Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Code to generate objects to use in Fedora 6 testing #3

Open
tomwrobel opened this issue Oct 31, 2022 · 0 comments
Open

Code to generate objects to use in Fedora 6 testing #3

tomwrobel opened this issue Oct 31, 2022 · 0 comments

Comments

@tomwrobel
Copy link
Owner

tomwrobel commented Oct 31, 2022

Linked GitLab ticket: https://gitlab.bodleian.ox.ac.uk/ORA4/dps/-/issues/2

An ORA object on disk in OCFL is defined in https://gitlab.bodleian.ox.ac.uk/ORA4/ora_ocfl Note, however, that the format of the directory path to an ORA OCFL object from the storage root is not set in stone, and we are prepared to be flexible on this: a shorter path with fewer directories is better.

For the purposes of testing, we are only concerned about bitstreams, not about the content of the files.

This task is to write code that will generate test metadata files and binary files for ingest into Fedora 6. This code should be made available to ORA as part of #5. The language and platform used for this code can be negotiated.

For each test object:

Object id

  • the object id should be a UUID4 identifier, with the prefix ora.ox.ac.ul:uuid:, e.g. ora.ox.ac.uk:uuid:3A34567890-3456-3456-3456-34567890abcd

Metadata files

  • every OCFL version of an object should have a metadata file that is a unique bitstream
  • each metadata file should be approx. 2Kb in size
  • each metadata file should have the same title: ora.ox.ac.uk:uuid:{uuid}.ora2.json, e.g. ora.ox.ac.uk:uuid:34567890-3456-3456-3456-34567890abcd.ora2.json

Binary files

  • each binary file should be a unique bitstream (this can be done by appending a unique string to an standard block of random bits)
  • binary files can be random bits, they don't need to be a logical file
  • binary file titles should have a file suffix for a normal file format (e.g. 2134q2q34.doc, thesis.pdf)
  • it would be beneficial if a few objects contained files with the same file title, but titles can normally be unique

We will, as discussed in #1, need to generate four types of object:

  • metadata only objects
  • binary file objects
  • large binary file objects
  • complex binary file objects
  • very large binary file objects

We will need to generate each as required for testing, and to add metadata and single file updates as specified in BAU testing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant