Skip to content

Store commands

Yoann Valeri edited this page Mar 15, 2023 · 4 revisions

Phobos provides both an Application Programming Interface and a Command Line Interface for users. Within them, commands and functions are separated between store commands and admin commands.

In this page, we will detail the store commands, which are mainly wrappers around the API functions, and pointers to the API will be given when their CLI counterpart are evoked. However, we will we will not detail the API functions here.

As a side note though, to use the API, you should:

  • Include the header file phobos_store.h
  • Link with the library libphobos_store (using -lphobos_store)
  • Link with the glib-2 dependency (using -lglib-2.0)

Put and Get objects

Writing objects

To put an object in Phobos, use the phobos put command. Its API counterpart is the phobos_put() function. To work, two values must be specified:

  • the source file which contains the data to put,
  • the identifier to address the data for later retrieval.

Moreover, to inform Phobos about where to put the data, you can specify the family of resources on which the data should be written (using the --family option, currently either with the value dir, tape or rados_pool). If not given, it will default to tape.

For instance, a basic call to phobos put is:

phobos put --family dir myinput_file foo-12345

Moreover, Phobos allows you to specify additional information when using phobos put:

  • an arbitrary set of attributes (key-values) to be attached to the object. They are specified when putting the object using the '--metadata' option:
# phobos put <file> <id> --metadata  <k=v,k=v,...>
phobos put myinput_file foo-12345 --metadata  user=$LOGNAME,put_time=$(date +%s)
  • Whether the put is for a newer version of an object, using the --overwrite option. If not provided, Phobos will only attempt to put the object (so if it already exists, Phobos will give back an error), but if provided, Phobos will either insert or update the object, depending on its state in the database (depending on its existence). In the later case, the old version is considered deprecated, and basic operations can no longer be executed on it if its version number is not given (for more information, go to the Object Versioning section.
phobos put --overwrite myinput_file foo-12345

Multiple write

For better performances when writing to tape, it is recommanded to write batches of objects in a single put command. This can be done using phobos mput command.

phobos mput takes as agument a file that contains a list of input files (one per line) and the corresponding identifiers and metadata, with the following format: <src_file> <object_id> <metadata|->.

Example of an input file for mput called list_file:

/srcdir/file.1      obj0123    -
/srcdir/file.2      obj0124    user=foo,md5=xxxxxx
/srcdir/file.3      obj0125    project=29DKPOKD,date=123xyz

Then, using this file, you can do the following:

phobos mput list_file

Note: this command currently only does a series of calls to the phobos_put() API function, there is no phobos_mput() function as of yet.

Reading objects

To retrieve the data of an object, use phobos get. Its API counterpart is the phobos_get() function. Its arguments are the identifier of the object to be retrieved, as well as a path to a file where the data should be written.

For example:

phobos get obj0123 /tmp/obj0123.back

An object can also be targeted using its uuid and/or its version number:

phobos get --uuid aabbccdd --version 2 obj0123 /tmp/obj0123.back

Reading object attributes

To retrieve the user-defined metadata of an object (specified at put using --metadata) custom object metadata, use phobos getmd command:

$ phobos getmd obj0123
cksum=md5:7c28aec5441644094064fcf651ab5e3e,user=foo

Its API counterpart is the phobos_getmd() function.

Object versioning

Phobos can manage multiple versions and generations of an object. The former are for the same object (i.e. an object foo was overwritten multiple times), while the later are for objects that share the same oid but are completely unrelated in their content or associated metadata.

With the concepts of versions and generations, Phobos sets a number of rules to manage object versioning:

  • There is always only one version of a living object (identified only by its oid),
  • All prior versions of an object are called deprecated (identified by their uuid and versions),
  • All deprecated versions can still be accessed,
  • Every object can be moved between the living and deprecated states, as long as this movement does not result in a conflict with the first rule.

Moreover, there can be multiple versions of an object, but Phobos also allows multiple generations of an object. This corresponds to Phobos allowing the reuse of an object identified, as long as it does not conflict with the above rules.

Finally, we will detail below the delete and undelete commands, which are commands to move objects from the living state to the deprecated state, and vice-versa.

Note: we currently have no mechanism to completely remove an object from the database and the corresponding data from the system, this is an ongoing development.

Deleting and restore objects

To delete an object, use the phobos del[ete] command:

phobos del obj0123

To revert this deletion, you can use the phobos undel[ete] command, which require either an oid or an uuid. If given an oid, the command will restore the most recent version of the object with that oid (meaning if an oid corresponds to different generations of objects, only the most recent one will be restored). If given an uuid, the command will restore the most recent version of the object with that uuid.

For instance, you can:

phobos undel oid obj0123
phobos undel uuid uuid0123

The API counterparts of these commands are the phobos_delete() and phobos_undelete() functions.

Listing objects

To list objects, use the phobos object list command. Its API counterpart is the phobos_store_object_list() function.

For instance:

$ phobos object list
obj01
obj02

Moreover, the command has multiple options available:

  • --deprecated to show list deprecated objects,
  • --metadata <key=value> to filter objects based on if they have the given list of key=value couples,
  • --output <column_name> to only show the given attributes of the outputted objects (if not given, will only output the names, if given all, will output every attribute),
  • --format <format> to output information in the specified format, defaults to "human",
  • --pattern "<pattern>" to filter object based on if their oid respect a certain pattern,
  • --max-width <n> to truncate the width of the user_md (user-defined metadata`) column, must be >= to 5, defaults to 30,
  • --no-trunc to not truncate the user_md column.

Note: The accepted patterns are Basic Regular Expressions (BRE) and Extended Regular Expressions (ERE). As defined in PostgreSQL manual, PSQL also accepts Advanced Regular Expressions (ARE), but we will not maintain this feature as ARE is not a POSIX standard.

Here are some examples of these options:

$ phobos object list --output oid,user_md
| oid   | user_md         |
|-------|-----------------|
| obj01 | {}              |
| obj02 | {"user": "foo"} |

$ phobos object list --pattern "*01*"
obj01

$ phobos object list --metadata user=foo
obj02

$ phobos object list --metadata user=foo --output oid,user_md
| oid   | user_md         |
|-------|-----------------|
| obj02 | {"user": "foo"} |

$ phobos object list --output oid,user_md --max-width 10
| oid   | user_md    |
|-------|------------|
| obj01 | {}         |
| obj02 | {"user...} |

Locating objects

As some types of media can be moved from one device to another, and from one host to another, the objects they hold can be reached from different hosts. To help you find the best host to reach an object (according to some heuristics that are detailed in the Advanced usage page), Phobos provides the phobos locate command (and its API counterpart the phobos_locate() function):

phobos locate obj0123

This will give you the host name of which it is most optimal to do your phobos get. Also, as part of its functioning, the locate command will preemptively attribute those media to the outputted host, so that a following get call can have the necessary media/devices ready to be used.

This command can also take different options to more precisely target which object you want to locate:

  • --uuid <uuid> to target a specific object uuid,
  • --version <n> to target a specific object version (if not given, will target the most recent one),

Moreover, the phobos locate command can also take a --focus-host parameter, which will tell the locate command that, if an object can be reached from multiple hosts in an equal fashion (according to the heuristics detailed in Advanced usage), the host given by the user should be preferred. This is especially useful for load balancing purposes.

Finally, a --best-host option is also available for the get command, to retrieve the object only if the request is executed on the best host:

phobos get --best-host obj0123 /tmp/obj0123.back