The OCM specification is designed to be extended in several ways.
The definition of such elements is restricted to a minimum set of attributes and may include functional behavior.
It typically consists of a type
attribute.
Those extension points are used to cover technology-specific aspects to be known either by dedicated implementations of the model or by applications using the model.
There are two different kinds of extensions: functional and semantic.
Functional extensions offer the possibility to enrich an implementation of the Open Component Model core with technology-specific parts to support more technology environments, like storage backends or artifact types.
The functional extension points are:
- Component Descriptor Serialization
- Storage Backends
- Access Methods
- Digest Algorithms
- Signing Algorithms
- Artifact Normalization
- Component Descriptor Normalization
- Label Merge Algorithms
Semantic extensions offer the possibility to describe the semantics and structure of an element by arbitrary types, not defined by the Open Component Model itself.
The semantic extension points are:
The elements used to describe a component version can be represented in a serialized format, typically as yaml document. This document MUST contain a format specification version with the concrete representation of the model elements of a component version. The defined formats are described here.
The Open Component Model specification does not describe a dedicated remotely accessible repository API (like for example the OCI distribution specification). Instead, the model is intended to be stored in any kind of storage sub system, which is able to store a potentially unlimited number of blobs with an adequate addressing scheme, supporting arbitrary names.
For example, an OCI registry with a deep repository structure, is suitable to host OCM components (see OCI mapping Scheme).
An OCM repository is therefore always backed by another technical repository, the storage backend.
On the client side, a suitable implementation or language binding must be available to work with component information stored in such a storage backend.
Every such binding must support at least the mandatory set of abstract operations working with elements of the component model (see below).
The Open Component Model defines abstract operations that must be available to work with a component repository view as interpretation layer on-top of dedicated well-known storage subsystems (like an OCI registry or an S3 blob store). These operations build the first extension point of OCM, which allows to map the OCM functionality onto any blobstore-like storage system.
A concrete implementation of a storage backend extension consists of two parts:
- a language-independent mapping of the OCM elements to the elements available in the storage backend and a formal specification how the abstract model operations are mapped to operations provided by the storage backend.
- a language-specific binding of the formerly described mapping
By defining the language-independent part used for those operations the interoperability between different implementations is assured.
An OCM repository is defined by the type of the storage backend and a set of attributes specific for this type, which specify the instance of the used backend. For example, this could be an URL. The set of attributes is called Repository Specification and consists at least of the following fields:
-
type
(required) stringThe globally unique type of the storage backend.
-
(optional) additional type specific attributes
Every OCM repository can therefore be described by such a Repository Specification. This specification can be utilized by a language binding to access the repository. The type is employed to distinguish the backend implementation, and the additional attributes are leveraged by the implementation to faciliprovide access to the specific repository instance.
In a concrete environment all those repositories are usable, for which an implementation of the abstract model operations exists.
The repository type used in a repository specification consists of two parts:
- a repository type name
- a version
The type name specifies the kind of storage backend mapping to be used to implement an OCM repository interface.
The version is used to specify the attribute structure that describes the repository instance as part of the repository specification.
Regardless of the creator of a component version, an access method must be uniquely identifyable. Therefore, the names of access methods must be globally unique.
There are two kinds of type name:
-
Centrally defined type names managed by the OCM organization
Those methods should be implemented by OCM compliant libraries and tools. Using only such access methods guarantees universal access.
The format of a repository type name is described by the following regexp:
[A-Z][a-zA-Z0-9]*
The defined types with their meaning and formats can be found here
-
Vendor specific types
Any organization using the Open Component Model may define own repository types. However, the significance and function of these types must be distinctly and uniquely defined. Organizations are encouraged to share and reuse existing types, rather than introducing new type names.
Using vendor specific repository types always means a restriction on using tools implementing these access methods. For exchanging such tools involved parties must agree on the repository support of the used toolset.
To support a unique namespace for those type names, vendor specific types have to follow a hierarchical naming scheme based on DNS domain names. Every type name has to be suffixed by a DNS domain owned by the providing organization (for example
myspecialrepo.acme.com
). The local type must follow the above rules for centrally defined type names and prepended, separated by a dot (.
).
The compattern for name
is:
[a-z][a-zA-Z0-9].<DNS domain name>
The version
follows this regexp:
v[1-9][0-9]*
The repository specification type consists of the
repository type name, optionally followed by a version separated by a
slash (/
). If not specified, version v1
is assumed.
The metadata of a component version is defined by the serialization format of a component descriptor. It is stored in the storage backend together with the format version. It must be possible to store any supported format version.
The following operations are mandatory:
-
UploadComponentDescriptor(ComponentDescriptor-YAML) error
Persist a serialized form of the descriptor of a component version with its component identity and version name in way so that it is retrievable again using this identity.
-
GetComponentDescriptor(ComponentId, VersionName) (ComponentDescriptor-YAML, error)
Retrieve a formally persisted description of a component version.
-
UploadBlob(ComponentId, VersionName, BlobAccess, MediaType, ReferenceHint) (BlobIdentity, GlobalAccessSpec, error)
Store a byte stream or blob under a namespace given by the component version identity and return a local blob identity (as string) that can be used to retrieve the blob, again (together with the component version identity).
Additionally, a dedicated media type can be used to decide how to internally represent the artifact content.
Optionally, the operation may decide to store the blob in dedicated ways according to its media type. For example, an OCI based implementation can represent blobs containing an OCI artifact as regular, globally addressable object.
A type-specific optional ReferenceHint can be passed to guide the operation for generating an identity, if it decided to make the object externally visible.
If this is the case, an external access specification has to be returned. At least a blob identity or an external access specification has to be returned for not successful executions.
-
GetBlob(ComponentId, VersionName, BlobIdentity) (Blob, error)
Retrieve a formerly stored blob, again, using the blob identity provided by the store operation. Technically this should be a stream or the blob content.
-
ListComponentVersions(ComponentId) ([]VersionName, error)
List all the known versions of a component specified by its component identity.
Optional operations might be:
-
DeleteComponentVersion(ComponentId, VersionName) error
To be able to clean up old information, an operation to delete the information stored for a component version should be available.
-
DeleteBlob(ComponentId, VersionName, BlobIndentity) error
It might be useful to provide an explicit delete operation for blobs stored along with the component version. But the repository implementation may keep track of used blobs on its own.
-
ListComponents(ComponentId-Prefix) ([]ComponentId, error)
List all components in the given identifier namespace. (The structure of a component id based on hierarchical namespace).
-
ListComponentClosure(ComponentId-Prefix) ([]ComponentId, error)
List all components in the given identifier namespace, recursively. It should not only return component identities, that are direct children, but traverse the complete subtree.
The task of an access method is to provide access to the physical content of an artifact described by a component version. The content is always provided as blob with a dedicated media type, either depending on the access method itself or the artifact type. To fulfill its task an access method gets an access specification.
The list of centrally defined access methods types can be found here
The technical access to the physical content of an artifact described as part of a Component Version is expressed by an Access Specification. It specifies which access method to use and additionally the type-specific attributes, which are required by the access method to access the content. In an implementation the Access Method Type is mapped to code for finally accessing the content of an artifact.
An access specification consists at least of the following fields:
-
type
(required) stringThe globally unique type of the access method.
-
(optional) additional type specific attributes
The access type used in a access specification consists of two parts:
- an access type name
- a version
The type name specifies the kind of access method to be used to access the content of an artifact.
The version is used to specify the attribute structure used to describe the information to identify the artifact in sme repository as part of the access specification.
Regardless of the creator of a component version, an access method must be uniquely identifyable. Therefore, the names of access methods must be globally unique.
There are two kinds of type name:
-
Centrally defined type names managed by the OCM organization
Those methods should be implemented by OCM compliant libraries and tools. Using only such access methods guarantees universal access.
The format of an access type name is described by the following regexp:
[a-z][a-zA-Z0-9]*
The defined types with their meaning and formats can be found in here
-
Vendor specific types
Any organization using the open component model may define access method types on their own. Nevertheless, the meaning and purpose of those types must be clearly and uniquely defined. Organizations should share and reuse existing types instead of introducing new type names.
Using component versions with vendor specific access methods always means a restriction on using tools implementing these access methods. For exchanging such component versions involved parties must agree on the used toolset.
To support a unique namespace for those type names, vendor specific types have to follow a hierarchical naming scheme based on DNS domain names. Every type name has to be suffixed by a DNS domain owned by the providing organization (for example
myspecialaccess.acme.com
). The local type must follow the above rules for centrally defined type names and prepended, separated by a dot (.
).So, the complete pattern looks as follows:
[a-z][a-zA-Z0-9].<DNS domain name>
The version follows the following regexp:
v[1-9][0-9]*
The access specification type consists of the
access method type name optionally followed by a version separated by a
slash (/
). If not specified the version v1
is assumed.
Examples:
ociArtifact/v1
myprotocol.acme.org/v1alpha1
Every access method MUST define a specification of the attributes required to locate the content.
This specification MAY be versioned. The type of the access specification MUST contain the access method name
and MAY have an optional specification type to uniquely describe the method and the attribute set.
Therefore, in addition to the access method type name the access specification type name MAY include
a version appended by a slash (/
) to completely describe the description format.
The version MUST match the following regular expression:
v[1-9][0-9]*
Examples:
ociArtifact/v1
myprotocol.acme.org/v1alpha1
If no version is specified, implicitly the version v1
is assumed.
The access method type is part of the access specification. The access method type may define additional specification attributes required to specify the access path to the artifact blob.
For example, the access method ociBlob
requires the OCI repository reference and the blob digest to be able to access the blob.
...
access:
type: ociArtefact
imageReference: ghcr.io/jensh007/ctf/github.com/open-component-model/ocmechoserver/echoserver:0.1.0
There must be an implementation for all supported external access methods according to their specifications. The local access method is mapped to the local blob access provided by the repository.
They have to support read access, only. At least a media type and stream access for the denoted blob is required.
-
<method>.GetMediaType(RepositoryContext, ComponentVersion, AccessSpecification) (string, error)
Provide the media type of the described artifact. It might explicitly be stored as part of the access specification, or implicitly provided by the access method.
-
<method>.GetStream(RepositoryContext, ComponentVersion, AccessSpecification) (Byte Stream, error)
Provide access to the blob content described by a dedicated access specification.
-
<method>.GetInexpensiveContentVersionIdentity(RepositoryContext, ComponentVersion, AccessSpecification) (string, error)
This optional operation is used to provide a unique identifier for the version of the described content, without the requirement of an (expensive) access to the byte stream. For example, the OCI artifact access method can extract the content digest from the reference part of its access specification. If this operation is not supported or proveds an empty string, an identity will be calculated based on accessing the byte stream.
Digest algorithms describe the way digests are calculated from a byte stream. The defined algorithms can be found here.
A signing algorithm is used to provide a signature byte sequence for a given digest. The algorithm is denoted by a name following the syntax
[A-Z][A-Z0-9-_]*
The result of a signing is a structured data set with the following fields:
-
mediatype
(required) stringThe mediatype used to represent the signature value. Possible values:
application/x-pem-file
signature is stored as multi-block PEM document. The signature block uses the typeSIGNATURE
. This block might describe the signature algorithm with the block headerSignature Algorithm
. Additionally there might be blocks describing the certificate chain of the used public key.application/vnd.ocm.signature.rsa
signature is stored as HEX encoded byte stream.
-
value
(required) stringThe signature byte stream according to the specified media type.
-
algorithm
(required) stringThe technical algorithm used to verify the signature. For example, this name might be different from the signing algorithm name. The signing algorithm might describe the usage of a signing server, which provides an RSA signature. In this case the verification algorithm would be
RSASSA-PKCS1-V1_5
. -
issuer
(optional) stringThe distinguished name of the subject of the public key certificate.
If a component is signed this signature should cover the content provided by the component resources. Therefore a digest is calculated for the resource content blobs. To be able to provide a format-independent digest, the resource blob can be normalized before a digest is calculated. For example, an OCI artifact is represented as blob following the OCI Image Layout Specification.
Unfortunately the byte stream of the resulting artifact blob is not stable, it depends on the used archiving tool, timestamps and the archiving order of the files. So, reccreating an OCI artifact in an OCI repository and recreating it into an archive does not necessaryily provide the same byte sequence. Therfore the natural blob digest is not necessarily a source of providing stable digests for signing. Hence OCM provides a mechanism to calculate digests for resources based on the logical content of an artifact. This mechanism is called Artifact Normalization.
Hereby a stable digest is provided for an artifact, independent of the physical byte representation. The normalization is selected based on the resource type and the artifact media type. The result of the normalization is a digest specification with the following fields
-
normalizationType
(required) stringThe name of the algorithm used to provide a digest for an artifact blob. The default digets algorithm is
genericBlobDigest/v1
, which calculates the byte stream digest of the blob. -
hashAlgorithm
(required) stringThe type of the provided digest algorithm.
-
value
(required) stringThe HEX encoded digest value.
The already defined digesters can be found here.
Example:
resources:
- name: apiserver-proxy
version: v0.14.0
type: ociImage
access:
type: ociRegistry
imageReference: >-
mycompany.com/myrepo/apiserver-proxy:v0.14.0-mod1
digest:
hashAlgorithm: SHA-256
normalisationAlgorithm: ociArtifactDigest/v1
value: 9dc9c7c74abe301e0ee2cb168c004051179e7365d269719db434d3582a12dcb6
relation: local
The component descriptor contains several kinds of information:
- volatile label settings, which might be changeable.
- artifact access information, which might be changed during transport steps.
- static information describing the features and artifacts of a component version.
The digest of a component descriptor is calculated on a normalized form of its elements. The normalized form contains only the signature relevant information, everything else gets removed during the normalization process. The resulting string is the source for calculating the digest. This digest is then finally signed (and verified).
A normalized component descriptor is a subset of its elements containing only the properties relevant for signing.
Like for signature algorithms, the model offers the possibility to work with different normalization algorithms and formats.
The algorithms used for normalization are listed in the extensions section.
Relevant fields are:
- Component Name
- Component Version
- Labels (for component, provider, resources, sources and references) (only signature-relevant labels, see below)
- Component Provider
- Resources without access method specification see below
- Sources without access method specification see below
- References see below
Access method specifications for sources and resources are completely ignored.
A resource or source is ignored, if the access method type is none
or the hash algorithm of the digest specification is NO-DIGEST
and the
normalization algorithm is EXCLUDE-FROM-SIGNATURE
.
Labels by default are removed before signing, but can be marked with a special boolean
property signing
set to true
. This property indicates that the label is
signing-relevant and therefore becomes part of the digest. As a consequence such
labels cannot be changed during the lifecycle of a component version anyomre
and SHOULD only describe static information.
Example:
labels:
- name: label1
value: foo
- name: label2
value: bar
signing: true
label1
will be excluded from the digest, whereas label2
will be included.
The value of any label is taken as is, preserving a potentially deeply nested structure.
If a component version contains references to other component versions, their digests are stored along with the reference as digest descriptor. The digest descriptor is similar to the artifact digest. Hereby the component descriptor normalization algorithm is registered instead of the artifact normalization algorithm.
Example:
...
componentReferences:
- name: etcd-druid
componentName: github.com/gardener/etcd-druid
version: v0.21.0
digest:
hashAlgorithm: SHA-256
normalisationAlgorithm: jsonNormalisation/v1
value: 7f5255bc89cdfc1eb06ce20dba7fb6e1d93a065533354ca9fccfe958c90eac73
...
The normalization algorithm provides a stable deserialization format based on the elements of a component descriptor, excluding the fields not relevant for signing. Afterwards the resulting byte stream is hashed using a digest algorithm. The resulting digest is the digest of the component version.
Value merge algorithms are used during a transfer of a component version into a target repository to merge label values, in case the transferred version is already present and the new content does not hamper the digest of the old one.
This scenario is used to re-transfer updated content of non-signature relevant labels (for example updated routing slips).
Hereby, potential changes in the target must be merged with the new inbound content. This is done by executing value merge algorithms for changed label values.
If no specific algorithm is configured for a label, the algorithm used is default
.
An algorithm is uniquely defined by its name. Therefore we distinguish between standard names and vendor specific names. Standard algorithms always use flat names. All non-standard algorithms MUST use a hierarchical name prefixed at least with a DNS-like domain owned by the provider of the algorithm.
<DNS domain name>/[a-z][a-zA-Z0-9]*
The merge algorithm is described by a specification descriptor optionally provided
by the field merge
as part of a label descriptor. It has the following fields:
-
algorithm
(required) stringThe name of the merge algorithm to be used.
-
config
(optional) anyAn arbitrary description of the config. The structure depends on the selected algorithm.
Example:
labels:
- name: mylabel
value: ...
merge:
algorithm: mapListMerge
config:
keyField: name
entries:
algorithm: ...
config: ...
The configuration may recursively contain further merge specifications used for nested parts of the label value, depending on the outer algorithm.
A model implementation MUST provide the possibility to declare merge algorithms for dedicated label names to be able to omit merge specifications as part of the component descriptor.
The currently specified algorithms for label merge can be found in the extensions section.
Artifact types describe the meaning of an artifact independent of their technical representation in a blob format. The artifact types defined by the core model (this specification) are described in section Artifact Types
Dynamic attribution of model elements with additional information is possible using Labels. To be interpretable by tools the meaning of a label must be uniquely derivable from its name, regardless of the creator of a concrete label entry in a component version. To assure that every consumer of a component version has the same understanding of the label, label names MUST be globally unique.
To combine globally uniqueness and arbitrarely extensibility of label names, they must comply with some namespaced naming scheme.
There are two flavors of labels:
-
labels with a predefined meaning within the component model.
Those labels are used by the standard OCM library and tool set to control some behaviour. Labels without a namespace are relevant for the component model itself.
Such labels use flat names following a camel case scheme with the first character in lower case.
Their format is described by the following regexp:
[a-z][a-zA-Z0-9]*
-
vendor specific labels
any organization using the open component model may define own labels. Nevertheless, these names must be globally unique. Basically there may be multiple such labels provided by different organizations with the same meaning. Such label names MUST use a namespace.
To support a unique namespace vendor specific labels have to follow a hierarchical naming scheme based on DNS domain names. Every label name has to be preceded by a DNS domain owned by the providing organization (for example
landscaper.gardener.cloud/blueprint
). The local name MUST follow the above rules for centrally defined names and is appended, separated by a slash (/
).So, the complete pattern looks as follows:
<DNS domain name>/[a-z][a-zA-Z0-9]*
To be interpretable by tools, every label MUST define a specification of its attributes, to describe its value space. This specification may be versioned.
The version must match the following regexp
v[0-9]+([a-z][a-z0-9]*)?
So far, no centrally predefined labels have been defined.