Skip to content
Alessio Lombardi edited this page Mar 11, 2020 · 24 revisions

Data validation is the process of ensuring data [...] is both correct and useful.
It uses routines, often called "validation rules", [...] that check for correctness, meaningfulness, and security of data that are input to the system. The rules may be implemented through the automated facilities of a data dictionary, or by the inclusion of explicit application program validation logic of the computer and its application.

(from Wikipedia, Data Validation)

In AECDeltas, validation is defined as the process of checking object properties against a certain predefined validation ruleset.

Definition of the validation Payload

The validation ruleset can be defined as an additional object to be uploaded to the Stream.

Its schema is the following:

"validation_ruleset" = {
    "name" : <string>
    "id": <uuid>                
    "rules" : <structured data>      // all validation rules, see below
    "enforce_validation" : <bool>,   // if true, server *must* validate and can update the stream only on success
    "from_revision" : <string>,      // if present, validation will be done only on objects created from this revision onwards
    "timestamp": <time_t>,           // time of latest update of the validation_ruleset
    "signature": <base64 AES string>,
    "sender": <string>               // Client used by the author, e.g. BHoM, 3D Repo, Speckle, etc.
}

Note the presence of:

  • an enforce_validation boolean. This indicates whether the validation must be enforced. If false, validation is not enforced when updating the stream revision. This allows for storing the data regardless of the validation success;
  • a from_revision identifier . This decides whether to retroactively apply the validation on old objects or not.

Since the smallest unit of change is an AEC object, the rules can be designed as a dictionary of rules to be checked against certain AEC object types. More details in the dedicated paragraph. The following is just an example:

// Example rules. See the rules schema definition for the actual schema.
"rules" = {
     "Revit_column" = { 
          "max_column_height" = 4,
          "min_column_height" = 2.7
     },
     "GSA_beam" = { 
          "allowed_cross_sections" = [UB, H, I],
          "min_depth" = 0.3
     },
}

In this example, when the validation process is triggered, the following checks will be done:

  • all objects of type Revit_column must have a colum_height property that is between 2.7 and 4
  • all objects of type GSA_beam must have a cross_section which is one of the types UB, H, I and a minimum depth of 0.3.

More details in the dedicated paragraph.

Requirements of the AECDeltas validation process

AECDeltas validation process must adhere to the following requirements.

General validation requirements

  1. A validation_ruleset object may or may not exist in a Stream Revision.

  2. It is responsibility of a client to defined and uploaded a validation_ruleset object.

  3. It follows that if the validation is to be performed, a validation_ruleset must exist in the Stream Revision that is the target of the validation.

  4. The validation_ruleset is an object to which the concept of the Delta applies. I.e. It is possible to modify the validation rules as a project goes forward – which is a common scenario.

    • Addendum: when property-level diff will be supported by the AECDeltas specification, it will be possible to trace back how the rules were modified and who modified them.
  5. The validation may be performed by both the client and the server. I.e. whoever possesses an AEC object and a validation_ruleset should be able to implement their validation function based on the rules explained in the "Rules schema definition" paragraph.

  6. The validation_ruleset object may or may not include a from_revision identifier. If it does not include it, the validation checks will be applied on all objects in the Stream Revision. Otherwise, the checks will be applied only on object from a the specified Revision onwards. This also requires objects to store the revision_id (on which they were created or latest modified).

  7. If there is more than validation_ruleset object in the model, the validation should be performed using all of them.

  8. If any validation fails, the failure must be recorded in a validation response message. The schema is in a later paragraph.

Server-side validation requirements

  1. The validation of a certain Stream Revision against a validation_ruleset existing in that Stream Revision may be triggered at any moment through a dedicated REST API Endpoint. More details in the REST API specification.

  2. When any Delta payload is received, the server must check if a validation ruleset is present and whether it the enforce_validation boolean set to true. In that case, the server must ensure that the validation is successful before they can use the content of the payload to update the stream.

  3. In all cases where the validation fails on the server, a validation response message must be returned to the client with the failed checks.

Client-side validation requirements

  • Validation enforcement prior to payload transmission should be defined by the individual clients. I.e. the client may or may not enforce the validation prior to uploading the payload to the server.

    • It follows that whether the validation is successful or not does not constitute a requirement for the Revision to be transmitted and stored on the Server, unless the client decides so.
  • The client may request the validation checks to be performed by the server at any time through the dedicated REST API endpoint; the client should therefore be able to interpret the validation response message, appropriately exposing that to the user.

Rules schema definition

The validation process can be implemented as a function that simply goes through the objects and performs some specific checks as specified by the validation ruleset. Different checks correspond to specific kinds of Data Validation.

The schema of the rules is:

"rules" = {
     'objectType_A' = [
           ['validationKind_1', parameters],
           ['validationKind_2', parameters]
      ],
     'objectType_B' = [
           ['validationKind_1', parameters],
           ['validationKind_2', parameters]
      ],
      ...,
}

where:

  • objectType = type of the AEC object that is the target of the validation checks
  • validationKind = one of the different kinds of validation checks, see next paragraph
  • parameters = sequence of parameters to be used by the validation check.

Kinds of Data Validation in AECDeltas

The AECDeltas specification defines the following kinds of Data Validation. The kinds of data validation might be expanded in the future.

  1. Data type validation
  2. Range and constraint validation

1. Data type validation

The data type validation is triggered if a type field is encountered.

It checks that the type of the object corresponds to the type specified in the ruleset.

"rules" = {
     "Revit_column" = ["type", "requiredType"]
     },

where requiredType may be the type name plus namespace.

Taking the example of a Revit Column, a client might wish to ensure that the column type is AnalyticalModelColumn:

"rules" = {
     "Revit_column" = ["type", "Autodesk.Revit.DB.Structure.AnalyticalModelColumn"]
     },

2. Range and constraint validation

May examine user input for consistency with a minimum/maximum range, or consistency with a test for evaluating a sequence of characters, such as one or more tests against regular expressions. For example, a beam cross section type must be among the UK standard sections.

For this check to be done by the validation function, a valid property type name must be specified for the given object type.

2.1. Single constraint value

Again taking a theoretical Revit column, if a validation is given as:

"rules" = {
     "Revit_column" = ["single", "column_height", 4]
     },

this means that if column_height is a recognised property of the Revit_column object, then its value must be 4 ("single" value).

2.2. Range constraint

The same check could be applied for ranges of values.

If "Range" is specified as the first item of the ruleset values, like:

"rules" = {
     "Revit_column" = ["Range", "column_height", 2.7, 4]
     },

then this means that if column_height is a recognised property of the Revit_column object, then its value must in a range between 2.7 and 4 (inclusive).

2.3. Tuple constraint

The same check could be applied for tuple of values by specifying "tuple":

"rules" = {
     "Revit_column" = ["tuple", "cross_section", "UB", "H", "I"]
     },

this means that if cross_section is a recognised property of the Revit_column object, then its value must one of "UB", "H", "I".

Validation response message schema

The validation response message should include only the failed checks.

The Rresponse schema can therefore simply be the same as the Rules schema, only including the rules that failed the check:

validation_response = {
     'objectType_A' = [
           ['validationKind_7', parameters]
      ],
     'objectType_B' = [
           ['validationKind_3', parameters],
      ],
      ...,
}