The low-code framework involves editing a boilerplate YAML file. This section deep dives into the components of the YAML file.
Streams define the schema of the data to sync, as well as how to read it from the underlying API source. A stream generally corresponds to a resource within the API. They are analogous to tables for a relational database source.
A stream's schema will can defined as a JSONSchema file in <source_connector_name>/schemas/<stream_name>.json
.
More information on how to define a stream's schema can be found here
The schema of a stream object is:
Stream:
type: object
additionalProperties: true
required:
- name
- retriever
properties:
"$options":
"$ref": "#/definitions/$options"
name:
type: string
primary_key:
"$ref": "#/definitions/PrimaryKey"
retriever:
"$ref": "#/definitions/Retriever"
stream_cursor_field:
type: string
transformations:
"$ref": "#/definitions/RecordTransformation"
checkpoint_interval:
type: integer
More details on streams and sources can be found in the basic concepts section.
The data retriever defines how to read the data for a Stream and acts as an orchestrator for the data retrieval flow.
It is described by:
- Requester: Describes how to submit requests to the API source
- Paginator: Describes how to navigate through the API's pages
- Record selector: Describes how to extract records from a HTTP response
- Stream slicer: Describes how to partition the stream, enabling incremental syncs and checkpointing
Each of those components (and their subcomponents) are defined by an explicit interface and one or many implementations. The developer can choose and configure the implementation they need depending on specifications of the integration they are building against.
Since the Retriever
is defined as part of the Stream configuration, different Streams for a given Source can use different Retriever
definitions if needed.
The schema of a retriever object is:
Retriever:
type: object
anyOf:
- "$ref": "#/definitions/SimpleRetriever"
SimpleRetriever:
type: object
additionalProperties: true
required:
- name
- requester
- record_selector
properties:
"$options":
"$ref": "#/definitions/$options"
name:
type: string
primary_key:
"$ref": "#/definitions/PrimaryKey"
requester:
"$ref": "#/definitions/Requester"
record_selector:
"$ref": "#/definitions/HttpSelector"
paginator:
"$ref": "#/definitions/Paginator"
stream_slicer:
"$ref": "#/definitions/StreamSlicer"
PrimaryKey:
type: string
Incremental syncs are supported by using a DatetimeStreamSlicer
to iterate over a datetime range.
Given a start time, an end time, and a step function, it will partition the interval [start, end] into small windows of the size described by the step.
Note that the StreamSlicer
's cursor_field
must match the Stream
's stream_cursor_field
.
More information on DatetimeStreamSlicer
can be found in the stream slicers section.