The work of OMOP on the dbt for demo in OHDSI Symposium 2022
Abstract available at: https://www.ohdsi.org/2022showcase-2/
This repo include:
- A part of productionize pipeline of OMOP CDM conversion at Siriraj Hospital (the 'Dev' box in the figure above.)
- Sub-repo of the dbt project and model that handling ETL in SQL.
Only for the demonstration, We use data pipeline and ETL convension from OHDSI/ETL-Synthea.
Learn more about dbt.
The dbt docs serve
is providing full documentation with graph of data lineage, ease developer to maintain their conversion.
From dbt manifest
to Apache Airflow
, Wrapping dbt project
into DAG of tasks dynamically per each models from the dbt with its execution order.
Some ETL pattern is redundant (example: Mapping Concepts), Define parameterized funtions at one place to keep maintainability by not edit on every .sql
file that operate the same pattern.
Developer can quickly run dry test for uniqueness in ID column, relationship between concept ID and concept table (PK and FK) with dbt test
before proceed on DQD.
Back-end infrastructure was wrapped up in Dockerfile
allow to deploy on any container platform (Docker, K8, etc.) and version controlled via GitHub or GitLab.
This article is an independent publication and has not been authorized, sponsored, or otherwise approved by dbt Labs, Inc., the owner of dbtTM, or any owners of the products mentioned therein.