forked from ltscomputingllc/faersdbstats
-
Notifications
You must be signed in to change notification settings - Fork 4
Home
wolfderby edited this page May 22, 2024
·
44 revisions
Instructions to execute the standardize FAERS data and generate safety signals ETL process in Pentaho
- Postgres 9.1+ (create table if not exists)
- Pentaho Spoon (built w/ 8.4 then 9.2)
- AWS S3 bucket and credentials
- a license for the latest available OHDSI CDMV5 Vocabulary tables from the OHDSI
- Athena website
-
data_from_s3 (is created by pentaho)
-
faersdbstats (is created by cloning in the repo)
-
logs (is created by pentaho in stage_1_setup.kjb)
-
my_config.conf (you create from the example_conf.conf)
- config var
- LOAD_ALL_TIME=1
- Drops all tables
- REBUILD_ALL_TIME_DATA_LOCALLY (1=yes or 0=no)
- if yes... (clobber alert back up your local domain import files)
- Stage one
- runs fda data download scripts for laers and faers
- attempts to rebuild domain import files (stage_1_setup > fix_fda_data step needs developement)
- syncs/uploads to s3 bucket
- Stage two
- s3 data sync (should be quick no change)
- Stage one
- if no...
- Stage one
- skips downloads
- Stage two
- s3 data sync (should be quick no change; or downloads all data if not locally)
- Stage one
- if yes... (clobber alert back up your local domain import files)
- LOAD_ALL_TIME=1
- puts all data faers and laers onto your s3 bucket
- downloads all s3 data to your machine
- loads into your local postgres database
- config vars
- LOAD_NEW_YR=2022
- LOAD_NEW_QTR=Q4
- LOAD_ALL_TIME=0
- downloads new quarter of data locally
- puts new quarter of data onto s3
- loads new quarter of data into your postgres database
git clone https://github.com/dbmi-pitt/faersdbstats.git
- Open stage_0_set_pentaho_vars/example_config.config
- Add your values
- Save as faers_config.config in repo's parent directory (your BASE_FILE_DIR config variable's value)
3. Open ./meta.kjb in pentaho
- Follow wiki pages for additional stage documentation