Skip to content

Latest commit

 

History

History
 
 

Achilles

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 




UW


Achilles Dockerfiles

Automated Characterization of Health Information at Large-scale Longitudinal Evidence Systems (ACHILLES) - descriptive statistics about a OMOP CDM database.

Quickstart

Use the following command to spin up Atlas. Make sure to change the values of the environment variables as needed. In the example below the database username and password are represented by postgres:s3cret.

docker run \
  --rm \
  --net=host \
  -v "$(pwd)"/output:/opt/app/output \
  -e "ACHILLES_SOURCE=Docker Default" \
  -e "ACHILLES_DB_URI=postgresql://postgres:s3cret@localhost:5432/cdm" \
  -e "ACHILLES_CDM_SCHEMA=public" \
  -e "ACHILLES_VOCAB_SCHEMA=public" \
  -e "ACHILLES_RES_SCHEMA=webapi" \
  -e "ACHILLES_CDM_VERSION=5" \
  uwcarg/ohdsi-achilles:1.5.0

Test

The first test is to make sure you see no errors in the output and that the output volume specified in the docker run command contains the JSON output files.

To see the Achilles results in Atlas, you will have to configure an appropriate data source. This can be a tricky, since data sources are not very well documented. Start by reading this. There are two components to a data source, the source and the source_daimon. These are both tables in the schema used for the WebAPI. Each source has a name, key, and connection string, as well as three corresponding entries in the source_daimon table. Each entry in the source_daimon table references a single source, and has a type (numeric value 0, 1 or 2) and a table qualifier (corresponding to the schema name in postgres).

The important types are

  • 0 (CDM) - The table qualifier (schema) should be the one containing the CDM data.
  • 1 (Vocabulary) - The table qualifier (schema) should be the one containing the CDM vocabulary metadata.
  • 2 (Results) - The table qualifier (schema) should be the one containing the results (e.g. Achilles results).

I believe the idea behind this schema mechanism is to allow multiple CDM databases to be accessible through Atlas. Further, it allows for some schemas to be read only (e.g. CDM data), while some are writable (e.g. Results).

The following SQL could be used to create a source with some source_daimon entries that refer to data and vocabulary metadata in the cdm schema, while referring to results in the webapi schema. This corresponds to the Docker run command above.

INSERT INTO webapi.source (source_id, source_name, source_key, source_connection, source_dialect) VALUES (2, 'WebAPI source', 'webapi_source', 'jdbc:postgresql://${DBHOST}:${DBPORT}/${DBNAME}?&user=${DBUSER}&password=${DBPASS}', 'postgresql');
INSERT INTO webapi.source_daimon (source_daimon_id, source_id, daimon_type, table_qualifier, priority) VALUES (1,2,0, 'public', 0); -- public is the postgres schema containing the cdm data
INSERT INTO webapi.source_daimon (source_daimon_id, source_id, daimon_type, table_qualifier, priority) VALUES (2,2,1, 'public', 0); -- public is the postgres schema containing the vocab metadata
INSERT INTO webapi.source_daimon (source_daimon_id, source_id, daimon_type, table_qualifier, priority) VALUES (3,2,2, 'webapi', 0); -- webapi is the postgres schema containing the Achilles results

To use the above you'll need to replace DBHOST, DBPORT, DBNAME, DBUSER and DBPASS.

Actually two source_daimon's of the same type can refer to the same source.

Details

The Docker image is built on Docker Hub and contains a build of OHDSI Achilles.

Atlas Configuration

Environment Variables

For details see the Achilles README. A summary of the environment variables is included below for convenience.

ACHILLES_SOURCE

The name of the data source containing the CDM data. See here for some information on data sources.

ACHILLES_DB_URI

The database connection string, including the username:password phrase. In the example above, replace postgres with your database username and s3cr3t with the password.

ACHILLES_CDM_SCHEMA

The name of the database schema containing the CDM data.

ACHILLES_VOCAB_SCHEMA

The name of the database schema containing the CDM vocabulary metadata.

ACHILLES_RES_SCHEMA

the name of the schema in which you want the results to be written.

ACHILLES_CDM_VERSION

The CDM version. This can either be 4 or 5.

Troubleshooting

Generating the analysis results can use a lot of disk space (e.g. #22), so you need to make sure enough disk space is available to Docker. On macOS, this is done by opening Preferences... from the Docker icon in the title bar and selecting the Disk option. Set the virtual disk image size to 64GB or 128GB depending on how much space you're using for other images and volumes.

License

CC BY-SA