-
Notifications
You must be signed in to change notification settings - Fork 3
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #192 from PNNL-CompBio/drug_descrip
Drug descriptor addition to pipeline, other fixes included
- Loading branch information
Showing
32 changed files
with
2,155 additions
and
1,579 deletions.
There are no files selected for viewing
Large diffs are not rendered by default.
Oops, something went wrong.
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,24 +1,68 @@ | ||
## BeatAML Data generation | ||
|
||
This directory builds the data for the BeatAML samples | ||
This directory builds the data for the BeatAML samples. To build and | ||
test this module, run the following commands from the root directory. | ||
|
||
### Sample generation | ||
## Build with test data | ||
Build commands should be similar to every other coderdata build | ||
module. | ||
|
||
To generate samples, you need to pass in the path of the previous | ||
sample file. | ||
|
||
### Build gene table | ||
First we need to build the gene table | ||
|
||
1. Build genes docker | ||
``` | ||
python GetBeatAML.py --token $SYNAPSE_AUTH_TOKEN --prevSamples=[path to previous samples] | ||
docker build -f build/docker/Dockerfile.genes -t genes . --build-arg HTTPS_PROXY=$HTTPS_PROXY | ||
``` | ||
|
||
### Drug generation | ||
2. Build gene file | ||
``` | ||
docker run -v $PWD:/tmp genes sh build_genes.sh | ||
``` | ||
|
||
How are the drugs generated??? | ||
### Build AML data | ||
1. Build the Docker image: | ||
``` | ||
docker build -f build/docker/Dockerfile.beataml -t beataml . --build-arg HTTPS_PROXY=$HTTPS_PROXY | ||
``` | ||
|
||
### Omics and Experiment Data | ||
2. Generate new identifiers for these samples to create a | ||
`beataml_samples.csv` file. This pulls from the latest synapse | ||
project metadata table. | ||
``` | ||
docker run -e SYNAPSE_AUTH_TOKEN=$SYNAPSE_AUTH_TOKEN -v $PWD:/tmp beataml sh build_samples.sh /tmp/build/build_test/test_samples.csv | ||
``` | ||
|
||
3. Pull the data and map it to the samples. This uses the metadata | ||
table pulled above. | ||
``` | ||
docker run -v $PWD:/tmp -e SYNAPSE_AUTH_TOKEN=$SYNAPSE_AUTH_TOKEN beataml sh build_omics.sh /tmp/build/build_test/test_genes.csv /tmp/beataml_samples.csv | ||
``` | ||
|
||
``` | ||
python GetBeatAML.py --token $SYNAPSE_AUTH_TOKEN --curSamples=[path togenerated sample file] | ||
4. Process drug data | ||
``` | ||
docker run -e SYNAPSE_AUTH_TOKEN=$SYNAPSE_AUTH_TOKEN -v $PWD:/tmp beataml sh build_drugs.sh /tmp/build/build_test/test_drugs.tsv | ||
``` | ||
|
||
5. Process experiment data. This uses the metadata from above as well as the file metadata on synapse: | ||
``` | ||
docker run -e SYNAPSE_AUTH_TOKEN=$SYNAPSE_AUTH_TOKEN -v $PWD:/tmp beataml sh build_exp.sh /tmp/beataml_samples.csv /tmp/beataml_drugs.tsv.gz | ||
``` | ||
|
||
Please ensure that each step is followed in order for correct dataset | ||
compilation. | ||
|
||
|
||
### BeatAML Dataset structure | ||
The build commands above create the following files in the local directory | ||
|
||
``` | ||
├── beataml_samples.csv.gz | ||
├── beataml_transcriptomics.csv.gz | ||
├── beataml_mutations.csv.gz | ||
├── beataml_proteomics.csv.gg | ||
├── beataml_drugs.tsv.gz | ||
├── beataml_drug_descriptors.tsv.gz | ||
├── beataml_experiments.tsv.gz | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1,2 @@ | ||
python GetBeatAML.py --token $SYNAPSE_AUTH_TOKEN --drugs --drugFile $1 | ||
python build_drug_desc.py --drugtable /tmp/beataml_drugs.tsv --desctable /tmp/beataml_drug_descriptors.tsv.gz |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,2 +1,3 @@ | ||
/opt/venv/bin/python 03a-nci60Drugs.py | ||
Rscript 03-createDrugFile.R CTRPv2,GDSC,gCSI,PRISM,CCLE,FIMM | ||
/opt/venv/bin/python build_drug_desc.py --drugtable /tmp/broad_sanger_drugs.tsv --desctable /tmp/broad_sanger_drug_descriptors.tsv.gz |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,10 +1,12 @@ | ||
pandas | ||
matplotlib | ||
numpy | ||
numpy==1.26.4 | ||
argparse | ||
tqdm | ||
scikit-learn | ||
scipy | ||
requests | ||
openpyxl | ||
polars | ||
mordredcommunity | ||
rdkit |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.