Genomics England pancancer signatures

Code and results for paper:

Comprehensive repertoire of the chromosomal alteration and mutational signatures across 16 cancer types from 10,983 cancer patients

Disclaimer: This code was written inside the Genomics England Research Environment without github and it has not been tested outside the research environment. We provide it here to aid the transparency of the publication and make our analysis methods more accessible and reproducible. The pipelines provided will almost certainly not be of significant use outside Genomics England, however, we hope users find pieces of code helpful to understand our work and useful in their research.

Getting started

Clone the repository git clone https://github.com/Wedge-lab/Gel_pan_cancer_signatures.git

Create .env file

cd Gel_pan_cancer_signatures
cp .env-template .env

in .env, change all file names as required to your local files.

Run editable install on a fresh conda environment (python=3.9) pip install -e .

Contained in this repository

Curate gene mutations

Bash scripts in ./scripts/data_prep are for collecting and curating genetic mutations in WGS data available in Genomics England including inferring expected impact using CADD scores or OncoKB annotations.

Combine signatures

Association analysis

A series of scripts in ./scripts run association ananlysis on mutation rates with various covariates including germline and somatic gene inactivations and treatment exposures:

germline_assoc.sh
treatment_assoc.sh
genotype_assoc.sh
somatic_assoc.sh
twohit_assoc.sh

Each script prepares data and runs the signatureAssociations.nf pipeline.

This is the most interesting thing probably contained in this code Go to pipelines/signaturessignatureAssociations.nf/README.md for more details. This pipeline contains the GLM association analysis with resampling which reduces the false positive rate of associations.

Name		Name	Last commit message	Last commit date
Latest commit History 124 Commits
.devcontainer		.devcontainer
.github		.github
.vscode		.vscode
docs		docs
scripts		scripts
src		src
tests		tests
.Rprofile		.Rprofile
.env-template		.env-template
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.pypirc		.pypirc
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
renv.lock		renv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Genomics England pancancer signatures

Getting started

Contained in this repository

Curate gene mutations

Combine signatures

Association analysis

About

Releases

Packages

Contributors 2

Languages

License

Wedge-lab/Gel_pan_cancer_signatures

Folders and files

Latest commit

History

Repository files navigation

Genomics England pancancer signatures

Getting started

Contained in this repository

Curate gene mutations

Combine signatures

Association analysis

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages