Skip to content

Commit

Permalink
Release 1.6.7 (#432)
Browse files Browse the repository at this point in the history
* Fix fusion report text and author name, update test md5

* Fix treatment options text; insert a comma, tidy up spacing, update test md5

* Fix spacing in 'below threshold to call MS score' message, move make_ordinal method into general-purpose HTML builder

* Make WGS description past tense throughout

* Evaluate and apply thresholds for HRD and MSI; record threshold status in results; plugin tests pass

* add total MS sites

* Update treatment header text and test md5

* Add description and links for NCCN compendium

* Fix TAR description, and move links to versions.py

* move microsatellite total into versions.py

* update test md5sums

* bugfix, MSI cells were not correctly generated

* make thresholds for MSI and HRD greater-than-or-equal instead of greater-than in code and disclaimer

* delete extra period

* Change to FDA/NCCN 'wording

* Improved NCCN explanatory text with document versions

* Fix NCCN compendium text

* update md5 sums

* update version and changelog for release 1.6.7
  • Loading branch information
iainrb authored Jul 23, 2024
1 parent 79c787f commit 184c563
Show file tree
Hide file tree
Showing 21 changed files with 120 additions and 74 deletions.
6 changes: 6 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,11 @@
# CHANGELOG

## v1.6.7: 2024-07-23
- GCGI-1396: Fixes to report text requested by clinical geneticist
- Correct threshold for reporting HRD; genomic landscape plugin has new "sample type" parameter
- Improved explanation and links for NCCN compendium
- Added number of MSI sites

## v1.6.6: 2024-07-19
- GCGI-1391: Fixed column names in data_CNA_oncoKBgenes_nonDiploid.txt which impacted oncoKB therapy annotation

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ def test_gene_info(self):
self.assertEqual(merger.ini_defaults.get(cc.RENDER_PRIORITY), 50)
html = merger.render(inputs)
md5_found = self.getMD5_of_string(html)
self.assertEqual(md5_found, '07f1328c64a5c19aa3751905af228239')
self.assertEqual(md5_found, 'd94877de73bced10b7aeebd8254e8bbc')

if __name__ == '__main__':
unittest.main()
Original file line number Diff line number Diff line change
Expand Up @@ -8,14 +8,10 @@
<!-- treatment -->
${html_builder.section_cells_begin("Treatment Options", True)}

<p>
Review identified <strong>${approved_total}</strong> option(s) indicating an FDA Approved and/or NCCN Recommended Biomarker
<strong>${investig_total}</strong> option(s) indicating investigational therapies,
and <strong>${prognostic_total}</strong> option(s) indicating NCCN-listed biomarkers.
</p>
<p>Review identified <strong>${approved_total}</strong> option(s) indicating an FDA-approved and/or NCCN-compendium listed treatment, <strong>${investig_total}</strong> option(s) indicating investigational therapies, and <strong>${prognostic_total}</strong> option(s) indicating NCCN-listed biomarkers.</p>

% if approved_total > 0:
<h3 class="header3">FDA Approved and/or NCCN Recommended Biomarker:</h3>
<h3 class="header3">FDA-approved and/or NCCN-recommended Biomarker:</h3>
<table class="variants" width="100%">
<thead style="background-color:white">
<tr>
Expand Down Expand Up @@ -82,4 +78,4 @@ <h3 class="header3">NCCN-listed Biomarker:</h3>
</table>
% endif

${html_builder.section_cells_end()}
${html_builder.section_cells_end()}
3 changes: 1 addition & 2 deletions src/lib/djerba/plugins/fusion/fusion_template.html
Original file line number Diff line number Diff line change
Expand Up @@ -14,8 +14,7 @@

<p>
<strong>${results.get(fusion.TOTAL_VARIANTS)}</strong> cancer gene(s) were subject to rearrangement.
<strong>${results.get(fusion.CLINICALLY_RELEVANT_VARIANTS)}</strong> fusion(s) were oncogenic according to OncoKB.
,in addition to which <strong>${results.get(fusion.NCCN_RELEVANT_VARIANTS)}</strong> rearrangement(s) appeared in NCCN's biomarker compendium.
<strong>${results.get(fusion.CLINICALLY_RELEVANT_VARIANTS)}</strong> fusion(s) were oncogenic according to OncoKB and <strong>${results.get(fusion.NCCN_RELEVANT_VARIANTS)}</strong> rearrangement(s) appeared in the NCCN biomarker compendium.
</p>

% if results.get(fusion.CLINICALLY_RELEVANT_VARIANTS) > 0:
Expand Down
2 changes: 1 addition & 1 deletion src/lib/djerba/plugins/fusion/test/fusion.ini
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
archive_name = djerba
archive_url = http://admin:[email protected]:5984
attributes =
author = CGI Author
author = Test Author
configure_priority = 100
depends_configure =
depends_extract =
Expand Down
2 changes: 1 addition & 1 deletion src/lib/djerba/plugins/fusion/test/plugin_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ def test(self):
params = {
self.INI: self.INI_NAME,
self.JSON: self.JSON_NAME,
self.MD5: '02c816c95f5bdb147ab3ccb6bc484a2c'
self.MD5: '140f972adcb7c796128970df64edfba5'
}
self.run_basic_test(input_dir, params, 'fusion', logging.ERROR, work_dir)

Expand Down
6 changes: 6 additions & 0 deletions src/lib/djerba/plugins/genomic_landscape/constants.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,12 @@
DONOR = 'donor'
MSI_FILE = 'msi_file'
CTDNA_FILE = 'ctdna_file'
SAMPLE_TYPE = 'sample_type'
UNKNOWN_SAMPLE_TYPE = 'Unknown sample type'

# biomarker reportability
CAN_REPORT_HRD = 'can_report_hrd'
CAN_REPORT_MSI = 'can_report_msi'

# For MSI file
MSI_RESULTS_SUFFIX = '.recalibrated.msi.booted'
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,20 +10,18 @@
${html_builder().section_cells_begin("<h2>Genomic Landscape</h2>","main")}

<p >
Tumour Mutation Burden (TMB) was <strong>${results.get(constants.GENOMIC_LANDSCAPE_INFO).get(constants.TMB_PER_MB)}</strong> coding mutations per Mb (${html_builder().k_comma_format(results.get(constants.GENOMIC_LANDSCAPE_INFO).get(constants.TMB_TOTAL))} mutations)
which corresponds to the ${gl_html_builder().make_ordinal(results.get(constants.GENOMIC_LANDSCAPE_INFO).get(constants.PAN_CANCER_PERCENTILE))}
percentile of the pan-cancer cohort and classified it as <strong>${results.get(constants.BIOMARKERS).get(constants.TMB).get(constants.METRIC_TEXT)}</strong>.

Tumour Mutation Burden (TMB) was <strong>${results.get(constants.GENOMIC_LANDSCAPE_INFO).get(constants.TMB_PER_MB)}</strong> coding mutations per Mb (${html_builder().k_comma_format(results.get(constants.GENOMIC_LANDSCAPE_INFO).get(constants.TMB_TOTAL))} mutations) which corresponds to the ${html_builder().make_ordinal(results.get(constants.GENOMIC_LANDSCAPE_INFO).get(constants.PAN_CANCER_PERCENTILE))} percentile of the pan-cancer cohort and classified it as <strong>${results.get(constants.BIOMARKERS).get(constants.TMB).get(constants.METRIC_TEXT)}</strong>.
% if results.get(constants.GENOMIC_LANDSCAPE_INFO).get(constants.CANCER_SPECIFIC_PERCENTILE) != "NA" :
This TMB placed the tumour in the <strong>${gl_html_builder().make_ordinal(results.get(constants.GENOMIC_LANDSCAPE_INFO).get(constants.CANCER_SPECIFIC_PERCENTILE))}</strong> percentile of the ${results.get(constants.GENOMIC_LANDSCAPE_INFO).get(constants.CANCER_SPECIFIC_COHORT)} cohort.
This TMB placed the tumour in the <strong>${html_builder().make_ordinal(results.get(constants.GENOMIC_LANDSCAPE_INFO).get(constants.CANCER_SPECIFIC_PERCENTILE))}</strong> percentile of the ${results.get(constants.GENOMIC_LANDSCAPE_INFO).get(constants.CANCER_SPECIFIC_COHORT)} cohort.
% endif

% if results.get(constants.PURITY) > 50:
% if results.get(constants.CAN_REPORT_MSI):
The microsatellite status was <strong>${results.get(constants.BIOMARKERS).get(constants.MSI).get(constants.METRIC_TEXT)}</strong>.
% endif
This tumour had <strong>${html_builder().k_comma_format(results.get(constants.CTDNA).get(constants.CTDNA_CANDIDATES))}</strong> candidate somatic SNVs genome-wide,
making the sample <strong>${results.get(constants.CTDNA).get(constants.CTDNA_ELIGIBILITY)}</strong> for OICR's plasma WGS cfDNA assay (minimum of 4,000 SNVs required).
This sample shows signatures consistent with <strong>${results.get(constants.BIOMARKERS).get(constants.HRD).get(constants.METRIC_TEXT)}</strong>.
This tumour had <strong>${html_builder().k_comma_format(results.get(constants.CTDNA).get(constants.CTDNA_CANDIDATES))}</strong> candidate somatic SNVs genome-wide, making the sample <strong>${results.get(constants.CTDNA).get(constants.CTDNA_ELIGIBILITY)}</strong> for OICR&apos;s plasma WGS cfDNA assay (minimum of 4,000 SNVs required).
% if results.get(constants.CAN_REPORT_HRD):
This sample shows signatures consistent with <strong>${results.get(constants.BIOMARKERS).get(constants.HRD).get(constants.METRIC_TEXT)}</strong>.
% endif
</p>

<!-- other biomarkers table -->
Expand All @@ -34,12 +32,10 @@
<th style="width:80%">Score & Confidence</th>
</thead>
<tbody>
% for row in gl_html_builder().biomarker_table_rows(results.get(constants.BIOMARKERS), results.get(constants.PURITY)):
% for row in gl_html_builder().biomarker_table_rows(results.get(constants.BIOMARKERS), results.get(constants.CAN_REPORT_HRD), results.get(constants.CAN_REPORT_MSI)):
${row}
% endfor
</tbody>


</table>


Expand Down
31 changes: 27 additions & 4 deletions src/lib/djerba/plugins/genomic_landscape/plugin.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
"""
import csv
import os
import re

import djerba.core.constants as core_constants
import djerba.plugins.genomic_landscape.constants as glc
Expand Down Expand Up @@ -44,7 +45,8 @@ def specify_params(self):
glc.PURITY_INPUT,
glc.MSI_FILE,
glc.CTDNA_FILE,
glc.HRDETECT_PATH
glc.HRDETECT_PATH,
glc.SAMPLE_TYPE
]
for key in discovered:
self.add_ini_discovered(key)
Expand Down Expand Up @@ -75,6 +77,7 @@ def configure(self, config):
dpi = core_constants.DEFAULT_PATH_INFO
oc = oncokb_constants.ONCOTREE_CODE
w = self.update_wrapper_if_null(w, ipf, glc.TCGA_CODE)
w = self.update_wrapper_if_null(w, ipf, glc.SAMPLE_TYPE, fallback=glc.UNKNOWN_SAMPLE_TYPE)
w = self.update_wrapper_if_null(w, ipf, oc, self.INPUT_PARAMS_ONCOTREE_CODE)
w = self.update_wrapper_if_null(w, ppf, purple_constants.PURITY)
w = self.update_wrapper_if_null(w, dsi, glc.TUMOUR_ID)
Expand All @@ -96,16 +99,36 @@ def extract(self, config):
plugin_dir = os.path.dirname(os.path.realpath(__file__))
r_script_dir = os.path.join(plugin_dir, 'Rscripts')

# Make a file where all the (actionable) biomarkers will go
# Make a file where all the (actionable) biomarkers will go, and initialize results
biomarkers_path = self.make_biomarkers_maf(work_dir)
results = tmb_processor(self.log_level, self.log_path).run(
work_dir, data_dir, r_script_dir, tcga_code, biomarkers_path, tumour_id
)
results[glc.PURITY] = wrapper.get_my_float(glc.PURITY_INPUT) * 100
# evaluate HRD and MSI reportability
purity = wrapper.get_my_float(glc.PURITY_INPUT)
sample_type = wrapper.get_my_string(glc.SAMPLE_TYPE)
sample_is_ffpe = False
if re.search('FFPE', sample_type.upper()):
sample_is_ffpe = True
self.logger.debug('FFPE sample detected')
elif sample_type == glc.UNKNOWN_SAMPLE_TYPE:
self.logger.warning("Unknown sample type in config; assuming non-FFPE sample")
else:
self.logger.debug('Non-FFPE sample detected')
if purity >= 0.5 or (purity >= 0.3 and not sample_is_ffpe):
results[glc.CAN_REPORT_HRD] = True
else:
results[glc.CAN_REPORT_HRD] = False
if purity >= 0.5:
results[glc.CAN_REPORT_MSI] = True
else:
results[glc.CAN_REPORT_MSI] = False
# evaluate biomarkers
results[glc.CTDNA] = ctdna_processor(self.log_level, self.log_path).run(wrapper.get_my_string(glc.CTDNA_FILE))
hrd = hrd_processor(self.log_level, self.log_path)
results[glc.BIOMARKERS][glc.HRD] = hrd.run(
work_dir, wrapper.get_my_string(glc.HRDETECT_PATH)
work_dir,
wrapper.get_my_string(glc.HRDETECT_PATH)
)
results[glc.BIOMARKERS][glc.MSI] = msi_processor(self.log_level, self.log_path).run(
work_dir,
Expand Down
38 changes: 15 additions & 23 deletions src/lib/djerba/plugins/genomic_landscape/render.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,35 +11,27 @@ def assemble_biomarker_plot(self,biomarker,plot):
cell = template.format(biomarker,plot)
return(cell)

def biomarker_table_rows(self, biomarkers, purity):
def biomarker_table_rows(self, biomarkers, can_report_hrd, can_report_msi):
rows = []
for marker, info in biomarkers.items():
cells = [
hb.td(info[constants.ALT]),
hb.td(info[constants.METRIC_ALTERATION]),
hb.td(self.assemble_biomarker_plot(info[constants.ALT], info[constants.METRIC_PLOT]))
]
if marker == "MSI" and purity < 50:
if marker == "HRD" and not can_report_hrd:
cells = [
hb.td(info[constants.ALT]),
hb.td("NA"),
hb.td("Cancer cell content &#8804; 50 &#37;, below threshold to call MS score")
hb.td("Cancer cell content below threshold to evaluate HRD; must be &#8805;50&#37; for FFPE samples, &#8805;30&#37; otherwise")
]
elif marker == "MSI" and not can_report_msi:
cells = [
hb.td(info[constants.ALT]),
hb.td("NA"),
hb.td("Cancer cell content below threshold to call MS score; must be &#8805;50&#37;")
]
else:
cells = [
hb.td(info[constants.ALT]),
hb.td(info[constants.METRIC_ALTERATION]),
hb.td(self.assemble_biomarker_plot(info[constants.ALT], info[constants.METRIC_PLOT]))
]
rows.append(hb.table_row(cells))
return rows

def make_ordinal(self,n):
'''
Convert an integer into its ordinal representation::
make_ordinal(0) => '0th'
make_ordinal(3) => '3rd'
make_ordinal(122) => '122nd'
make_ordinal(213) => '213th'
'''
n = int(n)
if 11 <= (n % 100) <= 13:
suffix = 'th'
else:
suffix = ['th', 'st', 'nd', 'rd', 'th'][min(n % 10, 4)]
return str(n) + suffix
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
[core]
author = Test Author

[genomic_landscape]
oncotree code=PAAD
Expand All @@ -7,4 +8,5 @@ purity=0.78
tumour_id=PLACEHOLDER
hrd_path = $DJERBA_TEST_DIR/plugins/hrd/hrdetect.signatures.json
msi_file=$DJERBA_TEST_DIR/plugins/genomic-landscape/PLACEHOLDER.filter.deduped.realigned.recalibrated.msi.booted
ctdna_file=$DJERBA_TEST_DIR/plugins/genomic-landscape/SNP.count.txt
ctdna_file=$DJERBA_TEST_DIR/plugins/genomic-landscape/SNP.count.txt
sample_type=FFPE
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,7 @@ def testGenomicLandscapeLowTmbStableMsi(self):
params = {
self.INI: self.INI_NAME,
self.JSON: json_location,
self.MD5: 'b8483c476a1c17404ac81f9e3a439641'
self.MD5: '690242139aab0153348e7a4634d2f17c'
}
self.run_basic_test(input_dir, params)

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,8 @@
Shallow whole genome sequencing (sWGS) libraries were prepared using the KAPA Hyper Prep kit with DNA extracted from FFPE, cfDNA, fresh frozen tissue (for tumour samples).
Paired-end sequencing was performed using the ${versions.TAR_ILLUMINA_VERSION} illumina technology to a minimum target coverage of 0.1x.
Alignments were performed using <a href=${versions.BWAMEM_LINK}>bwa mem</a> (${versions.BWAMEM_VERSION})
against reference genome <a href=${versions.REFERENCE_GENOME_LINK}>${versions.REFERENCE_GENOME_VERSION}</a>.
and copy number amplifications are called using <a href=${versions.ICHORCNA_LINK}>ichorCNA</a>.
against reference genome <a href=${versions.REFERENCE_GENOME_LINK}>${versions.REFERENCE_GENOME_VERSION}</a>.
Copy number amplifications were called using <a href=${versions.ICHORCNA_LINK}>ichorCNA</a>.

TAR libraries were prepared using the KAPA Hyper Prep kit with DNA extracted from FFPE, cfDNA, fresh frozen tissue (for tumour samples) or buffy coat blood specimens (for matched normal blood samples).
Paired-end sequencing was performed using the ${versions.TAR_ILLUMINA_VERSION} illumina technology.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,16 +14,16 @@
and annotated with <a href=${versions.VARIANTEFFECTPREDICTOR_LINK}>VariantEffectPredictor</a> (v.${versions.VARIANTEFFECTPREDICTOR_VERSION})
using MANE transcripts (MANE Clinical version 1.0 when available, <a href=${versions.MANE_LINK}>MANE Select</a> version ${versions.MANE_VERSION} for all other transcripts).
Variants were further annotated for oncogenicity and actionability by <a href=${versions.ONCOKB_LINK}>OncoKB</a>.
In cases where OncoKB does not use MANE Select, links in annotation use the corresponding alteration in OncoKB.
In cases where OncoKB did not use MANE Select, links in annotation have used the corresponding alteration in OncoKB.
Copy number variations were called using <a href=${versions.PURPLE_LINK}>Purple</a> (${versions.PURPLE_VERSION}).
Microsatellite (MS) Instability status is called using <a href=${versions.MICROSATELLITE_LINK}>msisensor-pro</a> (${versions.MICROSATELLITE_VERSION}) and a custom list of MS sites created by msisensor-pro for the current reference genome.
Homologous recombination deficiency (HRD) status is called using HRDetect <a href="https://pubmed.ncbi.nlm.nih.gov/28288110/">(Davies et al. 2017)</a>, a weighted logistic regression model, using the signature.tools.lib R package <a href="https://pubmed.ncbi.nlm.nih.gov/32118208/">(Degasperi et al. 2020).</a>. HRDetect takes SNVs and in/dels from MuTect2. The proportion of deletions that are at microhomologous sites is summarized as "Microhomologous Deletions".
The counts of SNVs are categorized into exposures based on their trinucleotide context using
Microsatellite (MS) Instability status was called using <a href=${versions.MICROSATELLITE_LINK}>msisensor-pro</a> (${versions.MICROSATELLITE_VERSION}) and a custom list of ${versions.MICROSATELLITE_CUSTOM_SITES} MS sites created by msisensor-pro for the current reference genome.
Homologous recombination deficiency (HRD) status was called using HRDetect <a href="https://pubmed.ncbi.nlm.nih.gov/28288110/">(Davies et al. 2017)</a>, a weighted logistic regression model, using the signature.tools.lib R package <a href="https://pubmed.ncbi.nlm.nih.gov/32118208/">(Degasperi et al. 2020)</a>. HRDetect takes SNVs and in/dels from MuTect2. The proportion of deletions occurring at microhomologous sites has been summarized as "Microhomologous Deletions".
The counts of SNVs were categorized into exposures based on their trinucleotide context using
<a href="https://cran.r-project.org/web/packages/deconstructSigs/index.html">DeconstructSigs</a> (v. 1.8.0) and SBS signatures
as defined in <a href="https://cancer.sanger.ac.uk/signatures/downloads/">COSMIC version 1</a>.
HRDetect also takes in LOH and structural variants. Structural variants are first called by
HRDetect also takes in LOH and structural variants. Structural variants were first called by
<a href="https://github.com/PapenfussLab/gridss">GRIDSS</a> (v.2.13.2)
and then passed to PURPLE (v.3.8.1) for integrated LOH calling. Structural variants are then categorized into exposures based on break-end
and then passed to PURPLE (v.3.8.1) for integrated LOH calling. Structural variants were then categorized into exposures based on break-end
characteristics using <a href="https://github.com/Nik-Zainal-Group/signature.tools.lib">signature.tools.lib</a> (v. 2.1.2)
and the rearrangement signature set defined in <a href="https://www.nature.com/articles/nature17676">Nik-Zainal et al. (2016)</a>.
</p>
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
<p>
Based on a minimum tumour purity of 30%, the sensitivity for SNVs and in/dels is 96% and 89%, respectively.
The sensitivity for CNVs and RNA fusions is 100% and 32%, respectively.
The limit of detection is 10% VAF for SNVs and 20% for in/dels. The limit of detection for MSI is cellularity >50%.
For HRD, the sensitivity is 83% and the specificity is 90%. The lower limit of detection is >50% cellularity in FFPE samples and >30% cellularity in fresh frozen samples.
The limit of detection is 10% VAF for SNVs and 20% for in/dels. The limit of detection for MSI is cellularity &ge;50%.
For HRD, the sensitivity is 83% and the specificity is 90%. The lower limit of detection is &ge;50% cellularity in FFPE samples and &ge;30% cellularity in fresh frozen samples.
For LOH, the sensitivity and specificity are both 100%. LOH is currently reported for autosomes; LOH on the X chromosome is not reported.
Although whole genome sequencing encompasses all genes in a specimen, this report is restricted to cancer genes defined by OncoKB as of the date the report is issued.
This test was developed and its performance characteristics determined by OICR Genomics. It has not been cleared or approved by the US Food and Drug Administration.
Expand Down
Loading

0 comments on commit 184c563

Please sign in to comment.