Skip to content

Commit

Permalink
2021.01 (#3065)
Browse files Browse the repository at this point in the history
  • Loading branch information
antgonza authored Jan 22, 2021
1 parent 30d7f5e commit e6e5d80
Show file tree
Hide file tree
Showing 11 changed files with 1,913 additions and 32 deletions.
12 changes: 12 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,18 @@
# Qiita changelog


Version 2021.01
---------------

* Moved the qiita repo from biocore to [qiita-spots](https://github.com/qiita-spots/qiita/).
* Created the [Qiita portal for the Cancer Microbiome](https://qiita.ucsd.edu/cancer/).
* The EBI-ENA code now verifies that the sample information file has a description column; this wasn't previously required because it was automatically prefilled by the QIIME 1 mapping file.
* Now it is possible to download the per preparation sample information file and the sample-preparation summary.
* Added a faster metagenomic/metatranscriptomic adaptor and host removal step based on fastp and minimap2. The previous version, using atropos and bowtie2 for QC host filtering, is now deprecated.
* Added qiime2.2020.11 to the system; which updated these plugins: qp-qiime2, qtp-biom, qtp-diversity, qtp-visualization.
* Added [WoL](https://biocore.github.io/wol/) tree for phylogenetic analyses (/projects/wol/release/databases/qiime2/phylogeny.qza) with per-genome WoL artifacts.
* Fixed the following issues: [#3060](https://github.com/qiita-spots/qiita/issues/3060), [#3049](https://github.com/qiita-spots/qiita/issues/3049), and [#2751](https://github.com/qiita-spots/qiita/issues/2751).

Version 2020.11
---------------

Expand Down
2 changes: 1 addition & 1 deletion README.rst
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
Qiita (canonically pronounced *cheetah*)
========================================

|Build Status| |Coverage Status| |Gitter|
|Build Status| |Coverage Status|

Advances in sequencing, proteomics, transcriptomics and metabolomics are giving
us new insights into the microbial world and dramatically improving our ability
Expand Down
1,865 changes: 1,865 additions & 0 deletions logos/qiita_cancer.ai

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion qiita_core/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,4 +6,4 @@
# The full license is in the file LICENSE, distributed with this software.
# -----------------------------------------------------------------------------

__version__ = "2020.11"
__version__ = "2021.01"
2 changes: 1 addition & 1 deletion qiita_db/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@
from . import user
from . import processing_job

__version__ = "2020.11"
__version__ = "2021.01"

__all__ = ["analysis", "artifact", "archive", "base", "commands",
"environment_manager", "exceptions", "investigation", "logger",
Expand Down
2 changes: 1 addition & 1 deletion qiita_pet/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,4 +6,4 @@
# The full license is in the file LICENSE, distributed with this software.
# -----------------------------------------------------------------------------

__version__ = "2020.11"
__version__ = "2021.01"
2 changes: 1 addition & 1 deletion qiita_pet/handlers/api_proxy/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@
from .user import (user_jobs_get_req)
from .util import check_access, check_fp

__version__ = "2020.11"
__version__ = "2021.01"

__all__ = ['prep_template_summary_get_req', 'data_types_get_req',
'study_get_req', 'sample_template_filepaths_get_req',
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -25,9 +25,11 @@ gene data: sequence clustering and sequence deblur.
Sequencing deblur (preferred)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

For this we use `deblur <https://github.com/biocore/deblur>`_. Here 2 BIOM tables are generated by default: fina.biom and final.only-16s.biom. The former is the full biom table, which can be used with any target gene and wetlab work;
the latter is the trimmed version to those sequences that match Greengenes at 80% similarity, a really basic and naive filtering. Each of those BIOM tables, is accompanied by a FASTA that contains
the representative sequences. The OTU IDs are given by the unique sequence.
For this we use `deblur <https://github.com/biocore/deblur>`_. Here 2 BIOM tables are generated by default:
`deblur final table` and `deblur reference hit table`. The former is the full biom table, which can be used with any
target gene and wetlab work; the latter is the trimmed version to those sequences that match Greengenes at 80% similarity, a
really basic and naive filtering. Each of those BIOM tables, is accompanied by a FASTA that contains the representative sequences.
The OTU IDs are given by the unique sequence.

Note that deblur needs all sequences to be trimmed at the same length, thus the recommended pipeline is to trim everything at 150bp and the deblur.

Expand All @@ -49,25 +51,28 @@ Below you will find more information about each of these options.

The current workflow is as follows:

#. Removal of adapter sequence and quality control: `Atropos <https://github.com/jdidion/atropos/>`_
#. Removal of host contamination using `Bowtie2 <http://bowtie-bio.sourceforge.net/bowtie2/index.shtml>`_
#. A single step per sample adapter removal (via `fastp <https://academic.oup.com/bioinformatics/article/34/17/i884/5093234>`_) and host filtering (via `minimap2 <https://academic.oup.com/bioinformatics/article/34/18/3094/4994778>`_); more information below.
#. Taxonomy profiling using bowtie2 as an aligner and two different reference databases; see sections below

Note that we recommend only uploading sequences that have already been through QC and human sequence removal. However, we
recommend that all sequence files go through adapter and quality control within the system to ensure they are ready for
subsequent analyses. Currently, the command removes adaptor sequences (only KAPA HyperPlus with iTru, which are compatible
with Illumina TruSeq).

Sequences generated with an instrument that relies on two-color chemistry (NextSeq, NovaSeq), need to undergo an additional
quality control step. This step removes trailing G nucleotides which signify that the instrument has finished capturing new
information. Per Illumina's specification, NovaSeq instruments have 3 quality levels (11, 25 and 37), and
high-quality trailing Gs need to be removed. Typically this can be done in conjunction with adapter removal, with Atropos
we recommend using the `--nextseq-trim 30` parameter.

For host removal we currently support *Danio Rerio* (zebrafish), *Drosophila Melanogaster* (fruit fly), *Mus Musculus* (mouse),
*Rattus Norvegicus* (rat), and Enterobacteria phage phiX174 (the Illumina spike-in control).
recommend that all sequence files go through adapter and host filtering within the system to ensure they are ready for
subsequent meta-analyses. Currently, the `fastp` command is set to autodetect adaptors so this command is available for all different
wetlab processing and we provide the following host references for your convenience:

- auto-detect adapters and artifacts + phix filtering: This is a `deblur artifacts <https://github.com/biocore/deblur/blob/master/deblur/support_files/artifacts.fa>`_ reference, mainly for debugging and testing
- auto-detect adapters and cheetah + phix filtering
- auto-detect adapters and cow + phix filtering
- auto-detect adapters and hamster + phix filtering
- auto-detect adapters and horse + phix filtering
- auto-detect adapters and merge_genomes + phix filtering : is the combined genomes of a cheetah, cow, hamster, horse, human, mouse, pig, rabbit, and rat
- auto-detect adapters and mouse + phix filtering
- auto-detect adapters and pig + phix filtering
- auto-detect adapters and rabbit + phix filtering
- auto-detect adapters and rat + phix filtering
- auto-detect adapters only filtering [not recommended]

Note that the command produces up to 6 output artifacts based on the aligner and database selected:

- Alignment Profile: contains the raw alignment file and the no rank classification BIOM table
- Taxonomic Prediction - phylum: contains the phylum level taxonomic predictions BIOM table
- Taxonomic Prediction - genus: contains the genus level taxonomic predictions BIOM table
Expand Down Expand Up @@ -186,19 +191,18 @@ Note that some of these are legacy option but not available for new processing.
Metatranscriptome processing
----------------------------

Qiita currently has one active Metatranscriptome data analysis pipeline, as follows:

#. Ribosomal read filtering via `SortMeRNA <https://pubmed.ncbi.nlm.nih.gov/23071270/>`_; details below. This produces a `Ribosomal reads` and a `Non-ribosomal reads` artifact/
#. Taxonomic profiling via Woltka; for more information see details above.

Sample processing guidelines for metatranscriptomic data
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Total community RNA extracted from samples contain both coding and non-coding RNA. Typically, ribosomal RNA make up
>90% of the library if not depleted prior to library construction. Ribosomal depletion allows for mRNA enrichment. Even if
you are dealing with ribosomal RNA subtracted cDNA libraries, there will be some
residual ribosomal RNA in the libraries that you want to remove/separate from the non ribosomal RNA sequences.

Ribosomal read filtering
^^^^^^^^^^^^^^^^^^^^^^^^

`SortMeRNA <https://bioinfo.lifl.fr/RNA/sortmerna/>`_
is used for removal of ribosomal reads from quality filtered Metatranscriptome data
`SortMeRNA <https://pubmed.ncbi.nlm.nih.gov/23071270/>`_ is used for removal of ribosomal reads from quality filtered Metatranscriptome data

Latest SortMeRNA version: v2.1

Expand Down
2 changes: 1 addition & 1 deletion qiita_ware/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,4 +6,4 @@
# The full license is in the file LICENSE, distributed with this software.
# -----------------------------------------------------------------------------

__version__ = "2020.11"
__version__ = "2021.01"
2 changes: 1 addition & 1 deletion scripts/qiita-auto-processing
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ full_pipelines = [
'steps': [
{'previous-step': None,
'plugin': 'qp-meta',
'version': '2020.11',
'version': '2021.01',
'cmd_name': 'Atropos v1.1.24',
'input_name': 'input',
'ignore_parameters': ['Number of threads used'],
Expand Down
2 changes: 1 addition & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
from setuptools import setup
from glob import glob

__version__ = "2020.11"
__version__ = "2021.01"


classes = """
Expand Down

0 comments on commit e6e5d80

Please sign in to comment.