-
Notifications
You must be signed in to change notification settings - Fork 67
Protocol specific setup
The new release of zUMIs2 has made feature extraction from sequencing files much more flexible. Because of this, we could remove the protocol-specific processing modes necessary for some more complicated sequencing setups.
zUMIs is also compatible with combinatorial indexing protocols, such as sci-RNA-seq (Cao et al., 2018) and SPLiT-seq (Rosenberg et al., 2018).
In the previous version of zUMIs, a preprocessing step was necessary to accomodate the library structure of these protocols. As of now, this is not necessary anymore!
Here is an example for sci-Seq:
project: SciSeq
sequence_files:
file1:
name: i7.fastq.gz
base_definition:
- BC(1-8)
file2:
name: i5.fastq.gz
base_definition:
- BC(1-8)
file3:
name: R1.fastq.gz
base_definition:
- UMI(1-6)
- BC(7-16)
file4:
name: R2.fastq.gz
base_definition:
- cDNA(1-50)
Here is an example for SPLiT-seq
project: SplitSeq
sequence_files:
file1:
name: R1.fastq.gz
base_definition:
- cDNA(1-50)
file2:
name: R2.fastq.gz
base_definition:
- UMI(1-10)
- BC(11-18,49-56,87-94)
Illumina/BioRad ddSeq data can be processed with zUMIs from version 2.2.0. In this protocol, the cell barcode is composed of three blocks that may be shifted in the read. zUMIs can now detect the linker sequence and use it to account for the phase shift. When setting up base definitions, give the linker sequence and the base ranges for the "unshifted" case.
Follow this example for ddSeq data:
sequence_files:
file1:
name: BCUMIread_R1.fastq.gz
base_definition:
- BC(1-6,21-26,41-46)
- UMI(50-57)
correct_frameshift: TAGCCATCGCATTGC
file2:
name: cDNAread_R2.fastq.gz
base_definition: cDNA(1-75)
InDrops data can be processed by zUMIs when generated by the v2 or v3 protocols. The InDrops-specific mode has been removed because zUMIs2 can handle the data directly.
Here are examples for InDrops:
project: InDropsV2
sequence_files:
file1:
name: cdnaread.R1.fastq.gz
base_definition:
- cDNA(1-36)
file3:
name: librarybc.R2.fastq.gz
base_definition:
- BC(1-6)
file4:
name: barcodeUMIread.R3.fastq.gz
base_definition:
- BC(1-8,31-38)
- UMI(39-44)
correct_frameshift: AAGGCGTCACAAGCAATCACTC
project: InDropsV3
sequence_files:
file1:
name: cdnaread.R1.fastq
base_definition:
- cDNA(1-50)
file2:
name: barcode1read.R2.fastq
base_definition:
- BC(1-8)
file3:
name: librarybc.R3.fastq
base_definition:
- BC(1-6)
file4:
name: barcode2UMIread.R4.fastq
base_definition:
- BC(1-8)
- UMI(9-14)
zUMIs can process STRT and STRT-2i data. The STRT-specific mode and options have been removed with zUMIs2.
Here is an example for STRT-2i:
project: STRT
sequence_files:
file1:
name: umicdnaread.R1.fastq
base_definition:
- UMI(1-6)
- cDNA(9-50)
file2:
name: barcode1read.R2.fastq
base_definition:
- BC(1-8)
file3:
name: barcode1read.R3.fastq
base_definition:
- BC(1-8)