- Support dual UMI indexes with bamtag.
- Add support for dual UMI indexes. Thanks @lbeltrame!
- Ensure headers are not written when writing out a Series, to make us compatible with pandas > 0.24.
- Fix for deprecated .ix call, .loc is the new replacement. Thanks to @naumenko-sa.
- Fix for the python3 fix.
- Fix for cb_filter with python3.
- Enable cb_histogram to be used on samples without UMIs.
- Enable filtering of cells during
demultiplex_cells
. - Fix incorrect pandas.read_csv call with header=-1.
- Python 3 support
- Add
demultiplex_cells
subcommand to break a transformed FASTQ file into separate FASTQ files by cell. - Future proofing for changes to panda's
to_csv
function.
- Add support for click 7.0.
- Fix for min-length filtering with paired samples. Previously required only one read to be longer, fix requires both.
- Fix tests for fastqtagcount to use indexed BAM files.
- Support gzipped cellular barcode files.
- Support 10x V2 barcoding scheme. Thanks to @tomasgomes for the fix.
- Re-enable streaming for cellular barcode filtering.
- Add
--umi_matrix
option to fasttagcount. This outputs a non-umi-deduped matrix of counts, useful for QC. - Support gzipped files for
sb_filter
,mb_filter
andadd_uid
.
- Fix
fasttagcount
off-by-one issue. - Add
version
subcommand. - Fix missing pandas import in
sparse
subcommand.
- Fix for kallisto output failing due to defaultdict not being imported. Thanks to @andreas-wilm for the fix.
- Added
tagcount
option--parse_tags
to use BAM tags rather than parsing read names (UM
for UMI,CR
for cell barcode) - Added
tagcount
option--gene_tags
to use BAM tags to get ID of mapping gene (GX
tag). - Fix tagcount with
--genemap
option not including a column name for the index. - Add
sparse
subcommand to turn a matrix into a sparse matrix. - Add
fasttagcount
subcommand. This assumes the input BAM/SAM file is coordinate sorted. Reduces memory usage by over 100x and runtime by 30-40% for deep samples. - Warn, don't fail if transcripts are missing from the genemap.
- Fix skipping first piece of evidence when tagcounting.
- Add test for tagcount.
- Output full sorted transcript table from tagcount rather than only the observed transcripts.
- Add
--sparse
option to output tagcount matrices in MatrixMarket format. - Allow cb_histogram subcommand to take gzipped files.
- Allow cb_filter subcommand to take gzipped files.
- Add support for triple-cellular barcodes.
- Add example for Illumina SureCell (https://www.illumina.com/products/by-type/sequencing-kits/library-prep-kits/surecell-wta-ddseq.html)
- Fix automatic format detection in cb_histogram.
- Add tests for cb_histogram.
- Re-enable streaming bamtagging. Thanks to @chapmanb for the suggestion.
- Add subset_bamfile to subset a BAM file to keep alignments with a given set of cellular barcodes.
- Speed improvements for reading gzipped FASTQ files.
- Memory usage improvements for tagcount.
- Fix for handling unicode, thanks to @chapmanb and @sowmyaiyer
- Adds support for adding BAM tags to aligned fastqtransformed files. Thanks to @chapmanb.
- Adds support for UMI-only fastqtransformation.
- Adds support for paired-end target sequences.
- Adds support for detecting sample barcodes via the SB tag in the regex.
- Adds support for sample-based demultiplexing with error correction.
- Now supports transforming 3-file input, as from the Linnarsson lab STRT-Seq data
- New kallisto subcommand formats read files for input to kallisto's UMI mode
- Fix gzip based fastq reading on Python 3.5
- Including preliminary subcommand for guessing cell cutoff from cb_histogram
- Added MANIFEST file which broke pip installation