Consensus sequences from multiple alignments

seq_consensus is a simple Python 3 library focused on calculating consensus sequences. Ambiguous letters in the input are handled as well. Numpy is used under the hood. Currently, DNA/RNA sequences are supported.

The package additionally offers a small utility (cons_tool), which allows calculating consensus sequences on the commandline.

How is the consensus calculated?

The method is identical with the approach by Geneious and very similar to the function ConsensusSequence from the DECIPHER R package (options a little different). The API documentation contains some more description.

Documentation

The complete user guide is found here and the API is documented here. Below some small examples for demonstration:

Usage example

from seq_consensus import consensus

seqs = [
    'ATTGC',
    'AT-CC',
    'RT-C-'
]

consensus(seqs, threshold=0.6)

This returns:

'AT-CC'

Commandline tool examle

The script cons_tool allows using the same functionality from the commandline. An especially useful feature is the possibility to group sequences by arbitrary regular expression pattern matched in the sequence headers:

cons_tool -k 'p:\w+' input.fasta

Example output (given that taxonomic annotations are present in the headers):

>p:Evosea consensus (n=124)
TACKATTTA--RTATTGAC-?TWA?-GKTACTAAAGCATGGGKA-T?AAA?AGGATTAGAGACCCTYGTA
>p:Chordata consensus (n=7065)
TWAYTTTA?--WAW-YWAY-YTGAA-YCCACGAAAGCTAAGAMA-CAAACTGGGATTAGATACCCCACTA
>p:Mollusca consensus (n=843)
TWAWTWTAW--WAW?WWAY-TTGAA-KYYAYGAAAKCTWRGRWA-YAAACTAGGATTAGATACCCTAYTA
>p:Chordata consensus (n=8509)
TWAYTTTA?--WAW-YMAC-TTGAA-CCCACGAAAGCTARGAMA-CAAACTGGGATTAGATACCCCACTA
>p:Platyhelminthes_ consensus (n=130)
TWAWTWTAA--WDW?TKWY-YTGAA-KYYACGAAAGYTAKGWTA-YAAACTGGGATTAGATACCCCATTA
>p:Ascomycotaconsensus (n=280)
TTAWTWTAA--WAA?TDAC-TTGAR-K??ACGAAAGCTWRGRWA-CAAACTAGGATTAGATACCCYABTA
>p:Streptophyta consensus (n=269)
TWAWTWTAW--WAW?TRAY-TTGAR-KY?ACGAAAGCTTRGRKA-CAAACTAGGATTAGATACCCTAKTA
(...)

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
.github/workflows		.github/workflows
docs		docs
src/seq_consensus		src/seq_consensus
tests		tests
.gitignore		.gitignore
.readthedocs.yaml		.readthedocs.yaml
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Consensus sequences from multiple alignments

How is the consensus calculated?

Documentation

Usage example

Commandline tool examle

About

Releases

Packages

Languages

License

markschl/seq_consensus

Folders and files

Latest commit

History

Repository files navigation

Consensus sequences from multiple alignments

How is the consensus calculated?

Documentation

Usage example

Commandline tool examle

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages