Skip to content

Commit

Permalink
Merge pull request #9 from maxibor/dev
Browse files Browse the repository at this point in the history
Version 1.0.0-beta
  • Loading branch information
maxibor authored Apr 21, 2022
2 parents 41afa20 + fa8184f commit edaa34a
Show file tree
Hide file tree
Showing 63 changed files with 2,529 additions and 879 deletions.
38 changes: 19 additions & 19 deletions .github/workflows/publish_pypi.yml
Original file line number Diff line number Diff line change
@@ -1,25 +1,25 @@
name: publish-pypi

on:
on:
release:
types: [published, edited]

jobs:
build-and-publish:
name: Build and publish Python 🐍 distributions 📦 to PyPI and TestPyPI
runs-on: ubuntu-18.04
steps:
- uses: actions/checkout@v1
- name: Setup Python 3.7
uses: actions/setup-python@v1
with:
python-version: 3.7
- name: Build sam2lca
run: |
pip install wheel
python setup.py sdist bdist_wheel
- name: Publish distribution 📦 to PyPI
uses: pypa/gh-action-pypi-publish@master
with:
password: ${{ secrets.PYPI_TOKEN }}

build-and-publish:
name: Build and publish Python 🐍 distributions 📦 to PyPI and TestPyPI
runs-on: ubuntu-18.04
steps:
- uses: actions/checkout@v3
- name: Setup Python
uses: actions/setup-python@v3
with:
python-version: "3.9"
- name: Build sam2lca
run: |
pip install wheel
python setup.py sdist bdist_wheel
- name: Publish a Python distribution to PyPI
uses: pypa/gh-action-pypi-publish@release/v1
with:
user: __token__
password: ${{ secrets.PYPI_TOKEN }}
20 changes: 5 additions & 15 deletions .github/workflows/sam2lca_ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,17 +2,16 @@ name: sam2lca-CI

on: [push, pull_request]


jobs:
sam2lca_ci:
name: sam2lca_ci
runs-on: 'ubuntu-latest'
runs-on: "ubuntu-latest"
if: "!contains(github.event.head_commit.message, '[skip_ci]')"
steps:
- uses: actions/checkout@v2
- uses: conda-incubator/setup-miniconda@v2
with:
python-version: 3.7
python-version: 3.9
mamba-version: "*"
channels: conda-forge,bioconda,defaults
channel-priority: true
Expand All @@ -26,15 +25,6 @@ jobs:
- name: Test with pytest
shell: bash -l {0}
run: |
pip install -e .
pip install pytest
# pytest
- name: Check sam2lca help message
shell: bash -l {0}
run: |
sam2lca --help
- name: Check sam2lca on test data
shell: bash -l {0}
run: |
sam2lca -m test update-db
sam2lca -m test analyze -b -t tests/data/taxonomy/test.tree tests/data/aligned.sorted.bam
pip install .
pip install pytest pytest-console-scripts
pytest -s -vv --script-launch-mode=subprocess
5 changes: 4 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -106,4 +106,7 @@ venv.bak/
.nextflow
.vscode/
mappings
aligned.sorted.sam2lca.*
aligned.sorted.sam2lca.*
sam2lca_test_dbdir
sam2lca.egg-info
.DS_Store
40 changes: 40 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
# Changelog

All notable changes to [sam2lca](https://github.com/maxibor/sam2lca) will be documented in this file.

The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## 1.0.0beta

### Added

- Added sam2lca tutorial
- Add Custom acc2tax with json
- Total number of reads is computed early on to provide progress bar
- unit and integration testing with PyTest
- Total Descendant read counts for each taxon
- GTDB taxonomy and acc2tax added
- 18s acc2tax added
- XN and XR flag in bam output for, respectively, Taxon name and rank
- Add edit distance threshold filtering

### Changed

- Code refactoring for speedup, making use of multithreading on shared dictionaries
- Improve logging and replace prints statements with logging.info
- ete3 has been replaced by taxopy
- Unclassified TAXID is now `12908` by default
- RocksDB params changed
- TAXID of LCA is only attributed to alignment segments passing threshold
- Identity threshold is now a range selection in CLI.

## Dependencies

- ete3 removed
- ordered-set removed
- pytest-console-scripts 1.3.1
- scipy
- Jinja2 pinned version ot 3.1 (see [RTD issue](https://github.com/readthedocs/readthedocs.org/issues/9038) )

### Removed
46 changes: 32 additions & 14 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,18 @@
![sam2lca-CI](https://github.com/maxibor/sam2lca/workflows/sam2lca-CI/badge.svg) [![Documentation Status](https://readthedocs.org/projects/sam2lca/badge/?version=latest)](https://sam2lca.readthedocs.io/en/latest/?badge=latest)
![](docs/img/sam2lca_logo_text.png)

<p align="center">
<a href="https://github.com/maxibor/sam2lca/actions"><img src="https://github.com/maxibor/sam2lca/workflows/sam2lca-CI/badge.svg"/></a>
<a href="https://sam2lca.readthedocs.io"><img src="https://readthedocs.org/projects/sam2lca/badge/?version=latest"/></a>
<a href="https://pypi.org/project/sam2lca"><img src="https://img.shields.io/badge/install%20with-pip-blue"/></a>
</p>

# sam2lca

[Lowest Common Ancestor](https://en.wikipedia.org/wiki/Lowest_common_ancestor) from a SAM/BAM/CRAM sequence alignment file
[Lowest Common Ancestor](https://en.wikipedia.org/wiki/Lowest_common_ancestor) from a [SAM/BAM/CRAM](<https://en.wikipedia.org/wiki/SAM_(file_format>) sequence alignment file.

## Quick start
## TLDR

Quick analyis of sequencing reads aligned to a DNA database
Analysis of sequencing reads aligned to a DNA database with NCBI accession numbers, using the NCBI taxonomy

```bash
sam2lca analyze myfile.bam
Expand All @@ -16,31 +23,42 @@ See all options
```bash
sam2lca --help
sam2lca update-db --help
sam2lca list-db --help
sam2lca analyze --help
```

> For further infos, check out the [sam2lca documentation](https://sam2lca.readthedocs.io) and [tutorial](https://sam2lca.readthedocs.io/en/latest/tutorial.html)
## Installation

### From source
### With [Conda](https://docs.conda.io/en/latest/) (recommended)

```bash
git clone [email protected]:maxibor/sam2lca.git
conda env create -f environment.yml
conda activate sam2lca
pip install git+ssh://[email protected]/maxibor/sam2lca.git
conda install -c conda-forge -c bioconda -c maxibor sam2lca
```

### From Conda
### With [pip](https://pypi.org/project/pip/)

```bash
conda install -c conda-forge -c bioconda -c maxibor sam2lca
pip install sam2lca
```
### From Pypi

### For development purposes, from the dev branch

```bash
pip install sam2lca
git clone [email protected]:maxibor/sam2lca.git
git checkout dev
conda env create -f environment.yml
conda activate sam2lca
pip install -e .
```

or

```bash
pip install git+ssh://[email protected]/maxibor/sam2lca.git@dev
```

## Documentation

The documentation of sam2lca, including tutorials, is available here: [sam2lca.readthedocs.io](https://sam2lca.readthedocs.io)
The documentation of sam2lca, including tutorials, is available here: [sam2lca.readthedocs.io](https://sam2lca.readthedocs.io)
3 changes: 1 addition & 2 deletions conda/meta.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -21,13 +21,12 @@ requirements:
- python
- click >=7.0
- pip >=20.0.2
- conda-forge::ete3 >=3.1.1
- bioconda::taxopy=0.9.2
- bioconda::pysam >=0.15.2
- conda-forge::rocksdb >=6.10.1
- conda-forge::python-rocksdb >=0.7.0
- conda-forge::xopen >=0.9.0
- tqdm >=4.45.0
- ordered-set >=4.0.2
- pandas >=1.1.4

test:
Expand Down
Binary file added docs/img/sam2lca_logo_text.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/img/sam2lca_square.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
45 changes: 45 additions & 0 deletions docs/source/contributing.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
# Contributing

We welcome any contributions !

To further develop sam2lca, or add documentation, please read further:

## Clone the sam2lca repository, and checkout the dev branch

```bash
git clone [email protected]:maxibor/sam2lca.git
git checkout dev
```

## Install and activate the development environment

```bash
conda env create -f environment.yml
conda activate sam2lca
```

## Install sam2lca with pip in editable mode

```bash
pip install -e .
```

## Run the unit and integration tests

```bash
pytest -s -vv --script-launch-mode=subprocess
```

## Build the documentation

```bash
cd docs
make html
```

The docs are built in the `docs/build/html` directory

### Claim your sticker

Thanks for contributing to sam2lca !
If you want to spread the word about sam2lca, please get in touch with me to claim your sticker (maxime_borry[at]eva.mpg.de) !
Loading

0 comments on commit edaa34a

Please sign in to comment.