Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
haeussma authored May 7, 2024
1 parent 9f90af1 commit d82a465
Showing 1 changed file with 3 additions and 36 deletions.
39 changes: 3 additions & 36 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@

## About 📖
pyEED is a toolkit enabling object-oriented analysis of protein sequences, instead of working with sequences in a file-oriented fashion. This will enable the user to easily access and manipulate sequence information and to perform analyses on the sequence data.
This library is currently under development and thus the API is subject to change. Check out the [Roadmap](#roadmap-%EF%B8%8F) for more information. If you have feature ideas, or notes issues with pyEED, submit an [issue](https://github.com/PyEED/pyeed/issues).
This library is currently under development and thus the API is subject to change.


## Installation ⚙️
Expand All @@ -20,42 +20,9 @@ pip install git+https://github.com/PyEED/pyeed.git

## Quick start 🚀

In the following example, information of the [aldolase](https://www.ncbi.nlm.nih.gov/protein/NP_001287541.1/) (*Drosophila melanogaster*) is retrieved from the corresponding GenBank entry. Thereafter, a protein blast search ist started and the found sequence information is fetched and stored as `ProteinSequence` objects.

```python
from pyeed.core import ProteinInfo

# Get a protein entry from NCBI by accession id
aldolase = ProteinInfo.from_ncbi("NP_001287541.1")
print(aldolase)

# Start a blast search with the protein sequence of tem1 as query
blast_results = aldolase.pblast(n_hits=10)

# Get the corresponding nucleotide entry
aldolase_cds = aldolase.get_dna()

# print the protein and coding sequence of the 2nd blast result
print(blast_results[1].sequence)

# print the nucleotide sequence of tem1
print(aldolase_cds)
```
Library is currently refactored, quick start will be updated soon!

## Documentation 📘

Check out the [documentation](https://pyeed.github.io/pyeed/) for in-depth information on how to setup `pyeed`,
use the build-in tools, and store sequence data in databases.

## Roadmap 🛣️

- [x] `ProteinSequence` data model: Object-oriented representation of a protein sequence database entry.
- [x] `ProteinSequence` query: get protein sequence by accession id from NCBI database
- [x] Blast search: get protein sequnces as a result from a blast search
- [x] Retrieve corresponding coding sequence
- [x] Storing `ProtenSequence` in a SQL database
- [x] Running pairwise alignments
- [x] Network analysis / visualization
- [ ] Create phylogenetic trees
- [x] Multi-sequence alignments
- [x] Representative clustering
use the build-in tools, and store sequence data in databases.

0 comments on commit d82a465

Please sign in to comment.