-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Newer python versions and Bio Alphabet #112
Comments
I tested removal of the import calls in init.py and one other file, and tracer loaded correctly, but haven't made a test run. |
Hi Nathan, Thanks for this! I'll be happy to accept a PR that updates this if you'd like to submit one. All the best, Mike |
I've spent several days working on a pull request. I removed the Bio Alphabet dependencies and changed the creating of the Seq objects to remove dependencies on Bio Alphabet IUPAC. I have also have been editing the Dockerfile to update packages to bring everything to a modern version, and also to run the tests. I can send you what I have so far, but: There's an error in the 'tracer test'. It seems that there's still an obscure call to Bio Alphabet in the pickle dump/load that I find difficult to trace. Partially likely because I'm not a python hacker, I can't resolve this one. Some help from the group would be appreciated. (fragment of tracer test below, I can't find a remaining reference to Bio Alphabet anywhere in the code base.) ##Running Kallisto## [build] loading fasta file /tracer/test_data/results/cell1/expression_quantification/kallisto_index/cell1_transcriptome.fa ##Quantifying with Kallisto## [quant] fragment length distribution will be estimated from the data ##Filtering by read count## |
I think the untraceability of the error is due to the Bio Alphabet embedding in the pkl test data reference files in directories like this: https://github.com/Teichlab/tracer/tree/master/test_data/results/cell2/unfiltered_TCR_seqs If that's true then the error is due to modern python not being able to load the old reference test results that were pickled. N (some text strings from the pkl file below) S'alphabet'p154g0(cBio.AlphabetHasStopCodonp155g2Ntp156Rp157(dp158S'stop_symbol'p159S'*'p160sg154g0(cBio.Alphabet.IUPACExtendedIUPACProteinp161g2Ntp162Rp163sS'letters' |
Thanks Nathan. Yes, I think you're right that the error comes from I think that a solution here would be to use an environment with the old BioPython to load those pickled files and then write them out as some kind of parseable text file (not as a pickle). The pickles are representations of a Line 10 in 84f53e5
Recombinant (Line 298 in 84f53e5
These classes aren't very complex so you could write out a text file containing their instance variables. You could then switch to an environment with the new version of BioPython, recreate the objects using the values in your text file and then repickle them. Those should then be compatible and Cheers, Mike |
Hello, I'm trying to build a running tracer on a more modern version of python (3.8.10). SInce then, Bio.Alphabet has been removed from python, and the recommendation is that calls to it (IUPAC) can be removed from most code without a problem.
Is it feasible to do this? Any know successes or issues with later versions of python?
Thank you.
File "/usr/local/lib/python3.8/site-packages/tracer-0.5-py3.8.egg/tracerlib/tracer_func.py", line 29, in
from Bio.Alphabet import IUPAC
File "/usr/local/lib/python3.8/site-packages/Bio/Alphabet/init.py", line 20, in
raise ImportError(
ImportError: Bio.Alphabet has been removed from Biopython. In many cases, the alphabet can simply be ignored and removed from scripts. In a few cases, you may need to specify the \
``molecule_type`
The text was updated successfully, but these errors were encountered: