Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot import pyabpoa without Internet - medaka doesn’t make a sequence consensus #116

Open
evfratov opened this issue Dec 16, 2024 · 5 comments
Labels
question Further information is requested

Comments

@evfratov
Copy link

Ask away!

Linux (both normal Ubuntu and WSL Ubuntu) wf-artic fails on demo data without network, argument '--update_dala false' does not help - consensus file contains empty sequences. barcodeNN.artic.log.txt contains this error (49 as an example):

[M::main] CMD: minimap2 -a -x map-ont -t 4 SARS-CoV-2/Midnight-ONT/V3/SARS-CoV-2.reference.fasta barcode49_barcode49.fastq
[M::main] Real time: 0.701 sec; CPU: 1.721 sec; Peak RSS: 0.048 GB
Cannot import pyabpoa, some features may not be available.
Failed to interpret '[email protected]:consensus' as a basecaller model.
Traceback (most recent call last):
  File "/opt/custflow/epi2meuser/conda/lib/python3.8/site-packages/medaka/medaka.py", line 36, in __call__
    model_fp = medaka.models.resolve_model(val)
  File "/opt/custflow/epi2meuser/conda/lib/python3.8/site-packages/medaka/models.py", line 46, in resolve_model
    raise ValueError(
ValueError: Model [email protected]:consensus is not a known model or existant file.

fail for current wf-artic 1.2.2, both 1.2.1 and 1.2.0
fail for current Nextflow 24.10.2, both 23.10.4 and 22.10.7
manual installation python3-puabpoa is useless

And with the network everything is ok except some probability of fail due to improper network stability.

What is wrong? Why a critical network dependency isn't documented?

@evfratov evfratov added the question Further information is requested label Dec 16, 2024
@ammaraziz
Copy link

ammaraziz commented Feb 5, 2025

I have this same issue. This is related to or caused by updating the ontresearch/wf-artic docker image.

edit: I am wrong. The issue I think for you is that the model you specified is wrong: [email protected]:consensus is wrong. What was the command you ran?

@ammaraziz
Copy link

ammaraziz commented Feb 5, 2025

I also get this error:

[M::mm_idx_gen::0.002*1.90] collected minimizers
[M::mm_idx_gen::0.004*2.64] sorted minimizers
[M::main::0.004*2.63] loaded/built the index for 1 target sequence(s)
[M::mm_mapopt_update::0.005*2.45] mid_occ = 3
[M::mm_idx_stat] kmer size: 15; skip: 10; is_hpc: 0; #seq: 1
[M::mm_idx_stat::0.005*2.40] distinct minimizers: 5587 (99.93% are singletons); average occurrences: 1.004; average spacing: 5.332; total length: 29903
[M::worker_pipeline::12.006*2.60] mapped 168736 sequences
[M::main] Version: 2.18-r1015
[M::main] CMD: minimap2 -a -x map-ont -t 4 SARS-CoV-2/ARTIC-ONT/V5.2/SARS-CoV-2.reference.fasta 24188419_24188419.fastq
[M::main] Real time: 12.007 sec; CPU: 31.221 sec; Peak RSS: 0.275 GB
Cannot import pyabpoa, some features may not be available.
Failed to interpret '[email protected]:consensus' as a basecaller model.
Traceback (most recent call last):
  File "/opt/custflow/epi2meuser/conda/lib/python3.8/site-packages/medaka/medaka.py", line 36, in __call__
    model_fp = medaka.models.resolve_model(val)
  File "/opt/custflow/epi2meuser/conda/lib/python3.8/site-packag+ mock_artic
+ echo 'Mocking artic results'
Mocking artic results

The error regarding pyabopoa is actually a warning and not relevant here. The issue is the model name of [email protected]:consensus.

@ammaraziz
Copy link

I think the issue stems from the auto model selector, however it works, messes with the model name. You can fix this by manually specifying the model using --override_basecaller_cfg.

@evfratov
Copy link
Author

evfratov commented Feb 5, 2025

I think the issue stems from the auto model selector, however it works, messes with the model name. You can fix this by manually specifying the model using --override_basecaller_cfg.

Direct specification of a basecalling model doesn't give any effect on the consensus assembly:

nextflow run epi2me-labs/wf-artic --fastq wf-artic-demo/fastq --update_data false --out_dir output-def-net with network work
nextflow run epi2me-labs/wf-artic --fastq wf-artic-demo/fastq --update_data false --out_dir output-def-nonet no network fail
nextflow run epi2me-labs/wf-artic --fastq wf-artic-demo/fastq --update_data false --override_basecaller_cfg [email protected] --out_dir output-cfg10-nonet no network fail
nextflow run epi2me-labs/wf-artic --fastq wf-artic-demo/fastq --update_data false --override_basecaller_cfg [email protected] --out_dir output-cfg9-nonet no network fail

@ammaraziz
Copy link

You are correct, it's not the model name but the lack of internet connection. The models are not available in the docker image, leading to the errors we are seeing.

We are facing the same issue, which is now ~6 months old:
#95

A hacky solution was posted by another user in the above issue. I'm testing it out today.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Development

No branches or pull requests

2 participants