Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bugs in miassembler #31

Merged
merged 8 commits into from
Jan 23, 2025
Merged

Fix bugs in miassembler #31

merged 8 commits into from
Jan 23, 2025

Conversation

Ge94
Copy link
Member

@Ge94 Ge94 commented Dec 17, 2024

Draft PR for numerous bugs to be fixed in miassembler:

  • adapted human reference genome name to genome name in codon
  • adjusted modules name in modules.config

@Ge94 Ge94 changed the title Fixed regex in modules.config + codon ref genome Fix bugs in miassembler Dec 17, 2024
@jmattock5
Copy link
Contributor

Solving the following bugs:

  • pipeline failed when the assembly, after host cleaning, was empty
  • ENA rejecting assemblies being uploaded that contained fewer than 2 contigs
  • Scientific notation error when the jgi_summarize_contigs output was being used to calculate depth - due to long contigs who's length was for example 2.60658e+06

I have added a filtering step to short_reads_assembly_qc.nf, after the host cleaning, which counts the number of contigs and only generates cleaned_contigs.fa.gz if the threshold is met - I had to remove the extra metadata generated during this quality check for the rest of the pipeline to run. I've set the default as 2 contigs so this fix solves problem 1&2.

I think there was still an issue with which file was being published in PUBLISH_CLEANED_CONTIGS. It was filtered_contigs which I think meant any non-human host cleaning still wasn't being used. Please check that this logic makes sense in my changes!

I've added another if statement to short_reads_assembly_qc.nf so if no reference is specified it still uses the length and human filtered outputs. Again please check this logic.

The qc_failed_runs csv generated now includes samples that fail the assembly qc check I've added. We could split this into another csv if we wanted this to be separate from the read qc output.

@jmattock5 jmattock5 marked this pull request as ready for review January 17, 2025 15:21
@jmattock5 jmattock5 requested a review from mberacochea January 17, 2025 15:22
Copy link
Member

@mberacochea mberacochea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, @jmattock5. It looks pretty good to me. I added some nitpicky comments; I've learned that having clear names makes it easier to follow the code, so I've included some suggestions to improve them.

subworkflows/local/short_reads_assembly_qc.nf Outdated Show resolved Hide resolved
subworkflows/local/short_reads_assembly_qc.nf Outdated Show resolved Hide resolved
subworkflows/local/short_reads_assembly_qc.nf Show resolved Hide resolved
subworkflows/local/short_reads_assembly_qc.nf Outdated Show resolved Hide resolved
subworkflows/local/short_reads_assembly_qc.nf Outdated Show resolved Hide resolved
workflows/miassembler.nf Outdated Show resolved Hide resolved
workflows/miassembler.nf Outdated Show resolved Hide resolved
workflows/short_reads_assembler.nf Outdated Show resolved Hide resolved
workflows/short_reads_assembler.nf Outdated Show resolved Hide resolved
Copy link
Member Author

@Ge94 Ge94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good stuff @jmattock5 !

README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
bin/calculate_assembly_coverage.py Show resolved Hide resolved
nextflow_schema.json Outdated Show resolved Hide resolved
subworkflows/local/short_reads_assembly_qc.nf Outdated Show resolved Hide resolved
tests/main.nf.test Show resolved Hide resolved
workflows/short_reads_assembler.nf Outdated Show resolved Hide resolved
@jmattock5
Copy link
Contributor

Thanks for your comments! I've made all the changes you suggested but for two, please look at my comments for justification

Copy link
Member

@mberacochea mberacochea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work 🏅 @jmattock5

@jmattock5 jmattock5 merged commit 644311f into dev Jan 23, 2025
3 checks passed
@jmattock5 jmattock5 deleted the bugfix/various_bug_fixes branch January 23, 2025 13:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants