Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Random Subsampling to Nanopore mNGS Pipeline #371

Merged
merged 1 commit into from
Jun 20, 2024

Conversation

phoenixAja
Copy link
Contributor

@phoenixAja phoenixAja commented Jun 17, 2024

Switching from using head for subsampling to using seqtk sample. I timed the head vs. seqtk sample approach out on a 6.7 GB nanopore sample and the seqtk option took about 2.3 times longer to run then head, but only took 27 seconds to run. Since 6.7 GB is on the larger side for a nanopore sample i think we're safe subbing out head for seqtk sample

The long-read-mngs docker container already has seqtk installed, so no changes are needed to add that in there.

root@50040e0c1bc9:/mnt/long-read-mngs# time head -4000000 beefy.sample.validated.fastq > beefy_sample_head.subsampled.fastq

real	0m11.675s
user	0m4.105s
sys	0m7.569s
root@50040e0c1bc9:/mnt/long-read-mngs# time seqtk sample -s42  beefy.sample.validated.fastq 4000000 > beefy_sample_seqtk.subsampled.fastq

real	0m27.252s
user	0m10.956s
sys	0m16.262s

ticket: CZID-9693

tested on this sample on staging and cross referenced this sample that used the existing subsampling scheme to verify that new approach gave similar results

@phoenixAja phoenixAja marked this pull request as ready for review June 20, 2024 19:25
@phoenixAja phoenixAja requested review from rzlim08 and a team June 20, 2024 19:26
Copy link
Collaborator

@rzlim08 rzlim08 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

@phoenixAja phoenixAja merged commit a7c54be into main Jun 20, 2024
15 checks passed
@phoenixAja phoenixAja deleted the phoenix/random-subsample-mNGS branch June 20, 2024 20:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants