Very few subreads and consensus reads despite many reads after preprocessing #20

kvg · 2021-06-02T06:22:34Z

Hello,
I'm testing out C3POa v2.2.3 on a small test dataset (176,000 reads from a much larger PromethION run). I'm hoping to use C3POa's demultiplexing feature and I've prepared a splints file with four sequences. Initial processing looks good at first:

$ # python3 C3POa.py -r /data/chunk.fastq -s /data/splints.fasta -l 100 -d 500 -g 1000 -o out
Aligning splints to reads with blat
Preprocessing:  99%|█████████████████████████████████████████████████████████████████████████▌| 176/177 [02:09<00:00,  1.36it/s]
Catting psls: 100%|██████████████████████████████████████████████████████████████████████████| 176/176 [00:01<00:00, 129.95it/s]
Removing preprocessing files: 100%|█████████████████████████████████████████████████████████| 176/176 [00:00<00:00, 2590.99it/s]
Calling consensi:   0%|                                                                                 | 0/177 [02:25<?, ?it/s]
Catting consensus reads: 100%|█████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 11949.58it/s]
Catting subreads: 100%|███████████████████████████████████████████████████████████████████████| 17/17 [00:00<00:00, 7898.00it/s]
Removing files: 100%|█████████████████████████████████████████████████████████████████████████| 17/17 [00:00<00:00, 4450.05it/s]
Catting consensus reads: 100%|██████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 9177.91it/s]
Catting subreads: 100%|███████████████████████████████████████████████████████████████████████| 13/13 [00:00<00:00, 8104.33it/s]
Removing files: 100%|█████████████████████████████████████████████████████████████████████████| 13/13 [00:00<00:00, 4262.17it/s]
Catting consensus reads: 100%|███████████████████████████████████████████████████████████████| 16/16 [00:00<00:00, 13879.81it/s]
Catting subreads: 100%|██████████████████████████████████████████████████████████████████████| 87/87 [00:00<00:00, 11671.71it/s]
Removing files: 100%|█████████████████████████████████████████████████████████████████████████| 87/87 [00:00<00:00, 4974.09it/s]
Catting consensus reads: 100%|████████████████████████████████████████████████████████████████| 12/12 [00:00<00:00, 7833.72it/s]
Catting subreads: 100%|██████████████████████████████████████████████████████████████████████| 82/82 [00:00<00:00, 10553.00it/s]
Removing files: 100%|█████████████████████████████████████████████████████████████████████████| 82/82 [00:00<00:00, 4555.04it/s]
(lr-c3poa) root@f8924132b3ed:/#

$ cat out/c3poa.log
C3POa version: v2.2.3
Total reads: 176000
No splint reads: 37306 (21.20%)
Under len cutoff: 0 (0.00%)
Total thrown away reads: 37306 (21.20%)
Reads after preprocessing: 138694

However, in checking the output subread and consensus files, I see very few entries:

# grep -c '^[>@]' out/10x_Splint_*/*
out/10x_Splint_1/R2C2_Consensus.fasta:4
out/10x_Splint_1/R2C2_Subreads.fastq:96
out/10x_Splint_2/R2C2_Consensus.fasta:1
out/10x_Splint_2/R2C2_Subreads.fastq:60
out/10x_Splint_3/R2C2_Consensus.fasta:19
out/10x_Splint_3/R2C2_Subreads.fastq:282
out/10x_Splint_4/R2C2_Consensus.fasta:13
out/10x_Splint_4/R2C2_Subreads.fastq:325

These seem like awfully low numbers to me, but it's not clear to me where they're getting lost. Shouldn't the total number of subreads add up to reads after preprocessing? And assuming 5-10 passes per subreads, shouldn't the number of consensus reads be somewhere between 14k - 30k reads? Is there a way to know what's happening to the rest of the reads? Or is my understanding simply incorrect?

Thanks,
-Kiran

The text was updated successfully, but these errors were encountered:

Usamahussein551980 · 2023-11-03T09:03:31Z

Hello Kiran,

I'm trying to analyze the long read data generated from the ONT sequencer according to the C3POa work flow, but the pre-processing doesn't continue and finished at Calling consensi.
Despite the tools were installed with their dependencies, and I prepared the UMI_Splint.fasta used in the experiment, but unfortunately the process stopped as showed below:

command:
(base) [ukhussein@ldragon3 C3POa-2.2.3]$ python3 C3POa.py -r ../../projects/nanopore_R2C2/10X_071_R2C2/test/dngqu0264_71_fastq_pass.tar.gz -s ./UMI_Splint.fasta/UMI_Splints.fasta -d 500 -l 100 -g 1000 -n 32 -o out2

abpoa

Output:
pr-processing

Log Contents:
$ cat/out/c3poa.log
C3POa version: v2.2.3
Total reads: 1687451
No splint reads: 1505291 (89.21%)
Under len cutoff: 15 (0.00%)
Total thrown away reads: 1505306 (89.21%)
Reads after preprocessing: 182145

Could you please help me to figure out what is the problem?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Very few subreads and consensus reads despite many reads after preprocessing #20

Very few subreads and consensus reads despite many reads after preprocessing #20

kvg commented Jun 2, 2021 •

edited

Loading

Usamahussein551980 commented Nov 3, 2023

Very few subreads and consensus reads despite many reads after preprocessing #20

Very few subreads and consensus reads despite many reads after preprocessing #20

Comments

kvg commented Jun 2, 2021 • edited Loading

Usamahussein551980 commented Nov 3, 2023

kvg commented Jun 2, 2021 •

edited

Loading