Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

duplicate subread record issue #16

Closed
JYLeeBioinfo opened this issue Feb 18, 2021 · 3 comments
Closed

duplicate subread record issue #16

JYLeeBioinfo opened this issue Feb 18, 2021 · 3 comments

Comments

@JYLeeBioinfo
Copy link

There are duplicate subreads record after running C3POa.py

Can this be problematic for consensus reads generation by C3POa.py?
And also, is it okay to proceed the downstream 10xR2C2 processes with these duplicate reads in subreads.fastq?

[hrs@node35 Splint1]$ cat R2C2_Subreads.fastq | awk 'NR%4==1'  | uniq -c | awk '$1!=1' | head
      3 @8746a94c-ed1c-4415-8ff6-46b3410d05e4_subread_2
      3 @ae3b22d7-e17b-4c0a-a601-816dd374ea95_subread_2
      2 @d33dedd5-b1b1-4e79-8894-00353593798b_subread_0
      3 @d33dedd5-b1b1-4e79-8894-00353593798b_subread_1
      2 @d33dedd5-b1b1-4e79-8894-00353593798b_subread_2
      7 @d33dedd5-b1b1-4e79-8894-00353593798b_subread_3
      3 @f4837886-b2ea-4ed0-95d5-13a4c48a21c3_subread_17
      3 @686f952f-49da-4e18-9d98-5b6038b8cded_subread_12
      7 @d5da835f-1e94-4a42-a609-6743df4b8bdf_subread_0
      3 @d5da835f-1e94-4a42-a609-6743df4b8bdf_subread_2
[hrs@node35 Splint1]$ cat R2C2_Subreads.fastq | awk 'NR%4==0'  | uniq -c | awk '$1!=1' | head | less -S
      3 A?<8FAC<>?9678;(009;<.*8ABN@>?=>;@9:91):*//=IGTMPCIF=AA;;?BADAF?9<AAEGB=>?FGAA013AB25;41D=@B?>9>E<;<>G<<=?==%%'%$4:7*3+&&($%$*97,=D@B=GE9<A?BBKB;:':<<CC:0,@?&/5&'2A@@;:@BB>A@E=948:;754>>?;=8?AFG9'0,;?+((&&??DBC
      3 AB??.03;<9<KHHD3<:@@:C@7=@@7)@<:>?A5?::53(878824<AB=HHIBCB5?44124->CCBADD?@L68946A@>25DIGCFDFE:[email protected])*+*/5799;;6/&%&7<<769?84:944@?DHIDGEI9;CBDPSIHAFIBHCC>AB>?@-2@BH0DIJLE>A@=?44AGHDKKJAA<-=CC?
      2 26022;4<AD8;>EB6+.<AA54:500-=?;?<>,,,++8..04FFPMHAC8<>?:/--'*(.-/((-<=?DCKDEB????204,.88::;>79;?<5632%'222BC=?B=:?C;HHE=9CED?LJG?4HG?/;<?A?=75998:620.,*))('''&&####"%#$$$$$$%%&&&'())((''&%'14<D>:/,?=>ACG8-%.GEI
      3 101;:@BBC=?AB=47&;??-+&$*+-+%%$$##$"%*/0;6CDDE@<90..$&$&%<?>@@A@AFECDE?B>BDC><.568@?><=?9<<=90:88<400/-&&)+:A=A@9:8<:5111&+187CC928EDB===?8010-)1>?2,,,+++++++,,-//00/..-+)'&&%%%%03578;88:=01''8''&)----.//0**78:
      2 =<7.FA@<>B:=A@;0)*8FB205A?C>BBA?7+%(()**34<<CAC1)(7221+*23-%$+(057:@>AA=B1))-.76886=FEAD?==;-1555;+))++27:01=?=EFA?7:A<CIEADE@@>><BA5;=<9=CGBGB7%%(<?2'&%%%$$$$$$$$%%%%%%&&''''''''''''&&''''''&&&&%*129?BC=II@;;9
      7 -0.--8<?C;DDF<?EFA79<?CC=CD:,:?:587<<>BF:><8<;2@>;<>41-;-57?A>=<</+&1$%+0AA=E:>04;?DGHGKKMHJB4*(*=D39;AMNEQNNLLEBIB=@EE?A@E@DDBSKD<;55@55435+0--11&/3/>+28)70.$+443:,A:71)$:65=>>;8;CAD8>EDCGEAE?EADJJKKD@A@C;?A?=
      3 34+)555336&+-5*3:(<E;024589:;2//)$$$$('257:9?=:99+7).676:;=;;;;:4*+,$,*++)'''&3236622995893,45:633712$$0.&'487986646<57/*022:8>>A@>7),68863966-**022/.,+*('$*//45642748:<3-239-657'$%$%$%#&$$)*-0/,%.8;:9;=:7-'((+
      3 5=??>..??@UPRKAF@CG?C?%/:*,6;<>@B=9BHGJ=9<67:KJ=CKHBOG@;<@9.7:=F>DDFHECEAAAKLIB8<C@89;BHD>?@AA?FFC9854420/0,/:>=>;AEHCC@>C?A--1<AIGG<>9B?A3;A;<@;:C<<CDHSO3>?B?AB>:F@KIGCMC;;=@?66999<@@B=<732889=;79?;<=>@>--885(
      7 >?$'),)*+576><>B@026><>>A;()8<=5GEAGF;01+1,,7DCC=7=>>0:8A;<>A>CB@>:9:??+''01/29.100)&+')563$$$$%&967:<A=D=;<;ACBAED@B3--@5?>A>@;<555,1>?>AD@DEFBA8((*7&547?;0DFCAB@A>>@AEBLLE@>98(*4478?9>>CAEIBAECI>@E@AH>BEF??@?
      3 )%&,*01212<:<677<+048897-<?.*&)8221'(%99>=?>*:@AA<;:;7=?<&&+*(+FCD66>AA=D?<;'D?;B&&&4<4<9<@A+8<CBL<G??<?<<?>G:?91%111.:A?9=;A:@?DC;>?CE>C@'++,214;=@CE<DE8;IC??2B>>?@>C><-*+8:;AKHJMDC<<=DNNL5QHC:>DF><245JH:BC<<=
@rvolden
Copy link
Owner

rvolden commented Feb 18, 2021

What version are you running? Subread naming was changed back to the way it was in v1.0.0 (readName_subreadNum). If you pull the most recent version, I don't think you'll have this issue. I also did a quick check on my test dataset and I found no duplicate subread headers.

@JYLeeBioinfo
Copy link
Author

Thank you for your fast response! I cloned this repo by Jan 28th.
Let me check if the issues will be resolved with the new version.

@JYLeeBioinfo
Copy link
Author

New version of the C3POa resolved the issue! Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants