Surjection based exploration issue #6

dduchen · 2019-10-28T01:47:05Z

Hello,
I'm attempting to surject aligned reads into individual references (stored as paths) and build a table of the output, as described within the virus pangenome / 'Surjection based exploration' exercise. I suppose some of the code may be deprecated, however I'm still interested in understanding where I'm going wrong.

Also I can't seem to find any documentation on surject, so any additional information on this function would be incredibly helpful.

Section describing the specific CPAN2018 workflow and code here: https://github.com/Pfern/PANGenomics/blob/master/exercises/HIV/ideas.md

Briefly:
Reference created from the 5-virus mix of HIV, found here: https://github.com/cbg-ethz/5-virus-mix/blob/master/data/REF.fasta.
Then map first 10k reads, then surject the reads into each one of the references

vg map -d REF4 -f <(zcat SRR961514_1.fastq.gz | head -40000) >SRR961514_1.first10k.gam 
#--
for ref in $(vg paths -L REF4.vg)
do ( vg surject -x REF4.xg -p $ref SRR961514_1.first10k.gam | vg view -a - \
    | jq -cr '[.name, "'$ref'", .identity]' | sed s/null/0/g | jq -cr @tsv ) | gzip >first10k.surj.$ref.tsv.gz
done

With some tinkering, I end up with tables that only include 0's for the path-relative identities for each read against each of the reference genomes/paths or my tables are all identical (seemingly not getting path/reference specific identity values for each read).

Any guidance would be greatly appreciated!
Thank you for your time

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Surjection based exploration issue #6

Surjection based exploration issue #6

dduchen commented Oct 28, 2019

Surjection based exploration issue #6

Surjection based exploration issue #6

Comments

dduchen commented Oct 28, 2019