This document describes running the TOPMed RNA-seq pipeline in the Helium DataCommons environment.
Steps:
- Register files into iRODS
- STAR index
- RSEM reference
- Genome FASTA
- Genes GTF
- Unpack
tar
andgz
files.
$ cd /renci/irods/topmed-demo/ $ gunzip -c LC_C13_cRNA_sequence_R1.txt.gz > LC_C13_cRNA_sequence_R1.fastq $ gunzip -c LC_C13_cRNA_sequence_R2.txt.gz > LC_C13_cRNA_sequence_R2.fastq $ mkdir star_index $ tar -xvvf STAR_genome_GRCh38_noALT_noHLA_noDecoy_ERCC_v26_oh100.tar.gz -C star_index --strip-components=1
- Run
ireg
command for all files into desired resource.
- Start an appliance using the instructions here.
- Run with toil
cwltoil --noLinkImports --jobStore /<path>/jobstore1 --batchSystem chronos --workDir ./ /<path>/<workflow>.cwl /<path>/<workflow-inputs>.yml