Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error while running ss3_isoform.py #5

Open
kwglam opened this issue Jul 9, 2021 · 13 comments
Open

Error while running ss3_isoform.py #5

kwglam opened this issue Jul 9, 2021 · 13 comments

Comments

@kwglam
Copy link

kwglam commented Jul 9, 2021

Hi Angela,

I tried to do the isoform reconstruction by running your ss3_isoform.py script. However, the program halted with the following error messages. Would you please kindly advise what the potential problem is? Thanks!!

Preprocessing on input BAM ...
[bam_sort_core] merging from 104 files and 8 in-memory blocks...
[main_samview] fail to read the header from "/home/xxx/projects/Smart-seq3/ss3iso_210629/hsa/ss3iso_210629/preprocess/210624_Smartseq3.filtered.Aligned.GeneTagged.UBcorrected.sorted.bam".
[main_samview] fail to read the header from "/home/xxx/projects/Smart-seq3/ss3iso_210629/hsa/ss3iso_210629/preprocess/210624_Smartseq3.filtered.Aligned.GeneTagged.UBcorrected.sorted.bam".
[main_samview] fail to read the header from "-".
samtools index: "/home/xxx/projects/Smart-seq3/ss3iso_210629/hsa/ss3iso_210629/preprocess/210624_Smartseq3.filtered.Aligned.GeneTagged.UBcorrected.sorted.bam" is in a format that cannot be usefully indexed
samtools index: "/home/xxx/projects/Smart-seq3/ss3iso_210629/hsa/ss3iso_210629/preprocess/210624_Smartseq3.filtered.Aligned.GeneTagged.UBcorrected.sorted.bam" is in a format that cannot be usefully indexed
Collect informative reads per gene...
samtools index: "/home/xxx/projects/Smart-seq3/ss3iso_210629/hsa/ss3iso_210629/expression_ensembl/ex_210624_Smartseq3.filtered.Aligned.GeneTagged.UBcorrected.sorted.bam" is in a format that cannot be usefully indexed
samtools index: "/home/xxx/projects/Smart-seq3/ss3iso_210629/hsa/ss3iso_210629/expression_ensembl/ex_210624_Smartseq3.filtered.Aligned.GeneTagged.UBcorrected.sorted.bam" is in a format that cannot be usefully indexed
...for genes on 1
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/home/xxx/anaconda3/lib/python3.8/multiprocessing/pool.py", line 125, in worker
result = (True, func(*args, **kwds))
File "/home/xxx/anaconda3/lib/python3.8/multiprocessing/pool.py", line 48, in mapstar
return list(map(*args))
File "/home/xxx/projects/Smart-seq3/ss3iso/Smart-seq3/ss3iso/pyModule/informative_reads.py", line 479, in _get_reads
report_gene = gobj.get_aligned_reads(n_read_limit, passed_cells)
File "/home/xxx/projects/Smart-seq3/ss3iso/Smart-seq3/ss3iso/pyModule/informative_reads.py", line 84, in get_aligned_reads
samfile = pysam.AlignmentFile(self.in_bam_uniq, "rc")
File "pysam/libcalignmentfile.pyx", line 742, in pysam.libcalignmentfile.AlignmentFile.cinit
File "pysam/libcalignmentfile.pyx", line 947, in pysam.libcalignmentfile.AlignmentFile._open
ValueError: file does not contain alignment data
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/home/xxx/projects/Smart-seq3/ss3iso/Smart-seq3/ss3iso/ss3_isoform.py", line 109, in
main()
File "/home/xxx/projects/Smart-seq3/ss3iso/Smart-seq3/ss3iso/ss3_isoform.py", line 99, in main
fetch_gene_reads(in_bam_uniq, in_bam_multi, conf_data, op.species, out_path)
File "/home/xxx/projects/Smart-seq3/ss3iso/Smart-seq3/ss3iso/pyModule/informative_reads.py", line 550, in fetch_gene_reads
report_genes = pool.map(func, genes, chunksize=1)
File "/home/xxx/anaconda3/lib/python3.8/multiprocessing/pool.py", line 364, in map
return self._map_async(func, iterable, mapstar, chunksize).get()
File "/home/xxx/anaconda3/lib/python3.8/multiprocessing/pool.py", line 771, in get
raise self._value
ValueError: file does not contain alignment data

@PingChen-Angela
Copy link
Contributor

Hi, it looks like something is wrong in BAM files.

@kwglam
Copy link
Author

kwglam commented Aug 6, 2021

Hi Angela,
Thanks for the comment. After running zUMIs, 4 bam files are generated. The bam file ending with ".......filtered.Aligned.GeneTagged.UBcorrected.sorted.bam" is the only one comes together with .bai file. Is it the correct bam file for running ss3_isoform.py? I have successfully used this file to run stitcher.py, generating a sam file with stitched RNA molecules. Do you know if there is any way to check what problem the bam file has? Thanks!

@PingChen-Angela
Copy link
Contributor

@kwglam Hi, is this issue solved?

@kwglam
Copy link
Author

kwglam commented Sep 7, 2021

Yes, this issue has been solved. Thanks!

@kwglam kwglam closed this as completed Sep 7, 2021
@HaniJieunKim
Copy link

Hi @PingChen-Angela! Thanks for maintaining such a useful package!

Just following on from this thread regarding the inputs of ss3_isoform.py.. I have run zUMI and would now like to run the isoform matching.

Would filtered.tagged.Aligned.out.bam from running zUMI be the correct output into -i [path/to/inputBAM] ? I noticed in the above thread the following bam may be required filtered.Aligned.GeneTagged.UBcorrected.sorted.bam, which I think is the bam output from running zUMI with velocyte run.

Thanks in advance for clarifying.

Best regards,
Hani

@cziegenhain
Copy link
Collaborator

Hi Hani,

The *.filtered.tagged.Aligned.out.bam lacks gene assignment and UMI error correction, which are both needed for isoform inference.
The velocyto output from zUMIs has nothing to do with this and is labelled *.tagged.forVelocyto.bam
Hence, please use the *.filtered.Aligned.GeneTagged.UBcorrected.sorted.bam file.

Best,
Christoph

@HaniJieunKim
Copy link

I see, thanks Christoph for the clarification!

@xucaoling
Copy link

Hi cziegenhain,
When i use *.filtered.Aligned.GeneTagged.UBcorrected.sorted.bam for ss3_isoform.py,i have an error:
Preprocessing on input BAM ...
[bam_sort_core] merging from 88 files and 8 in-memory blocks...
Collect informative reads per gene...
...for genes on chr1
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/home/data/vip55/miniconda3/envs/zUMIs-env/lib/python3.6/multiprocessing/pool.py", line 119, in worker
result = (True, func(*args, **kwds))
File "/home/data/vip55/miniconda3/envs/zUMIs-env/lib/python3.6/multiprocessing/pool.py", line 44, in mapstar
return list(map(*args))
File "/home/data/vip55/software/Smart-seq3-master/ss3iso/pyModule/informative_reads.py", line 479, in _get_reads
report_gene = gobj.get_aligned_reads(n_read_limit, passed_cells)
File "/home/data/vip55/software/Smart-seq3-master/ss3iso/pyModule/informative_reads.py", line 84, in get_aligned_reads
samfile = pysam.AlignmentFile(self.in_bam_uniq, "rc")
File "pysam/libcalignmentfile.pyx", line 741, in pysam.libcalignmentfile.AlignmentFile.cinit
File "pysam/libcalignmentfile.pyx", line 990, in pysam.libcalignmentfile.AlignmentFile._open
ValueError: file has no sequences defined (mode='rc') - is it SAM/BAM format? Consider opening with check_sq=False
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/home/data/vip55/software/Smart-seq3-master/ss3iso/ss3_isoform.py", line 109, in
main()
File "/home/data/vip55/software/Smart-seq3-master/ss3iso/ss3_isoform.py", line 99, in main
fetch_gene_reads(in_bam_uniq, in_bam_multi, conf_data, op.species, out_path)
File "/home/data/vip55/software/Smart-seq3-master/ss3iso/pyModule/informative_reads.py", line 550, in fetch_gene_reads
report_genes = pool.map(func, genes, chunksize=1)
File "/home/data/vip55/miniconda3/envs/zUMIs-env/lib/python3.6/multiprocessing/pool.py", line 266, in map
return self._map_async(func, iterable, mapstar, chunksize).get()
File "/home/data/vip55/miniconda3/envs/zUMIs-env/lib/python3.6/multiprocessing/pool.py", line 644, in get
raise self._value
ValueError: file has no sequences defined (mode='rc') - is it SAM/BAM format? Consider opening with check_sq=False

and my code is:
$python /home/data/vip55/software/Smart-seq3-master/ss3iso/ss3_isoform.py -i smartseq3_mouse_fibroblast.filtered.Aligned.GeneTagged.UBcorrected.sorted.bam -e smartseq3_mouse_fibroblast -o ss3 -p 8 -s mm10 -P -R -c ss3_isoform.conf

so, what's wrong?

Best,
Anna

@xucaoling
Copy link

Yes, this issue has been solved. Thanks!

Hi Kwglam, How did you solve it?

@Shinichiro03
Copy link

Hi xucaoling,

I also have the same issue. Do you solve the issue?

Best,
Shin

@cziegenhain cziegenhain reopened this Jul 1, 2022
@lamyankin
Copy link

Hi xucaoling,

I also have the same issue. Do you solve the issue?

Best, Shin

Hi shinichiro03, have you solved the issue?

@kwglam
Copy link
Author

kwglam commented Nov 29, 2022

@lamyankin, @xucaoling, and @Shinichiro03,
I forgot what exactly the problems were coz I have not used it for quite a long time. My recollection is that you have to stick with the old version of bedtools (bedtoolsv.2.26 or older versions) and that you have to change umi_file_prefix = 'UBfix.sort.bam' into umi_file_prefix = 'UBcorrected.sorted.bam' on line 67 in the ss3_isoform.py script. Hope it works....

@lokeshbio
Copy link

After fixing the umi_file_prefix = 'UBcorrected.sorted.bam' problem, I get the following error! Does this look familiar? I couldn't quite figure out what the problem is!

multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/opt/conda/lib/python3.9/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/opt/conda/lib/python3.9/multiprocessing/pool.py", line 48, in mapstar
    return list(map(*args))
  File "/project/ss3iso/pyModule/informative_reads.py", line 468, in _get_reads
    gobj.get_exon_coordinates(gene)
  File "/project/ss3iso/pyModule/informative_reads.py", line 64, in get_exon_coordinates
    gene_id = fds[-1].split(';')[0].split('=')[1]
IndexError: list index out of range
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/project/ss3iso/ss3_isoform.py", line 112, in <module>
    main()
  File "/project/ss3iso/ss3_isoform.py", line 102, in main
    fetch_gene_reads(in_bam_uniq, in_bam_multi, conf_data, op.species, out_path)
  File "/project/ss3iso/pyModule/informative_reads.py", line 550, in fetch_gene_reads
    report_genes = pool.map(func, genes, chunksize=1)
  File "/opt/conda/lib/python3.9/multiprocessing/pool.py", line 364, in map
    return self._map_async(func, iterable, mapstar, chunksize).get()
  File "/opt/conda/lib/python3.9/multiprocessing/pool.py", line 771, in get
    raise self._value
IndexError: list index out of range

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants