Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Help with halPhyloPTrain.py #274

Open
xiaoyezao opened this issue Jun 15, 2023 · 7 comments
Open

Help with halPhyloPTrain.py #274

xiaoyezao opened this issue Jun 15, 2023 · 7 comments

Comments

@xiaoyezao
Copy link

Hello, I got the following error with halPhyloPTrain.py. Can you please help to debug this?

halPhyloPTrain.py $hal_file $reference $neutralRegions.bed $neutralModel.mod --numProc 12

Reading alignment from Lactuca_neutralModel_halPhyloPTrain_temp_NADOHCS_Lactuca_neutralModel_halPhyloPTrain_temp_NADOHCS_Lsat_1_Genome_v11.01.annotation_Maker.gff.tidy.chr8.4d4d.maf ...
ERROR msa_reorder_rows: covered[new_to_old[5]]=1 should be 0
Traceback (most recent call last):
  File "/home/CBS2021/app/cactus-bin-v2.5.2/bin/halPhyloPTrain.py", line 260, in <module>
    sys.exit(main())
             ^^^^^^
  File "/home/CBS2021/app/cactus-bin-v2.5.2/bin/halPhyloPTrain.py", line 257, in main
    computeModel(args)
  File "/home/CBS2021/app/cactus-bin-v2.5.2/bin/halPhyloPTrain.py", line 134, in computeModel
    computeAgMAFStats(options)
  File "/home/CBS2021/app/cactus-bin-v2.5.2/bin/halPhyloPTrain.py", line 103, in computeAgMAFStats
    runShellCommand("msa_view -o SS -z --in-format MAF --aggregate %s %s > %s" % (
  File "/home/CBS2021/app/cactus-bin-v2.5.2/lib/hal/stats/halStats.py", line 27, in runShellCommand
    raise RuntimeError("Command: %s exited with non-zero status %i" %
RuntimeError: Command: msa_view -o SS -z --in-format MAF --aggregate Anc0,T.koksaghyz,Anc1,L.virosa,Anc2,L.saligna,Anc3,L.serriola,L.sativa Lactuca_neutralModel_halPhyloPTrain_temp_NADOHCS*.maf > Lactuca_neutralModel_halPhyloPTrain_temp_NADOHCS.ss exited with non-zero status 1

The head of the intermediate *.maf file is:

##maf version=1 scoring=N/A

a
s       L.sativa.Lsat_1_v11_chr8        94777886        1       +       343517054       T
s       Anc0.Anc0refChr3        163905  1       +       189369  T
s       Anc1.Anc1refChr1593     44277   1       +       81021   T
s       Anc2.Anc2refChr2805     62794   1       -       240409  T
s       Anc3.Anc3refChr2040     413548  1       +       2493988 T
s       L.saligna.chr8  142086748       1       -       238633233       T
s       L.serriola.Lser_1_US96UC23_v10_chr8     93752352        1       +       329941051       T
s       L.virosa.Lvir_CGN04683_V4_scf4  186602678       1       -       342977835       T
s       T.koksaghyz.GWHBCHF00000009     5976022 1       +       111408619       T

a
s       L.sativa.Lsat_1_v11_chr8        94777904        1       +       343517054       T
s       Anc0.Anc0refChr3        163923  1       +       189369  T
s       Anc1.Anc1refChr1593     44295   1       +       81021   T
s       Anc2.Anc2refChr2805     62812   1       -       240409  T
s       Anc3.Anc3refChr2040     413566  1       +       2493988 T
s       L.saligna.chr8  142086766       1       -       238633233       T
s       L.serriola.Lser_1_US96UC23_v10_chr8     93752370        1       +       329941051       T
s       L.virosa.Lvir_CGN04683_V4_scf4  186602696       1       -       342977835       T
s       T.koksaghyz.GWHBCHF00000009     5976040 1       +       111408619       T

a
s       L.sativa.Lsat_1_v11_chr8        94777919        1       +       343517054       C
s       Anc0.Anc0refChr3        163938  1       +       189369  C
s       Anc1.Anc1refChr1593     44310   1       +       81021   C
s       Anc2.Anc2refChr2805     62827   1       -       240409  C
s       Anc3.Anc3refChr2040     413581  1       +       2493988 C
s       L.saligna.chr8  142086781       1       -       238633233       C
s       L.serriola.Lser_1_US96UC23_v10_chr8     93752385        1       +       329941051       C
s       L.virosa.Lvir_CGN04683_V4_scf4  186602711       1       -       342977835       C
s       T.koksaghyz.GWHBCHF00000009     5976055 1       +       111408619       C
@glennhickey
Copy link
Collaborator

I worry that the halPhyloP tools have gone stale since they haven't been tested for so long. It is a project we are talking about reviving. But in the meantime, I think you'd be best served by using cactus-hal2maf --dupeMode single to export a single-copy maf then run PhlyoP directly on the MAF.

@GeorgeBGM
Copy link

Hi, I want to do something similar, so the first step is to use cactus-hal2maf --dupeMode single to export a single-copy maf, but it generates the following error message, how should I solve the problem.

cactus-hal2maf js --workDir work --maxCores 5 --dupeMode single mc.full.hal mc.full.maf --chunkSize 1000000 --refGenome Chimpanzee

image

@GeorgeBGM
Copy link

Hi, are there any suggestions? I'm looking forward for your reply.

@glennhickey
Copy link
Collaborator

Please share your full log.

@GeorgeBGM
Copy link

Hi,the following is the full error message.

[2023-06-29T20:49:19+0800] [MainThread] [I] [toil.statsAndLogging] Setting batchCores to 20
[2023-06-29T20:49:19+0800] [MainThread] [I] [toil.statsAndLogging] Enabling realtime logging in Toil
[2023-06-29T20:49:19+0800] [MainThread] [I] [toil.statsAndLogging] Cactus Command: /home/ddu/Software/Anaconda/mambaforge-pypy3/envs/MC/bin/cactus-hal2maf /home/ddu/Project/Test/001.Merge_Test_V1/Merge-V1_Detail/js --workDir /home/ddu/Project/Test/001.Merge_Test_V1/Merge-V1_Detail/work --maxCores 20 --dupeMode single /home/ddu/Project/Test/001.Merge_Test_V1/Merge-V1_Detail/Merge-V1-mc.full.hal /home/ddu/Project/Test/001.Merge_Test_V1/Merge-V1_Detail/Merge-V1-mc.full.maf --chunkSize 3 --refGenome Chimpanzee --restart
[2023-06-29T20:49:19+0800] [MainThread] [I] [toil.statsAndLogging] Cactus Commit: a33d3eabb909873746ecd8e7e1528344e526d95b
[2023-06-29T20:49:19+0800] [MainThread] [I] [toil.statsAndLogging] Using default batch count of 1
[2023-06-29T20:49:20+0800] [MainThread] [C] [toil.jobStores.abstractJobStore] Repairing job: kind-hal2maf_batch/instance-88w6p79f
[2023-06-29T20:49:20+0800] [MainThread] [I] [toil] Running Toil version 5.11.0a1-ee11d4bc8e9a0d38c636208d0090c619bce76a4b on host cpu13.
[2023-06-29T20:49:20+0800] [MainThread] [I] [toil.realtimeLogger] Starting real-time logging.
[2023-06-29T20:49:22+0800] [MainThread] [I] [toil.leader] Issued job 'hal2maf_batch' kind-hal2maf_batch/instance-88w6p79f v4 with job batch system ID: 1 and disk: 675.0 Gi, memory: 2.0 Gi, cores: 20, accelerators: [], preemptible: False
[2023-06-29T20:49:24+0800] [MainThread] [I] [toil.leader] 1 jobs are running, 0 jobs are issued and waiting to run
[2023-06-29T20:49:31+0800] [MainThread] [I] [toil.worker] Redirecting logging to /home/ddu/Project/Test/001.Merge_Test_V1/Merge-V1_Detail/work/25669f8a9a105c5fa94e0c82e81877df/39e5/worker_log.txt
[2023-06-29T20:52:27+0800] [MainThread] [I] [toil-rt] Reading HAL file from job store to /home/ddu/Project/Test/001.Merge_Test_V1/Merge-V1_Detail/work/25669f8a9a105c5fa94e0c82e81877df/39e5/d115/tmp7k0r51a5/Merge-V1-mc.full.hal
[2023-06-29T21:49:26+0800] [MainThread] [I] [toil.leader] 1 jobs are running, 0 jobs are issued and waiting to run
[2023-06-29T22:25:47+0800] [Thread-1 ] [E] [toil.batchSystems.singleMachine] Got exit code -15 (indicating failure) from job _toil_worker hal2maf_batch file:/home/ddu/Project/Test/001.Merge_Test_V1/Merge-V1_Detail/js kind-hal2maf_batch/instance-88w6p79f.
[2023-06-29T22:25:48+0800] [MainThread] [W] [toil.leader] Job failed with exit value -15: 'hal2maf_batch' kind-hal2maf_batch/instance-88w6p79f v4
Exit reason: None
[2023-06-29T22:25:48+0800] [MainThread] [W] [toil.leader] No log file is present, despite job failing: 'hal2maf_batch' kind-hal2maf_batch/instance-88w6p79f v4
[2023-06-29T22:25:48+0800] [MainThread] [W] [toil.job] Due to failure we are reducing the remaining try count of job 'hal2maf_batch' kind-hal2maf_batch/instance-88w6p79f v4 with ID kind-hal2maf_batch/instance-88w6p79f to 1
[2023-06-29T22:25:48+0800] [MainThread] [I] [toil.leader] Issued job 'hal2maf_batch' kind-hal2maf_batch/instance-88w6p79f v5 with job batch system ID: 2 and disk: 675.0 Gi, memory: 2.0 Gi, cores: 20, accelerators: [], preemptible: False
[2023-06-29T22:26:48+0800] [MainThread] [I] [toil.worker] Redirecting logging to /home/ddu/Project/Test/001.Merge_Test_V1/Merge-V1_Detail/work/25669f8a9a105c5fa94e0c82e81877df/f0bd/worker_log.txt
[2023-06-29T22:30:02+0800] [MainThread] [I] [toil-rt] Reading HAL file from job store to /home/ddu/Project/Test/001.Merge_Test_V1/Merge-V1_Detail/work/25669f8a9a105c5fa94e0c82e81877df/f0bd/c076/tmpnwx0ah9k/Merge-V1-mc.full.hal
[2023-06-29T22:49:27+0800] [MainThread] [I] [toil.leader] 1 jobs are running, 0 jobs are issued and waiting to run
[2023-06-29T23:30:45+0800] [Thread-1 ] [E] [toil.batchSystems.singleMachine] Got exit code -15 (indicating failure) from job _toil_worker hal2maf_batch file:/home/ddu/Project/Test/001.Merge_Test_V1/Merge-V1_Detail/js kind-hal2maf_batch/instance-88w6p79f.
[2023-06-29T23:30:45+0800] [MainThread] [W] [toil.leader] Job failed with exit value -15: 'hal2maf_batch' kind-hal2maf_batch/instance-88w6p79f v5
Exit reason: None
[2023-06-29T23:30:45+0800] [MainThread] [W] [toil.leader] No log file is present, despite job failing: 'hal2maf_batch' kind-hal2maf_batch/instance-88w6p79f v5
[2023-06-29T23:30:46+0800] [MainThread] [W] [toil.job] Due to failure we are reducing the remaining try count of job 'hal2maf_batch' kind-hal2maf_batch/instance-88w6p79f v5 with ID kind-hal2maf_batch/instance-88w6p79f to 0
[2023-06-29T23:30:46+0800] [MainThread] [W] [toil.leader] Job 'hal2maf_batch' kind-hal2maf_batch/instance-88w6p79f v6 is completely failed
[2023-06-29T23:31:28+0800] [MainThread] [I] [toil.leader] Finished toil run with 3 failed jobs.
[2023-06-29T23:31:28+0800] [MainThread] [I] [toil.leader] Failed jobs at end of the run: 'hal2maf_batch' kind-hal2maf_batch/instance-88w6p79f v6 'hal2maf_workflow' kind-hal2maf_workflow/instance-v3_4s4en v2 'hal2maf_all' kind-hal2maf_ranges/instance-g1d_w2o2 v5
[2023-06-29T23:31:28+0800] [MainThread] [I] [toil.realtimeLogger] Stopping real-time logging server.
[2023-06-29T23:31:29+0800] [MainThread] [I] [toil.realtimeLogger] Joining real-time logging server thread.
Traceback (most recent call last):
File "/home/ddu/Software/Anaconda/mambaforge-pypy3/envs/MC/bin/cactus-hal2maf", line 8, in
sys.exit(main())
File "/home/ddu/Software/Anaconda/mambaforge-pypy3/envs/MC/lib/python3.9/site-packages/cactus/maf/cactus_hal2maf.py", line 173, in main
maf_id = toil.restart()
File "/home/ddu/Software/Anaconda/mambaforge-pypy3/envs/MC/lib/python3.9/site-packages/toil/common.py", line 1101, in restart
return self._runMainLoop(rootJobDescription)
File "/home/ddu/Software/Anaconda/mambaforge-pypy3/envs/MC/lib/python3.9/site-packages/toil/common.py", line 1511, in _runMainLoop
return Leader(config=self.config,
File "/home/ddu/Software/Anaconda/mambaforge-pypy3/envs/MC/lib/python3.9/site-packages/toil/leader.py", line 289, in run
raise FailedJobsException(self.jobStore, failed_jobs, exit_code=self.recommended_fail_exit_code)
toil.exceptions.FailedJobsException: The job store '/home/ddu/Project/Test/001.Merge_Test_V1/Merge-V1_Detail/js' contains 3 failed jobs: 'hal2maf_batch' kind-hal2maf_batch/instance-88w6p79f v6, 'hal2maf_workflow' kind-hal2maf_workflow/instance-v3_4s4en v2, 'hal2maf_all' kind-hal2maf_ranges/instance-g1d_w2o2 v5
Command exited with non-zero status 1
Command being timed: "bash 002.cmd_test_all_Phylo.sh"
User time (seconds): 8676.49
System time (seconds): 850.47
Percent of CPU this job got: 96%
Elapsed (wall clock) time (h:mm:ss or m:ss): 2:43:51
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 442428960
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 829
Minor (reclaiming a frame) page faults: 242907409
Voluntary context switches: 1087443
Involuntary context switches: 30726
Swaps: 0
File system inputs: 67947858
File system outputs: 106817984
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 1

@glennhickey
Copy link
Collaborator

Sorry, I can't really make that out. I've had some problems with --restart in the past. I'm guessing that's what's at issue here (the --restart option). I will just say that --chunkSize 3 is way too small and will certainly not help. If, say, you are working on a human alignment, then that will ask it to make ~1,000,000,000 hal2maf jobs, each for 3bp. which will certainly crash your filesystem.

@GeorgeBGM
Copy link

Hi, I will rerun this code using the --chunkSize 1000000 parameter.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants