We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
hello,
I am a beginner under Kaldi and I'am trying to finetune danzu model by mini-librispeech data (juste a simple try) to understand the process.
I have firsty prepared data, coputed MFCC and i have then used this script for finetuning https://github.com/kaldi-asr/kaldi/blob/master/egs/aishell2/s5/local/nnet3/tuning/finetune_tdnn_1a.sh
I have used https://github.com/daanzu/kaldi-active-grammar/releases/download/v1.8.0/tree_sp.zip as it is the tree directory for the most recent models (i have used mainly the ali.x.gz files).
i have faced the below issue:
2021-05-30 14:27:20,165 [steps/nnet3/train_dnn.py:36 - - INFO ] Starting DNN trainer (train_dnn.py) steps/nnet3/train_dnn.py --stage=-10 --cmd=run.pl --mem 4G --feat.cmvn-opts=--norm-means=false --norm-vars=false --trainer.input-model exp/nnet3/tdnn_sp_train/input.raw --trainer.num-epochs 5 --trainer.optimization.num-jobs-initial 1 --trainer.optimization.num-jobs-final 1 --trainer.optimization.initial-effective-lrate 0.0005 --trainer.optimization.final-effective-lrate 0.00002 --trainer.optimization.minibatch-size 1024 --feat-dir data/train_hires --lang data/lang --ali-dir exp/train_ali --dir exp/nnet3/tdnn_sp_train ['steps/nnet3/train_dnn.py', '--stage=-10', '--cmd=run.pl --mem 4G', '--feat.cmvn-opts=--norm-means=false --norm-vars=false', '--trainer.input-model', 'exp/nnet3/tdnn_sp_train/input.raw', '--trainer.num-epochs', '5', '--trainer.optimization.num-jobs-initial', '1', '--trainer.optimization.num-jobs-final', '1', '--trainer.optimization.initial-effective-lrate', '0.0005', '--trainer.optimization.final-effective-lrate', '0.00002', '--trainer.optimization.minibatch-size', '1024', '--feat-dir', 'data/train_hires', '--lang', 'data/lang', '--ali-dir', 'exp/train_ali', '--dir', 'exp/nnet3/tdnn_sp_train'] 2021-05-30 14:27:20,172 [steps/nnet3/train_dnn.py:178 - train - INFO ] Arguments for the experiment {'ali_dir': 'exp/train_ali', 'backstitch_training_interval': 1, 'backstitch_training_scale': 0.0, 'cleanup': True, 'cmvn_opts': '--norm-means=false --norm-vars=false', 'combine_sum_to_one_penalty': 0.0, 'command': 'run.pl --mem 4G', 'compute_per_dim_accuracy': False, 'dir': 'exp/nnet3/tdnn_sp_train', 'do_final_combination': True, 'dropout_schedule': None, 'egs_command': None, 'egs_dir': None, 'egs_opts': None, 'egs_stage': 0, 'email': None, 'exit_stage': None, 'feat_dir': 'data/train_hires', 'final_effective_lrate': 2e-05, 'frames_per_eg': 8, 'initial_effective_lrate': 0.0005, 'input_model': 'exp/nnet3/tdnn_sp_train/input.raw', 'lang': 'data/lang', 'max_lda_jobs': 10, 'max_models_combine': 20, 'max_objective_evaluations': 30, 'max_param_change': 2.0, 'minibatch_size': '1024', 'momentum': 0.0, 'num_epochs': 5.0, 'num_jobs_compute_prior': 10, 'num_jobs_final': 1, 'num_jobs_initial': 1, 'num_jobs_step': 1, 'online_ivector_dir': None, 'preserve_model_interval': 100, 'presoftmax_prior_scale_power': -0.25, 'prior_subset_size': 20000, 'proportional_shrink': 0.0, 'rand_prune': 4.0, 'remove_egs': True, 'reporting_interval': 0.1, 'samples_per_iter': 400000, 'shuffle_buffer_size': 5000, 'srand': 0, 'stage': -10, 'train_opts': [], 'use_gpu': 'yes'} nnet3-info exp/nnet3/tdnn_sp_train/input.raw 2021-05-30 14:27:20,373 [steps/nnet3/train_dnn.py:238 - train - INFO ] Generating egs steps/nnet3/get_egs.sh --cmd run.pl --mem 4G --cmvn-opts --norm-means=false --norm-vars=false --online-ivector-dir --left-context 34 --right-context 34 --left-context-initial -1 --right-context-final -1 --stage 0 --samples-per-iter 400000 --frames-per-eg 8 --srand 0 data/train_hires exp/train_ali exp/nnet3/tdnn_sp_train/egs steps/nnet3/get_egs.sh: creating egs. To ensure they are not deleted later you can do: touch exp/nnet3/tdnn_sp_train/egs/.nodelete steps/nnet3/get_egs.sh: feature type is raw, with 'apply-cmvn' steps/nnet3/get_egs.sh: working out number of frames of training data steps/nnet3/get_egs.sh: working out feature dim *** steps/nnet3/get_egs.sh: warning: the --frames-per-eg is too large to generate one archive with *** as many as --samples-per-iter egs in it. Consider reducing --frames-per-eg. steps/nnet3/get_egs.sh: creating 1 archives, each with 238983 egs, with steps/nnet3/get_egs.sh: 8 labels per example, and (left,right) context = (34,34) steps/nnet3/get_egs.sh: copying data alignments copy-int-vector ark:- ark,scp:exp/nnet3/tdnn_sp_train/egs/ali.ark,exp/nnet3/tdnn_sp_train/egs/ali.scp ERROR (copy-int-vector[5.5.929~1539-9bca2]:ReadBasicType():base/io-funcs-inl.h:68) ReadBasicType: did not get expected integer type, 0 vs. 4. You can change this code to successfully read it later, if needed.
[ Stack-Trace: ] /opt/kaldi/src/lib/libkaldi-base.so(kaldi::MessageLogger::LogMessage() const+0x793) [0x7f27463671c3] copy-int-vector(kaldi::MessageLogger::LogAndThrow::operator=(kaldi::MessageLogger const&)+0x25) [0x5650dbd339dd] copy-int-vector(kaldi::BasicVectorHolder::Read(std::istream&)+0xba9) [0x5650dbd3adcb] copy-int-vector(kaldi::SequentialTableReaderArchiveImpl<kaldi::BasicVectorHolder >::Next()+0xf3) [0x5650dbd3b159] copy-int-vector(main+0x484) [0x5650dbd3320d] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf3) [0x7f2745f5f0b3] copy-int-vector(_start+0x2e) [0x5650dbd32cce]
WARNING (copy-int-vector[5.5.9291539-9bca2]:Read():util/kaldi-holder-inl.h:308) BasicVectorHolder::Read, read error or unexpected data at archive entry beginning at file position 18446744073709551615 WARNING (copy-int-vector[5.5.9291539-9bca2]:Next():util/kaldi-table-inl.h:574) Object read failed, reading archive standard input LOG (copy-int-vector[5.5.9291539-9bca2]:main():copy-int-vector.cc:83) Copied 2697018 vectors of int32. ERROR (copy-int-vector[5.5.9291539-9bca2]:~SequentialTableReaderArchiveImpl():util/kaldi-table-inl.h:678) TableReader: error detected closing archive standard input
[ Stack-Trace: ] /opt/kaldi/src/lib/libkaldi-base.so(kaldi::MessageLogger::LogMessage() const+0x793) [0x7f27463671c3] copy-int-vector(kaldi::MessageLogger::LogAndThrow::operator=(kaldi::MessageLogger const&)+0x25) [0x5650dbd339dd] copy-int-vector(kaldi::SequentialTableReaderArchiveImpl<kaldi::BasicVectorHolder >::~SequentialTableReaderArchiveImpl()+0x121) [0x5650dbd3734d] copy-int-vector(kaldi::SequentialTableReaderArchiveImpl<kaldi::BasicVectorHolder >::~SequentialTableReaderArchiveImpl()+0xd) [0x5650dbd3764d] copy-int-vector(kaldi::SequentialTableReader<kaldi::BasicVectorHolder >::~SequentialTableReader()+0x16) [0x5650dbd37816] copy-int-vector(main+0x520) [0x5650dbd332a9] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf3) [0x7f2745f5f0b3] copy-int-vector(_start+0x2e) [0x5650dbd32cce]
terminate called after throwing an instance of 'kaldi::KaldiFatalError' what(): kaldi::KaldiFatalError
gzip: stdout: Broken pipe
gzip: stdout: Broken pipe steps/nnet3/get_egs.sh: line 272: 1101381 Exit 1 for id in $(seq $num_ali_jobs); do gunzip -c $alidir/ali.$id.gz; done 1101382 Aborted (core dumped) | copy-int-vector ark:- ark,scp:$dir/ali.ark,$dir/ali.scp Traceback (most recent call last): File "steps/nnet3/train_dnn.py", line 459, in main train(args, run_opts) File "steps/nnet3/train_dnn.py", line 253, in train stage=args.egs_stage) File "steps/libs/nnet3/train/frame_level_objf/acoustic_model.py", line 61, in generate_egs egs_opts=egs_opts if egs_opts is not None else '')) File "steps/libs/common.py", line 129, in execute_command p.returncode, command)) Exception: Command exited with status 1: steps/nnet3/get_egs.sh --cmd "run.pl --mem 4G" --cmvn-opts "--norm-means=false --norm-vars=false" --online-ivector-dir "" --left-context 34 --right-context 34 --left-context-initial -1 --right-context-final -1 --stage 0 --samples-per-iter 400000 --frames-per-eg 8 --srand 0 data/train_hires exp/train_ali exp/nnet3/tdnn_sp_train/egs
(GPU) /scratch/asma-kaldi/egs/aishell2/s5$ bash local/nnet3/tuning/finetune_tdnn_1a.sh 2021-05-30 14:31:18,075 [steps/nnet3/train_dnn.py:36 - - INFO ] Starting DNN trainer (train_dnn.py) steps/nnet3/train_dnn.py --stage=-10 --cmd=run.pl --mem 4G --feat.cmvn-opts=--norm-means=false --norm-vars=false --trainer.input-model exp/nnet3/tdnn_sp_train/input.raw --trainer.num-epochs 5 --trainer.optimization.num-jobs-initial 1 --trainer.optimization.num-jobs-final 1 --trainer.optimization.initial-effective-lrate 0.0005 --trainer.optimization.final-effective-lrate 0.00002 --trainer.optimization.minibatch-size 1024 --feat-dir data/train_hires --lang data/lang --ali-dir exp/train_ali --dir exp/nnet3/tdnn_sp_train ['steps/nnet3/train_dnn.py', '--stage=-10', '--cmd=run.pl --mem 4G', '--feat.cmvn-opts=--norm-means=false --norm-vars=false', '--trainer.input-model', 'exp/nnet3/tdnn_sp_train/input.raw', '--trainer.num-epochs', '5', '--trainer.optimization.num-jobs-initial', '1', '--trainer.optimization.num-jobs-final', '1', '--trainer.optimization.initial-effective-lrate', '0.0005', '--trainer.optimization.final-effective-lrate', '0.00002', '--trainer.optimization.minibatch-size', '1024', '--feat-dir', 'data/train_hires', '--lang', 'data/lang', '--ali-dir', 'exp/train_ali', '--dir', 'exp/nnet3/tdnn_sp_train'] 2021-05-30 14:31:18,082 [steps/nnet3/train_dnn.py:178 - train - INFO ] Arguments for the experiment {'ali_dir': 'exp/train_ali', 'backstitch_training_interval': 1, 'backstitch_training_scale': 0.0, 'cleanup': True, 'cmvn_opts': '--norm-means=false --norm-vars=false', 'combine_sum_to_one_penalty': 0.0, 'command': 'run.pl --mem 4G', 'compute_per_dim_accuracy': False, 'dir': 'exp/nnet3/tdnn_sp_train', 'do_final_combination': True, 'dropout_schedule': None, 'egs_command': None, 'egs_dir': None, 'egs_opts': None, 'egs_stage': 0, 'email': None, 'exit_stage': None, 'feat_dir': 'data/train_hires', 'final_effective_lrate': 2e-05, 'frames_per_eg': 8, 'initial_effective_lrate': 0.0005, 'input_model': 'exp/nnet3/tdnn_sp_train/input.raw', 'lang': 'data/lang', 'max_lda_jobs': 10, 'max_models_combine': 20, 'max_objective_evaluations': 30, 'max_param_change': 2.0, 'minibatch_size': '1024', 'momentum': 0.0, 'num_epochs': 5.0, 'num_jobs_compute_prior': 10, 'num_jobs_final': 1, 'num_jobs_initial': 1, 'num_jobs_step': 1, 'online_ivector_dir': None, 'preserve_model_interval': 100, 'presoftmax_prior_scale_power': -0.25, 'prior_subset_size': 20000, 'proportional_shrink': 0.0, 'rand_prune': 4.0, 'remove_egs': True, 'reporting_interval': 0.1, 'samples_per_iter': 400000, 'shuffle_buffer_size': 5000, 'srand': 0, 'stage': -10, 'train_opts': [], 'use_gpu': 'yes'} nnet3-info exp/nnet3/tdnn_sp_train/input.raw 2021-05-30 14:31:18,286 [steps/nnet3/train_dnn.py:238 - train - INFO ] Generating egs steps/nnet3/get_egs.sh --cmd run.pl --mem 4G --cmvn-opts --norm-means=false --norm-vars=false --online-ivector-dir --left-context 34 --right-context 34 --left-context-initial -1 --right-context-final -1 --stage 0 --samples-per-iter 400000 --frames-per-eg 8 --srand 0 data/train_hires exp/train_ali exp/nnet3/tdnn_sp_train/egs steps/nnet3/get_egs.sh: creating egs. To ensure they are not deleted later you can do: touch exp/nnet3/tdnn_sp_train/egs/.nodelete steps/nnet3/get_egs.sh: feature type is raw, with 'apply-cmvn' steps/nnet3/get_egs.sh: working out number of frames of training data steps/nnet3/get_egs.sh: working out feature dim *** steps/nnet3/get_egs.sh: warning: the --frames-per-eg is too large to generate one archive with *** as many as --samples-per-iter egs in it. Consider reducing --frames-per-eg. steps/nnet3/get_egs.sh: creating 1 archives, each with 238983 egs, with steps/nnet3/get_egs.sh: 8 labels per example, and (left,right) context = (34,34) steps/nnet3/get_egs.sh: copying data alignments copy-int-vector ark:- ark,scp:exp/nnet3/tdnn_sp_train/egs/ali.ark,exp/nnet3/tdnn_sp_train/egs/ali.scp ERROR (copy-int-vector[5.5.9291539-9bca2]:ReadBasicType():base/io-funcs-inl.h:68) ReadBasicType: did not get expected integer type, 0 vs. 4. You can change this code to successfully read it later, if needed.
[ Stack-Trace: ] /opt/kaldi/src/lib/libkaldi-base.so(kaldi::MessageLogger::LogMessage() const+0x793) [0x7f3acf1181c3] copy-int-vector(kaldi::MessageLogger::LogAndThrow::operator=(kaldi::MessageLogger const&)+0x25) [0x55f3b30bd9dd] copy-int-vector(kaldi::BasicVectorHolder::Read(std::istream&)+0xba9) [0x55f3b30c4dcb] copy-int-vector(kaldi::SequentialTableReaderArchiveImpl<kaldi::BasicVectorHolder >::Next()+0xf3) [0x55f3b30c5159] copy-int-vector(main+0x484) [0x55f3b30bd20d] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf3) [0x7f3aced100b3] copy-int-vector(_start+0x2e) [0x55f3b30bccce]
[ Stack-Trace: ] /opt/kaldi/src/lib/libkaldi-base.so(kaldi::MessageLogger::LogMessage() const+0x793) [0x7f3acf1181c3] copy-int-vector(kaldi::MessageLogger::LogAndThrow::operator=(kaldi::MessageLogger const&)+0x25) [0x55f3b30bd9dd] copy-int-vector(kaldi::SequentialTableReaderArchiveImpl<kaldi::BasicVectorHolder >::~SequentialTableReaderArchiveImpl()+0x121) [0x55f3b30c134d] copy-int-vector(kaldi::SequentialTableReaderArchiveImpl<kaldi::BasicVectorHolder >::~SequentialTableReaderArchiveImpl()+0xd) [0x55f3b30c164d] copy-int-vector(kaldi::SequentialTableReader<kaldi::BasicVectorHolder >::~SequentialTableReader()+0x16) [0x55f3b30c1816] copy-int-vector(main+0x520) [0x55f3b30bd2a9] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf3) [0x7f3aced100b3] copy-int-vector(_start+0x2e) [0x55f3b30bccce]
You received this message because you are subscribed to the Google Groups "kaldi-help" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/60e0ba04-247d-4a26-93f8-8bdd106ca987n%40googlegroups.com.
Any idea please ( for information, i am using two GPUs with 25 GB each )?
The text was updated successfully, but these errors were encountered:
No branches or pull requests
hello,
I am a beginner under Kaldi and I'am trying to finetune danzu model by mini-librispeech data (juste a simple try) to understand the process.
I have firsty prepared data, coputed MFCC and i have then used this script for finetuning https://github.com/kaldi-asr/kaldi/blob/master/egs/aishell2/s5/local/nnet3/tuning/finetune_tdnn_1a.sh
I have used https://github.com/daanzu/kaldi-active-grammar/releases/download/v1.8.0/tree_sp.zip as it is the tree directory for the most recent models (i have used mainly the ali.x.gz files).
i have faced the below issue:
2021-05-30 14:27:20,165 [steps/nnet3/train_dnn.py:36 - - INFO ] Starting DNN trainer (train_dnn.py)
steps/nnet3/train_dnn.py --stage=-10 --cmd=run.pl --mem 4G --feat.cmvn-opts=--norm-means=false --norm-vars=false --trainer.input-model exp/nnet3/tdnn_sp_train/input.raw --trainer.num-epochs 5 --trainer.optimization.num-jobs-initial 1 --trainer.optimization.num-jobs-final 1 --trainer.optimization.initial-effective-lrate 0.0005 --trainer.optimization.final-effective-lrate 0.00002 --trainer.optimization.minibatch-size 1024 --feat-dir data/train_hires --lang data/lang --ali-dir exp/train_ali --dir exp/nnet3/tdnn_sp_train
['steps/nnet3/train_dnn.py', '--stage=-10', '--cmd=run.pl --mem 4G', '--feat.cmvn-opts=--norm-means=false --norm-vars=false', '--trainer.input-model', 'exp/nnet3/tdnn_sp_train/input.raw', '--trainer.num-epochs', '5', '--trainer.optimization.num-jobs-initial', '1', '--trainer.optimization.num-jobs-final', '1', '--trainer.optimization.initial-effective-lrate', '0.0005', '--trainer.optimization.final-effective-lrate', '0.00002', '--trainer.optimization.minibatch-size', '1024', '--feat-dir', 'data/train_hires', '--lang', 'data/lang', '--ali-dir', 'exp/train_ali', '--dir', 'exp/nnet3/tdnn_sp_train']
2021-05-30 14:27:20,172 [steps/nnet3/train_dnn.py:178 - train - INFO ] Arguments for the experiment
{'ali_dir': 'exp/train_ali',
'backstitch_training_interval': 1,
'backstitch_training_scale': 0.0,
'cleanup': True,
'cmvn_opts': '--norm-means=false --norm-vars=false',
'combine_sum_to_one_penalty': 0.0,
'command': 'run.pl --mem 4G',
'compute_per_dim_accuracy': False,
'dir': 'exp/nnet3/tdnn_sp_train',
'do_final_combination': True,
'dropout_schedule': None,
'egs_command': None,
'egs_dir': None,
'egs_opts': None,
'egs_stage': 0,
'email': None,
'exit_stage': None,
'feat_dir': 'data/train_hires',
'final_effective_lrate': 2e-05,
'frames_per_eg': 8,
'initial_effective_lrate': 0.0005,
'input_model': 'exp/nnet3/tdnn_sp_train/input.raw',
'lang': 'data/lang',
'max_lda_jobs': 10,
'max_models_combine': 20,
'max_objective_evaluations': 30,
'max_param_change': 2.0,
'minibatch_size': '1024',
'momentum': 0.0,
'num_epochs': 5.0,
'num_jobs_compute_prior': 10,
'num_jobs_final': 1,
'num_jobs_initial': 1,
'num_jobs_step': 1,
'online_ivector_dir': None,
'preserve_model_interval': 100,
'presoftmax_prior_scale_power': -0.25,
'prior_subset_size': 20000,
'proportional_shrink': 0.0,
'rand_prune': 4.0,
'remove_egs': True,
'reporting_interval': 0.1,
'samples_per_iter': 400000,
'shuffle_buffer_size': 5000,
'srand': 0,
'stage': -10,
'train_opts': [],
'use_gpu': 'yes'}
nnet3-info exp/nnet3/tdnn_sp_train/input.raw
2021-05-30 14:27:20,373 [steps/nnet3/train_dnn.py:238 - train - INFO ] Generating egs
steps/nnet3/get_egs.sh --cmd run.pl --mem 4G --cmvn-opts --norm-means=false --norm-vars=false --online-ivector-dir --left-context 34 --right-context 34 --left-context-initial -1 --right-context-final -1 --stage 0 --samples-per-iter 400000 --frames-per-eg 8 --srand 0 data/train_hires exp/train_ali exp/nnet3/tdnn_sp_train/egs
steps/nnet3/get_egs.sh: creating egs. To ensure they are not deleted later you can do: touch exp/nnet3/tdnn_sp_train/egs/.nodelete
steps/nnet3/get_egs.sh: feature type is raw, with 'apply-cmvn'
steps/nnet3/get_egs.sh: working out number of frames of training data
steps/nnet3/get_egs.sh: working out feature dim
*** steps/nnet3/get_egs.sh: warning: the --frames-per-eg is too large to generate one archive with
*** as many as --samples-per-iter egs in it. Consider reducing --frames-per-eg.
steps/nnet3/get_egs.sh: creating 1 archives, each with 238983 egs, with
steps/nnet3/get_egs.sh: 8 labels per example, and (left,right) context = (34,34)
steps/nnet3/get_egs.sh: copying data alignments
copy-int-vector ark:- ark,scp:exp/nnet3/tdnn_sp_train/egs/ali.ark,exp/nnet3/tdnn_sp_train/egs/ali.scp
ERROR (copy-int-vector[5.5.929~1539-9bca2]:ReadBasicType():base/io-funcs-inl.h:68) ReadBasicType: did not get expected integer type, 0 vs. 4. You can change this code to successfully read it later, if needed.
[ Stack-Trace: ]
/opt/kaldi/src/lib/libkaldi-base.so(kaldi::MessageLogger::LogMessage() const+0x793) [0x7f27463671c3]
copy-int-vector(kaldi::MessageLogger::LogAndThrow::operator=(kaldi::MessageLogger const&)+0x25) [0x5650dbd339dd]
copy-int-vector(kaldi::BasicVectorHolder::Read(std::istream&)+0xba9) [0x5650dbd3adcb]
copy-int-vector(kaldi::SequentialTableReaderArchiveImpl<kaldi::BasicVectorHolder >::Next()+0xf3) [0x5650dbd3b159]
copy-int-vector(main+0x484) [0x5650dbd3320d]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf3) [0x7f2745f5f0b3]
copy-int-vector(_start+0x2e) [0x5650dbd32cce]
WARNING (copy-int-vector[5.5.929
1539-9bca2]:Read():util/kaldi-holder-inl.h:308) BasicVectorHolder::Read, read error or unexpected data at archive entry beginning at file position 184467440737095516151539-9bca2]:Next():util/kaldi-table-inl.h:574) Object read failed, reading archive standard inputWARNING (copy-int-vector[5.5.929
LOG (copy-int-vector[5.5.929
1539-9bca2]:main():copy-int-vector.cc:83) Copied 2697018 vectors of int32.1539-9bca2]:~SequentialTableReaderArchiveImpl():util/kaldi-table-inl.h:678) TableReader: error detected closing archive standard inputERROR (copy-int-vector[5.5.929
[ Stack-Trace: ]
/opt/kaldi/src/lib/libkaldi-base.so(kaldi::MessageLogger::LogMessage() const+0x793) [0x7f27463671c3]
copy-int-vector(kaldi::MessageLogger::LogAndThrow::operator=(kaldi::MessageLogger const&)+0x25) [0x5650dbd339dd]
copy-int-vector(kaldi::SequentialTableReaderArchiveImpl<kaldi::BasicVectorHolder >::~SequentialTableReaderArchiveImpl()+0x121) [0x5650dbd3734d]
copy-int-vector(kaldi::SequentialTableReaderArchiveImpl<kaldi::BasicVectorHolder >::~SequentialTableReaderArchiveImpl()+0xd) [0x5650dbd3764d]
copy-int-vector(kaldi::SequentialTableReader<kaldi::BasicVectorHolder >::~SequentialTableReader()+0x16) [0x5650dbd37816]
copy-int-vector(main+0x520) [0x5650dbd332a9]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf3) [0x7f2745f5f0b3]
copy-int-vector(_start+0x2e) [0x5650dbd32cce]
terminate called after throwing an instance of 'kaldi::KaldiFatalError'
what(): kaldi::KaldiFatalError
gzip: stdout: Broken pipe
gzip: stdout: Broken pipe
gzip: stdout: Broken pipe
gzip: stdout: Broken pipe
gzip: stdout: Broken pipe
gzip: stdout: Broken pipe
gzip: stdout: Broken pipe
gzip: stdout: Broken pipe
gzip: stdout: Broken pipe
gzip: stdout: Broken pipe
gzip: stdout: Broken pipe
gzip: stdout: Broken pipe
steps/nnet3/get_egs.sh: line 272: 1101381 Exit 1 for id in $(seq $num_ali_jobs);
do
gunzip -c $alidir/ali.$id.gz;
done
1101382 Aborted (core dumped) | copy-int-vector ark:- ark,scp:$dir/ali.ark,$dir/ali.scp
Traceback (most recent call last):
File "steps/nnet3/train_dnn.py", line 459, in main
train(args, run_opts)
File "steps/nnet3/train_dnn.py", line 253, in train
stage=args.egs_stage)
File "steps/libs/nnet3/train/frame_level_objf/acoustic_model.py", line 61, in generate_egs
egs_opts=egs_opts if egs_opts is not None else ''))
File "steps/libs/common.py", line 129, in execute_command
p.returncode, command))
Exception: Command exited with status 1: steps/nnet3/get_egs.sh --cmd "run.pl --mem 4G" --cmvn-opts "--norm-means=false --norm-vars=false" --online-ivector-dir "" --left-context 34 --right-context 34 --left-context-initial -1 --right-context-final -1 --stage 0 --samples-per-iter 400000 --frames-per-eg 8 --srand 0 data/train_hires exp/train_ali exp/nnet3/tdnn_sp_train/egs
(GPU)
/scratch/asma-kaldi/egs/aishell2/s5$ bash local/nnet3/tuning/finetune_tdnn_1a.sh1539-9bca2]:ReadBasicType():base/io-funcs-inl.h:68) ReadBasicType: did not get expected integer type, 0 vs. 4. You can change this code to successfully read it later, if needed.2021-05-30 14:31:18,075 [steps/nnet3/train_dnn.py:36 - - INFO ] Starting DNN trainer (train_dnn.py)
steps/nnet3/train_dnn.py --stage=-10 --cmd=run.pl --mem 4G --feat.cmvn-opts=--norm-means=false --norm-vars=false --trainer.input-model exp/nnet3/tdnn_sp_train/input.raw --trainer.num-epochs 5 --trainer.optimization.num-jobs-initial 1 --trainer.optimization.num-jobs-final 1 --trainer.optimization.initial-effective-lrate 0.0005 --trainer.optimization.final-effective-lrate 0.00002 --trainer.optimization.minibatch-size 1024 --feat-dir data/train_hires --lang data/lang --ali-dir exp/train_ali --dir exp/nnet3/tdnn_sp_train
['steps/nnet3/train_dnn.py', '--stage=-10', '--cmd=run.pl --mem 4G', '--feat.cmvn-opts=--norm-means=false --norm-vars=false', '--trainer.input-model', 'exp/nnet3/tdnn_sp_train/input.raw', '--trainer.num-epochs', '5', '--trainer.optimization.num-jobs-initial', '1', '--trainer.optimization.num-jobs-final', '1', '--trainer.optimization.initial-effective-lrate', '0.0005', '--trainer.optimization.final-effective-lrate', '0.00002', '--trainer.optimization.minibatch-size', '1024', '--feat-dir', 'data/train_hires', '--lang', 'data/lang', '--ali-dir', 'exp/train_ali', '--dir', 'exp/nnet3/tdnn_sp_train']
2021-05-30 14:31:18,082 [steps/nnet3/train_dnn.py:178 - train - INFO ] Arguments for the experiment
{'ali_dir': 'exp/train_ali',
'backstitch_training_interval': 1,
'backstitch_training_scale': 0.0,
'cleanup': True,
'cmvn_opts': '--norm-means=false --norm-vars=false',
'combine_sum_to_one_penalty': 0.0,
'command': 'run.pl --mem 4G',
'compute_per_dim_accuracy': False,
'dir': 'exp/nnet3/tdnn_sp_train',
'do_final_combination': True,
'dropout_schedule': None,
'egs_command': None,
'egs_dir': None,
'egs_opts': None,
'egs_stage': 0,
'email': None,
'exit_stage': None,
'feat_dir': 'data/train_hires',
'final_effective_lrate': 2e-05,
'frames_per_eg': 8,
'initial_effective_lrate': 0.0005,
'input_model': 'exp/nnet3/tdnn_sp_train/input.raw',
'lang': 'data/lang',
'max_lda_jobs': 10,
'max_models_combine': 20,
'max_objective_evaluations': 30,
'max_param_change': 2.0,
'minibatch_size': '1024',
'momentum': 0.0,
'num_epochs': 5.0,
'num_jobs_compute_prior': 10,
'num_jobs_final': 1,
'num_jobs_initial': 1,
'num_jobs_step': 1,
'online_ivector_dir': None,
'preserve_model_interval': 100,
'presoftmax_prior_scale_power': -0.25,
'prior_subset_size': 20000,
'proportional_shrink': 0.0,
'rand_prune': 4.0,
'remove_egs': True,
'reporting_interval': 0.1,
'samples_per_iter': 400000,
'shuffle_buffer_size': 5000,
'srand': 0,
'stage': -10,
'train_opts': [],
'use_gpu': 'yes'}
nnet3-info exp/nnet3/tdnn_sp_train/input.raw
2021-05-30 14:31:18,286 [steps/nnet3/train_dnn.py:238 - train - INFO ] Generating egs
steps/nnet3/get_egs.sh --cmd run.pl --mem 4G --cmvn-opts --norm-means=false --norm-vars=false --online-ivector-dir --left-context 34 --right-context 34 --left-context-initial -1 --right-context-final -1 --stage 0 --samples-per-iter 400000 --frames-per-eg 8 --srand 0 data/train_hires exp/train_ali exp/nnet3/tdnn_sp_train/egs
steps/nnet3/get_egs.sh: creating egs. To ensure they are not deleted later you can do: touch exp/nnet3/tdnn_sp_train/egs/.nodelete
steps/nnet3/get_egs.sh: feature type is raw, with 'apply-cmvn'
steps/nnet3/get_egs.sh: working out number of frames of training data
steps/nnet3/get_egs.sh: working out feature dim
*** steps/nnet3/get_egs.sh: warning: the --frames-per-eg is too large to generate one archive with
*** as many as --samples-per-iter egs in it. Consider reducing --frames-per-eg.
steps/nnet3/get_egs.sh: creating 1 archives, each with 238983 egs, with
steps/nnet3/get_egs.sh: 8 labels per example, and (left,right) context = (34,34)
steps/nnet3/get_egs.sh: copying data alignments
copy-int-vector ark:- ark,scp:exp/nnet3/tdnn_sp_train/egs/ali.ark,exp/nnet3/tdnn_sp_train/egs/ali.scp
ERROR (copy-int-vector[5.5.929
[ Stack-Trace: ]
/opt/kaldi/src/lib/libkaldi-base.so(kaldi::MessageLogger::LogMessage() const+0x793) [0x7f3acf1181c3]
copy-int-vector(kaldi::MessageLogger::LogAndThrow::operator=(kaldi::MessageLogger const&)+0x25) [0x55f3b30bd9dd]
copy-int-vector(kaldi::BasicVectorHolder::Read(std::istream&)+0xba9) [0x55f3b30c4dcb]
copy-int-vector(kaldi::SequentialTableReaderArchiveImpl<kaldi::BasicVectorHolder >::Next()+0xf3) [0x55f3b30c5159]
copy-int-vector(main+0x484) [0x55f3b30bd20d]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf3) [0x7f3aced100b3]
copy-int-vector(_start+0x2e) [0x55f3b30bccce]
WARNING (copy-int-vector[5.5.929
1539-9bca2]:Read():util/kaldi-holder-inl.h:308) BasicVectorHolder::Read, read error or unexpected data at archive entry beginning at file position 184467440737095516151539-9bca2]:Next():util/kaldi-table-inl.h:574) Object read failed, reading archive standard inputWARNING (copy-int-vector[5.5.929
LOG (copy-int-vector[5.5.929
1539-9bca2]:main():copy-int-vector.cc:83) Copied 2697018 vectors of int32.1539-9bca2]:~SequentialTableReaderArchiveImpl():util/kaldi-table-inl.h:678) TableReader: error detected closing archive standard inputERROR (copy-int-vector[5.5.929
[ Stack-Trace: ]
/opt/kaldi/src/lib/libkaldi-base.so(kaldi::MessageLogger::LogMessage() const+0x793) [0x7f3acf1181c3]
copy-int-vector(kaldi::MessageLogger::LogAndThrow::operator=(kaldi::MessageLogger const&)+0x25) [0x55f3b30bd9dd]
copy-int-vector(kaldi::SequentialTableReaderArchiveImpl<kaldi::BasicVectorHolder >::~SequentialTableReaderArchiveImpl()+0x121) [0x55f3b30c134d]
copy-int-vector(kaldi::SequentialTableReaderArchiveImpl<kaldi::BasicVectorHolder >::~SequentialTableReaderArchiveImpl()+0xd) [0x55f3b30c164d]
copy-int-vector(kaldi::SequentialTableReader<kaldi::BasicVectorHolder >::~SequentialTableReader()+0x16) [0x55f3b30c1816]
copy-int-vector(main+0x520) [0x55f3b30bd2a9]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf3) [0x7f3aced100b3]
copy-int-vector(_start+0x2e) [0x55f3b30bccce]
terminate called after throwing an instance of 'kaldi::KaldiFatalError'
what(): kaldi::KaldiFatalError
gzip: stdout: Broken pipe
gzip: stdout: Broken pipe
gzip: stdout: Broken pipe
gzip: stdout: Broken pipe
gzip: stdout: Broken pipe
gzip: stdout: Broken pipe
gzip: stdout: Broken pipe
gzip: stdout: Broken pipe
gzip: stdout: Broken pipe
gzip: stdout: Broken pipe
gzip: stdout: Broken pipe
gzip: stdout: Broken pipe
steps/nnet3/get_egs.sh: line 272: 1101630 Exit 1 for id in $(seq $num_ali_jobs);
do
gunzip -c $alidir/ali.$id.gz;
done
1101631 Aborted (core dumped) | copy-int-vector ark:- ark,scp:$dir/ali.ark,$dir/ali.scp
Traceback (most recent call last):
File "steps/nnet3/train_dnn.py", line 459, in main
train(args, run_opts)
File "steps/nnet3/train_dnn.py", line 253, in train
stage=args.egs_stage)
File "steps/libs/nnet3/train/frame_level_objf/acoustic_model.py", line 61, in generate_egs
egs_opts=egs_opts if egs_opts is not None else ''))
File "steps/libs/common.py", line 129, in execute_command
p.returncode, command))
Exception: Command exited with status 1: steps/nnet3/get_egs.sh --cmd "run.pl --mem 4G" --cmvn-opts "--norm-means=false --norm-vars=false" --online-ivector-dir "" --left-context 34 --right-context 34 --left-context-initial -1 --right-context-final -1 --stage 0 --samples-per-iter 400000 --frames-per-eg 8 --srand 0 data/train_hires exp/train_ali exp/nnet3/tdnn_sp_train/egs
Go to http://kaldi-asr.org/forums.html to find out how to join the kaldi-help group
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/60e0ba04-247d-4a26-93f8-8bdd106ca987n%40googlegroups.com.
Any idea please ( for information, i am using two GPUs with 25 GB each )?
The text was updated successfully, but these errors were encountered: