-
Notifications
You must be signed in to change notification settings - Fork 310
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactoring #4
Refactoring #4
Conversation
That was fast! Thanks! |
) | ||
|
||
# TODO: Use eos_id as ignore_id. | ||
# tgt_key_padding_mask = decoder_padding_mask(ys_in_pad, ignore_id=eos_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is commented out since existing models are trained with it disabled.
If it is enabled, the WER becomes worse.
We should enable it when we start to train a new model.
The following is the WER from the model trained by #3 and decoded with this pull-request:
Epochs 14-26 are used in model averaging. I have uploaded the above checkpoints to To reproduce the decoding result:
The results are expected to become better if trained with more epochs. |
Great!!
…On Tue, Aug 3, 2021 at 8:16 PM Fangjun Kuang ***@***.***> wrote:
The following is the WER from the model trained by #3
<#3> and decoded with this
pull-request:
(With n-gram LM rescoring and attention decoder. The model is trained for
26 epochs)
For test-clean, WER of different settings are:
ngram_lm_scale_0.7_attention_scale_0.6 2.96 best for test-clean
ngram_lm_scale_0.9_attention_scale_0.5 2.96
ngram_lm_scale_0.7_attention_scale_0.5 2.97
ngram_lm_scale_0.7_attention_scale_0.7 2.97
ngram_lm_scale_0.9_attention_scale_0.6 2.97
ngram_lm_scale_0.9_attention_scale_0.7 2.97
ngram_lm_scale_0.9_attention_scale_0.9 2.97
ngram_lm_scale_1.0_attention_scale_0.7 2.97
ngram_lm_scale_1.0_attention_scale_0.9 2.97
ngram_lm_scale_1.0_attention_scale_1.0 2.97
ngram_lm_scale_1.0_attention_scale_1.1 2.97
ngram_lm_scale_1.0_attention_scale_1.2 2.97
ngram_lm_scale_1.0_attention_scale_1.3 2.97
ngram_lm_scale_1.1_attention_scale_0.9 2.97
Epochs 14-26 are used in model averaging.
------------------------------
I have uploaded the above checkpoints to
https://huggingface.co/csukuangfj/conformer_ctc/tree/main
To reproduce the decoding result:
1. clone the above repo containing checkpoints and put it into
conformer_ctc/exp/
2. after step 1, you should have
conformer_ctc/exp/epoch-{14,15,...,26}.pt
3. run
./prepare.sh
./conformer_ctc/decode.py --epoch 26 --avg 13 --max-duration=50
1. You should get the above result.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#4 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAZFLOYRK6U225FIAUPRC2TT27MZDANCNFSM5BJ7IYRA>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&utm_campaign=notification-email>
.
|
Nice! I'm curious -- did you ever try to run the same thing but with MMI instead of CTC? |
yes, I am planning to do that with a pretrained P. All the related code can be found in snowfall. |
Merging it to avoid conflicts. |
TODOs