You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the issue
When running the mfa traincommand with the --temporary_directory option raises a UnicodeDecode error UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb0 in position 37: invalid start byte
The error doesn't arise if you do not use this option. I note that the --output_directory option has no problem.
What language is the corpus in? This problem occurs in Bislama & Tok Pisin corpora
How many files/speakers? BIS: 27 speakers, 30 files; TPI: 52 speakers 55 files
Are you using lab files or TextGrid files for input? TextGrid
Dictionary
Are you using a dictionary from MFA? If so, which one? No
If it's a custom dictionary, what is the phoneset? It is just the roman alphabet as well we 'ng'
Acoustic model
If you're using an acoustic model, is it one download through MFA? If so, which one? No
If it's a model you've trained, what data was it trained on? The same data as above
Log file
I searched through all the log-files but they either do not exist or have completed successfully. I have provided the Traceback.
Traceback:
/opt/miniconda3/envs/aligner/bin/mfa train \
--temporary_directory "/Volumes/PassmoreSSD/PacificCreoles/BIS/acoustic_model" \
--clean \
--verbose \
--debug \
"/Volumes/PassmoreSSD/PacificCreoles/BIS/training" \
"/Volumes/PassmoreSSD/PacificCreoles/BIS/pronunciation_dictionary.txt" \
"/Volumes/PassmoreSSD/PacificCreoles/BIS/BIS_acoustic_model.zip"
INFO Setting up corpus information...
INFO Loading corpus from source files...
52% ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╸━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 52/100 [ 0:00:01 < -:--:-- , ? it/s ]
INFO Found 27 speakers across 52 files, average number of utterances per speaker: 271.51851851851853
INFO Initializing multiprocessing jobs...
INFO Normalizing text...
100% ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 7,331/7,331 [ 0:00:02 < 0:00:00 , 6,049 it/s ]
INFO Generating MFCCs...
100% ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 7,331/7,331 [ 0:00:49 < 0:00:00 , 147 it/s ]
INFO Calculating CMVN...
INFO Generating final features...
100% ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 7,331/7,331 [ 0:00:03 < 0:00:00 , 4,528 it/s ]
INFO Creating corpus split...
100% ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 7,331/7,331 [ 0:00:02 < 0:00:00 , 6,225 it/s ]
INFO Filtering utterances for training...
INFO Initializing training for monophone...
INFO Compiling training graphs...
INFO Generating initial alignments...
100% ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 7,334/7,331 [ 0:00:07 < 0:00:00 , 2,040 it/s ]
INFO Initialization complete!
INFO monophone - Iteration 1 of 40
INFO Generating alignments...
100% ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 7,331/7,331 [ 0:01:20 < 0:00:00 , 58 it/s ]
INFO Accumulating statistics...
68% ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╸━━━━━━━━━━━━━━━━━━━━━━━━━━ 4,976/7,331 [ 0:00:03 < 0:00:01 , 2,628 it/s ]
ERROR There was an error in the run, please see the log.
Error in atexit._run_exitfuncs:
Traceback (most recent call last):
File "/opt/miniconda3/envs/aligner/bin/mfa", line 10, in <module>
sys.exit(mfa_cli())
File "/opt/miniconda3/envs/aligner/lib/python3.9/site-packages/rich_click/rich_command.py", line 367, in __call__
return super().__call__(*args, **kwargs)
File "/opt/miniconda3/envs/aligner/lib/python3.9/site-packages/click/core.py", line 1157, in __call__
return self.main(*args, **kwargs)
File "/opt/miniconda3/envs/aligner/lib/python3.9/site-packages/rich_click/rich_command.py", line 152, in main
rv = self.invoke(ctx)
File "/opt/miniconda3/envs/aligner/lib/python3.9/site-packages/click/core.py", line 1688, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/opt/miniconda3/envs/aligner/lib/python3.9/site-packages/click/core.py", line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/opt/miniconda3/envs/aligner/lib/python3.9/site-packages/click/core.py", line 783, in invoke
return __callback(*args, **kwargs)
File "/opt/miniconda3/envs/aligner/lib/python3.9/site-packages/click/decorators.py", line 33, in new_func
return f(get_current_context(), *args, **kwargs)
File "/opt/miniconda3/envs/aligner/lib/python3.9/site-packages/montreal_forced_aligner/command_line/train_acoustic_model.py", line 151, in train_acoustic_model_cli
trainer.train()
File "/opt/miniconda3/envs/aligner/lib/python3.9/site-packages/montreal_forced_aligner/acoustic_modeling/trainer.py", line 607, in train
trainer.train()
File "/opt/miniconda3/envs/aligner/lib/python3.9/site-packages/montreal_forced_aligner/acoustic_modeling/base.py", line 395, in train
self.train_iteration()
File "/opt/miniconda3/envs/aligner/lib/python3.9/site-packages/montreal_forced_aligner/acoustic_modeling/base.py", line 370, in train_iteration
parse_logs(self.working_log_directory)
File "/opt/miniconda3/envs/aligner/lib/python3.9/site-packages/montreal_forced_aligner/utils.py", line 364, in parse_logs
for line in f:
File "/opt/miniconda3/envs/aligner/lib/python3.9/codecs.py", line 322, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb0 in position 37: invalid start byte
Desktop (please complete the following information):
OS: MacOs
Version: 13.3.1 (a) (22E772610a) - M1 Processor
The text was updated successfully, but these errors were encountered:
Debugging checklist
[X] Have you read the troubleshooting page (https://montreal-forced-aligner.readthedocs.io/en/latest/user_guide/troubleshooting.html) and searched the documentation to ensure that your issue is not addressed there?
[X] Have you updated to latest MFA version (check https://montreal-forced-aligner.readthedocs.io/en/latest/changelog/changelog_3.0.html)? What is the output of
mfa version
?[X] Have you tried rerunning the command with the
--clean
flag?Describe the issue
When running the
mfa train
command with the --temporary_directory option raises a UnicodeDecode errorUnicodeDecodeError: 'utf-8' codec can't decode byte 0xb0 in position 37: invalid start byte
The error doesn't arise if you do not use this option. I note that the --output_directory option has no problem.
For Reproducing your issue
The code I use is:
Log file
I searched through all the log-files but they either do not exist or have completed successfully. I have provided the Traceback.
Traceback:
Desktop (please complete the following information):
The text was updated successfully, but these errors were encountered: