Issue Urdu-English translation #19

ahmedraza1235 · 2020-07-16T02:17:44Z

Hi Mikel!
I apply all the steps which your toolkit required in paper on urdu- english corpus. But get very poor bleu score like 0.5 or 0.9.
data Preprocessing
step 1) monolingual data on apply: tokenization, true casing and cleaning 1-50 sentence length with moses.
step 2)word embeddings with word2vec parameters epco=5, window_size=5, window_size =5 and dimension=300 then apply MUSE for alignment mapped on shared space with Vecmap.
size of my corpus is 13k. (it's enough?)
my query is this toolkit support urdu language.
and second i use parameter toolkit default.
if effect parameter on model training kindly please share.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue Urdu-English translation #19

Issue Urdu-English translation #19

ahmedraza1235 commented Jul 16, 2020 •

edited

Loading

Issue Urdu-English translation #19

Issue Urdu-English translation #19

Comments

ahmedraza1235 commented Jul 16, 2020 • edited Loading

ahmedraza1235 commented Jul 16, 2020 •

edited

Loading