Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to prepare data #3

Open
akanshajainn opened this issue Sep 28, 2018 · 1 comment
Open

How to prepare data #3

akanshajainn opened this issue Sep 28, 2018 · 1 comment

Comments

@akanshajainn
Copy link

In this repo you have provided the data in zipped which will be used to train the MT system. But I am planning to try it on different set of languages, but I am really stuck on how to prepare data for that. I do know how to tokenise, binarize the data, but don't know how to get those dictionary, and first translation data?

@guillaumekln
Copy link
Contributor

As described in the README, the first translation data were generated by this project: https://github.com/jsenellart/papers/tree/master/WordTranslationWithoutParallelData

You should also be able to use the official project: https://github.com/facebookresearch/MUSE

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants