Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add correct mecab installation instructions #132

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

stet-stet
Copy link

@stet-stet stet-stet commented Feb 26, 2020

TL;DR
I present this PR to prevent people from having issues like #54, #18.

Update: I also strongly suspect that issues such as #111 was caused by an incorrect configuration of MeCab(with the default encodings), which may cause an assertion in fastBPE.hpp to fail (line 480), therefore resulting in failiure to produce output files after fastBPE.

At least in my system locale, failing to set any one of these utf-8-enabling flags(see install_external_tools.sh) led to empty outputs in the embed task, encoding errors (at $LASER/source/lib/romanize_lc.py), and much confusion. Regrettably, it is quite hard to know this fact before you have this problem.

Also, I changed README.md a bit, so that hopefully mecab feels a bit more optional for people not dealing with the Japanese language.

Additional question: Why was the auto-installation of Mecab dropped?

@facebook-github-bot
Copy link

Hi @stet-stet!

Thank you for your pull request and welcome to our community.We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file.

In order for us to review and merge your code, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

If you have received this in error or have any questions, please contact us at [email protected]. Thanks!

@facebook-github-bot
Copy link

Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Facebook open source project. Thanks!

@facebook-github-bot facebook-github-bot added the CLA Signed Do not delete this pull request or issue due to inactivity. label Feb 26, 2020
@stet-stet
Copy link
Author

I just noticed that #97 will install MeCab correctly, automatically. However since we do not know why the auto-installation of MeCab was dropped, I will still leave this PR open.

@facebook-github-bot
Copy link

Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Facebook open source project. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed Do not delete this pull request or issue due to inactivity.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants