Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

toRomaji() option for KUNREI method #65

Open
tsomeq opened this issue Aug 31, 2017 · 2 comments
Open

toRomaji() option for KUNREI method #65

tsomeq opened this issue Aug 31, 2017 · 2 comments

Comments

@tsomeq
Copy link

tsomeq commented Aug 31, 2017

Japanese ROMAJI has 2 standard methods.
The kunrei-shiki and hepburn-shiki.
( word "shiki" means "method" )
This method is also known as ISO 3602 .
https://en.wikipedia.org/wiki/Kunrei-shiki_romanization

How about supporting KUNREI method with an option?

@DJTB
Copy link
Collaborator

DJTB commented Oct 10, 2017

I'm not sure that multiple romaji implementations is something we want to include in the scope of Wanakana. It'll require a new conversion table and increase the filesize, and then there's a question of whether we should then support other romaji variants as well.

More importantly, there's features of Kunrei that we couldn't support out of the box without some sort of syntactical/morphological analysis (which Wanakana is not set up to do since basic Hepburn is very straightforward and thus doesn't need it).

  • When he (へ) is used as a particle, it is written as e, not he (as in Nihon-shiki).
  • When ha (は) is used as a particle, it is written as wa, not ha.
  • wo (を/ヲ) is used only as a particle, written o.
  • Vowels that are separated by a morpheme boundary are not considered to be a long vowel. For example, おもう (思う) is written omou, not omô.

We don't analyse sentence structure in any form, so using a simple conversion table for Kunrei and attempting something like

toRomaji("かれはとうきょうへいったことがあるとおもう", { variant: 'kunrei' })

would result in "kare ha tōkyō he itta koto ga aru to omō"
rather than the correct "kare wa tōkyō e itta koto ga aru to omou".

Of course the user could input "かれわとうきょうえいった..." but that's not correct Japanese, and wouldn't be clear without explicit instructions to mangle their input. Moreover, I think a partial Kunrei option would be misleading for any projects using toRomaji() for text block conversion (whereas the Hepburn conversion is still correct when multiple words are strung together).

I do see the benefit of enabling different romaji outputs, but I think that using an external dedicated script would be more appropriate in this case rather than extending Wanakana.

Any thoughts @vietqhoang @mimshwright?

Personally I'm against a partial implementation of Kunrei, or attempting to add complete syntactical/morphological analysis.

@DJTB DJTB mentioned this issue Oct 23, 2017
@DJTB
Copy link
Collaborator

DJTB commented Dec 24, 2017

There have been some recent changes in dev allowing custom mapping to occur.

If anyone wants to try a PR for kunrei, you can clone or fork dev branch

Any questions ping @Geggles and @DJTB

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants