Jiten-pai

A Gjiten-lookalike Japanese dictionary written in Python.

Prerequisites

Running jiten-pai.py requires Python version >= 3.7 installed, plus the PyQt5 python bindings for Qt5. Tested with Python 3.7.3, PyQt5 5.11.3.

Dictionary files are not included in Jiten-pai, these have to be downloaded and installed separately, see next section.

Get and Install Dictionary Files

Jiten-pai supports word dictionary files in EDICT format, as made available by the Electronic Dictionary Research and Development Group:

EDICT2 (essential)
- EDICT main dictionary; revised format; EUC-JP encoding;
- download file, then unpack and convert to UTF-8:
  
  zcat edict2.gz | recode EUC-JP..UTF-8 > edict2
- install in Jiten-pai using the Edit→Preferences dialog
- HINT: In case the zcat or recode utilities are not available, the included simple transcoding utility can be used instead, e.g.:

            eucjp_to_utf8.py edict2.gz edict2
            eucjp_to_utf8.py enamdict.gz enamdict
            eucjp_to_utf8.py kanjidic.gz kanjidic

EDICT (obsolete)
- predecessor to EDICT2; legacy format; EUC-JP encoding
- download file, unpack and convert to UTF-8 (see above)
- install via Edit→Preferences
ENAMDICT (optional)
- named entity dictionary; EDICT format; EUC-JP encoding
- download file, unpack and convert to UTF-8 (see above)
- install via Edit→Preferences

More word dictionaries and alternative language versions are available at the EDRDG archive. The respective accompanying documentation will have the details, and in particular indicate whether a file is actually in EDICT(2) format. In most cases a conversion from EUC-JP to UTF-8 will be necessary, see above.

In addition to any of the abovementioned word dictionaries, the KanjiDic part of Jiten-pai requires installation of one of the kanjidic files, also made available by the EDRDG:

KANJIDIC (recommended)
- Kanji dictionary; plain text format; EUC-JP encoding
- contains all kanji covered by the JIS X 0208-1998 standard
- download file, unpack and convert to UTF-8 (see above)
- install via Edit→Preferences
KANJIDIC_COMB (alternative)
- Kanji dictionary; plain text format; UTF-8 encoding
- additionally contains kanji from JIS X 0212/0213 supplementary sets
- download file, unpack, install via Edit→Preferences
KANJIDIC2 (alternative)
- Kanji dictionary; XML format; UTF-8 encoding
- additionally contains kanji from JIS X 0212/0213 supplementary sets
- caveat: does not support full text search
- download file, unpack, install via Edit→Preferences

The EDRDG licence page provides dictionary copyright information and licensing terms.

Notes

If the search term contains any Katakana or Hiragana, Jiten-pai will always report matches for both syllabaries. This is intentional.
The word search supports Python regular expression syntax to allow for more flexible queries. Consequently, backslash-escaping is required when, for whatever reason, searching verbatim for regex special characters like '.', '(', ')', etc. Using the '^' and '$' regex anchors is strongly discouraged. Equivalent functionality is already provided by the various search options. (If this paragraph is all Greek to you, chances are you can safely ignore it.)
During startup Jiten-pai will look for the vconj.utf8 verb conjugation file as well as the kradfile[2].utf8 and radkfile[2].utf8 kanji radical cross-reference files in the following directories, in the given order:
- $HOME/.local/share/jiten-pai/
- /usr/local/share/jiten-pai/
- /usr/share/jiten-pai/
- current working directory
Without these files verb de-inflection and radical search, respectively, will not be available.

Command Line

Jiten-pai supports a few command line options which might come in handy for workflow integration. These should be fairly self explaining:

    usage: jiten-pai.py [-h] [-k] [-K] [-c] [-v] [-l KANJI] [-w WORD]

    Jiten-pai Japanese dictionary

    optional arguments:
      -h, --help                      show this help message and exit
      -k, --kanjidic                  start with KanjiDic
      -K                              same as -k, but word dictionary visible
      -c, --clip-kanji                look up kanji from clipboard
      -v, --clip-word                 look up word from clipboard
      -l KANJI, --kanji-lookup KANJI  look up KANJI in kanji dictionary
      -w WORD, --word-lookup WORD     look up WORD in word dictionary

    Only one of these options should be used at a time.

License

Jiten-pai incorporates parts taken from other projects, namely:

Kana conversion code adapted from jaconv; Copyright (c) 2014 Yukino Ikegami; MIT License
VCONJ verb de-inflection rule file adapted from XJDIC; Copyright (c) 1998-2003 J.W. Breen; GNU General Public License v2.0 Modifications for Gjiten 1999-2005 by Botond Botyanszki
RADKFILE and KRADFILE kanji radical cross-reference adapted from The KRADFILE/RADKFILE Project; Copyright (c) James William BREEN and The Electronic Dictionary Research and Development Group; Creative Commons Attribution-ShareAlike Licence (V3.0)

The remaining majority of Jiten-pai code is distributed under the Modified ("3-clause") BSD License. See LICENSE file for more information.

Name		Name	Last commit message	Last commit date
Latest commit History 158 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
eucjp_to_utf8.py		eucjp_to_utf8.py
jiten-pai.py		jiten-pai.py
jiten-pai.svg		jiten-pai.svg
kanjidic.py		kanjidic.py
kradfile.utf8		kradfile.utf8
kradfile2.utf8		kradfile2.utf8
radkfile.utf8		radkfile.utf8
radkfile2.utf8		radkfile2.utf8
vconj.utf8		vconj.utf8

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Jiten-pai

Prerequisites

Get and Install Dictionary Files

Notes

Command Line

License

About

Releases

Packages

Languages

License

irrwahn/jiten-pai

Folders and files

Latest commit

History

Repository files navigation

Jiten-pai

Prerequisites

Get and Install Dictionary Files

Notes

Command Line

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages