If you are new to studying Japanese, or have been studying the language for a short while, you might have come across the name Jim Breen at some point. Even if you don’t recognize the name right away, chances are you have seen this name before. Many Japanese-English dictionaries use the entries from the EDICT project. Most mobile Japanese-English dictionary apps such as Aedict, Imiwa (Formerly known as Kotoba) and JED (just to name a few) are all using the EDICT database.
Who is Jim Breen?
Jim is an Adjunct Senior Research Fellow in the Japanese Studies Centre at Monash University in Australia. In 1991, Jim began the EDICT project. A DOS based Japanese word processor had been released, containing the initial EDICT file. The file was continuously modified and expanded. By 1999, the file had around 60 000 entries, and the first XML format JMdict file was released. A new format, known as the EDICT2 was an expanded version of the EDICT file. Created in 2003, The new format contained multiple kanji headwords and readings, cross-reference and other information fields.
In 2000, the Electronic Dictionary Research and Development Group was established with the Faculty of Information Technology at Monash University. Their objectives were to compile electronic dictionaries and to carry out research and development in applied computational linguistics.
The group consists of Jim Breen and collaborators from around the world. The EDICT project is currently housed at the EDRDG. Also, the copyright for the files of the EDICT project are assigned to this group. In addition to the EDICT project, there are several other projects the group is currently engaged with, such projects include;
- The ENAMDICT/JMnedict Japanese Proper names Electronic Database
- The KANJIDIC and KANJIDIC2 Kanji information databases
- the WWWJDIC dictionary server
The EDICT dictionary file is still expanding and improving. This is largely due to the effort of Jim Breen and the EDRDG. Also, the copyright scheme behind the EDICT project has allowed for this dictionary file to proliferate in the mobile app and web browser space. The files are available for use to developers, as long as acknowledgement is provided to the group. This is why many mobile apps are using the Jim Breen EDICT file as well as other databases from the EDRDG. For students learning Japanese, the database that beats at the heart of several apps and browser extensions has been an indispensable resource to learning the Japanese language.
One major criticism of the file has been the sheer number of results which are generated when a search is conducted. For instance, the Aedict app will show 20+ entries for a simple word such as “go”. In newer app versions, this has been mitigated by a ranking scheme, denoting words that are in common use in contemporary Japanese. Words which are obsolete are still included and denoted as such in order to prevent language learners from using vocabulary which is outdated.
- The EDICT Project is meant to be a Japanese to English dictionary, however many developers have incorporated the database in such a way that it can also be used in reverse (English to Japanese)