Yazi
Extending Yazi DictionariesYazi's current dictionaries come from two sources: The Unicode Unihan database and the CEDict Chinese-English dictionary. There is a Japanese dictionary project that CEDict was modeled on, although we haven't converted it for this application. The file format we use is based on the file format that is used for the Unihan data.
Yazi's dictionaries are keyed based on characters in Unicode format. This means that every entry in the dictionary begins with unicode character(s) followed by some set of meta information about those characters. For the time being we have chosen to use a simple text-based format instead of something more complicated like XML. The biggest advantage for the time being is that this makes it very easy to add new data.
A sample set of dictionary entries could look something like this:
U+65BD+8010+6DB5 kMandarin shi1 nai4 han2 U+65BD+8010+6DB5 kDefinition Nathan Sturtevant, the author of YaziEach entry has 3 parts that are tab separated. The first part is the unicode characters, starting with U+ and followed by the unicode hexadecimal for each of the characters.
After the unicode entry, there is a tab, and then the field type. Many field types are pre-defined, but it is possible to add your own fields. After another tab is the actual data for those characters.
The Unihan and CEDict dictionaries are combined and stored internally in the Yazi application, along with the data file that describes the possible fields in the application. But, Yazi will also look for and load data from (in your home directory) Library/Application Support/Yazi Dictionaries/UserDictionary.txt if the file exists.
Possible dictionary fields that you might want to add new entries for:
kCantonese The Cantonese pronunciation(s) of this character kJapaneseKun The Japanese pronunciation(s) of this character kJapaneseOn The Sino-Japanese pronunciation(s) of this character kKorean The Korean pronunciation(s) of this character kMandarin The Pinyin pronunciation(s) of this character kVietnamese The Vietnamese pronunciation(s) of this character kDefinition The English definition of the characterIf you control-click on the Yazi application and choose "Show Package Contents" you can view the raw format of the dictionary fields in the Contents/Resources directory.
Yazi offers a Unicode services to convert characters to their unicode encoding. See servives for more information. If you have suggestions for dictionary support, or any other questions, you can contact us at the e-mail below.
This software is provided without warranty implied or otherwise.
Last modified: