Hunspell: words normalization
Latin is an inflected language, so a word could have several forms but dictionaries usually contain only its normal form, such as nominative singular for nouns. For this reason, one cannot simply copy an unknown word into GoldenDict, but has to manually type its correct form.
Hunspell is a spell checking system, also able to produce lemmas, i.e. normal forms of the words, so, for example, word “puellam” will be transformed to “puella”, which significantly simplifies work with dictionaries. Other examples of normalization are ligatures, “pæne” > “paene”, and diacritics, “curâ” or “malè”.
Note: Hunspell helps finding lemmas of Latin language, but it does not work so with Ancient Greek.
Hunspell by itself needs a dictionary. To the date we could choose between two variants:
- Dictionary by Karl Zeiler and Jean-Pierre Sutto
- Dictionary by Konrad Kokoszkiewicz (at the very bottom of the page)
The first dictionary is the most universal, the second contains strictly classical lexicon (to the end of 2nd century AD). Each dictionary should work fine with GoldenDict. Additionally, they could be used for spellchecking in LibreOffice, Chrome, Firefox, &c.
Note: Users of Linux could install Hunspell dictionaries as ordinary applications (see Software Center or something alike). For example, Arch Linux provides a package based on the dictionary of Karl Zeiler and Jean-Pierre Sutto. After installing, follow the step 3 in the list below, or start from beginning if your distributive has no Hunspell dictionary for Latin language.
Procedure is simple:
- Download the dictionary and extract files (Karl Zeiler’s variant).
- Start GoldenDict, open menu Edit > Dictionaries, tab Sources > Morphology.
- Change “Path to a directory with Hunspell/Myspell dictionaries” to the folder where you saved the files. (Linux: if you installed dictionary as package, then the correct path will be “/usr/share/hunspell”.)
- Enable (check on) “Latin Morphology” in the list.
- Press OK. Try typing an incorrect word, you should see the list of spelling suggestions (see image).
- If not, open settings (Edit > Dictionaries), and on the tab Groups add “Latin Morphology” to the group.
la_LA.dic to the phone’s SD card, into the GoldenDict folder. Run application, it will recognize them as a new dictionary.
Mobile application BlueDict for Android could use special morphology dictionary. Registration is required for download from the website (all in Chinese). Probably, this file could be used in other applications reading MDict format, but it was not tested.
MDict supports unmodified Hunspell dictionaries. Files
la_LA.dic should be copied into the “/mdict/data/” folder.