New autogenerated spellchecking dictionaries for many applications

Started by 'eylan na'viyä, June 28, 2010, 05:36:20 PM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

'eylan na'viyä

To ensure that the spellcheckers can always rely on the newest dictionary i wrote a little php script that automatically generates the wordlist and the addons for all the applications i made an addon before manually.

Applications
Firefox
download na7vi_mozilla.xpi and drag'n drop it into firefox.
i planned to make this addon auto-updateable but due to signature issues and addons.mozilla.org limitations it needs to be done by hand. Because of this it wont be up to date all the time and the addon-manager might show an error.
But anyway you can still update the addon by hand by redoing the install procedure.
Thunderbird
Works the same way as with firefox but hasn't been tested much yet
Other Mozilla applications
please report if something works or not
Google Chrome
chrome uses a strange format. Therefore it needs to be compiled on a windows machine. The tool is included in the package and can be easily run by clicking generate.bat. Under linux wine would be required (you can still compile it on a windows machine and copy the files to the target system).
For an easy (un)installation in windows you can download na7vi_chrome_inst.zip ; extract it and run either (un)install_winvista&7.bat or (un)install_winxp.bat
To be able to use the spellchecker you need to select Dansk as language (i apologize if you are from Denmark. it isn't possible without that hack)
Opera
For an easy (un)installation in windows you can download na7vi_opera_inst.zip ; extract it and run (un)install_windows.bat
OpenOffice
In Windows you should be able to install it by double clicking the downloaded na7vi_oo.oxt
To be able to use the spellchecker you need to select Dansk as language (i apologize if you are from Denmark. it isn't possible without that hack)
Gnome
Many applications that come with the Gnome Desktop for linux (including ubuntu) use shared hunspell dictionaries for spell checking.
To (un)install the Na'vi dictionary download na7vi_gnome_inst.zip and run (un)install.sh with root rights
OSX
OSX 10.6 or higher also supports hunspell dictionaries but i weren't able to test it yet. So try to run the install.sh(don't know if osx even supports sh) or copy the .dic & .aff to the folders that are used in the install.sh
required file: na7vi_osx_inst.zip
Other Applications
I am sure that there are other applications that also support hunspell. To use the dictionary there you need to find out yourself where you need to place the .dic & .aff
suggested file: na7vi_dictionary.zip

download

If you see that something is not working as expected or you have an idea how to improve something please let me know. Especially with that amount of different software and Operation Systems its impossible to test everything in advance.

The same applies to the spellchecking itself.

Known issues with the dictionary:
- an ' at the beginning or end of a word is not detected correctly (bug in hunspell)
- hunspell doesn't support infixes, so i had to write an php script to generate the giant amount of possible combinations.
 i think these combinations cover 99% of the daily usage but it are not all combinations because that would blow up the filesize even more.
 In fact i don't even know which combinations are possible
 thats what i implemented: eyk|äp|-   ,   am|ìm|ìy|ay|er|ol|arm|ìrm|*ìry|*ary|*alm|*ìlm|*ìly|*aly|asy|ìsy|iv|imv|iyev|ìyev|irv|ilv|-  ,  äng|ei|uy|ats|-     + awn + us


Autogeneration
to be sure that everything is up to date you can open these pages before downloading

autogenerate dictionaries
autogenerate addons

Other resources
download dictionaries
used Eana Eltu sql source , Eana Eltu

DutchNavi

Ma 'eylan na'ivyä,

Amazing work but I noticed that you haven't implemented two grammar rules that are mentioned in "Na'vi in a Nutshell" and the reference grammar of wm.annis:

2.3.2. Pseudovowel Contraction. Due to the shape of the aspect infixes, ‹er›
and ‹ol›, it is possible for the pseudovowels to occur immediately after their consonantal
counterpart, as in *p‹ol›lltxe. When this happens in an unstressed syllable,
the pseudovowel disappears, poltxe. In a stressed syllable, the infix disappears,
*f‹er›rrfen > frrfen.

Some of the incorrect entries in your word list: ferrrfen, molllte and pollltxe.

2.3.3. Affect Infix Epenthesis. When the positive affect infix ‹ei› is followed
by the vowel i a y is inserted, *s‹ei›i > seiyi.

Some of the incorrect entries in your word list: 'ampeii, ätxäle seii, eltu seii, fmeii, kelku seii, keiin, lrrtok seii, mun'eii, nari seii, reiikx, sngä'eii, uvan seii, kavuk seii, latseii, lew seii, muntxa seii, piak seii, steykeii, steii, tì'awm seii, tìkangkem seii, tìsraw seykeii, tìsraw seii, tsaheyl seii, tsap'alute seii, tsre'eii, tstu seii, win säpeii, win seii, srung seii, txopu seii, pamrel seii, ultxa seii, feweii, yemfpay seii, tìkxey seii, kxll seii, tsulfä seii, teya seii, kem seii, irayo seii, eltur tìtxen seii and law seii.



'eylan na'viyä

sorry for letting you wait so long.
Of course i will release the source. i think i posted one half of it somewhere else but i will also put it here.
I wanted to make it also compatible to other languages. the script itself is almost, but the includes would have to be modified by hand.
The script is split in 2 parts. The first one gets the sql dump finds the words and generates the infixes which are not natively supported in hunspell, adds word type suffixes and then outputs complete dictionary. The second script takes the dictionary and puts it together with all the other files that need to be in a dictionary addon file which are different for most applicatins. Then it generates the zip packed ready to use dictionary addons.
There was a little problem with the auto update function with the firefox addon. I don't remember exactly. It was because of external hosting on addons.mozilla.org which require a ssl certificate or some key signing method that could not be automated and that dictionary addons were treated differently. but you can still update by hand like you have to with all other applications.

Here are the 2 scripts including all the required folders, includes and some old output.
i don't know if it still works properly because tuiq modified some things but from a rough glance it looks good.
hunspellzipper.zip
na7vi_dic.zip

About the errors in the output. Maybe i can fix them but i think it would be a better approach to let this happen closer to the source of the words ;) this way it could not happen that it suddenly stops working
here: http://forum.learnnavi.org/projects/tsim-apiak-number-converter-translator/
or here: http://github.com/Quit/EanaEltu

Tuiq

Eana Eltu: PDF/TSV/jMemorize

Yawne Zize’ite

About the apostrophe error, have you tried using U+02BC ʼ instead of U+0027 ' for the apostrophe?  Many programs consider ' to be punctuation or a control code, but they accept ʼ as a letter.  You could have the dictionary fetcher script run a search-and-replace before import.

Oe Lu Toruk Makto

what version of firefox is the na7vi_mozilla.xpi compatible with? because it says "Na'vi Classic Notation dictionary could not be installed because it is not compatible with Firefox 16.0.1."
Toruk Makto is king of the Ikran riders

Eana Unil

Just wanted to ask if this is ever going to be re-uploaded or updated or whatever?
If not, would someone else create a Na'vi spellchecker like this one here? Such a thing would be awesome.