Na'vi Firefox dictionary

Started by Hufwe ta'em, January 24, 2010, 07:50:48 PM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

Hufwe ta'em

Do someone know if it exist an dictionary for Na'vi on Firefox ?


Ftiafpi


Nume fpi sänume

Well, it hasnt yet been a proposed project here in the projects area. I also think it would be a great project for someone to undertake :D

Hufwe ta'em

Quote from: Nume fpi sänume on January 24, 2010, 10:41:50 PM
Well, it hasnt yet been a proposed project here in the projects area. I also think it would be a great project for someone to undertake :D
im to busy to start it :(


Eight

#4
If you have some more specifics about what you'd like then I might do this. Now that my rhyming dictionary is done, I'm forced to actually do my job during the days.... must put a stop to that.

Edit: are you saying you just want a dictionary file for Firefox's own spellchecking; or a plugin thing with it's own interface etc.?

Nume fpi sänume

I think they mean a dictionary file for Firefox so it stops telling me in spelling things wrong :P

Eight

Quote from: Nume fpi sänume on January 25, 2010, 03:25:11 PM
I think they mean a dictionary file for Firefox so it stops telling me in spelling things wrong :P
Hmm... off the top of my head I could probably churn out a dictionary file of:


  • nouns inflected for duality/triality/plurality and case1
  • adjectives (with and without -a)
  • pronouns
  • prepositions/particles
  • adverbs
  • verb roots

(All with and without dashes that are used in the beginners forum)

...in about a day depending on work load. Since I essentially have everything I need there in the source file and software for my rhyming dictionary.

Verb inflections (with and without brackets on infixes) and adpositions could take a bit longer. 1 Some nouns might need to be reviewed and filtered out if they have irregularities or don't work with me-, pxe- and ay- etc.

I think I'd definitely need a bit of help - i.e. someone to help run through the generated output and look for any major flaws. If anyone with a bit more knowledge than me is up for it then I'd be prepared to take a run at this.

RyleVao

Sounds cool, but would be a nightmare to make!
You are like a Baby!  Making noise, don't know what to do!
-Neytiri

Eight

Quote from: RyleVao on January 25, 2010, 05:05:43 PM
Sounds cool, but would be a nightmare to make!
Actually no... from what I think we know about Na'vi at this point, it's regular enough that you can easily write software to do a lot of work.

For instance, the rhyming dictionary was generated by my own software tool and LaTex. The only hard work was putting all the entries from Taronyu's dictionary in... and watching him update the master every time you think you're done. :D

The same tool (with a few tweaks) can deal with most parts of speech correctly (I think) enough to output in the format we'd need for a Firefox dictionary. The only real complication are verbs... where you need to be able to break down the syllables correctly to generate all the possible inflections... and that's something I started doing but initially got wrong. However, I've since read a better explanation of Na'vi syllable structure and I'm relatively confident even this bit could be done without too much hassle.   

I'll do some tests today, but again, would really need some wiser menari at some point.

A side benefit of this project, which has really swung it for me, is that if Firefox can recognise swira as a mistake and suggest swirä, then it would save having to type in the "special" characters when on this forum.

dcb

Hi Eight,

I have a Na'vi project in FieldWorks Language Explorer which can parse some Na'vi words automatically. So far it is parsing verbs OK, I think. It can't generate lists of valid words though. So I would be happy to try putting the list of words you generate through the parser and see which one's parse OK. I'll need to check the parser rules for each one that doesn't parse correctly. This would give us an interlinear text with each word, and it's gloss and part of speech. That should make it easy to see the meaning of each word, and how it is constucted too.

It would be helpful if the word lists could be divided according to part of speech, as I would have to fix the parsing rules for each part of speech.

All the best,
David B.

Eight

Quote from: dcb on January 26, 2010, 02:13:46 AM
It can't generate lists of valid words though. So I would be happy to try putting the list of words you generate through the parser and see which one's parse OK. I'll need to check the parser rules for each one that doesn't parse correctly. This would give us an interlinear text with each word, and it's gloss and part of speech. That should make it easy to see the meaning of each word, and how it is constucted too.
Hmm... could be interesting.

Let me try and test build a Firefox dictionary, and if everything is ok there then it might be fun to look at adding a second output format suitable for Fieldworks. Could be some useful work possible with that combination.

Eight

Ok folks. The results are in.

I modified my rhyme processor to spit out an appropriately formatted wordlist of pronouns. For each of these pronouns, it also generates versions with suffixes for erg(2)/dat(3)/gen(3)/top(2)/acc(2) - selecting the suffix based on whether the word ends in a vowel or a consonant. Versions with and without prefixes are also generated. E.g. oeyä and oe-yä.

It was then reasonably easy (little bit of faffing around initially) to package this up into a downloadable & installable dictionary extension for Firefox.

Notes:

1
Firefox seems to be misbehaving with the dashes e.g. if I type oe-ya then it will only tell me that ya is incorrect, it doesn't currently suggest oe-yä as a fix. I'd imagine there's a way around this, certainly if suffixes were listed in my dictionary then it would at least suggest -yä.

2
It is possible to type a word without the diacretics and have the Firefox dictionary suggest the version that does contain the special characters. Typing oeya will cause Firefox to highlight it as a mistake and suggest oeyä.

3
Pronouns are a pain. Because Taronyu's dictionary does not list determiners and personal pronouns differently, then mine does not either, and so the tool adds the noun casings to these. However, a determiner (e.g. this book is boring vs. this is boring) would not take suffixes. So I'd need help to blacklist invalid forms.

4
Following on from point 3 - if anyone wants me to continue with this then I will need support from someone who knows their onions Na'vi. This means someone who can answer the odd question, and is prepared to go through simple lists and pull out any forms (eg. genitive with determiners) that we don't want, and simply whack them into a text file for me to add to the blacklist.

But the overall outcome of all this is... it's possible. And not too difficult really.

Nume fpi sänume

Good to know. I think tackling the portion of the language we have now will help with the rest later on, or at least pave the way. Excellent work.

dcb

Hi Eight,

To begin with, a simple word list with all of the generated words would be fine for FieldWorks.
At the moment I'm still adding the lexemes and rule necessary to parse the letter from Prof. Frommer.


Eight

I'm just tweaking a few things then will have a go at getting a useful wordlist for you - and I owe Kaltxì Palulukan one as well.

I decided to push on with this add-on even though there's a fair few things I would check on if I had a bit of support.

It's currently at the stage of having pronouns and nouns inflected for dual/trial/plural in all cases (lenition is active). Adjectives with and without a- and -a (at either end), proper nouns in all cases, particles/interjections/adverbs/numerals/conjunctions/unknown-types in as they stand in Taronyu's dictionary. Currently missing prepositions/adpositions and verbs (probably something else, I forget).

The biggest issue right now is that Firefox is refusing to play ball with apostrophes.
E.g.
'awpo highlights as an error. The suggestion is 'awpo which then creates ''awpo, and even this is then highlighted as a mistake.

But e.g.
'evengur works fine as evengur is in the dictionary (lenited plural).

However, considering everything, it's currently usable and doing a pretty reasonable job (I've been using it on here for a few days now).

The generation side of things is fine (I think), just throws up the odd form here and there due to oddities in my dictionary file.

Alìm Tsamsiyu

What kind of support are you needing Eight? I'll help you look over the list to see if anything seems out of place.
Oeyä ayswizawri tswayon alìm ulte takuk nìngay.
My arrows fly far and strike true.

Eight

Quote from: Alìm Tsamsiyu on February 01, 2010, 02:15:19 PM
What kind of support are you needing Eight? I'll help you look over the list to see if anything seems out of place.
That's essentially the jist of it mate. I'll PM you.

'eylan na'viyä

does there still exist some files? unfinished or useful?

'eylan na'viyä

#18
I figured out that firefox uses hunspell as spell checker which is also used by open office, google chrome, opera, thunderbird, ...
it supports prefixes and suffixes even multiple ones and maybe infixes
i also found some kind of documentation:
http://www.digipedia.pl/man/doc/view/hunspell.4/

I made up a showcase dictionary addon.
-the problem with words starting/ending with ' still exists
-prefixes and suffixes are not implemented yet but it is possible

to generate an updated version download http://eanaeltu.learnnavi.org/dicts/NaviDictionaryjm.csv
and regexp replace "([^"]*)","[^"]*","[^"]*" with \1 and replace the content(except first line) of nvi.dic in the .xpi(zipfile) with this result

new regexp: "([^"]*)","[^"]*","([^"]*)" with \1/\2 and n. -> N; adv.->A; adv.->D; v.->V;

'eylan na'viyä

#19
now I made something that is actually usable
-single prefixes and suffixes work
- ' , multiple pre/suffixes and infixes will be at least much more complicated

Edit: This addon works for Firefox AND Thunderbird