Conquer MS Word!!

Started by Taronyu Ayunilyä Alahe, February 19, 2010, 07:16:15 AM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

Taronyu Ayunilyä Alahe

just a simple idea...
let's make MS Word recognize Na'vi :)
right click on a underlined Na'vi word (which happens when MS word does not recognize a word or sth) and press the add to dictionary thing..

hm, but that will make the Na'vi words mix with Englishh... can we make a whole new language mode in MS Word??

any ideas?


(oh and is this the right place to place this topic? o.O)
ke plltxe ngeyä kawng tìrey lu

'eylan na'viyä

i dont use word, but maybe the spellingcorrection-system has an imoprt-function

Eight

Looked at this when I started the Firefox dictionary - it's certainly possible.

That reminds me, must find time to finish that project off. :(

hawnuyuna'viyä

Refer to the MS support article for custom dictionaries: http://support.microsoft.com/kb/322198.

This takes through creation and importing.
It will simply be a case of typing the words into their system instead.
This can be simplified however, since the dictionary format is simply a \cr\lf (dos new-line) seperated list of words, with a .dic extension.

So simply create a file, rename to make it have a .dic extensions, then enter each word you want in the dictionary on a new line. (Just don't forget the different cases you will need to enter the words in.)


Our Lady of Toast

The tricky thing is that those custom dictionaries are really intended to extend your main language (such as English).

I've been corresponding with an Office developer at Microsoft about the spellcheck, and extending it to languages that aren't currently supported.  (I'm a linguist and we regularly have to include a lot of text in other languages.)  Obviously there is not going to be an official Na'vi spellchecker coming out from Microsoft any time soon, but I'm hoping that if they give me enough information to let me create something for a human language, we'd be able to use it for Na'vi as well.

Kaltxì Palulukan!

Custom dictionaries are very easy in MS Word. Just make sure to save these words ONLY in your custom dictionary ;)
2022 update: Working on the new astrology book. "How to read tarot" books are on Amazon, if you are into that sort of thing.
Okay, so the old podcast is here: https://www.podomatic.com/podcasts/radioavatar It was goofy fun that ended too soon, but we had creative people. I hope we can get a new gang together (interested? PM me, let's make some magic!)
(Very old, outdated) Na'vi FUN activity book is here: But what are you doing? Let me know! :)

Nume fpi sänume

Yes, I've added in quite a few words into MS words custom dictionary.

Taronyu Ayunilyä Alahe

#7
Quote from: hawnuyu na'viyä on February 19, 2010, 04:24:44 PM
Refer to the MS support article for custom dictionaries: http://support.microsoft.com/kb/322198.

This takes through creation and importing.
It will simply be a case of typing the words into their system instead.
This can be simplified however, since the dictionary format is simply a \cr\lf (dos new-line) seperated list of words, with a .dic extension.

So simply create a file, rename to make it have a .dic extensions, then enter each word you want in the dictionary on a new line. (Just don't forget the different cases you will need to enter the words in.)


oh man,, I'm feeling stupid and slow today so I dunno what those words actually means...
hm, so we can make a new language or does Microsoft have to make it?

EDIT: I did it! I made a new language spellcheck thingy! now all I hav to do is.. add the words ><

EDIT: oh man, even if I added all the words I still hav to add words with tenses :O
ke plltxe ngeyä kawng tìrey lu

hawnuyuna'viyä

Yep, that is the slow bit.

We could just write a little ruby script to parse the jMemorize list( http://forum.learnnavi.org/your-projects-other-resources/new-jmemorize-lists/), since that contains information about which words are verbs, so we could then apply the infixes in, but I am not sure how we would explain to the computer where to apply infixes.

Or, we could use that to pull out everything but the verbs, to put straight into the dictionary, but verbs have to be added manually with infixes. (Probably the better idea)

'eylan na'viyä

#9
i think that with the help of eanaeltu it might be possible top create all possibilities if infixcombinations. but i think that there will be far too many combinations to add them to a dictionary. to identify verbs some scripting would be necessary, but i dont know if that is possible in msword

hawnuyuna'viyä

No, I was suggesting that we use scripting to help in the generation of the dictionary OUTSIDE of msword.

The only bit that would take human intervention would be the generation of all possible infix combinations with all verbs.

'eylan na'viyä

Quote from: hawnuyu na'viyä on February 20, 2010, 11:20:52 AM
No, I was suggesting that we use scripting to help in the generation of the dictionary OUTSIDE of msword.
i got that, and i agree that this is usefull.

Quote from: hawnuyu na'viyä on February 20, 2010, 11:20:52 AM
The only bit that would take human intervention would be the generation of all possible infix combinations with all verbs.

its not only that this would be quite a lot of work, i frear that this could blow up the dictionarysize to X MB.

i try a rough calculation of the number of possibilities:

i dont know if all combinations are possible, logical or practical:

(eyk|äp|us|-)(am|ìm|-|ìy|iy)(ol|-|er)(iv|-)(ei|äng|uy|ats|-)
4                 *5                *3        *2    *5                    =600

in worst case it could even be:

(eyk|-)(äp|-)(us|-)(am|ìm|-|ìy|iy)(ol|-|er)(iv|-)(ei|äng|-)(uy|-)(ats|-)
2       *2     *2     *5                *3        *2    *3          *2     *2       =2 880

i guess you see now that this is quite a lot for only 1 verb

hawnuyuna'viyä

Oh, I see what you mean, however, it will not make the dictionary large (MB wise, it will be long though).

Assuming worst case of 2880 (shouldn't really be this big) @ 30 characters, stored as UTF-8 (multi-byte assuming all 4 bytes are required for each character (not actually the case)): 338KB.

A more reasonable case of 600 @ 20 characters, stored as UTF-8 (assume 1 byte for 18 characters, 2 bytes for 2 characters): 14KB.

'eylan na'viyä

if it doesnt make too much additional work a dictionary for open office would be nice too

http://wiki.services.openoffice.org/wiki/Extension_Dictionaries
it seems that these .oxt files are zipfiles with some xml files in it.
the actual dictionary file (.dic) looks quite uncomplicated

Eight

Done it already (most of the way anyway) and generating the possible combinations is far from a nightmare... and with our current word list the output is not huge at all.

It could be - if Na'vi's word list grows to the size of a real world language - which is why dictionary systems are typically made up of a .dic file of words and roots, and a .aff file describing rules for affixes etc. I didn't go down that route for two reasons

  • just because at this point there is no need, and it is far easier to write a program/script to do the work than master those affix files.
  • and the first test was to see whether forms that include dashes and <>s suitable for the beginners forum would work (which would have been a problem in the .aff files)

hawnuyuna'viyä

Quote from: 'eylan na'viyä on February 20, 2010, 02:15:20 PM
if it doesnt make too much additional work a dictionary for open office would be nice too

http://wiki.services.openoffice.org/wiki/Extension_Dictionaries
it seems that these .oxt files are zipfiles with some xml files in it.
the actual dictionary file (.dic) looks quite uncomplicated

It will be quite possible to create the dictionary for OpenOffice as well.
The actual wordlist format is almost identical, it simply has a bit more xml to tell it what the dictionary actually is.

Quote from: Eight on February 20, 2010, 02:28:27 PM
Done it already (most of the way anyway) and generating the possible combinations is far from a nightmare... and with our current word list the output is not huge at all.
May I ask how you generated the word list with all combinations?

Quote from: Eight on February 20, 2010, 02:28:27 PM
It could be - if Na'vi's word list grows to the size of a real world language - which is why dictionary systems are typically made up of a .dic file of words and roots, and a .aff file describing rules for affixes etc. I didn't go down that route for two reasons

  • just because at this point there is no need, and it is far easier to write a program/script to do the work than master those affix files.
  • and the first test was to see whether forms that include dashes and <>s suitable for the beginners forum would work (which would have been a problem in the .aff files)
The system for msword, since we would not be generating a 'proper dictionary', only a 'custom one', would not allow us to have the system split into root+affix.

hawnuyuna'viyä

Then because it uses the same format as msword dictionary (and a header), we could create an aspell dictionary which then works across all KDE applications. (Assuming spell check was enabled when it was compiled).

See http://samat.org/2008/11/02/creating-your-own-personal-aspell-dictionary

Eight

#17
Quote from: hawnuyu na'viyä on February 20, 2010, 02:39:02 PM
May I ask how you generated the word list with all combinations?
.NET program - processes the word list from the rhyming dictionary and does stuff like

add dictionary form to output
plurality for nouns - add prefixes (lenition is easy to code) - add word(s) to output
run base form and new plural forms through the case functions below:
extra for nouns - inflect for erg (i.e. if it ends in Na'vi vowel add -l, if not, add - ìl - add word to output
                    - inflect for dat (i.e. if it ends in Na'vi vowel add -r, if not, add -ur - add word to output
                    - inflect for dat2(i.e. if it ends in Na'vi vowel then die, if not, add -ru - add word to output
                    - etc. etc;
extra for pronouns - much the same as nouns - but no plurality as my word list includes those forms anyway
extra for adjectives adjectives - add a- to front - add word to output
                                        - if it ends in a, do nothing, else add -a - add word to output
etc. etc.

Verbs are a bit more of a pain, I could have specified the syllables in my word list but I didn't. But you can use your knowledge of Na'vi morphology to find the infix positions with relative ease. And your functions want to be somewhat recursive for dealing with multiple affixes.

Then you end up with some forms that don't make a huge amount of sense due to the word classes given in Taronyu's dictionary, so you make sure the program has a simple blacklist function that you can use to stop invalid forms making it into the final output.

Doesn't need to be elegant, just needs to work.

Taronyu Ayunilyä Alahe

wow I'm quite surprised at the responses... I just started yesterday and not much was done. I'm going for a 5 days trip tomorrow soo I'll analyse everyone's help/blah then and then continue filling in words. irayo for everything! :)
ke plltxe ngeyä kawng tìrey lu

'eylan na'viyä

#19
i made one for Open Office out of the hunspell dictionary i created for firefox
http://forum.learnnavi.org/your-projects-other-resources/navify-software/msg203197/#msg203197
installation:
download
doubleclick and install
when creating a new document goto tools->language->all the text->more...
out of the list for "western" choose esperanto (<- i know its stupid but i had to choose a language already in the list or it would not have been displayed)
click ok and it should work