Author Topic: Eana Eltu: Translator, Dictionary, API and putxìng.  (Read 20034 times)

0 Members and 1 Guest are viewing this topic.

Offline Tuiq

  • Tute
  • ***
  • Posts: 364
  • Karma: 18
  • I am the terror that flaps in the night.
Back to sqool
« Reply #100 on: May 28, 2010, 05:22:37 am »
Many times I was asked if there could be a better interface to provide data for offline applications. While the API was only usable with an active internet connection and provided almost all information you probably ever needed, the CSV/TSV files did not. Until now!



It took over 15 years, 15 billion US Dollars, a few cookies and some little animals but now it's here: The SQLinator. It contains almost everything the API could tell you, but it's easily usable even offline! The file is generated every time the PDFs are.

Also, I'm looking forward to next week when I'm going to implement even MORE hats! Stay tuned.
« Last Edit: May 28, 2010, 05:50:04 am by Tuiq »
Eana Eltu: PDF/TSV/jMemorize

Offline omängum fra'uti

  • Moderator Emeritus
  • Palulukan Makto
  • *****
  • *
  • Posts: 3804
  • Karma: 127
  • Na'vi's first grammar nazi
    • Pronounced Na'vi words
Re: Eana Eltu: Translator, Dictionary, API and putxìng.
« Reply #101 on: May 28, 2010, 05:27:16 am »
Not everyone is pro hat you know....

Ftxey lu nga tokx ftxey lu nga tirea? Lu oe tìkeftxo.
Listen to my Na'vi Lessons podcast!

Offline Tuiq

  • Tute
  • ***
  • Posts: 364
  • Karma: 18
  • I am the terror that flaps in the night.
Re: Eana Eltu: Translator, Dictionary, API and putxìng.
« Reply #102 on: May 28, 2010, 05:33:57 am »
You're crazy. Everybody wants hats.
Eana Eltu: PDF/TSV/jMemorize

Offline Toruk Makto

  • LearnNavi Admin
  • Toruk Makto
  • Palulukan Makto
  • *****
  • *
  • Posts: 6199
  • nv Eywa'eveng
  • Karma: 215
  • . Txepsiyu Markì .
    • Learn Na'vi
Re: Eana Eltu: Translator, Dictionary, API and putxìng.
« Reply #103 on: May 29, 2010, 04:51:03 pm »
...and safety dance!

Lì’fyari leNa’vi ’Rrtamì, vay set ’almong a fra’u zera’u ta ngrrpongu
Na'vi Dictionary: http://files.learnnavi.org/dicts/NaviDictionary.pdf

Offline Sіr. Ηaxalot

  • Palulukan Makto
  • *****
  • Posts: 1307
  • Karma: 45
  • ¯\_(ツ)_/¯
Re: Eana Eltu: Translator, Dictionary, API and putxìng.
« Reply #104 on: May 30, 2010, 07:17:53 am »
I have to report a bug.

Some of the IPÅ seems to be double encoded, e.g. the IPA for za'ärìp is "zaˈʔ.æɾ.ɪpÌ", while most IPA look fine.

Offline Tuiq

  • Tute
  • ***
  • Posts: 364
  • Karma: 18
  • I am the terror that flaps in the night.
Re: Eana Eltu: Translator, Dictionary, API and putxìng.
« Reply #105 on: June 01, 2010, 03:43:07 pm »
Not sure, but should be fixed.
Eana Eltu: PDF/TSV/jMemorize

Offline Muzer

  • Palulukan Makto
  • *****
  • Posts: 1348
  • Karma: 10
Re: Eana Eltu: Translator, Dictionary, API and putxìng.
« Reply #106 on: June 03, 2010, 05:30:22 am »
Hmm - why does the translator want "salivew" rather than (what I'm pretty sure is correct) sivalew?
[21:42:56] <@Muzer> Apple products used to be good, if expensive
[21:42:59] <@Muzer> now they are just expensive

Offline Tuiq

  • Tute
  • ***
  • Posts: 364
  • Karma: 18
  • I am the terror that flaps in the night.
Re: Eana Eltu: Translator, Dictionary, API and putxìng.
« Reply #107 on: June 03, 2010, 06:37:59 am »
It defines salew as sal<1><2><3>ew - composed of sa and lew - and applies the composed verb rules. I don't know much about Na'vi, I have somebody to confirm or deny that behaviour.
Eana Eltu: PDF/TSV/jMemorize

Offline omängum fra'uti

  • Moderator Emeritus
  • Palulukan Makto
  • *****
  • *
  • Posts: 3804
  • Karma: 127
  • Na'vi's first grammar nazi
    • Pronounced Na'vi words
Re: Eana Eltu: Translator, Dictionary, API and putxìng.
« Reply #108 on: June 03, 2010, 06:48:09 am »
Lew is an adjective not a verb, and sa isn't even a word, so theres two reasons thats wrong right there.  The meaning of lew has little to do with salew.
Ftxey lu nga tokx ftxey lu nga tirea? Lu oe tìkeftxo.
Listen to my Na'vi Lessons podcast!

Offline 'eylan na'viyä

  • Omatikaya
  • ****
  • Posts: 447
  • Karma: 10
Re: Eana Eltu: Translator, Dictionary, API and putxìng.
« Reply #109 on: June 08, 2010, 02:45:30 pm »
Hi Tuiq,
I made a spell checking dictionary based on hunspell which is used by many applications some time ago.
http://forum.learnnavi.org/your-projects-other-resources/navify-software/
But hunspell has a problem: It does not support infixes. So i ask you if you could make a script that exports a hunspell .dic file from the database including the most common conjugations of the verbs.
The file format is easy. There's the file of the momentary dictionary:
http://forum.learnnavi.org/your-projects-other-resources/navify-software/?action=dlattach;attach=4950

The letter after the / defines the word type. I'd work out the replacement rules for all the types.

It would be cool if you could implement this export function.

Offline Tuiq

  • Tute
  • ***
  • Posts: 364
  • Karma: 18
  • I am the terror that flaps in the night.
Re: Eana Eltu: Translator, Dictionary, API and putxìng.
« Reply #110 on: June 08, 2010, 02:58:23 pm »
"most common" is kind of hard to tell - I mean, for example, which are the most used verb times in English? Also, what's that number at the beginning? Size?

Just thinking: There are about, I guess, 2*3*4 combinations of verbs (and Eana does not even support smashed forms yet) - that would create 24 entries for one verb. That's an awful lot (there are about 210 verbs - 210 * 24 is more than all the words we even got yet (~1000)). That's a lot.
Eana Eltu: PDF/TSV/jMemorize

Offline 'eylan na'viyä

  • Omatikaya
  • ****
  • Posts: 447
  • Karma: 10
Re: Eana Eltu: Translator, Dictionary, API and putxìng.
« Reply #111 on: June 08, 2010, 05:45:57 pm »
Actually i don't know what the number at the beginning is. I think its an identification code and can be a random number. I just copied it from another file an it worked in all application i used it.

I think these infixes would cover up 99% of daily usage: (there are more, but most are mere assumptions)

position 0,1,2                  
eyk|äp|awn|us|-     ,     am|ìm|ìy|ay|er|ol|arm|ìrm|*ìry|*ary|*alm|*ìlm|*ìly|*aly|asy|ìsy|iv|imv|iyev|ìyev|irv|ilv|-    ,    äng|ei|uy|ats|-
*=not in the pdf; awn&us need to be treated as adjectives

5*23*5 *210=575*210=120 750

thats an incredible large number but the file only needs to be machine read and writable. And that is still the case. Eg: the belarus dictionary has 1 570 000 lines.

But i think at the moment it would be enough if only 1 infix per word would be recognized.
that would be (4+22+4+1)*210=6 930 entries. Still a lot but it would result in a quite small file for a dictionary.
Or are there other limitations than filesize?
« Last Edit: June 08, 2010, 06:47:15 pm by 'eylan na'viyä »

Offline Tuiq

  • Tute
  • ***
  • Posts: 364
  • Karma: 18
  • I am the terror that flaps in the night.
Re: Eana Eltu: Translator, Dictionary, API and putxìng.
« Reply #112 on: June 09, 2010, 04:55:37 am »
It's the time. It's already taking up to 30 seconds to generate the SQL, the dictionaries, CSV and TSV. If I add many more outputs (or one big...), this could slow down the whole thing very easily very much. If that's the case I'll have to switch the system - dictionaries would be automagically generated at midnight, this may breaks all addons, I'll have to think about that. It's quite complicated.

However, are there any other people/apps that could profit from that format?
Eana Eltu: PDF/TSV/jMemorize

Offline 'eylan na'viyä

  • Omatikaya
  • ****
  • Posts: 447
  • Karma: 10
Re: Eana Eltu: Translator, Dictionary, API and putxìng.
« Reply #113 on: June 09, 2010, 10:48:46 am »
30 seconds are quite a lot for such a small file. Do you know what exactly takes that long? If if its only the database access i think this wouldn't take much longer because each verb still has to be loaded only once.

At the moment i don't know other uses for the resulting file, but a wordlist is a very basic thing which could be useful for things nobody thinks about at the moment.

Offline Tuiq

  • Tute
  • ***
  • Posts: 364
  • Karma: 18
  • I am the terror that flaps in the night.
Re: Eana Eltu: Translator, Dictionary, API and putxìng.
« Reply #114 on: June 09, 2010, 01:47:14 pm »
"small file?" You're kidding, right? NaviDictionary.pdf, NaviCatDictionary.pdf, DictionaryNavi.pdf - three pdfs have to be created out of .tex. This isn't working at super sonic speed. Then there is the giant sql file, the jMemorize file, a CSV file, a TSV file - this takes some time.
Eana Eltu: PDF/TSV/jMemorize

Offline 'eylan na'viyä

  • Omatikaya
  • ****
  • Posts: 447
  • Karma: 10
Re: Eana Eltu: Translator, Dictionary, API and putxìng.
« Reply #115 on: June 09, 2010, 03:57:05 pm »
The pdfs are big, thats true. I misread and thought it were 30 sec each file. The dictionary that i made is only 7k. Compared to creating 3 pdfs inflecting some verbs should not take that long i guess.

Edit: i made the replacement table now:
Quote
.adj      A
adj., adv.   A & D
adj., intj.   A & J
adj., n.   A & N
adp.      Z
adv.      D
adv., intj.   D & J
conj.      C
dem.      N
dem., pn.   N
inter.      I
intj.      J
n.         N
n., adv.   N & D
n., intj.   N & J
num.      A
part.      C
phrase      J
pn.         N
pn., adv.   N & D
prefix      Z
v.         V
v., intj.   V & J
""         J

maybe its easier&better to split the type by "," than handling each of the combinations like "n., adv." individually.

I'm also making a script that can generate dictionary packages with installers for every supported application automatically.
« Last Edit: June 09, 2010, 04:49:04 pm by 'eylan na'viyä »

Offline Tuiq

  • Tute
  • ***
  • Posts: 364
  • Karma: 18
  • I am the terror that flaps in the night.
Re: Eana Eltu: Translator, Dictionary, API and putxìng.
« Reply #116 on: June 11, 2010, 04:55:56 am »
It's just, as of now the files are FTP'ed to learnnavi.org, which takes the most time. Transferring one more time will require more time, it's in a time where you can really feel every second you have to wait. Not to mention that aborted scripts could trigger hell.
Eana Eltu: PDF/TSV/jMemorize

Offline 'eylan na'viyä

  • Omatikaya
  • ****
  • Posts: 447
  • Karma: 10
Re: Eana Eltu: Translator, Dictionary, API and putxìng.
« Reply #117 on: June 11, 2010, 06:02:31 am »
you would not need to copy the file somewhere. you only need to trigger this script and it will download the file and generate the packages. It's almost complete now.
« Last Edit: October 07, 2010, 06:27:47 pm by 'eylan na'viyä »

Offline Tuiq

  • Tute
  • ***
  • Posts: 364
  • Karma: 18
  • I am the terror that flaps in the night.
Re: Eana Eltu: Translator, Dictionary, API and putxìng.
« Reply #118 on: June 13, 2010, 05:03:06 am »
All Eana services will be down tomorrow for a few hours/days. The already generated PDF, TSV and SQL files hosted on eanaeltu.learnnavi.org are not affected.
Eana Eltu: PDF/TSV/jMemorize

Offline Tuiq

  • Tute
  • ***
  • Posts: 364
  • Karma: 18
  • I am the terror that flaps in the night.
Re: Eana Eltu: Translator, Dictionary, API and putxìng.
« Reply #119 on: June 22, 2010, 03:14:23 pm »
Changed jMemorize format from .csv to .tsv. Same change applies for the filename, jm.csv won't be updated anymore.
Eana Eltu: PDF/TSV/jMemorize

 

Become LearnNavi's friend on Facebook Follow LearnNavi on Twitter! Watch LearnNavi's videos on YouTube

SMF 2.0.17 | SMF © 2017, Simple Machines | XHTML | RSS | WAP2 | Site Rules

LearnNavi is not affiliated with the official Avatar website,
James Cameron, LightStorm Entertainment or The Walt Disney Company.
All trademarks and servicemarks are the properties of their respective owners.
Images in the LearnNavi.org Forums and Gallery may not be used without permission.

LearnNavi Affiliates:
ToS

LearnNavi is the community to learn Na'vi, the Avatar Language
"A place where real friendships are made." -Paul Frommer

AvatarMeet | Learn Na'vi Forum | Learn Na'vi Wiki | Na'viteri

LearnNavi