Author Topic: Translating Taronyu's dictionary  (Read 20640 times)

0 Members and 1 Guest are viewing this topic.

Offline marcin1509

  • Ketuwong
  • *
  • Posts: 35
  • Karma: 0
Re: Translating Taronyu's dictionary
« Reply #120 on: June 30, 2016, 07:57:29 am »
Maybe let's create one app to connect all translations in different languages.
Maybe I'll try to write it in Java+sqlite. Maybe after it i'll try to write in other technology.

Offline Tìtstewan

  • LearnNavi Zeykoyu
  • Toruk Makto
  • Palulukan Makto
  • *****
  • *
  • *
  • Posts: 9804
  • de Germany
  • Karma: 321
  • Ke lu oeru kea krr krrtalun!
    • My YouTube Channel
Re: Translating Taronyu's dictionary
« Reply #121 on: June 30, 2016, 08:50:38 am »
If I am not mistaken, there is an app in development that will support or it is planned to support multiple languages. *looks at Tirea Aean* :P

-| Dict-Na'vi.com | Na'viteri Files | FAQ | LM | Puk Pxaw 'Rrta | Kem si fu kem rä'ä si, ke lu tìfmi. |-

Offline marcin1509

  • Ketuwong
  • *
  • Posts: 35
  • Karma: 0
Re: Translating Taronyu's dictionary
« Reply #122 on: June 30, 2016, 09:20:50 am »
I meant a windows application, not mobile (Android). Maybe a universal app for windows mobile and windows 10 desktop.

Offline Tirea Aean

  • The Blue One
  • Olo'eyktan Anawm
  • Palulukan Makto
  • *****
  • *
  • *
  • *
  • Posts: 9847
  • nv Eywa'eveng
  • Karma: 243
  • Oeri ran lu srung
    • Tirea Aean
Re: Translating Taronyu's dictionary
« Reply #123 on: July 02, 2016, 11:23:16 am »
I meant a windows application, not mobile (Android). Maybe a universal app for windows mobile and windows 10 desktop.

I've already been working on this for years. :D

Which reminds me I need to somehow get around to compiling it for Windows (it's already compiled for Linux now). See my GitHub http://github.com/tirea  Specifically the Fwew and possibly the vrrtepcli projects.  The only issue with these for Windows users is, no one likes to use cmd.exe as an interface on Windows. HRH So if you pull this off with a nice GUI, I might just stop cross-compiling my apps ;)

kelku ikranä a hawnventi yom podcast (na'vi-only): https://tirearadio.com/podcast
Learn Na'vi Discord Chat: https://discord.gg/WF6qcmv

Offline Tirea Aean

  • The Blue One
  • Olo'eyktan Anawm
  • Palulukan Makto
  • *****
  • *
  • *
  • *
  • Posts: 9847
  • nv Eywa'eveng
  • Karma: 243
  • Oeri ran lu srung
    • Tirea Aean
Re: Translating Taronyu's dictionary
« Reply #124 on: January 19, 2017, 02:22:35 am »
Not to resurrect the dead, but...


I'm just not sure what to make of our Russian portion of the database. It's been a pile of question marks in localizedWords for a couple years now. What happened and how can this be fixed?


As you see, localizedInfixes table looks okay, but localizedWords is just not working out for our Russian-speaking friends. This bug is indeed found in production. Therefore causing the sql file to contain this, and thus the projects (such as vrrtepcli and fwew) using this data as well.

kelku ikranä a hawnventi yom podcast (na'vi-only): https://tirearadio.com/podcast
Learn Na'vi Discord Chat: https://discord.gg/WF6qcmv

Offline Tìtstewan

  • LearnNavi Zeykoyu
  • Toruk Makto
  • Palulukan Makto
  • *****
  • *
  • *
  • Posts: 9804
  • de Germany
  • Karma: 321
  • Ke lu oeru kea krr krrtalun!
    • My YouTube Channel
Re: Translating Taronyu's dictionary
« Reply #125 on: January 19, 2017, 03:17:26 am »
Apparently, because there was an issue when saving russian characters in the database. I have the same issue when using the database for the dictionary generator.
Code: (php) [Select]
<?php
// Select your language. Following are available (according the EE's NaviData.sql)
// eng, de, pl, est, hu, sv, nl and ru (ru = russian will show no words only ???, so don't use it)
$lang "'eng'";
?>

-| Dict-Na'vi.com | Na'viteri Files | FAQ | LM | Puk Pxaw 'Rrta | Kem si fu kem rä'ä si, ke lu tìfmi. |-

Offline Tuiq

  • Eyktan
  • Tute
  • *****
  • Posts: 364
  • Karma: 18
  • I am the terror that flaps in the night.
Re: Translating Taronyu's dictionary
« Reply #126 on: January 19, 2017, 04:57:39 am »
I don't think the issue occurs when saving into the database. The problem was the encoding of the characters. While latin1 works fine for English, German and various other, non-too-far-from-latin-character-languages, Russian with its cryllic alphabet wasn't part of latin1 and resulted in the question marks you see.

EE's main database contains the proper data (see the attachment), so it seems like the encoding is not properly written into the exported files (as the TSV shows the same issue). This is kind of weird... I think I'm exporting it explicitly as UTF-8, as is obvious when you take a look at the German words ('berühren'). I'm not entirely sure why it doesn't work for the Russian one. Looking at it, there seem to be other characters that don't seem to be properly encoded either.

So, errr... It's been a few (... 3? 4? 5? 6?) years since I've actually messed around on the server, or the database for that matter. I can't tell you more than something with the encoding doesn't seem right when exporting the files; but it's right in the web interface. So I suppose I would start investigating where the mistake in the export script lies... Is the encoding for the output stream not properly set, or even ignored? Is the file okay, but the webserver serves it wrong? Is the data read somehow corrupted from the database?

I think by now I've lost all access to EE/the database, so if I should take a look at that, I would need the credentials, URIs and all that again.

Edit: Okay, so that words don't work, but infixes do, is odd. Like, really odd. I suspect that something in SpeakNavi is not dealing with the UTF-8 properly, but then again, why is only THAT affected.
« Last Edit: January 19, 2017, 01:51:51 pm by Tuiq »
Eana Eltu: PDF/TSV/jMemorize

Offline Tirea Aean

  • The Blue One
  • Olo'eyktan Anawm
  • Palulukan Makto
  • *****
  • *
  • *
  • *
  • Posts: 9847
  • nv Eywa'eveng
  • Karma: 243
  • Oeri ran lu srung
    • Tirea Aean
Re: Translating Taronyu's dictionary
« Reply #127 on: January 19, 2017, 03:14:47 pm »
Those screenshots I posted were from the server's mySQL via SSH. I wasn't sure of the source of the bug. I ran mysql with --default-character-set=utf8 param. I didn't find anything wrong with other languages when selecting stuff from the table. ¯\_(ツ)_/¯

kelku ikranä a hawnventi yom podcast (na'vi-only): https://tirearadio.com/podcast
Learn Na'vi Discord Chat: https://discord.gg/WF6qcmv

Offline Tuiq

  • Eyktan
  • Tute
  • *****
  • Posts: 364
  • Karma: 18
  • I am the terror that flaps in the night.
Re: Translating Taronyu's dictionary
« Reply #128 on: January 19, 2017, 03:24:42 pm »
Using another client displays it correctly. In all likelihood, the font you're using does not support cryllic characters?

Some other languages seem to be broken, too. For example, VALUES ('8','pl','M?otog?ów','rz.') seems a bit borked - I don't expect it to have a question mark in the middle of the word... twice.
Eana Eltu: PDF/TSV/jMemorize

Offline Tirea Aean

  • The Blue One
  • Olo'eyktan Anawm
  • Palulukan Makto
  • *****
  • *
  • *
  • *
  • Posts: 9847
  • nv Eywa'eveng
  • Karma: 243
  • Oeri ran lu srung
    • Tirea Aean
Re: Translating Taronyu's dictionary
« Reply #129 on: January 19, 2017, 03:26:54 pm »
But the Cyrillic chars in the localizedInfixes table query look fine. (See other screenshot up there)

kelku ikranä a hawnventi yom podcast (na'vi-only): https://tirearadio.com/podcast
Learn Na'vi Discord Chat: https://discord.gg/WF6qcmv

Offline Tìtstewan

  • LearnNavi Zeykoyu
  • Toruk Makto
  • Palulukan Makto
  • *****
  • *
  • *
  • Posts: 9804
  • de Germany
  • Karma: 321
  • Ke lu oeru kea krr krrtalun!
    • My YouTube Channel
Re: Translating Taronyu's dictionary
« Reply #130 on: January 19, 2017, 03:43:10 pm »
For me, the question marks are in the NaviData.sql file, I use for the generator.

EDIT: And I use mysqli_set_charset($db_link, 'utf8'); in my script.

-| Dict-Na'vi.com | Na'viteri Files | FAQ | LM | Puk Pxaw 'Rrta | Kem si fu kem rä'ä si, ke lu tìfmi. |-

Offline Tuiq

  • Eyktan
  • Tute
  • *****
  • Posts: 364
  • Karma: 18
  • I am the terror that flaps in the night.
Re: Translating Taronyu's dictionary
« Reply #131 on: January 19, 2017, 03:52:45 pm »
Right, so let's clear up some misunderstandings first...

There's four data sets:

1. The original database, which also happens to be MySQL. Here, everything is OK.
2. Because the original dataset was meant for LaTeX, it's rather ugly. Therefore, the script reads the data, transforms it into a nicer form, and makes it available for exporters.
3. The SQL exporter takes the refined data and writes it in a SQL format. We know that this SQL is wrong, as it omits certain characters... in certain blocks.
4. You're loading the SQL again in a MySQL database.

So either step 2 or 3 are borking. Step 1 can't be, because it's fine in my database - see the example (dumped as JSON):

Quote
[
    {
        "id": "1",
        "arg1": null,
        "arg2": null,
        "arg3": "\u043f\u0435\u0440.",
        "arg4": "\u0442\u0440\u043e\u0433\u0430\u0442\u044c",
        "arg5": null,
        "arg6": null,
        "arg7": null,
        "arg8": null,
        "arg9": null,
        "arg10": null,
        "odd": ""
    },
    {
        "id": "2",
        "arg1": null,
        "arg2": null,
        "arg3": "\u0441\u0443\u0449.",
        "arg4": "\u043c\u043e\u043b\u043e\u0442\u043e\u0433\u043b\u0430\u0432 (\u0436\u0438\u0432\u043e\u0442\u043d\u043e\u0435)",
        "arg5": null,
        "arg6": null,
        "arg7": null,
        "arg8": null,
        "arg9": null,
        "arg10": null,
        "odd": "",
        "lc": "ru"
    },

It's encoded for JSON, but if you're evaluating it, it's the proper stuff (трогать, молотоглав (животное)). So step 1 isn't breaking. Either 2 or 3 are acting up, and I'm not sure which one it is yet. I'll need to get access to the current scripts that are exporting the stuff, so I can see what's going on.

Technically, I would say #2 is broken, because as far as I can tell, the SQL exporter is setting the output encoding properly and everything. However, that the infixes work, but the words don't, sounds fishy... which makes it more likely that 2 is broken again.

Seriously, I should just rewrite this stuff already in C#.
Eana Eltu: PDF/TSV/jMemorize

Offline Tuiq

  • Eyktan
  • Tute
  • *****
  • Posts: 364
  • Karma: 18
  • I am the terror that flaps in the night.
Re: Translating Taronyu's dictionary
« Reply #132 on: January 19, 2017, 07:21:51 pm »
It was MySQL's fault. I'm not entirely sure why, but I'm not going to question it, nor am I going to go deeper into this issue.

All content should now be available and properly UTF8 encoded. Russian was just the most obvious one; there are other entries in the .sql which weren't properly encoded. Those should be fixed now, too. The infixes worked because they're being loaded with another DB wrapper, which was already forcing MySQL to send it using UTF8 or something... something I've had to do in the other wrapper myself now.
Eana Eltu: PDF/TSV/jMemorize

Offline Tirea Aean

  • The Blue One
  • Olo'eyktan Anawm
  • Palulukan Makto
  • *****
  • *
  • *
  • *
  • Posts: 9847
  • nv Eywa'eveng
  • Karma: 243
  • Oeri ran lu srung
    • Tirea Aean
Re: Translating Taronyu's dictionary
« Reply #133 on: January 20, 2017, 07:19:22 am »
Ma Tuiq, thank you for all your time investigating and fixing this!

kelku ikranä a hawnventi yom podcast (na'vi-only): https://tirearadio.com/podcast
Learn Na'vi Discord Chat: https://discord.gg/WF6qcmv

Offline eejmensenikbenhet

  • Palulukan Makto
  • *****
  • *
  • Posts: 1019
  • nl Netherlands
  • Karma: 15
    • Ketuwong aNeyn
Re: Translating Taronyu's dictionary
« Reply #134 on: June 03, 2017, 03:38:57 am »
I'm having trouble saving the Dutch dict... I just translated the Mo'ara definition in 13.41 and entered the new changelog line before clicking "Create", and now it gives me a wall of red error text.

I've included the complete Log in the attachments.

EDIT: Also, it seems a lot of the Variable's have been changed... Complete translations have vanished? The intro text, daytime definitions and more...

Offline eejmensenikbenhet

  • Palulukan Makto
  • *****
  • *
  • Posts: 1019
  • nl Netherlands
  • Karma: 15
    • Ketuwong aNeyn
Re: Translating Taronyu's dictionary
« Reply #135 on: June 03, 2017, 03:45:18 am »
(Excuse the double-post)
Upon further inspection it seems that it has broken off every translation containing an accented vowel. ì/ä/ë/ó
Those are used both in Na'vi and in Dutch but I never had any trouble with them.

EDIT: It seems that every time I try to compile the document it cuts off all translations containing an accented vowel... I luckily have saved the 13.332 version on my laptop so I can copy and edit the intro texts and such, but I'm hoping that this didn't affect any of the word translations.
« Last Edit: June 03, 2017, 04:16:53 am by eejmensenikbenhet »

Offline Tuiq

  • Eyktan
  • Tute
  • *****
  • Posts: 364
  • Karma: 18
  • I am the terror that flaps in the night.
Re: Translating Taronyu's dictionary
« Reply #136 on: June 03, 2017, 04:15:23 am »
Hello!

Runaway argument?
{säpllhrr). Dank aan Elf! \item {\bf 13.301} - Tikfouten/formatterin\ETC.
! File ended while scanning use of \textbf .


=> 13.31 - Onnozele tikfouten van \textbf{pllhrr} en \textbf{säpllhrr). Dank aan Elf!

In CHANGELOG, you've used the wrong bracket ')' instead of '}'.

Fixed it. Dictionary compiles again.
Eana Eltu: PDF/TSV/jMemorize

Offline eejmensenikbenhet

  • Palulukan Makto
  • *****
  • *
  • Posts: 1019
  • nl Netherlands
  • Karma: 15
    • Ketuwong aNeyn
Re: Translating Taronyu's dictionary
« Reply #137 on: June 03, 2017, 06:01:54 am »
Ah, so it does, thanks, that was fairly stupid of me...
Now, there's still the problem with the accented vowels, it cuts off a lot of text which in turn leaves open \textbf{ brackets turning the entire document bold.

EDIT: I'm working around it now using LaTeX accented vowels.
\`{i} and \'{o} seem to do the trick for now...
« Last Edit: June 03, 2017, 06:08:08 am by eejmensenikbenhet »

Offline Tuiq

  • Eyktan
  • Tute
  • *****
  • Posts: 364
  • Karma: 18
  • I am the terror that flaps in the night.
Re: Translating Taronyu's dictionary
« Reply #138 on: June 03, 2017, 06:44:09 am »
It's not your fault, if anything, the system should catch those mistakes and not let you save them in the first place. Or, even better, not even allow you to make them - e.g. by not using LaTeX but Markdown or something.

What's an accented vowel for you? ì? ò? ö? Technically, as long as it is UTF-8, LaTeX should eat it... but it was always a bit nitpicky.
Eana Eltu: PDF/TSV/jMemorize

Offline eejmensenikbenhet

  • Palulukan Makto
  • *****
  • *
  • Posts: 1019
  • nl Netherlands
  • Karma: 15
    • Ketuwong aNeyn
Re: Translating Taronyu's dictionary
« Reply #139 on: June 03, 2017, 06:50:22 am »
Up until now I've had no problems using tremas (ä in Na'vi and ë,ï in Dutch) or accents (ì in Na'vi and ó in Dutch) but all of a sudden it stopped working, oh well, it's fixed now.
Time for the next issue: it doesn't seem to accept Guillemets in some cases («these infix markers»), in the regular intro text it's fine, but in the intro of the other dicts (NL-Na'vi, Categorised and Concise) it removes all of the text following it. I can replace them with \guillemotleft and \guillemotright, but that seems foolish considering it works fine in the original intro text.

 

Become LearnNavi's friend on Facebook Follow LearnNavi on Twitter! Watch LearnNavi's videos on YouTube

SMF 2.0.15 | SMF © 2017, Simple Machines
Privacy Policy
| XHTML | RSS | WAP2 | Site Rules

LearnNavi is not affiliated with the official Avatar website,
James Cameron, or the Twentieth Century-Fox Film Corporation.
All trademarks and servicemarks are the properties of their respective owners.
Images in the LearnNavi.org Forums and Gallery may not be used without permission.

LearnNavi Affiliates:
ToS

LearnNavi is the community to learn Na'vi, the Avatar Language
"A place where real friendships are made." -Paul Frommer

AvatarMeet | Learn Na'vi Forum | Learn Na'vi Wiki | Na'viteri

LearnNavi