Translating Taronyu's dictionary

Started by Tuiq, July 29, 2010, 04:25:55 AM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

Puvomun

Quote from: Tuiq on February 17, 2012, 01:14:38 AM
Which reminds me - I have now tools to check dictionaries for unmatched brackets or dollar signs. It's not a complete LaTeX parser and it cannot detect all errors, but it can find the most annoying ones.

I might make these tools public somehow (so you could see which word definitions are "wrong").

That would be splendid! Looking forward to those, as a typo is so easy to make and so hard to spot...
Krr a lì'fya lam sraw, may' frivìp utralit.

Ngopyu ayvurä.

Tuiq

Don't get too excited - all it does is iterating through all words, iterating through all fields, counting the $ and {} - and if they don't match, it gives you the word and the argument that is supposedly broken.

Originally I planned - long time ago - to abandon LaTeX in favour of BBCode. Which means instead of {\bf foobar} you would simply have [b]foobar[/b]. Before the BBCode is translated, all invalid LaTeX would be escaped appropriately so things like these couldn't happen at all.

However, it would make things more complicated and is just too much effort for too little gain.
Eana Eltu: PDF/TSV/jMemorize

Puvomun

Quote from: Tuiq on February 17, 2012, 01:36:33 AM
Don't get too excited - all it does is iterating through all words, iterating through all fields, counting the $ and {} - and if they don't match, it gives you the word and the argument that is supposedly broken.

Originally I planned - long time ago - to abandon LaTeX in favour of BBCode. Which means instead of {\bf foobar} you would simply have [b]foobar[/b]. Before the BBCode is translated, all invalid LaTeX would be escaped appropriately so things like these couldn't happen at all.

However, it would make things more complicated and is just too much effort for too little gain.

Not sure who, but someone mentioned that you had something like that, written in Perl I think? When I had a problem with the Dutch translation I had made a similar script (GMTA). Problem however turned out to be a forgotten space somewhere that created a bad LaTeX command  :-[

But anything that helps, helps.
Krr a lì'fya lam sraw, may' frivìp utralit.

Ngopyu ayvurä.

Tirea Aean

Could it also check if \( or other latex \<something> occurs somewhere when it should not? \( was the long-undetected culprit that kept the nl dictionary from compiling for months.

Tuiq

I don't know anything about LaTeX, so I won't implement checks that I don't know about.

It's simple Perl though: Write me a function that takes one argument (a string) and returns 0 if there's everything okay and 1 if there was an error.
Eana Eltu: PDF/TSV/jMemorize

Tuiq

It was hinted to me today that the Hungarian output is broken. Which means all of it - the demon itself is not capable to process certain characters - they are fine in the database, however. Affected output is pretty much everything, but as far as I can tell, it's limited to certain characters in hungarian (namely, at least "ő").

They are displayed as question marks instead. I'll take this to re-write the way every file is produced.
Eana Eltu: PDF/TSV/jMemorize

Puvomun

Quote from: Tuiq on February 21, 2012, 01:44:54 PM
It was hinted to me today that the Hungarian output is broken. Which means all of it - the demon itself is not capable to process certain characters - they are fine in the database, however. Affected output is pretty much everything, but as far as I can tell, it's limited to certain characters in hungarian (namely, at least "ő").

They are displayed as question marks instead. I'll take this to re-write the way every file is produced.

Good luck.
Krr a lì'fya lam sraw, may' frivìp utralit.

Ngopyu ayvurä.

P.A.'li makto

Quote from: Tuiq on February 21, 2012, 01:44:54 PM
It was hinted to me today that the Hungarian output is broken. Which means all of it - the demon itself is not capable to process certain characters - they are fine in the database, however. Affected output is pretty much everything, but as far as I can tell, it's limited to certain characters in hungarian (namely, at least "ő").

They are displayed as question marks instead. I'll take this to re-write the way every file is produced.
Can it be fixed? I'm worrying about it, because I help Tukan in his project, and don't know what happens now with the Hungarian dictionary...  :(

facebook: soaia leNa`vi

Puvomun

Quote from: P.A.'li makto on February 21, 2012, 01:53:39 PM
Quote from: Tuiq on February 21, 2012, 01:44:54 PM
It was hinted to me today that the Hungarian output is broken. Which means all of it - the demon itself is not capable to process certain characters - they are fine in the database, however. Affected output is pretty much everything, but as far as I can tell, it's limited to certain characters in hungarian (namely, at least "ő").

They are displayed as question marks instead. I'll take this to re-write the way every file is produced.
Can it be fixed? I'm worrying about it, because I help Tukan in his project, and don't know what happens now with the Hungarian dictionary...  :(

The re-write Tuiq mentions means that he will change the way that the dictionaries are created, to fix the problem.
Krr a lì'fya lam sraw, may' frivìp utralit.

Ngopyu ayvurä.

Tuiq

It could be fixed without a complete overhaul.

I added the code to "fix" it and I'm currently re-generating all content - let's see what happens. Worst case, double encoded UTF8.
Eana Eltu: PDF/TSV/jMemorize

Tuiq

Well, that was easy.

Fixed. All output seems to be stable UTF8 now. That way, I don't have to rewrite the code... yet.
Eana Eltu: PDF/TSV/jMemorize

P.A.'li makto

Quote from: Tuiq on February 21, 2012, 02:48:00 PM
Well, that was easy.

Fixed. All output seems to be stable UTF8 now. That way, I don't have to rewrite the code... yet.
Oh, thanks a lot!  :)

facebook: soaia leNa`vi

Puvomun

Quote from: Tuiq on February 21, 2012, 02:48:00 PM
Well, that was easy.

Fixed. All output seems to be stable UTF8 now. That way, I don't have to rewrite the code... yet.

Great job. I'm glad it was an easy one!
Krr a lì'fya lam sraw, may' frivìp utralit.

Ngopyu ayvurä.

Tuiq

#53
Upcoming notice: As per 30.04.2012, my domain will cease to exist. If I'm getting a new one or which one is yet to be decided, it's just sure that I get rid of it. This will affect all translation tool links and render all links to current dictionaries useless if they are not linked to eanaeltu.learnnavi.org (but mwf-data.clonk2c.ch instead).

I think that three months should be enough to change links when necessary.

Edit: If anybody wants to take over the project, I think now would be a pretty fitting time for it. Taking over the project would require you to be able to set up the current development tools or you simply code your own. In any case, serious people that can show me they are able to continue this project would be given the complete current code and a complete database dump including everything I have.
Eana Eltu: PDF/TSV/jMemorize

Tuiq

With only a bit more than three weeks to go, I'd like to stress again that all EE services will shut down on April 30, including the translation service. Remaining dictionaries will stay as they are.

As for now, there are no plans to replace the domain.
Eana Eltu: PDF/TSV/jMemorize

`Eylan Ayfalulukanä

Quote from: Tuiq on April 05, 2012, 02:51:36 PM
With only a bit more than three weeks to go, I'd like to stress again that all EE services will shut down on April 30, including the translation service. Remaining dictionaries will stay as they are.

As for now, there are no plans to replace the domain.

I take it, ma/zhey Tuiq, that this will also affect the Dothraki dictionary server, as that is currently on mwf-data.clonk2c.ch as well. As the current principal dictionary editor there, what do I need to do?

Yawey ngahu!
pamrel si ro [email protected]

Tuiq

Taronyu established an upload to learnnavi.org ages ago if I'm not mistaken - the dothraki dictionaries should lie around somewhere there too.
Eana Eltu: PDF/TSV/jMemorize

`Eylan Ayfalulukanä

Quote from: Tuiq on April 05, 2012, 09:36:03 PM
Taronyu established an upload to learnnavi.org ages ago if I'm not mistaken - the dothraki dictionaries should lie around somewhere there too.

I will check with Taronyu (now Skxawng Makto) to see if he has done anything, as the tools I use still point to clonk2c.ch.

Thanks for your work in getting this started. If I were more of a programmer, I would consider taking up the maintenance of these tools.

Yawey ngahu!
pamrel si ro [email protected]

Puvomun

I wish I had time to dive into this. Alas, I have not... :(
Krr a lì'fya lam sraw, may' frivìp utralit.

Ngopyu ayvurä.

Tuiq

The tools ARE still on clonk2c.ch, the produced pdfs however are on learnnavi.org - for Dothraki too.
Eana Eltu: PDF/TSV/jMemorize