New jMemorize lists

Started by Mirri, January 18, 2010, 11:06:00 AM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

Tuiq

All eana files are of course UTF-8. This might have been the problem.
Eana Eltu: PDF/TSV/jMemorize

Mirri

Quote from: tirea kaya'evengä on April 29, 2010, 03:45:07 PM
When I try to open the list with jmemorize a message saying: an error occured while loading file. and then:
Ver 1.3.0  (200803122134) - Java 1.6.0_20 , OS Windows Vista
java.io.IOException: Content is not allowed in prolog.
   at jmemorize.core.Main.loadLesson(Unknown Source)
   at jmemorize.gui.swing.frames.MainFrame.loadLesson(Unknown Source)
   at jmemorize.core.Main.run(Unknown Source)
   at jmemorize.core.Main.main(Unknown Source)


The memorize lists have been abandoned. We're working on generating them automatically from Taronyu's dictionary at the moment.
Stay tuned.
Ngaya poanìl new mune 'uti: hrrap sì uvan. Talun poanìl new ayfoeti -- ayfo lu lehrrap ayu leuvan.

hawnuyuna'viyä

Quote from: Mirri on May 04, 2010, 05:00:44 AM
Quote from: tirea kaya'evengä on April 29, 2010, 03:45:07 PM
When I try to open the list with jmemorize a message saying: an error occured while loading file. and then:
Ver 1.3.0  (200803122134) - Java 1.6.0_20 , OS Windows Vista
java.io.IOException: Content is not allowed in prolog.
   at jmemorize.core.Main.loadLesson(Unknown Source)
   at jmemorize.gui.swing.frames.MainFrame.loadLesson(Unknown Source)
   at jmemorize.core.Main.run(Unknown Source)
   at jmemorize.core.Main.main(Unknown Source)

This is a problem with the XML lesson file, not the word list. Would you mind posting/PMing your lesson file.

Quote from: Mirri on May 04, 2010, 05:00:44 AM
The memorize lists have been abandoned. We're working on generating them automatically from Taronyu's dictionary at the moment.
Stay tuned.
This sounds a great solution to the problem that all the projects here have had: out of date wordlists. I started looking into how much effort it would take to parse the Vocab wiki page, but using the dictionary source is much cleaner and easier.
Can you give an ETA? :)

Tuiq

I think I'd rather be able to give one because I'm the ~only developer doing things with the dictionary. I guess.

As I stated before there is already a jmemorize file. Due to my total non-interesting in jMemorize and nobody being able to explain me /correctly/ what jMemorizefiles do look like (or better: what jMemorize is capable of: escaping, for example) I don't know if/how to continue that.
Eana Eltu: PDF/TSV/jMemorize

Mirri

Quote from: Tuiq on May 05, 2010, 01:19:22 PM
I think I'd rather be able to give one because I'm the ~only developer doing things with the dictionary. I guess.

As I stated before there is already a jmemorize file. Due to my total non-interesting in jMemorize and nobody being able to explain me /correctly/ what jMemorizefiles do look like (or better: what jMemorize is capable of: escaping, for example) I don't know if/how to continue that.

I've given you examples and instructions on how a jmemorize file works in PM. I thought you hadn't gotten around to looking at it yet. You have not replied or indicated to me that there was a problem with my instructions.
Please let me know what the problem is so I can help you.
Ngaya poanìl new mune 'uti: hrrap sì uvan. Talun poanìl new ayfoeti -- ayfo lu lehrrap ayu leuvan.

Tuiq

Well, no, the instructions were just. You know. Not really helpful :|

The main problem is, in my opinion, that ';' (or ,? I think both would work) are common in the file. Usually escaping them is done with '\' in front of them (Foo,bar => Foo\,bar). Since I can't get jMemorize running here somebody else would have to test if that works that way.

A sample file could be something like that:

Front,Back
Foo,Bar
Foo and\, important\, bar,Foobar
Foobar is Foobar,Foobar


... I can't remember it now, was there something different than the comma thing? Oh, categories. That will be evil since all dictionaries are. No, wait, it won't. What am I talking about.
Eana Eltu: PDF/TSV/jMemorize

Mirri

Quote from: Tuiq on May 05, 2010, 01:40:24 PM
Well, no, the instructions were just. You know. Not really helpful :|

The main problem is, in my opinion, that ';' (or ,? I think both would work) are common in the file. Usually escaping them is done with '\' in front of them (Foo,bar => Foo\,bar). Since I can't get jMemorize running here somebody else would have to test if that works that way.

A sample file could be something like that:

Front,Back
Foo,Bar
Foo and\, important\, bar,Foobar
Foobar is Foobar,Foobar


... I can't remember it now, was there something different than the comma thing? Oh, categories. That will be evil since all dictionaries are. No, wait, it won't. What am I talking about.

Maybe you should check my instructions again, because I've explained this to you already. Every field is in quotes "" and anything between the quotes is taken literally. It's only between the quotes that you can have comma separations.
Ngaya poanìl new mune 'uti: hrrap sì uvan. Talun poanìl new ayfoeti -- ayfo lu lehrrap ayu leuvan.

Tuiq

Yes, and then we've got the problem of quotes-in-quotes and entries beginning or ending with quotes.
Eana Eltu: PDF/TSV/jMemorize

Mirri

Quote from: Tuiq on May 06, 2010, 12:15:05 PM
Yes, and then we've got the problem of quotes-in-quotes and entries beginning or ending with quotes.

I've checked this out and unfortunately there is no way to use quotes inside quotes with this program. What I suggest is to search and replace all the quotes in the na'vi dictionary with single quotes ` just before creating the wordlist. This should get around the quote problem.
It may not be a pretty and elegant solution, but it's functional and people won't have to wait for the wordlists anymore :)
Ngaya poanìl new mune 'uti: hrrap sì uvan. Talun poanìl new ayfoeti -- ayfo lu lehrrap ayu leuvan.

Tuiq

#29
It's not just not pretty. It's illegal. You can't do that for all languages. In German, '' and "" have completely different meanings. This may sounds evil but I'm willing to wait until the hell freezes. I'm not going to change anything just because some coders are unable to write a program that is capable of the simplest tasks possible.

I don't understand the program anyway. The documentation is really spare or better: For creating documents, it does not even exist. I've seen at least three different (completely different) versions of databases which all seem to work with jMemorize.
Eana Eltu: PDF/TSV/jMemorize

Mirri

Quote from: Tuiq on May 12, 2010, 09:57:27 AM
It's not just not pretty. It's illegal. You can't do that for all languages. In German, '' and "" have completely different meanings. This may sounds evil but I'm willing to wait until the hell freezes. I'm not going to change anything just because some coders are unable to write a program that is capable of the simplest tasks possible.

Well, since this isn't German, but Na'vi and English there shouldn't be a problem.
How about just doing the search and replace for now and we can ask the jmemorize people to update their program to accept quotes, and once they get that sorted we can just remove the search and replace function?
Ngaya poanìl new mune 'uti: hrrap sì uvan. Talun poanìl new ayfoeti -- ayfo lu lehrrap ayu leuvan.

Tuiq

Yes, but I do not really want to code exceptions again. The one for the Na'vi<->Na'vi dictionary are enough already. I'm pretty sure there has to be some kind of escaping.

If not, who said that jMemorize files have to be seperated by ,? Can't it be tab? Or something else? For example, quotes aren't necessary either.
Eana Eltu: PDF/TSV/jMemorize

Mirri

Quote from: Tuiq on May 12, 2010, 01:43:18 PM
If not, who said that jMemorize files have to be seperated by ,? Can't it be tab? Or something else? For example, quotes aren't necessary either.

It can import either comma separated or tab separated files. The csv format is this:
"'ampi", "v. touch", "Interactions", "0"


You can use tabs instead of commas if you want to, but it doesn't really change anything.
You can also use csv without the quotes, but only if you don't use commas:
'ampi, v. touch, Interactions, 0

If you want to use commas in the entry, you have to put quotes around it:
atan, n. light, "Animals, Flora, World", 0

The problem is that the program stops reading the entry when it sees the second quote. So if I have a file with this:
"hey"this is in quotes"", v. touch, Interactions, 0

then the first entry will be:
hey


Ngaya poanìl new mune 'uti: hrrap sì uvan. Talun poanìl new ayfoeti -- ayfo lu lehrrap ayu leuvan.

Tuiq

So. Where exactly is the problem using TSV files..?
Eana Eltu: PDF/TSV/jMemorize

Mirri

Quote from: Tuiq on May 12, 2010, 02:11:32 PM
So. Where exactly is the problem using TSV files..?

They handle quotes exactly the same as csv files, so it doesn't make any difference.
Ngaya poanìl new mune 'uti: hrrap sì uvan. Talun poanìl new ayfoeti -- ayfo lu lehrrap ayu leuvan.

Tuiq

Grah. Even entries that do not begin with " force the program to terminate when a quote comes up?
Eana Eltu: PDF/TSV/jMemorize

Mirri

Quote from: Tuiq on May 12, 2010, 03:38:27 PM
Grah. Even entries that do not begin with " force the program to terminate when a quote comes up?

Aha! You've solved it!
Using tsv I've managed to enter the following line:
'ampi _"special"_ things, that, "lo,ok," , "good"   (tab)v.,touch   (tab)Interactions   (tab)0

The first entry shows up with all three sets of quotes in it and every comma :)
So we should be able to use all the quotes and commas we want as long as the entry doesn't start with a quote.

I tried to make a tsv file myself in a text program, but jmemorize kept making errors when I opened it, so there must be some invisible thing in it that I'm missing.
I took the file from the website and changed it around. It's attached to this post and it works, so you can base the dictionary output on that.
Ngaya poanìl new mune 'uti: hrrap sì uvan. Talun poanìl new ayfoeti -- ayfo lu lehrrap ayu leuvan.

Tuiq

Try what happens if there is a whitespace (either the normal or a reserved one) before tab and the quote.

[TAB][WHITESPACE]"Here text that begins with a quote but does not end with one.
Eana Eltu: PDF/TSV/jMemorize

Mirri

Quote from: Tuiq on May 13, 2010, 08:04:13 AM
Try what happens if there is a whitespace (either the normal or a reserved one) before tab and the quote.

[TAB][WHITESPACE]"Here text that begins with a quote but does not end with one.

Doesn't look like it's fooled by that. Same bad result as previously, the entry cuts off at the second quote it sees.
Ngaya poanìl new mune 'uti: hrrap sì uvan. Talun poanìl new ayfoeti -- ayfo lu lehrrap ayu leuvan.

Tuiq

Luckily, as far as I can see, no entry starts with a quote. So, we can see this problem as "solved". One thing to do then is to add categories which should be transateable somehow. I think I might add some fields to edit_dict_meta.
Eana Eltu: PDF/TSV/jMemorize