VrrtepCLI

Tirea Aean · June 06, 2011, 07:09:06 PM

Quote from: Kä'eng on June 06, 2011, 03:45:51 PM
Quote from: Swoka Ikran on June 06, 2011, 11:57:00 AM
Quote from: Tirea Aean on June 06, 2011, 09:13:09 AM
i've also encountered an anagram while playing scramble. lor and rol.(i put lor answer was rol) im at a loss of how or if i can fix that.
Can't be fixed easily. It's doable, but in my experience, the code to do this can make programs extremely slow.
It doesn't have to be slow. Here's a possiblility: at the beginning, create a set of all words in addition to the list (wordset = set(wordlist)). Instead of checking that the user entered the chosen word, check that the user entered any Na'vi word and that it's an anagram of the chosen word (a in wordset and sorted(a) == sorted(word)). So it might pick rol, show olr, but both lor and rol would be accepted.

~~but theoretically, any other scrambled version of that could be a "valid anagram" how does it know that olr and rlo are not words but rol and lor are, when orl is displayed?~~

EDIT: just kidding. I get it. I see what you did there. Yeah, that is a possibility. I'll see if I can work on that.

Blue Elf · June 21, 2011, 08:33:29 AM

Hello guys, I was using vrrtepCli for some time without problems, but today I fall into problems.
I'm using version 1.9 (not sure if newer one is available), playing quiz game. Usually up to 100 questions, no problem. Today program crashed down 2 times (somewhere at 50th round). attaching console output:

Code Select

Word: Traceback (most recent call last):
  File "vrrtepcli.py", line 321, in <module>
  File "vrrtepcli.py", line 302, in main
  File "quiz.pyc", line 86, in quiz
  File "encodings\cp437.pyc", line 12, in encode
UnicodeEncodeError: 'charmap' codec can't encode character u'\xcc' in position 1
: character maps to <undefined>

And the second:

Code Select

Word: nemrey

1 - sting
2 - Traceback (most recent call last):
  File "vrrtepcli.py", line 321, in <module>
  File "vrrtepcli.py", line 302, in main
  File "quiz.pyc", line 101, in quiz
  File "encodings\cp437.pyc", line 12, in encode
UnicodeEncodeError: 'charmap' codec can't encode character u'\x92' in position 2
2: character maps to <undefined>

Recently I tried to update dictionaries, there was a problem to get some of them, but at the second attempt everything was ok.
Any ideas? Using Windows XP 32 bit

Tirea Aean · June 21, 2011, 08:40:02 AM

thts just for windows version having to do with cp 437 encodings i think. !summon SwokaIkran

also the dictionaries wil be updated tonight for this as the main LN dict has been updated.

Swoka Ikran · June 21, 2011, 11:59:05 AM

Quote from: Blue Elf on June 21, 2011, 08:33:29 AM
Hello guys, I was using vrrtepCli for some time without problems, but today I fall into problems.
I'm using version 1.9 (not sure if newer one is available), playing quiz game. Usually up to 100 questions, no problem. Today program crashed down 2 times (somewhere at 50th round). attaching console output:
Spoiler
Code Select Expand
Word: Traceback (most recent call last): File "vrrtepcli.py", line 321, in <module> File "vrrtepcli.py", line 302, in main File "quiz.pyc", line 86, in quiz File "encodings\cp437.pyc", line 12, in encode UnicodeEncodeError: 'charmap' codec can't encode character u'\xcc' in position 1 : character maps to <undefined>
And the second:
Code Select Expand
Word: nemrey 1 - sting 2 - Traceback (most recent call last): File "vrrtepcli.py", line 321, in <module> File "vrrtepcli.py", line 302, in main File "quiz.pyc", line 101, in quiz File "encodings\cp437.pyc", line 12, in encode UnicodeEncodeError: 'charmap' codec can't encode character u'\x92' in position 2 2: character maps to <undefined>
Recently I tried to update dictionaries, there was a problem to get some of them, but at the second attempt everything was ok.
Any ideas? Using Windows XP 32 bit

The code itself hasn't changed in almost a month.

Since the problem occurred after a dictionary update, those errors are likely caused by malformed dictionaries. There's no dictionary validation in-program, so anything wrong with it will break the program.

Somewhere in the dict, there's characters (the \x92 and \xcc) that are not valid as UTF-8. Either they're orphaned (missing their second part), are part of characters that aren't valid unicode, or are part of characters not printable in a standard CP437 console window (which we shouldn't have in this app).

For the time being, delete your dictionaries and get the old ones from the release package.

@TA: Since you plan to update them anyway: check the dictionary and remove unicode that's not part of ì ä é characters.

Blue Elf · June 21, 2011, 02:32:39 PM

yes, I also think that problem can be in the update. Finally I did another update, all files were downloaded correctly and I was able to take my regular hundred of words without problem....
Maybe download could be done in "transaction" - get newer dictionaries into temp folder and if all files succeed, move them in correct location. Now data are deleted first, then downloaded. And if error occurs....

Swoka Ikran · June 21, 2011, 03:43:04 PM

Quote from: Blue Elf on June 21, 2011, 02:32:39 PM
Maybe download could be done in "transaction" - get newer dictionaries into temp folder and if all files succeed, move them in correct location. Now data are deleted first, then downloaded. And if error occurs....

That can be done.

If WGET succeeds on all files, move into place.

The problem is that this still won't catch files that download fine, but were corrupted during transfer. I'd need to add some form of integrity checking for that.

Tirea Aean · June 21, 2011, 04:11:39 PM

but the thing is, NOTHING AT ALL has changed. I STILL have yet to change anything in any dictionary file. what you downloaded with the update function just re-downloaded what you already had before. ie I havent changed the dicts for this program for about 2 or 3 Markì-updates back in the Dictionary Part II thread...

EDIT: As I update, I'll still check for those two partial chars

DOUBLE EDIT: just to be clear:

Python sees:

"ì" == "\xc3\xac"
"ä" == "\xc3\xa4"
"é" == "\xc3\xa9"

so really, the \x92 and \xcc things I have NO IDEA where they are coming from... /me investigates

Tirea Aean · June 21, 2011, 04:22:14 PM

after some curious python intepreter play:

Spoiler

EDIT: after further messing, these two "mystery unicode snippets" have not been adding to anything here. Z is just \x5a and j is just \x7a and z is just \x7a the \xcc did not add anything. :\

DOUBLE EDIT: However, we ARE onto something here with

>>> print "\xc3\x92"
Ò

Swoka Ikran · June 21, 2011, 04:52:06 PM

I wonder if his downloads might have gotten corrupted in transfer the first time around...

Kä'eng · June 21, 2011, 05:02:09 PM

u'\xcc' is Ì, which is in the word 'Ìnglìsì in naviWords.txt. u'\x92' is ' (right single quote), which is present in two lines of eng.txt:
"be busy (negative sense): be tired out and overwhelmed by an activity that's keeping one busy"
"in a fashion as if one's life were at stake"
Since these characters are not in CP437, you get an error trying to print them to the console.

Tirea Aean · June 21, 2011, 05:12:47 PM

Quote from: Swoka Ikran on June 21, 2011, 04:52:06 PM
I wonder if his downloads might have gotten corrupted in transfer the first time around...

This is possible... Which reminds me... I'll post here after I update all the dictionary files.

Swoka Ikran · June 21, 2011, 05:27:34 PM

Quote from: Kä'eng on June 21, 2011, 05:02:09 PM
u'\xcc' is Ì, which is in the word 'Ìnglìsì in naviWords.txt. u'\x92' is ' (right single quote), which is present in two lines of eng.txt:
"be busy (negative sense): be tired out and overwhelmed by an activity that's keeping one busy"
"in a fashion as if one's life were at stake"
Since these characters are not in CP437, you get an error trying to print them to the console.

That explains the errors

I guess TA should change the ' with a ', and change the Ì to a lowercase ì.

Also...I rewrote the updater and am debugging it now. It asks before installing updates and offers to try again if any of the 11 files fail during download.

One thing I discovered in the process: you never published an est.txt file on tirea.skxawng.lu

Tirea Aean · June 21, 2011, 06:01:35 PM

Noted. I will make all these changes tonight when I have time.

Swoka Ikran · June 21, 2011, 06:06:59 PM

At the est.txt issue: Turns out the file is there after all. When you made the original linux updater, you wrote /soruce/ instead of /source/ in the WGET for est.txt. Since my Windows version was a port of the linux version, I didn't catch the error until now.

I've fixed it in the Windows version and will commit later.

Tirea Aean · June 21, 2011, 06:11:44 PM

Quote from: Swoka Ikran on June 21, 2011, 06:06:59 PM
At the est.txt issue: Turns out the file is there after all. When you made the original linux updater, you wrote /soruce/ instead of /source/ in the WGET for est.txt. Since my Windows version was a port of the linux version, I didn't catch the error until now.

I've fixed it in the Windows version and will commit later.

That means I will also need to do that for the linux version and commit as well. good catch.

EDIT: found "soruce" fail in updater script and fixed. will commit after making notes on the LN dict thread and making the changes.

Swoka Ikran · June 22, 2011, 12:30:13 AM

Not sure where your commit is, so I committed my changes. r7 (Windows): Updater updated.

Also, just to let you know, I may not be around Wed. or Thurs, because I'm not sure if I'll be able to find wifi at my grandparents. If I can, I'll make an attempt to read this.

Tirea Aean · June 22, 2011, 12:51:37 AM

the commit will most likely be there tomorrow. or later today. depends when you are reading this and where you are in the world.

Swoka Ikran · June 22, 2011, 01:23:42 AM

Quote from: Tirea Aean on June 22, 2011, 12:51:37 AM
the commit will most likely be there tomorrow. or later today. depends when you are reading this and where you are in the world.

OK. I'll probably have to get it on Friday.

(Oh, and it's later today for me

)

Blue Elf · June 22, 2011, 04:05:44 AM

Some strange behavior:

Code Select

C:\WINDOWS>vrrtepcli tìlam

C:\WINDOWS>echo off
Vrrtep CLI v1.9 by Tirea Aean. run 'vrrtepcli -h' for usage.
Windows version by Swoka Ikran
Standalone version

Not Found

but:

Code Select

C:\WINDOWS>vrrtepcli -l appearance

C:\WINDOWS>echo off
Vrrtep CLI v1.9 by Tirea Aean. run 'vrrtepcli -h' for usage.
Windows version by Swoka Ikran
Standalone version

Query matches:
n. tìlam

Can you check why tìlam is not found? Seems that Windows have problem with ì, words with ä are found...

Tirea Aean · June 22, 2011, 10:00:32 AM

I get

tirea@tirea:~$ vrrtepcli tìlam
Vrrtep CLI v1.91 by Tirea Aean. run 'vrrtepcli -h' for usage.

n. appearance

tirea@tirea:~$ vrrtepcli -l appearance
Vrrtep CLI v1.91 by Tirea Aean. run 'vrrtepcli -h' for usage.

Query matches:
n. tìlam

It's either locale or unicode failure. >.<

(I <3 Ubuntu)

EDIT: Linux 1.91 is not much different than 1.9:

Quote from: ChangeLog.txt
v1.91
repeat looping now works on command line.

v1.9
[...]