Na'vi Scrabble!

Started by Kayrìlien, February 22, 2010, 01:43:18 AM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

Kayrìlien

I was browsing the older threads in the forums (things that were written before I registered), and one of Kaltxì Palulukan's projects intrigued me. He had done a letter frequency count of all the words in the word list and made a chart of how often each letter appeared. Cool, huh?

Flash forward to this evening: I was playing Scrabble with my family, and (WARNING: SHAMELESS SELF-PROMOTION AHEAD!) after utterly blowing them out with sequential bingoes BAYONETS (on a triple word score) and REVOLVER (just double...oh darn), I picked up the letters AAEGNOY, the first word I see is...AYOENG! Of course, that set off a whole series of thoughts, and, three hours later, I'm here upstairs at my computer thinking about playing Scrabble in Na'vi!

Now, I remembered Kaltxì Palulukan's thread, which was really helpful as far as coming up with ideas for this, but I figured that a letter count of every word in the dictionary is very different from a letter count in a conversation, because some words are much more common than others.

So, to save (a lot of) time, I simply used Karyu Paul's letter to us from January 20. I wanted to first look at the Na'vi letter count, including digraphs and diphthongs as individual letters, then compare it to the letter count with individual letters.

Here is a chart with the count and frequencies of Na'vi letters including digraphs and diphthongs:

Letter - Tally - Frequency (%)

A - 63 - 10.26
E - 53 - 8.63
L - 39 - 6.35
O - 34 - 5.54
U - 33 - 5.37
N - 33 - 5.37
ì - 31 - 5.05
T - 30 - 4.89
NG - 28 - 4.56
I - 27 - 4.40
' - 24 - 3.91
R - 24 - 3.91
EY - 22 - 3.58
AY - 20 - 3.26
F - 16 - 2.61
M - 16 - 2.61
S - 16 - 2.61
Ä - 14 - 2.28
V - 14 - 2.28
Y - 13 - 2.12
K - 12 - 1.95
P - 10 - 1.63
AW - 9 - 1.47
W - 8 - 1.30
TX - 8 - 1.30
H - 4 - 0.65
TS - 3 - 0.49
Z - 3 - 0.49
EW - 2 - 0.33
RR - 2 - 0.33
PX - 2 - 0.33
KX - 1 - 0.16
LL - 0 - 0.00

Total: 614

This looked somewhat similar to English in that the vowels are mostly clustered near the top, and the consonants are very unevenly distributed.

Next I converted all of these numbers into counts and frequencies of individual letters. While I'm aware that some languages that use digraphs actually have Scrabble tiles with both letters on them (for example, IJ in Dutch), I felt that having only individual letter tiles would be more interesting and more fun, for reasons I will explain later.

Here are the tallies and frequencies for individual letters:

Letter - Tally - Frequency (%) - Role

A - 92 - 12.94 - Vowel
E - 77 - 10.83 - Vowel
N - 61 - 8.58 - Consonant
Y - 55 - 7.74 - Either/Or
T - 41 - 5.77 - Consonant
L - 39 - 5.49 - Either/Or
O - 34 - 4.78 - Vowel
U - 33 - 4.64 - Vowel
Ì - 31 - 4.36 - Vowel
R - 28 - 3.94 - Either/Or
G - 28 - 3.94 - Dependent
I - 27 - 3.80 - Vowel
' - 24 - 3.38 - Consonant
W - 19 - 2.67 - Either/Or
S - 19 - 2.67 - Consonant
M - 16 - 2.25 - Consonant
F - 16 - 2.25 - Consonant
Ä - 14 - 1.97 - Vowel
V - 14 - 1.97 - Consonant
K - 13 - 1.83 - Consonant
P - 12 - 1.69 - Consonant
X - 11 - 1.55 - Dependent
H - 4 - 0.56 - Consonant
Z - 3 - 0.42 - Consonant

Total: 711

Notes about the fourth column: "Either/Or" means the letter can either be used by itself as a consonant or as part of a diphthong or syllabic vowel. "Dependent" means that the letter cannot exist by itself; it can only exist alongside another letter, much like Q in English (the overwhelming majority of the time...words like "qintar" and "sheqalim" can, uh...how does Jake put it? Kiss the darkest part of my lily white--).

I then compared this frequency chart to English, and again noticed many similarities; most of the vowels are at the top, with two consonants being much more frequent than others. N is easily the most common consonant, though it is helped immensely by being part of NG. Y is also rather common, as it is not only part of two diphthongs but also of many affixes, such as plural ay+, genitive -yä, future and near-future infixes <ìy> and <ay>, etc.

Finally, to compare this frequency chart with actual Scrabble gameplay, I looked at how common each letter tile appears in the English version of Scrabble. I wanted to conserve a similar ratio of vowels to consonants (Na'vi is slightly higher, perhaps one or two percent), plus a similar tile points assortment and what I called the "points potential", basically the sum of all the point values on the tiles. (If someone with a background in Game Theory knows what this is actually called, let me know.) In English, this sum is 187.

Keeping all these things in mind, this is what I came up with. For all the TL;DR-type people in the forum, this is what you're looking for.

Number of Tiles - Letter @ Points Value

12 - A @ 1
8 - E @ 1
6 - N @ 1
6 - Y @ 1
6 - T @ 1
6 - L @ 1
5 - O @ 1
5 - U @ 1
4 - R @ 1
4 - I @ 1
4 - Ì @ 2
4 - (') @ 2
3 - Ä @ 2
3 - K @ 3
3 - P @ 3
3 - X @ 8
3 - S @ 2
3 - M @ 3
2 - G @ 4
2 - W @ 4
2 - F @ 4
2 - V @ 4
1 - H @ 5
1 - Z @ 10
2 - [BLANK] @ 0 (Just as in English Scrabble)

Total: 100 Tiles, 187 Total Points (also just as in English Scrabble)

Notes:

1)  There are 43 Vowel tiles, 32 Consonant tiles, 18 Either/Or tiles,  5 Dependent tiles, and 2 Blank tiles. English has 42 Vowel Tiles, 53 Consonant tiles, 2 Either/Or tiles (Y), 1 Dependent tile (Q), and 2 Blank tiles. As you can see, the large number of Either/Or letters in Na'vi allows for the higher Vowel-to-Consonant ratio to be expressed without everyone's tile rack looking like Hawaiian.

2)  The number of tiles has been kept rather close to the overall frequency of the letters in the language, with a few exceptions. I chose to give a bit of emphasis to the letters not found in English, namely Ä, Ì, and ('), so they show up in slightly higher quantities than the numbers say they should.

3)  Na'vi has two dependent letters, X and G, and they are not nearly as uncommon as Q is in English. Because ejectives are another thing that is "different" in Na'vi, there are three X's to go along with twelve T's, P's, and K's, which is the same ratio of Q-to-U in English Scrabble. Because of their dependence, and the fact that Na'vi has fewer extremely rare letters, these three X's have a point value of 8, which, coincidentally, is the same point value X has in English Scrabble. To compensate for this, there is only one 10-point tile, the lone Z.

G is a bit of a different animal because NG is extremely common in Na'vi words. Because of this, even though there are actually fewer G's in this tile set than there are X's, G is only worth 4 points. It shouldn't ever be hard to use them, especially since you can just stick <äng> into a verb.

4)  Obviously a language with a four-digit number of words is going to be hard to play with. For this reason, I would suggest playing with a rack of eight or nine tiles rather than the usual seven. I thought about putting extra blanks into the mix, but that takes a lot of the skill out of the game.

5)  English has a surprising number of "hooks" that can be used to build off of words, notably adding S to the end of a word to make it plural. Na'vi doesn't have this sort of flexibility, as most of its word alterations are either infixes (which Scrabble doesn't like at all) or multiple letters long. The best "hook" in Na'vi is likely Ä, which can be added to the end of many nouns to form the genitive.

6)  (The BIG one.) This whole thing is probably flat out wrong. Why? Well, I don't think that using one letter, even if it is from Karyu Pawl, is really an accurate sampling of an entire language. I really wish I had the time to analyze other parts of the canon and use them to give me more accurate figures, but until that time materializes (read: College is so freaking busy I want to rip my hair out), this is all I'm going to base my numbers on. I don't think that anything will change drastically, but if anyone wants to take the initiative, it'd be much appreciated.

Well, that's about all I have. If anyone wants an Excel file with all of the data I used to do this with, just send me an email.

Now...all I have to do is find someone within 100 miles of me that actually knows Na'vi and would consider playing Scrabble with me!

Hopefully you all enjoy this.

Eywa ayngahu,

Kayrìlien

Payoang

Awesome compilation. Moving to Projects; "Language Updates" is for official updates.

Kayrìlien

Quote from: Seabass on February 22, 2010, 02:21:54 AM
Awesome compilation. Moving to Projects; "Language Updates" is for official updates.

Irayo for that; I wasn't sure which forum this belonged in!

Kayrìlien

Erimeyz

Quote from: Kayrìlien on February 22, 2010, 01:43:18 AM
I don't think that using one letter, even if it is from Karyu Pawl, is really an accurate sampling of an entire language. I really wish I had the time to analyze other parts of the canon and use them to give me more accurate figures, but until that time materializes (read: College is so freaking busy I want to rip my hair out), this is all I'm going to base my numbers on.
When I get the chance, I'll send you the cleaned-up intermediate file I used for my word-count analysis of the Na'vi Only forum posts.  I had been thinking about doing a letter-count analysis as well, but never got around to it, and you seem to be well-equipped for the task. :)

  - Eri

Erimeyz

(Oh, and btw, this is freakin' awesome.)

Kayrìlien

Quote from: Erimeyz on February 22, 2010, 10:56:44 AM
Quote from: Kayrìlien on February 22, 2010, 01:43:18 AM
I don't think that using one letter, even if it is from Karyu Pawl, is really an accurate sampling of an entire language. I really wish I had the time to analyze other parts of the canon and use them to give me more accurate figures, but until that time materializes (read: College is so freaking busy I want to rip my hair out), this is all I'm going to base my numbers on.
When I get the chance, I'll send you the cleaned-up intermediate file I used for my word-count analysis of the Na'vi Only forum posts.  I had been thinking about doing a letter-count analysis as well, but never got around to it, and you seem to be well-equipped for the task. :)

  - Eri


Oh wow, that would be awesome! I think it'd definitely be more accurate than my rather limited source material, specifically it'll probably boost O up a bunch (people sure like to talk about themselves...so many oe's...), but I'm gonna have to decline adding a B tile for "buzzlightyear", sorry.  ;D

Quote from: Erimeyz on February 22, 2010, 10:57:23 AM
(Oh, and btw, this is freakin' awesome.)

Irayo! I'm glad you enjoy it. Until I get around to going to a local craft store to buy a bunch of little wooden square pieces for tiles, I'll probably amuse myself by just using a random number generator to give me sample racks of seven tiles and seeing what Na'vi words I can make.

Kayrìlien

Erimeyz

Also btw - at first, I was surprised and skeptical that you used monographs as letters instead of including the digraphs on single tiles.  It just didn't seem right, you know?  Because kx ISN'T K + X, it's a letter all to itself and completely different from K.  Right?  Right?

Right.

But.  You made a very convincing case, that FOR SCRABBLE the game is more interesting and aesthetically appropriate using monographs.  So.  I was wrong, you are right, and deservedly so. :)

I find your construction of the tileset (frequencies, points) to be well-chosen.  I do wonder, though, whether it's more appropriate for tile frequencies to be based off of a literary / conversational corpus as you're trying to do, or off of the dictionary instead?  After all, as you point out, people like to say "Oe" a whole lot...

  - Eri

Kayrìlien

Quote from: Erimeyz on February 22, 2010, 12:41:54 PM
Also btw - at first, I was surprised and skeptical that you used monographs as letters instead of including the digraphs on single tiles.  It just didn't seem right, you know?  Because kx ISN'T K + X, it's a letter all to itself and completely different from K.  Right?  Right?

Right.

But.  You made a very convincing case, that FOR SCRABBLE the game is more interesting and aesthetically appropriate using monographs.  So.  I was wrong, you are right, and deservedly so. :)

I find your construction of the tileset (frequencies, points) to be well-chosen.  I do wonder, though, whether it's more appropriate for tile frequencies to be based off of a literary / conversational corpus as you're trying to do, or off of the dictionary instead?  After all, as you point out, people like to say "Oe" a whole lot...

  - Eri


As far as the first point goes, I really do think you could go both ways, but some of the reasons why I chose using individual tiles rather than digraphs are:

1) Words like ayngeyä that consist mostly of digraphs would become rather short words, which I could see as a bit of a letdown to many people. They just saw this awesome seven-letter word, but it's only four tiles, perhaps six points maximum (without premium squares). While I know this is a very English-centric way of looking at the game, as Na'vi would definitely not see ayngeyä as a seven-letter word but (correctly) as a four letter word, I'm thinking about how people will react to actually playing the game, and I can see this being a major detractor, at least to people who are very concerned with achieving a high score.

On the other hand, using digraphs would allow "long" Na'vi words to be played more easily, something that would open up the useful vocabulary a bit. You really could go both ways with this issue, so I just chose one and went with it. I'd be more than happy to derive a tileset and point values for a digraphs-version of the game as well.

2) I think that having five dependent letters adds a bit of strategy to the game when dealing with those letters. Anyone who has played Scrabble before knows how much fun (pain?) the Q can be, and how having it in your rack can drastically alter how you play the game. Having three X's and two G's that all share characteristics with English Q seems like it would add another dimension to the game, plus, using digraphs, PX and KX would likely only have one tile each.

The second point is equally valid. While I agree that using a dictionary is a more logical way to determing letter frequency as we are looking at simply forming words regardless of meaning, I basically just followed the same process that the original creator of Scrabble did, who used articles from The New York Times to determine his letter counts. Yes, using conversational dialogue places extra emphasis on pronouns and commonly-used discourse-related words, but I also think that those are the words that most people are going to think of first; the ones that they use more frequently in conversation. Eltungawng probably doesn't come up in conversations that often, so people might not remember what it is right away. But (almost) everybody knows what irayo means, and it will probably be seen as a potential play a lot more often than something obscure.

Kayrìlien

Erimeyz

Since I had nothing better to do than fart around on Google for a few minutes:


Scrabble's not really my thing (although Na'vi Scrabble could TOTALLY become one of my things), so if anyone knows of other custom tile makers, speak up.

  - Eri

Kayrìlien

Quote from: Erimeyz on February 22, 2010, 01:35:41 PM
Since I had nothing better to do than fart around on Google for a few minutes:


Scrabble's not really my thing (although Na'vi Scrabble could TOTALLY become one of my things), so if anyone knows of other custom tile makers, speak up.

  - Eri


Wow, I'm surprised there are companies out there that did this sort of thing. I guess I underestimated how serious some people are about board games. Then again, I am learning a made-up language from a Sci-fi movie.  :D

I was thinking something more along the lines of Michael's, just buying a bag of square wooden chips that they have for miscellaneous crafts and the like. (I've definitely used their bargain bins before to make those "Name Trains" for people I know, except I used "STFU" instead of their name.  :D) Shouldn't cost more than five dollars, and I can just Sharpie the letters on. Low budget operation ftw!

In other news, I got Excel to give me a random tile generator (though I'm positive that anyone with even moderate knowledge of computer programming would scoff at how illogical it is), and here are the first three racks of tiles it gave me:

1)  AÄMRTVY

2) AAìNPW'

3) EEGìKRY

If anyone wants to try their hand at coming up with the best Na'vi word for each of those racks as a sort of test to make sure this would actually be, you know, fun, that'd be cool.

Irayo,

Kayrìlien

Erimeyz

Quote from: Kayrìlien on February 22, 2010, 01:58:07 PM
Shouldn't cost more than five dollars, and I can just Sharpie the letters on. Low budget operation ftw!

Sounds like work.  These days I like to pay people to do work for me. :)

  - Eri

Erimeyz

Okay, as promised, here's a (mostly) cleaned-up wordlist from the Na'vi Only forums, as of a couple weeks ago.  I tried to remove the non-Na'vi words (including Na'vi-ified proper names and loan words) and to fix obvious spelling errors and to convert everything to lower-case and to remove punctuation and to switch stress-marked letters like á and é into a and e etc.  It was mostly automated, but partially manual, and some stuff may have gotten missed.  Still, it should be pretty good.

I'm attaching two versions.  The second one is just a uniqued version of the first, so that if you want to do a frequency count by lexicon rather than discourse (but using a discourse-like corpus as the lexical source) you can just use the second file and save yourself a step.

Have fun!

  - Eri

Erimeyz

... and on a related note: Na'vi Boggle!

Blue Panther (link above) also makes custom dice.  Or, you can use blanks and a Sharpie...  But what distribution of letters on cubes should you use?  I have absolutely no clue, but here are some links that might help spur ideas if anyone was so inclined:

* Boggle letter distribution
* Boggle letter distributions from lots of different versions
* Boggle letter distribution from Big Boggle (5x5 grid)
* Ideas for algorithms for generating word search grids

  - Eri

'Awve Tìkameie

This is a great idea! You can probably do a lot of different board games about the language. Good luck!
My uncle's dog came over a little less than one week ago. I stupidly left the computer plug on the floor. That night, his dog chewed it up. So, the next morning, I go looking for the plug and I find a chewed up piece of metal. I sadly cannot purchase a new plug right now, and my computer's battery is out. I'm at my public library.

Also, I discovered this promising "small business" film company called Mirror Entertainment. It looks really promising: http://www.mirrorente

eanayo

Quote from: Erimeyz on February 22, 2010, 02:55:22 PM
... and on a related note: Na'vi Boggle!

Months ago I was laughing at Klingon Boggle - today I wish I could play Na'vi Boggle with some guys. How weird is that?

Great job coming up with all these Ideas for this beautiful language, amazing to see it pick up speed this fast.

Ah, Neytiri, nice one! Too bad that's a proper noun.

Visit Our Dictionary for eBook readers, The Na'vi Word Puzzle Game and the Cryptogram Generator
srake tsun pivlltxe san [ˈɔaχkat͡slʃwɔaf]?

Kayrìlien

Quote from: Aysyal on February 22, 2010, 03:41:40 PM
Quote from: Erimeyz on February 22, 2010, 02:55:22 PM
... and on a related note: Na'vi Boggle!

Months ago I was laughing at Klingon Boggle - today I wish I could play Na'vi Boggle with some guys. How weird is that?

This is ONE HUNDRED PERCENT the situation I'm in. Sitting on the couch watching The Big Bang Theory, I thought it was absurd that people would ever spend that much time and energy trying to learn a language that "no one in their right mind" would ever speak. (Plus, it was funny how Howard was attempting to use Yiddish words as Klingon; apparently they sound similar, I don't know...) At that time it was something that I just dismissed as "geeky" and never thought about again.

BOY WAS I WRONG.

Now look at me, absolutely enchanted by both the beauty and technical challenges of Na'vi, sitting here, again on my couch, though with the Olympics on rather than CBS sitcoms, thinking about just how awesome it would be to do basically the same thing that I was openly ridiculing no more than four months ago. I actually feel rather guilty, and I made that rather clear in the petition sent to Karyu Pawl.

I mean, technically, I don't really like Boggle, Scrabble is much much better, but...semantics, who cares?

Quote from: Erimeyz on February 22, 2010, 02:28:13 PM
Okay, as promised, here's a (mostly) cleaned-up wordlist from the Na'vi Only forums, as of a couple weeks ago.  I tried to remove the non-Na'vi words (including Na'vi-ified proper names and loan words) and to fix obvious spelling errors and to convert everything to lower-case and to remove punctuation and to switch stress-marked letters like á and é into a and e etc.  It was mostly automated, but partially manual, and some stuff may have gotten missed.  Still, it should be pretty good.

I'm attaching two versions.  The second one is just a uniqued version of the first, so that if you want to do a frequency count by lexicon rather than discourse (but using a discourse-like corpus as the lexical source) you can just use the second file and save yourself a step.

Have fun!

  - Eri


I'll take a closer look at this a bit later in the week, as I'm absolutely swamped until Wednesday evening. Thanks a million for this, though, it looks immensely useful. LOL at the Mr. Ed joke, BTW.


QuoteIn other news, I got Excel to give me a random tile generator (though I'm positive that anyone with even moderate knowledge of computer programming would scoff at how illogical it is), and here are the first three racks of tiles it gave me:

1)  AÄMRTVY

2) AAìNPW'

3) EEGìKRY

If anyone wants to try their hand at coming up with the best Na'vi word for each of those racks as a sort of test to make sure this would actually be, you know, fun, that'd be cool.

The best words I could find for these are:

1) vay

2) nì'aw

3) <er>eyk

There's probably a better one for the first rack, though I couldn't see anything off the top of my head.

Thanks again for all the feedback,

Kayrìlien

Erimeyz

Quote from: Kayrìlien on February 22, 2010, 01:58:07 PM
3) EEGìKRY

If anyone wants to try their hand at coming up with the best Na'vi word for each of those racks as a sort of test to make sure this would actually be, you know, fun, that'd be cool.

Bingo!  GEEKÌRY !!

What?

Oh, come on.  You can't tell me geekìry isn't a Na'vi word!  Na'vi is pure geekìry!

  - Eri

'Awve Tìkameie

I was thinking, since many people can't order/buy a Na'vi Scrabble, what if you were to create an online scrabble game. Kind of like a scrabble computer game. If you would like, I can find some info and send it to you
My uncle's dog came over a little less than one week ago. I stupidly left the computer plug on the floor. That night, his dog chewed it up. So, the next morning, I go looking for the plug and I find a chewed up piece of metal. I sadly cannot purchase a new plug right now, and my computer's battery is out. I'm at my public library.

Also, I discovered this promising "small business" film company called Mirror Entertainment. It looks really promising: http://www.mirrorente

Kayrìlien

Quote from: 'Awve Tìkameie on February 24, 2010, 11:14:26 PM
I was thinking, since many people can't order/buy a Na'vi Scrabble, what if you were to create an online scrabble game. Kind of like a scrabble computer game. If you would like, I can find some info and send it to you

I was actually thinking about doing this as well, though my own knowledge of computer programming is minimal. I've already gotten a couple of PM's regarding doing this, so there's definitely interest. Plus, having it online is much more practical than simply having a physical board game, at least for people like me who are rather far away from any other Na'vi speakers.

I'm sure there are copyright issues involved with calling something Scrabble, so I think if this would ever happen we'd have to use Keylstxatsmen's phonetic suggestion, Skìrapll leNa'vi! (Much as online Pictionary is called "iSketch")

I'm finally somewhat free tonight, so I'm looking at Erimeyz' rather comprehensive word list to see if there are any discrepancies in letter count, plus I can finally playtest a bit with the wooden tiles I bought at Michael's a couple hours ago.

Quote from: Erimeyz on February 22, 2010, 04:26:22 PM
Quote from: Kayrìlien on February 22, 2010, 01:58:07 PM
3) EEGìKRY

If anyone wants to try their hand at coming up with the best Na'vi word for each of those racks as a sort of test to make sure this would actually be, you know, fun, that'd be cool.

Bingo!  GEEKÌRY !!

What?

Oh, come on.  You can't tell me geekìry isn't a Na'vi word!  Na'vi is pure geekìry!

  - Eri


That made me laugh. Pure geekìry, lol...

Kayrìlien

'Awve Tìkameie

QuoteI was actually thinking about doing this as well, though my own knowledge of computer programming is minimal. I've already gotten a couple of PM's regarding doing this, so there's definitely interest. Plus, having it online is much more practical than simply having a physical board game, at least for people like me who are rather far away from any other Na'vi speakers.

I'm sure there are copyright issues involved with calling something Scrabble, so I think if this would ever happen we'd have to use Keylstxatsmen's phonetic suggestion, Skìrapll leNa'vi! (Much as online Pictionary is called "iSketch")

I'm finally somewhat free tonight, so I'm looking at Erimeyz' rather comprehensive word list to see if there are any discrepancies in letter count, plus I can finally playtest a bit with the wooden tiles I bought at Michael's a couple hours ago.

I'll look online for some info, to help out. I really like this idea. I'll see if there is an application that allows you to "make your own scrabble," if you know what I mean. If I can't find that, I'll look up info on how to make it.

Also, as another option, you can ask people around the forum for help. They probably have some valuable information on making something like/similar to this.
Once I find enough, I'll PM you. In the meantime, good luck!
My uncle's dog came over a little less than one week ago. I stupidly left the computer plug on the floor. That night, his dog chewed it up. So, the next morning, I go looking for the plug and I find a chewed up piece of metal. I sadly cannot purchase a new plug right now, and my computer's battery is out. I'm at my public library.

Also, I discovered this promising "small business" film company called Mirror Entertainment. It looks really promising: http://www.mirrorente