Main Menu

Hyphenation?

Started by Yawne Zize’ite, July 07, 2013, 04:54:49 PM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

Yawne Zize’ite

As before, I don't know where this should go, but Intermediate felt best; please move it if it belongs elsewhere, ma moderators.

So, I was playing with TeX and set a Naʼvi text in a narrow column. To nobody's surprise, it was heavily hyphenated by the program. To my surprise, despite its predictability, it was hyphenated in ways that felt wrong; in various runs, yrrap was hyphenated yr-rap, nìftxan was hyphenated nìft-xan, and tengkrr was hyphenated ten-gkrr.

After irritably typing in every hyphenation point I could think of for the short passage and finally getting an acceptable result, I started thinking: what are the hyphenation rules for Naʼvi?

Starting with English rules is a horrible idea, since English speakers can't even agree on what they are, and any set of English hyphenation principles relies heavily on keeping word roots intact so readers can recognize them. Languages spelled (more) phonetically don't do this.

So, I went back to Latin. The rules for Latin have some complexity, but can be summed up like so:

1. Break at the syllable boundary. (There are many sub-rules for trying to find the syllable boundary, which can be summed up as "adhere to the Maximal Onset Principle" and at any rate do not concern Naʼvi.)
2. However, if a word is a compound word whose elements can be seen clearly, break at the division between elements.
3. Don't break a word so only one letter will be left on a line. (For English the rule is a minimum of two letters on the first line and three letters on the second line, but Latin does not have that restriction.)

These sound like good rules for Naʼvi too: yrr-ap, nì-ftxan, teng-krr, Ey-wa-ʼe-veng, Ey-wa-ʼe-ve-ngä. They run into issues with a/e+w/y+V sequences, but so does anything that relies on Naʼvi syllabification (e.g. tswayon would simply have to be entered as an exception tsway-on).

Since my native language is English, I have to fight to resist the urge to hyphenate to preserve word roots: not ay-ik-ran but a-yik-ran (which becomes ayik-ran by rule #3) - unless it really is ay.ik.ran - not tsa-atan but tsa-tan, and so on.

Unfortunately I don't have the programming skill to write up the hyphenation algorithm.

Any thoughts?

Tìtstewan

#1
Do you have took a look on this syllables-file, which includes all Na'vi syllables?
Maybe that's help.
Usually it's separate by the syllables which the words contain. Example teng-krr, nì-ftxan etc.
I think, you cannot seperate a word like this nìft-xan.

I thoungt about:
I would separet the words in syllables.
atxkxe = atx-kxe
kllwo = kll-wo
atxkxerel = atx-kxe-rel
meoauniaea = me-o-a-u-ni-a-e-a
me'em = -me-'em


But I wouldn't separete ejectives like ey, ay etc. / px, kx or tx, also ll and rr.

-| Na'vi Vocab + Audio | Na'viteri as one HTML file | FAQ | Useful Links for Beginners |-
-| Kem si fu kem rä'ä si, ke lu tìfmi. |-

Yawne Zize’ite

Would tìmeʼem be tì-me-ʼem?

So no one disagrees with the actual principles or would be more comfortable with a morphological system? To a native English speaker, Eywaʼeve-ngit looks strange and Eywaʼeveng-it looks "right". This is why

Also, ma ʼEylan Ayfalulukanä, note that Central European norms and common sense suggest that Dothraki contracted digraphs should expand when hyphenated: atthar but ath-thar.

Tìtstewan

Quote from: Yawne Zize'ite on July 08, 2013, 01:15:51 PM
Would tìmeʼem be tì-me-ʼem?
Well, it should be me'em = -me-'em
(My was a typo... :-[)

Quote from: Yawne Zize'ite on July 08, 2013, 01:15:51 PM
So no one disagrees with the actual principles or would be more comfortable with a morphological system? To a native English speaker, Eywaʼeve-ngit looks strange and Eywaʼeveng-it looks "right". This is why
For me Eywaʼeve-ngit sound also strange. In German, we usually seperate words in their syllable like in aufstehen = auf-ste-hen.
For Na'vi I would seperate the words in syllable like in this
example Eywaʼevengit  = Ey-wa-ʼe-veng-it
A tipp: note the red points which should be in the dictionary [ɛj.wa.ˈʔɛ.vɛŋ.it]

Quote from: Yawne Zize'ite on July 08, 2013, 01:15:51 PM
Also, ma ʼEylan Ayfalulukanä, note that Central European norms and common sense suggest that Dothraki contracted digraphs should expand when hyphenated: atthar but ath-thar.
Na'vi =/= Dothraki
I don't know about Dothraki, but Dothraki seem to use a other hyphenation system.

-| Na'vi Vocab + Audio | Na'viteri as one HTML file | FAQ | Useful Links for Beginners |-
-| Kem si fu kem rä'ä si, ke lu tìfmi. |-

Yawne Zize’ite

#4
Quote from: Tìtstewan on July 08, 2013, 02:09:25 PM
Quote from: Yawne Zize'ite on July 08, 2013, 01:15:51 PM
So no one disagrees with the actual principles or would be more comfortable with a morphological system? To a native English speaker, Eywaʼeve-ngit looks strange and Eywaʼeveng-it looks "right". This is why
For me Eywaʼeve-ngit sound also strange. In German, we usually seperate words in their syllable like in aufstehen = auf-ste-hen.
For Na'vi I would seperate the words in syllable like in this
example Eywaʼevengit  = Ey-wa-ʼe-veng-it
A tipp: note the red points which should be in the dictionary [ɛj.wa.ˈʔɛ.vɛŋ.it]

That's an important question - when you attach prefixes, suffixes, and infixes to a word, how does it change where the syllable boundaries fall?

I assume that a suffix ending in a vowel would "pick up" the final consonant, and change the word from [ɛj.wa.ˈʔɛ.vɛŋ] to [ɛj.wa.ˈʔɛ..ŋit]. Similarly ay- might "pick up" a following vowel and turn ayikran into [a.ˈjik.ran], but I'm not as sure of that, because there's the counterpull of examples like ayhilwan.

Then infixes are their own kettle of fish. Is t<ay>aron [ta.ˈja.ɾon] or [taj.ˈa.ɾon]? My guess is the first one since it fits the infix pattern, but I don't know.

Plumps

The only official example I can think of is from this post in which Frommer gives the syllabification of

renu ngampamä as /re.nu ngam.pa./

which strengthens your first guess.

Concerning infixes, I also think it's the first one since in some instances this would create illegal syllabifications; think of paylltxe. It has to be /pa.yll.txe/ because *pay.ll.txe would be illegal.

Tìtstewan

#6
Quote from: Yawne Zize'ite on July 08, 2013, 03:32:02 PM
That's an important question - when you attach prefixes, suffixes, and infixes to a word, how does it change where the syllable boundaries fall?

I assume that a suffix ending in a vowel would "pick up" the final consonant, and change the word from [ɛj.wa.ˈʔɛ.vɛŋ] to [ɛj.wa.ˈʔɛ..ŋit]. Similarly ay- might "pick up" a following vowel and turn ayikran into [a.ˈjik.ran], but I'm not as sure of that, because there's the counterpull of examples like

Then infixes are their own kettle of fish. Is t<ay>aron [ta.ˈja.ɾon] or [taj.ˈa.ɾon]? My guess is the first one since it fits the infix pattern, but I don't know.
I think, in that cause infixes will 'split in new' syllables:
tayaron = ta-ya -ron
tolaron = to-la-ron
But I'm not sure, if this allowed or anywhere confirmed. But otherwise I don't see any sense if should be not like this.

With prefixes and suffixes, should be much easier:
ayyayo = ay-ya-yo
ayikranit = ay-ik-ran-it
fìtsengìl = fì-tseng-ìl

But if is there only one letter like this:
yayoä = ya-yo-ä
yayot = ya-yot !!! Here you can't separate the last syllable, because it makes no sense.
taronyut = ta-ron-yut
atxkxel = atx-kxel

EDIT:
Quote from: Plumps on July 08, 2013, 03:53:46 PM

Concerning infixes, I also think it's the first one since in some instances this would create illegal syllabifications; think of paylltxe. It has to be /pa.yll.txe/ because *pay.ll.txe would be illegal.
Hmm, I thought about
payll-txe ???
Thingks abouth the syllable of paytx


Oe skxawng... ::)




-| Na'vi Vocab + Audio | Na'viteri as one HTML file | FAQ | Useful Links for Beginners |-
-| Kem si fu kem rä'ä si, ke lu tìfmi. |-

Plumps

Quote from: Tìtstewan on July 08, 2013, 03:54:36 PM
Quote from: Plumps on July 08, 2013, 03:53:46 PM

Concerning infixes, I also think it's the first one since in some instances this would create illegal syllabifications; think of paylltxe. It has to be /pa.yll.txe/ because *pay.ll.txe would be illegal.
Hmm, I thought about
payll-txe ???
Thingks abouth the syllable of paytx

That would work if ll and rr wouldn't be restricted to come only after consonant or consonant cluster – in your example /ay/ counts as a vowel (though a diphthong) which can't precede ll and rr

*paytx is of course one syllable, as snaytx is ;)

Tìtstewan

I already corrected that...



Ma Plumps,

What do you say about the hyphenation of the infix-containing verbs and words like yayot??

-| Na'vi Vocab + Audio | Na'viteri as one HTML file | FAQ | Useful Links for Beginners |-
-| Kem si fu kem rä'ä si, ke lu tìfmi. |-

Yawne Zize’ite

While further pinning down how to divide syllables in Naʼvi is very useful, I should note that it is possible for hyphenation to be done on morphological grounds English-style, instead of on phonological grounds Romance-style.

I'm not convinced it's a good idea, but it's not my call to make.

`Eylan Ayfalulukanä

Ma Yawne Zizeʼite, how did you know I would be reading this? ;) Dothraki hyphenation is certainly worthy of discussion-- in the Dothraki forums (now expanded to include Valyrian!).

I think the tense and mood infixes are designed to cause a syllable break in many cases. It is certainly very apparent with <iv>. Hyphenation is a complex issue, I think, in most languages.

Yawey ngahu!
pamrel si ro [email protected]