If we want to look up in a Thai dictionary or
lexicon, we are faced with sorting of Thai words.
Unlike in our alphabet, which is present as
a combined list of consonants and vowels, are consonants and
vowels separated in the Thai script and appear in two lists.
In the first instance is sorted according to
the consonants, and in the order in which we learn it:
The last five listed here are written in front
of the initial consonant of a syllable and require special
consideration: They are sorted as they would stand behind
the next consonant. In the case of a double initial consonant,
this characters move to a place between the two character
initial consonant. Remaining parts of a vowel combination
stay at her place around the last initial consonant.
Sound and other diacritical marks remain completely
ignored in the first step of sorting.
The first step of sorting then compares character
by character from two words, until there is a difference.
The first different character determines the order in the
sorted list: two different consonants in according to the
priority list of the consonants, a consonant has higher priority
than vowels, two vowels, the list of vowels decides pursuance
of the said priority. If the search for the first different
character reaches the end of one word before the end of the
other word, then the shorter term has the higher priority.
If the two words are absolutely the same, then
(and only then) special characters are also incorporated in
a second step in the following priority order:
-, -่, -้, -๊,
-๋, -็, -์, -ๆ, -ฯ
Summary
1. Two given words are compared character by
character from left to right up to the first difference. This
difference alone decide according to the above mentioned priorities.
It does no matter how many signs were previously equal, nor
what follows behind. If several consonants follow each other
is no need to investigate whether this is a special kind of
consonant cluster, or whether they are initial or final consonant.
Of vowel combinations, only the individual characters are
in their current position in relation to the character that
is found in the same position in the other word. The only
exceptions are vowel signs in front of a consonant, which
are classified as behind the consonant.
2. If found out at the comparison, that the
two words are equal, any existing sound and other special
characters be considered in a second step. If it proves a
difference, this will decide the sort order.
3. If the shorter of two different words identical
with the beginning of the longer word, the shorter one has
the higher sort priority.
Special case: Compound Words
The first two words in the list below - considered
in isolation - are in the right order, because the first three
characters are the same and therefore the shorter term has
the higher priority. Because without regarding the special
character the third term in the list is identical to the first,
in according to the tone rules it should be placed on the
second place. However, there are dictionaries in which the
following sequence is applied:
จอง
[ja:wng--]
v. book, hold, reserve;
จองตั๋ว
[ja:wng--tu:a
\/]
v. purchase a ticket
จ้อง
[ja:wng /\]
v. fix, look at, stare
จ้องดู
[ja:wng
/\ du:--]
v. gaze, look at, stare
The objective of this order meaning is unmistakable.
This order arises not from the above-described sorting rules.
In their application would come out the following sequence:
จอง
[ja:wng--]
v. book, hold, reserve;
จ้อง
[ja:wng /\]
v. fix, look at, stare
จ้องดู
[ja:wng
/\ du:--]
v. gaze, look at, stare
จองตั๋ว
[ja:wng--tu:a
\/]
v. purchase a ticket
Both sorts, however, apply to be correct.
Today, probably more dictionaries are sorted by computer than
by hand. A sorting algorithm that has as a result the first
form is not easy to program and work only with the help of
a large exception dictionary. Therefore, probably, most current
dictionaries sorted by the rule and show these words in the
sequence of the second list.
Sorting with Excel?
Where there are word lists and dictionaries as a Word file or
Excel spreadsheet, the built-in sorting function can not be
used on Thai-columns. Computer programs simply sort by the ASCII
value of each character and pay no attention on preceding vowel
or special character rules.
I need your help!
I'm not a native English speaker and my English is poor.
I've translated this page from my German site because it can be helpful
for Thai students everywhere in the world.