Thai (TH)
Standard Thai is based on Central Thai, also known as Siamese Thai as it is the language of the core region of the kingdom of Siam. It is closely related to Lanna (Kham Mueang or Northern Thai), Isan-Lao (Northeastern Thai), and Thai Tai (Southern Thai).
The Thai script is based on the Old Khmer script, which is based on the Indic Brahmic script model that is widespread across South and Southeast Asia. Therefore, when learning Thai script as a non-native, even if never having studied an Indic script before, it is helpful to learn some of the basic principles in order to understand the underlying logic and rationale beneath its structure. Additionally, the Thai tonal system developed as part of the East Asian Sinospheric tone system Sprachbund, so it is helpful to learn some principles of how these tonal systems work in order to understand how Thai tone marks interplay with the Indic-based script.
In the following chart of initial consonants, the column "Ancestral Indic" shows what sound the letter originally transcribed in Sanskrit/Pali, "Ancestral Thai" shows what sound the letter would have originally transcribed in the past (different from modern Thai), and "Ancestral Tone Class" shows what tone class the letter belongs to based on its original phonetic value. The letters are named by the initial sound plus -ǫǫ for Middle and Low tone class or -ǫ̌ǫ for High tone class, followed by a word that uses the letter.
Initials
Script | Ancestral Indic | Ancestral Thai | Ancestral Lexx Rom | Tone Class | Modern IPA | Lexx Rom | Name | Name Lexx Rom |
---|---|---|---|---|---|---|---|---|
ก | k | k | k | middle | k | k | ไก่ | ka̩i |
ข | kʰ | kʰ | kh | high | kʰ | kh | ไข่ | kha̩i |
ฃ | - | x | x | high | kʰ | kh | ขวด | khu̩at |
ค | ɡ | ɡ | g | low | kʰ | kh | ควาย | khwaai |
ฅ | - | ɣ | ğ | low | kʰ | kh | คน | khon |
ฆ | ɡʱ | ɡ | gh | low | kʰ | kh | ระฆัง | rákhang |
ง | ŋ | ŋ | ṅ | low | ŋ | ng | งู | nguu |
จ | tʃ | tʃ | c | middle | tʃ | c | จาน | caan |
ฉ | tʃʰ | tʃʰ | ch | high | tʃʰ | ch | ฉิ่ง | chi̩ng |
ช | dʒ | dʒ | j | low | tʃʰ | ch | ช้าง | cháang |
ซ | - | z | z | low | s | s | โซ่ | sôo |
ฌ | dʒʱ | dʒ | jh | low | tʃʰ | ch | เฌอ | chơơ |
ญ | ɲ | ɲ | ñ | low | j | y | หญิง | yǐng |
ฎ | ʈ | ɗ | đ̣ | middle | d | d | ชฎา | chádaa |
ฏ | ʈ | t | ṭ | middle | t | t | ปฏัก | pa̩ta̩k |
ฐ | ʈʰ | tʰ | ṭh | high | tʰ | th | ฐาน | thǎan |
ฑ | ɖ | d | ḍ | low | tʰ | th | มณโฑ | monthoo |
ฒ | ɖʱ | d | ḍh | low | tʰ | th | ผู้เฒ่า | phûuthâo |
ณ | ɳ | n | ṇ | low | n | n | เณร | neen |
ด | t | ɗ | đ | middle | d | d | เด็ก | de̩k |
ต | t | t | t | middle | t | t | เต่า | ta̩o |
ถ | tʰ | tʰ | th | high | tʰ | th | ถุง | thǔng |
ท | d | d | d | low | tʰ | th | ทหาร | tháhǎan |
ธ | dʱ | d | dh | low | tʰ | th | ธง | thong |
น | n | n | n | low | n | n | หนู | nǔu |
บ | p | ɓ | ƀ | middle | b | b | ใบไม้ | bai máai |
ป | p | p | p | middle | p | p | ปลา | plaa |
ผ | pʰ | pʰ | ph | high | pʰ | ph | ผึ้ง | phư̂ng |
ฝ | - | f | f | high | f | f | ฝา | fǎa |
พ | b | b | b | low | pʰ | ph | พาน | phaan |
ฟ | - | v | v | low | f | f | ฟัน | fan |
ภ | bʱ | b | bh | low | pʰ | ph | สำเภา | sǎmphao |
ม | m | m | m | low | m | m | ม้า | máa |
ย | j | j | y | low | j | y | ยักษ์ | yák |
ร | r | r | r | low | r | r | เรือ | rưa |
ล | l | l | l | low | l | l | ลิง | ling |
ว | ʋ | w | w | low | w | w | แหวน | wę̌ęn |
ศ | ʃ | s | ś | high | s | s | ศาลา | sǎalaa |
ษ | ʂ | s | ṣ | high | s | s | ฤๅษี | rưưsǐi |
ส | s | s | s | high | s | s | เสือ | sư̌a |
ห | h | h | h | high | h | h | หีบ | hi̩ip |
ฬ | ɭ | l | ḷ | low | l | l | จุฬา | cu̩laa |
อ | - | ʔ | (') | middle | ʔ | (') | อ่าง | a̩ang |
ฮ | - | ɦ | h̤ | low | h | h | นกฮูก | nók hûuk |
As in many Southeast Asian Indic orthographies, the Indic short a letter is used in Thai as the null/glottal stop initial. In Lexx Rom, this sound can be omitted from transcription at the beginning of a word.
The three tone classes in Thai are High, Middle, and Low, based on the ancestral sound of each letter. The names High, Middle, and Low refer to areal pattern in East Asia of voiced initials triggering lower tones for their syllables compared to the syllables with voiceless initials, which is a cross-linguistically attested natural acoustic tendency due to the lower formant frequency caused by the voicing. However, this no longer reflects the modern Thai tone reflexes for these letters, due to language change over time.
Voiceless aspirated plosives/affricate (ข kh, ฉ ch, ฐ ṭh, ถ th, ผ ph) and voiceless fricatives (ฃ x, ฝ f, ศ ś, ษ ṣ, ส s, ห h) comprise the High tone class.
Unaspirated voiceless plosives/affricate (ก k, จ c, ฏ ṭ, ต t, ป p) and glottalized initials (ฎ đ̣, ด đ, บ ƀ, อ ') comprise the Middle tone class.
Voiced plosives/affricate (ค g, ช j, ฑ ḍ, ท d, พ b), breathy voiced initials (ฆ gh, ฌ jh, ฒ ḍh, ธ dh, ภ bh, ฮ h̤), voiced fricatives (ฅ ğ, ซ z, ฟ v), and sonorants (ง ṅ, ญ ñ, ณ ṇ, น n, ม m, ย y, ร r, ล l, ว w, ฬ ḷ) comprise the Low tone class.
Some sonorants can also have a ห h appended in front of them, and become High tone class versions of that consonant (this does not affect the pronunciation of the initial consonant in modern Thai): หง hṅ, หญ hñ, หน hm, หม hn, หย hy, หร hr, หล hl, หว hw. อ (') can also get appended in front of ย y in a couple words, changing it to a Middle tone consonant.
As seen in the chart above by comparing the ancestral sounds to the modern Thai sounds, voiced plosives/fricatives and breathy voiced initials devoiced and became aspirated.
Thai no longer has velar fricatives, so ฃ x and ฅ ğ are now obsolete, pronounced like ข kh and ค g respectively (modern kʰ in High and Low tone classes).
/r/ is generally merged with /l/ in casual Thai, but it is preserved in formal speech, and Thai people generally place great importance on using it when teaching foreigners. In casual Thai, as the second component of an initial cluster, it is commonly dropped.
Note that there is only one series of coronal consonants in Thai; the retroflex consonant letters are used for transcribing Sanskrit/Pali loanwords, and are pronounced the same as their dental counterparts.
As modern Thai does not have a voiced /ɡ/ or /dʒ/ initial, these sounds in modern loanwords are transcribed using the letters for the corresponding voiceless sounds ก k and จ c, and in casual spelling in Roman script they are often spelled as g and j, something followed in some Romanization systems for convenience.
Final Codas
While there are many initial consonants in Thai, augmented even further by extra letters dedicated for transcribing Indic sounds, there are only six options for final codas (plus semivowels -j and -w, which will be shown later in the rime section below), so all the dozens of consonant letters must collocate into one of these six options, based on what sound is the most similar.
Final Coda | Other Associated Letters | IPA | Lexx Rom |
---|---|---|---|
-ก | ข,ค,ฆ | -k̚ | -k |
-ง | - | -ŋ | -ng |
-ด | จ,ฉ,ช,ซ,ฌ,ฎ,ฏ,ฐ,ฑ,ฒ,ต,ถ,ท,ธ,ศ,ษ,ส | -t̚ | -t |
-น | ญ,ณ,ร,ล,ฬ | -n | -n |
-บ | ป,ผ,ฝ,พ,ฟ,ภ | -p̚ | -p |
-ม | - | -m | -m |
These six codas -k, -ng, -t, -n, -p, -m are a common set of final codas in the East Asia region. Note that the letter used as the default to write finals -t and -p are actually the letters for initial /d/ and /b/.
Vowels and Rimes
Thai has nine basic vowels, all of which can occur as short or long. The representation of the vowel in the script is sometimes different if there is a final coda consonant coming after it. The names of these letters are composed of the word สระ sa̩ra̩ + the vowel sound. The null/glottal stop initial is used as the base carrier for these vowels.
Basic Vowels
IPA | Script without Coda | Script with Coda | Lexx Rom |
---|---|---|---|
a | ◌ะ | -ั◌ | a |
aː | ◌า | = | aa |
i | ◌ิ | = | i |
iː | ◌ี | = | ii |
ɯ | ◌ึ | = | ư |
ɯː | ◌ือ | ◌ื | ưư |
u | ◌ุ | = | u |
uː | ◌ู | = | uu |
e | เ◌ะ | เ◌็◌ | e |
eː | เ◌ | = | ee |
ɛ | แ◌ะ | แ◌็◌ | ę |
ɛː | แ◌ | = | ęę |
o | โ◌ะ | ◌◌ | o |
oː | โ◌ | = | oo |
ɔ | เ◌าะ | ◌็อ◌ | ǫ |
ɔː | ◌อ | = | ǫǫ |
ɤ | เ◌อะ | (∅) | ơ |
ɤː | เ◌อ | เ◌ิ◌,เ◌อ◌ | ơơ |
/ɛ(ː)/ is quite low, close to /æ(ː)/. Short /a/ can approach /ɐ/, though not quite exactly reaching the position of Cantonese /ɐ/.
Short vowels without any coda consonant ending actually carry a light glottal stop coda /ʔ/, though it is not always fully pronounced if not in word-final position, and is sometimes omitted in rapid speech as well. This glottal coda means that these short vowel rimes actually count as checked syllables for the purposes of tonal behavior (discussed in the next section). This glottal stop coda is not indicated in Lexx Rom transcription.
Note that the null/glottal stop initial อ also does double duty as a potential vowel indicator in some vowel rimes , so for example ออ indicates <ǫǫ>.
Falling Diphthongs
IPA | Script | Lexx Rom |
---|---|---|
iə | เ◌ียะ | iah |
iːə | เ◌ีย | ia |
ɯə | เ◌ือะ | ưah |
ɯːə | เ◌ือ | ưa |
uə | ◌ัวะ | uah |
uːə | ◌ัว | ua |
The short falling diphthongs /iə, uə, ɯə/ are marginal to the phonological system and do not occur with a coda letter following, so the default Lexx Rom <ia, ua, ưa> indicate the full length diphthongs /iːə, uːə, ɯːə/.
Other Compound Rimes
IPA | Script | Lexx Rom |
---|---|---|
aj | ไ◌, ใ◌, ไ◌ย, ◌ัย | ai |
aːj | ◌าย | aai |
aw | เ◌า | ao |
aːw | ◌าว | aao |
iw | –ิว | iu |
uj | ◌ุย | ui |
uːj | ◌ูย | uui |
ew | เ◌็ว | eo |
eːw | เ◌ว | eeo |
ɛːw | แ◌ว | ęęo |
ɤːj | เ◌ย | ơơi |
ɔj | ◌็อย | ǫi |
ɔːj | ◌อย | ǫǫi |
oːj | โ◌ย | ooi |
iəw | เ◌ียว | iao |
uəj | ◌วย | uai |
ɯəj | เ◌ือย | ưai |
The mark ใ◌ once indicated a different sound /aɰ/, still found in other Tai languages such as Shan, but in Thai, this sound has merged into /aj/, though the spelling still reflects it in a handful of high-frequency words for which the spelling must be memorized. Likewise, words with /aj/ spelled as ไ◌ย, ◌ัย (generally due to etymological transliteration of Indic loans) must have the spelling memorized.
Special Rimes
IPA | Thai Script | Lexx Rom |
---|---|---|
◌ำ | am | am |
ฤ | rɯ | rư |
ฤๅ | rɯː | rưư |
ฦ | lɯ | lư |
ฦๅ | lɯː | lưư |
◌ร | ɔːn | ǫǫn |
ฤ, ฤๅ, ฦ, ฦๅ are letters for the Sanskrit syllabic liquids /r̩, r̩ː, l̩, l̩ː/ <r̥, r̥̄, l̥, l̥̄>, which are relatively uncommon. In some words, the inherent vowel for these letters will be /i/ instead of /ɯ/, which must be memorized word by word.
Tones
Thai has five contrastive tones, though the script does not directly indicate them in a one-to-one corresponding manner. Tones in Thai are indicated through an intricate interaction between initial consonant tone class (refer to the initials section of this guide) and tone mark. Syllables in the Middle tone class that do not have a checked coda (-p, -t, -k, or -ʔ) will be used as reference, because the tones with tone marks correspond neatly as follows:
Tones (ordered by Middle tone class)
IPA | Tone Number | Tone Mark | Description | Lexx Rom |
---|---|---|---|---|
˧ | 1 | ◌ | mid-flat | aa |
˨˩ | 2 | ◌่ | low or low falling | a̩a |
˦˥˧ | 3 | ◌้ | high falling or peaking | âa |
˦˥ | 4 | ◌๊ | high or high rising | áa |
˧˩˦ | 5 | ◌๋ | rising or dipping | ǎa |
Tone 1 is a mid-flat tone that can have some downdrift towards the end.
Tone 2 is a tone that is at the bottom of the vocal register, which cross-linguistically is ambiguous between being a low level tone or a low falling tone.
Tone 3 is a high falling tone that often carries some peaking during the higher segment, so it is often described as a peaking tone as well.
Tone 4 is a high tone that rises higher.
Tone 5 is a low rising tone that can have some dipping at the beginning, and in rapid speech in non-final position it can also surface as a low falling tone (without having time to rise up again).
High tone class:
For syllables in the High tone class that do not have a checked coda, the default tone if there is no tone mark is instead <ǎa>.
Tone marks ◌่ and ◌้ will result in the same tone as in the above chart for Middle tone class (<a̩a> and <âa>).
Tone marks ◌๊ and ◌๋ are not used with the High tone class.
Low tone class:
For syllables in the Low tone class that do not have a checked coda, the default tone if there is no tone mark is the same as in the Middle tone class (<aa>).
However, the tones when combined with tone marks ◌่ and ◌้ are <âa> and <áa> respectively, which differs from the pattern seen in the Middle/High tone classes. Tone marks ◌๊ and ◌๋ are not used with the Low tone class.
In checked syllables:
For syllables that have a checked coda (-p, -t, -k, or -ʔ), if the initial is Middle or High tone class, the default tone (no tone mark) will be <a̩>.
For syllables with initial in the Low tone class with default tone (no tone mark), one must check another step further and determine whether it is a short vowel rime or a long vowel rime; short vowel rimes will take tone <á>, and long vowel rimes will take tone <âa>. The short <á> in these checked syllables may not have that much time to rise up in tone, so it can surface as a [˦] in rapid speech.
Syllables with checked codas will rarely take tone marks; these only occur in particles, onomatopoeia, or loanwords. If taking a tone mark, the tone becomes the equivalent of whatever tone the syllable would have been if there had not been a checked coda present.
In addition to these tones, there is also an neutralized tone that occurs only as a short a in the first segment of a sesquisyllabic word. This is not indicated with any tone mark in Lexx Rom (recall that the short <a> would otherwise be <a>̩ or <á>, depending on tone class of the initial, so there is no ambiguity).
The general etymological correspondences between Thai tones and tones in other Tai languages as well as in broader East Asia is as follows:
Tone 1 corresponds to A2 (Yangping).
Tone 2 corresponds to B1 (Yinqu) and D1 (Yinru).
Tone 3 corresponds to C1 (Yinshang), B2 (Yangqu) and DL2 (Long Yangru).
Tone 4 corresponds to C2 (Yangshang) and DS2 (Short Yangru).
Tone 5 corresponds to A1 (Yinping).
Other Symbols
◌์ on top of a consonant letter cancels its entire sound out, meaning the entire letter is silent, and only retained for etymological purposes.
ๆ reduplicates the prior word.
ฯ marks an abbreviation.
◌ฺ under a consonant letter can be used in Indic transliteration to indicate that the inherent vowel has been canceled out (but not the entire letter as in ◌์). This mark is not used in written Thai otherwise, so it is not important for beginners.
Numerals
Thai | International | Spelling | Lexx Rom |
---|---|---|---|
๐ | 0 | ศูนย์ | sǔun |
๑ | 1 | หนึ่ง | nư̩ng |
๒ | 2 | สอง | sǫ̌ǫng |
๓ | 3 | สาม | sǎam |
๔ | 4 | สี่ | si̩i |
๕ | 5 | ห้า | hâa |
๖ | 6 | หก | ho̩k |
๗ | 7 | เจ็ด | ce̩t |
๘ | 8 | แปด | pę̩ęt |
๙ | 9 | เก้า | kâao |
๑๐ | 10 | สิบ | si̩p |