Hindi

Hindi is the main language of North India, and the official language of the central government of India. Hindi and Urdu are both standardized versions of Hindustani, the language of the Delhi region. Besides differing in writing system, with Hindi typically using the Devanagari script and Urdu using the Arabic script, literary Hindi typically derives its higher-register vocabulary from Sanskrit, while literary Urdu typically derives its higher-register vocabulary from Arabic or Persian, though the everyday colloquial spoken registers of the languages are much more similar to each other, with the same overall grammatical structure and basic vocabulary.

Hindustani is closely related to other linguistic varieties in the North Indian area such as Braj Bhasha, Haryanvi, Awadhi, and Bhojpuri (some of which are sometimes referred to as "dialects"), as well as other Indo-Aryan languages of South Asia, such as Punjabi, Gujarati, Sindhi, Nepali, and Bengali. Due to migration, there are also some varieties of Hindi spoken in other countries, such as Fiji Hindi and Caribbean Hindi, many of which have a strong Bhojpuri component.

The Devanagari script used for Hindi is an abugida-style alphabet, with an implicit short a vowel inherently part of the base consonant letter. The Lexx Rom romanization system for Hindi is largely in line with ISO Indic transliteration.

Main Vowels

IPA Lexx Rom Base Vowel Vowel Mark
ɐ ~ ə a -
a: ā
ɪ i ि
i: ī
ʊ u
u: ū
ɾɪ
e: ē
ɛ: ai
o: ō
ɔː au

The inherent short vowel in Hindi is often described as a shwa /ə/, though the phonetic realization is typically lower as [ɐ]. The distinction between it and its long counterpart /a:/ is typically neutralized in word-final position to [a].

Similarly, the distinction between short-long counterparts /ɪ/ /i:/ and /ʊ/ /u:/ is neutralized in word-final position to [i] and [u] respectively.

The vowel ऋ <r̥> in Sanskrit originally represented a syllabic liquid like /ɹ̩/, in modern Hindi this is pronounced as a sequence /ɾɪ/ in words directly derived from Sanskrit, which is not a different vowel from /ɪ/, but the original Sanskrit vowel letter is still indicated in the spelling.

/e:/ and /o:/ do not have short separate short vowel versions, but in terms of derivational morphology, in words where they reduced they often correspond to short /ɪ/ and /ʊ/ respectively.

/ɛ:/ and /ɔː/ originally represented diphthongs /ai/ and /au/, but in standard Hindi/Urdu these are now monophthongized. They are often still diphthongs in other Indian languages, so the romanization still reflects the original diphthong origin.

Short [ɛ] and [ɔ] can surface as allophones of /ə/ when two /ə/ surround the letter /ɦ/.

In Hindi, the inherent short a vowel can be deleted in certain positions, such syllable-final position. Where not pronounced, it will be left out of the Lexx Rom romanization.

Main Consonants

IPA Lexx Rom Devanagari
k k
kh
ɡ g
ɡʱ gh
ŋ
c
tɕʰ ch
j
dʒʱ jh
ɲ ñ
ʈ
ʈʰ ṭh
ɖ
ɖʱ ḍh
ɳ
t
t̪ʰ th
d
d̪ʱ dh
n n
p p
ph
b b
bh
m m
j y
ɾ r
l l
ʋ v
ʃ ś
ʂ
s s
ɦ h

The first 25 letters in a typical Indic-template script consist of five series of five letters each. The five letters in a series represent voiceless unaspirated, voiceless aspirated, plain voiced, breathy voiced, and nasal consonants, in that order. The five series are velar (k row), palatal (c row), retroflex (ṭ row), dental (t row), and labial (p row).

The remaining consonant letters in the main Indic-template block consist of two series, one series of sonorants (y, r, l, v) and one series of fricatives (ś, ṣ, s, h).

/ŋ/ and /ɲ/ are not contrastive in Hind; they only occur before the consonants in their respective series.

The dental series consonants are apical dental consonants where the tip of the tongue must touch the back of the front teeth, so alveolar coronal consonants like /t/ and /d/ in English loanwords are perceived as the retroflex series consonants instead and spelled as such. The dental component is not necessary in the case of /n/ because the retroflex /ɳ/ is generally realized as [ɽ̃] except when coming before another retroflex consonant.

The /pʰ/ sound for some speakers is pronounced as [f].

/ɾ/ can also be realized as a short trill, and when geminate, it is typically realized as a trill [rː].

There is typically no distinction between the palatal and retroflex sibilants /ʃ/ and /ʂ/, with both being pronounced as /ʃ/. Some speakers also do not distinguish this /ʃ/ from /s/.

Other Consonants

Some supplementary consonants in Hindi that aren't part of the core Indic-template consonant block can be indicated through the use of a nuqta, or dot diacritic, placed underneath a base consonant letter.

IPA Lexx Rom Devanagari
ɽ ड़
ɽʱ ṛh ढ़

Hindi modifies the letters for <ḍ> and <ḍh> with the nuqta to create their flapped counterparts. These sounds originally developed when <ḍ> and <ḍh> became flapped in non-intial position, so they do not typically appear at the beginning of a word. Due to the subsequent influx of loanwords into Hindi-Urdu after this sound shift had occurred where retroflex <ḍ> can occur in this position, there are now minimal pairs between these and the original plosive consonants.

IPA Lexx Rom Devanagari
q q क़
x x ख़
ɣ ġ ग़
z z ज़
f f फ़
ʒ ž झ़

These nuqta letters are used for sounds that entered Hindi-Urdu through loanwords, especially loanwords from Arabic, Persian, and English. They are often not written out with the nuqta, so you can often see these words just written using the base letter instead. In terms of pronunciation, many speakers do not distinguish the pronunciation of these letters from the base letter, but English-educated speakers will typically use /z/ and /f/ where appropriate in an English loanword. As mentioned above, some speakers also use /f/ across the board all the time instead of the original /pʰ/ sound. Some English loanwords /ʒ/ with can occasionally be seen with झ़, but it is not often distinguished from ज़ or ज.

The letters क़ ख़ ग़ are found in words of Arabic or Persian origin, and so they are generally more characteristic of Urdu-register pronunciation.

Other Vowels and Vowel-like Marks

ं (anusvar) before a consonant indicates a homorganic nasal consonant from that same series.

ँ (candrabindu) above indicates nasalization of a vowel. In cases where it would not fit size-wise due to another vowel mark competing for the same space, the nasalization is indicated with the ं anusvar.

ः (visarga) indicates a soft h pronunciation after the vowel, optionally followed by a short echo of the vowel sound again. For the most part this is used only in words derived directly from Sanskrit, with a couple exceptions. It is romanized as <ḥ>.

There are two vowel letters with a candra (half-moon) without the bindu dot, that can be used for /æː/ and /ɑː/ in English loanwords:

IPA Lexx Rom Base Vowel Vowel Mark
æː â
ɑː ~ ɒː ô

Of these, ऍ is now obsolete in Hindi, being replaced with ऐ in English loanwords where the vowel would be used (meaning that in some words ऐ can indicate æː rather than ɛ:). ॅ is still used in Marathi, though the base vowel is considered to be ॲ.

Much like the English /ɔː/ is not always distinguished from /ɒː/ or /ɑː/ depending on dialect, vowel /ɑː/ is not always distinguished from /a:/ (which itself can be realized quite towards [ɑː] in Hindi). As such, it is often without the candra as the simple आ.

There are also some other Sanskrit syllabic liquid vowel letters that are extremely rare, even in Sanskrit:

IPA Lexx Rom Base Letter Vowel Mark
rɪ: r̥̄
lɪ: l̥̄

These letters will practically never be encountered in written Hindi unless one is reading specifically about Sanskrit.