Languages & Dialects
Every language family carries many dialects — each with its own vocabulary, script, and community. Browse what Dialect Atlas currently documents, and watch the directory grow as more communities contribute.
English
21 dialectsAboriginal Australian English
also: AAE
A distinct Australian English variety used widely across Aboriginal communities. Shows substrate influence from many Aboriginal languages and forms a continuum with Kriol in northern Australia.
African American Vernacular English
also: AAVE, Black English, Ebonics
A widespread English variety with a coherent grammatical system, including aspectual habitual "be" and the zero copula. Spoken across the African American community throughout the United States.
Appalachian English
The English of the Appalachian mountains across West Virginia, Kentucky, Tennessee, and western North Carolina. Preserves several archaic Scotch-Irish features lost in other US varieties.
Australian English
also: General Australian
The mainstream variety of Australian English, used by the majority of speakers nationally. The General Australian sociolect, sitting between Broad and Cultivated extremes.
Broad Australian
also: Strine
The most strongly-marked sociolect of Australian English, traditionally rural and outback in association. Famously caricatured as "Strine"; recognisable by its diphthongal vowels and broad nasalisation.
Cajun English
The English of the Cajun communities of southern Louisiana, descended from generations of French-English bilingualism. Distinguished by intonation, vowel realisations, and lexical borrowings from Cajun French.
Canadian English
The English of English-speaking Canada. Closer to General American than to British English in most features, but with its own vocabulary and the well-known Canadian raising.
Chicano English
also: Mexican-American English
A native English variety spoken in Mexican-American communities of the US Southwest, particularly Los Angeles. Despite its name, most speakers are monolingual English speakers.
Cultivated Australian
The historically prestigious Australian English variety, modelled on British Received Pronunciation. Once dominant in broadcasting and law; now rare in everyday speech.
Eastern New England English
also: Boston English
The traditional English of Boston and eastern New England. Non-rhotic in conservative speech, with the cot-caught merger and the famously Boston-broad short-a.
General American
also: Standard American English
The broad accent associated with national US broadcasting and most of the inland north and west. A reference point rather than a single regional variety.
Geordie
also: Tyneside English
The dialect of Newcastle and the Tyneside region. Retains a number of Old English and Norse-derived features lost elsewhere in England.
Hawaiian Pidgin
also: Hawaii Creole English, Pidgin
An English-based creole that emerged on the plantations of Hawaiʻi in the late 19th century. Now native to a substantial share of the Hawaiʻi-born population, with influences from Hawaiian, Portuguese, Cantonese, and Japanese.
Hiberno-English
also: Irish English
The English of Ireland, shaped by long contact with Irish Gaelic. Distinctive grammar, intonation, and vocabulary set it apart from British and American varieties.
Indian English
The pluricentric variety of English used across India. Shaped by long contact with Hindi and other Indian languages, with its own pronunciation, grammar, and lexical norms.
New York English
also: NYC English, New Yorkese
The English of New York City and surrounding areas. Marked by the famous tense-lax split of short-a, raised /ɔ/, and (historically) non-rhoticity.
Newfoundland English
also: Newfie English
The most distinctive variety of Canadian English, descended from 17th- and 18th-century West Country and Hiberno-English settlers. Strongly divergent from mainland Canadian English in vowels and grammar.
Received Pronunciation
also: RP, BBC English, Standard Southern British
The traditional prestige accent of southern England. Long carried by national broadcasting and higher education, but no longer dominant in everyday British speech.
Scottish English
The variety of English spoken across Scotland. Distinct from Scots (which is sometimes counted as a separate language) but heavily influenced by it.
Singlish
also: Singapore Colloquial English
The colloquial English of Singapore, mixing English with grammatical features and lexicon drawn from Hokkien, Malay, Cantonese, and Tamil.
Southern American English
also: Southern Drawl
The dialects of the US South, characterised by the Southern Vowel Shift, the pin-pen merger, and a distinct lexical tradition.
Spanish
17 dialectsAndalusian Spanish
also: Andaluz
The Spanish of southern Spain, marked by aspiration of /s/, the loss of final consonants, and a strong influence on the formation of Latin American varieties via the Atlantic ports.
Andean Spanish
The Spanish of the Andean highlands across Peru, Bolivia, and Ecuador. Heavily influenced in pronunciation, grammar, and lexicon by Quechua and Aymara.
Bolivian Spanish
also: Cochabambino, Camba (Lowland Bolivian)
The Spanish of Bolivia, internally split between Andean (La Paz, Cochabamba) and Lowland Camba (Santa Cruz) varieties. Heavy contact features from Quechua and Aymara in the highlands.
Canarian Spanish
also: Canario
The Spanish of the Canary Islands. Linguistically a bridge between Andalusian and Caribbean varieties, with shared seseo and weakened final consonants.
Caribbean Spanish
The Spanish of Cuba, the Dominican Republic, and Puerto Rico. Shares with Andalusian the aspiration of /s/ and the loss of final consonants, and carries a distinctive lexicon shaped by Taino and African languages.
Castilian Spanish
also: Castellano, Peninsular Spanish
The standard variety of Spain, centred on Madrid and the historical kingdom of Castile. Distinguished from American varieties by the distinción between /θ/ and /s/.
Central American Spanish
The Spanish of Central America, centred on Guatemala City. Linguistically transitional between Mexican and Caribbean varieties, with widespread voseo verb forms.
Chicano Spanish
also: Mexican-American Spanish, US Spanish
The Spanish of Mexican-American communities across the US Southwest. Diverse and bilingual; shaped by long contact with English and characterised by code-switching (Spanglish).
Chilean Spanish
The Spanish of Chile. Known for rapid speech, heavy aspiration of /s/, distinct second-person verb forms, and a rich slang lexicon.
Colombian Spanish
also: Bogotano
The Spanish of Colombia, internally varied by region. The Andean Bogotá variety is widely cited for its conservative phonology and is often described as one of the clearest spoken varieties in the Americas.
Cuban-American Spanish
also: Miami Spanish
The Spanish of the Cuban-American community of Florida, especially Miami. A Caribbean Spanish variety with substantial English contact features and a distinctive bilingual code-switching register.
Ecuadorian Spanish
also: Costeño Ecuadoriano
The Spanish of coastal Ecuador, centred on Guayaquil. Distinct from the Andean Quito variety, with stronger Caribbean Spanish features including /s/-aspiration.
Mexican Spanish
The most widely spoken Spanish variety in the world. Heavily shaped by Nahuatl and other indigenous languages, especially in lexicon, and exported widely through Mexican film and music.
Northern Mexican Spanish
also: Norteño Spanish
The Spanish of northern Mexico, centred on Monterrey and Chihuahua. Distinct from central Mexican Spanish in vowel realisation, intonation, and a distinct lexicon shaped by long contact with US English.
Rioplatense Spanish
also: Río de la Plata Spanish
The Spanish of Argentina and Uruguay. Distinctive for its voseo verb forms, the sheísmo / zheísmo realisation of <ll>/<y>, and a strong Italian-influenced intonation.
Venezuelan Spanish
The Spanish of Venezuela, centred on Caracas. A Caribbean Spanish variety closely related to Cuban and Dominican Spanish, with strong /s/-aspiration and a distinctive lexicon.
Yucatec Spanish
also: Yucatecan Spanish, Español yucateco
The Spanish of the Yucatán peninsula. Heavily shaped by contact with Yucatec Maya, with distinctive intonation, glottalised stops, and a substantial Mayan-derived lexicon.
Arabic
16 dialectsAlgerian Arabic
also: Dziri, Darja
The Maghrebi Arabic of Algeria. Internally varied between urban Tellian, Saharan, and Hilalian rural varieties; carries deep Berber substrate and French contact lexicon.
Chadian Arabic
also: Shuwa Arabic, Baggara Arabic
Spoken across Chad and adjoining parts of Sudan, north-eastern Nigeria, and northern Cameroon. The main Sahelian Arabic variety, used as a lingua franca well beyond Arab communities.
Egyptian Arabic
also: Masri
The most widely understood Arabic dialect, carried across the region by cinema and music.
Gulf Arabic
also: Khaliji
Spoken across the Arabian Peninsula. Retains classical features alongside modern urban slang.
Hassaniya Arabic
A Bedouin-descended Arabic variety spoken across Mauritania, Western Sahara, and adjacent parts of the Sahel. Carries deep Berber substrate and a strong oral poetic tradition.
Hejazi Arabic
also: Hijazi
The Arabic of western Saudi Arabia, centred on Mecca, Medina, and Jeddah. Sits between Egyptian and Najdi varieties and is widely understood across the peninsula.
Iraqi Arabic
also: Mesopotamian Arabic
Spoken across Iraq and parts of eastern Syria and southwestern Iran. Carries strong Aramaic and Turkic influence and splits internally between gilit (southern) and qeltu (northern) varieties.
Levantine Arabic
also: Shami
Spoken across Syria, Lebanon, Palestine, and Jordan. Known for soft consonants and a shared cultural vocabulary.
Libyan Arabic
also: Sulaimitian Arabic
The Maghrebi Arabic of Libya, transitional between western Maghrebi and Egyptian. Tripolitanian and Cyrenaican varieties differ noticeably in vocabulary and phonology.
Maghrebi Arabic
also: Darija
Spoken across Morocco, Algeria, Tunisia, and Libya. Shaped by Berber, French, and Spanish contact.
Moroccan Darija
also: Darija, Moroccan Arabic
The Maghrebi Arabic of Morocco, with heavy Berber substrate and substantial French and Spanish loanwords. Distinct enough from Mashreqi Arabic to function as a separate everyday language.
Najdi Arabic
The Arabic of central Saudi Arabia around Riyadh. Closely related to Gulf Arabic but with distinct vowel and consonant features rooted in the bedouin Najd plateau.
Saidi Arabic
also: Upper Egyptian Arabic
The Arabic of Upper Egypt, from Beni Suef south to Aswan. Conservative relative to Cairene Egyptian Arabic, with retained interdentals and a distinct rural prestige.
Sudanese Arabic
The Arabic of Sudan, retaining a number of older features lost in other dialects and shaped by long contact with Nubian and other African languages.
Tunisian Arabic
also: Tounsi, Derja
The Maghrebi Arabic of Tunisia. Notable for vowel reductions, a rich set of French and Italian loanwords, and a literary written tradition unusual among colloquial Arabic varieties.
Yemeni Arabic
A cluster of conservative Arabic varieties across Yemen. Often cited as preserving features close to Classical Arabic that have shifted elsewhere.
French
11 dialectsAcadian French
The French of the Maritime provinces, especially New Brunswick. Preserves several 17th-century features lost in both Quebec and Metropolitan French.
Belgian French
also: Français de Belgique
The French of Wallonia and Brussels. Maintains a number of vocabulary and number forms (septante, nonante) that the Paris standard has dropped.
Cajun French
also: Louisiana French, Français cadien
The French of the Cajun communities of southern Louisiana, descended from 18th-century Acadian settlers expelled from the Maritimes. Endangered, with active revitalisation through CODOFIL and the Louisiana French immersion programme.
Franco-Ontarian
also: Ontario French, Français ontarien
The French of the historic Francophone communities of Ontario, especially around Sudbury, Ottawa, and the Northern shore. Closely related to Quebec French but with stronger English contact features.
Maghrebi French
The French of North Africa, used as a second language across Algeria, Tunisia, and Morocco. Marked by code-switching with Maghrebi Arabic and Berber.
Meridional French
also: Southern French, Français méridional
The southern French of Marseille, Toulouse, and the broader Occitan-speaking belt. Marked by realisation of mute vowels and a distinct intonation contour.
Métis French
also: Mitchif Français, Prairie French
The French variety of the Métis Nation of the Canadian Prairies. Distinct from Quebec and Acadian French; not to be confused with Michif, the mixed Cree-French language of the same communities.
Metropolitan French
also: Parisian French, Standard French
The Paris-area standard of European French. Carried nationally by education and broadcasting; the reference point for most French taught abroad.
Quebec French
also: Québécois, Joual
The dominant French variety of Canada. Strongly distinct from European French in vowel realisation, lexicon, and the colloquial Joual register.
Swiss French
also: Romand French, Suisse romand
The French of western Switzerland. Like Belgian French, retains older numerals (septante, huitante, nonante) and shows long contact with German-Swiss varieties.
West African French
The French of Francophone West Africa — used in administration, education, and media. Strong contact features from Wolof, Bambara, and other regional languages.
Portuguese
10 dialectsAngolan Portuguese
The Portuguese of Angola, used as a national lingua franca alongside Bantu languages such as Kimbundu and Umbundu, which have shaped both pronunciation and vocabulary.
Azorean Portuguese
also: Açoriano
The Portuguese of the Azores archipelago. Highly distinctive vowels and a strong island-by-island variation, often cited as one of the more divergent European varieties.
Carioca
also: Rio de Janeiro Portuguese, Fluminense
The Portuguese of Rio de Janeiro. Recognisable for its palatalised /s/ in coda position — a feature shared with Lisbon — and a distinctive intonation.
European Portuguese
also: Portuguese (Portugal), Lisbon Portuguese
The standard variety of Portugal, centred on Lisbon. Distinguished from Brazilian Portuguese by reduced unstressed vowels and a denser consonant cluster system.
Gaúcho
also: Sulista, Southern Brazilian Portuguese
The Brazilian Portuguese of Rio Grande do Sul and southern Brazil. Distinct from northern Brazilian varieties, with strong Spanish (Rioplatense) and Italian/German immigrant contact features.
Mineiro
also: Belo Horizonte Portuguese
The Brazilian Portuguese of Minas Gerais and the surrounding interior. Distinguished by characteristic vowel reductions, monophthongisation of diphthongs, and a distinctive lexicon.
Mozambican Portuguese
The Portuguese of Mozambique. A national lingua franca shaped by long contact with Bantu languages including Makhuwa, Sena, and Tsonga.
Nordestino
also: Northeastern Brazilian Portuguese
The Portuguese of Brazil's Northeast region around Recife, Salvador, and Fortaleza. Internally varied, with a distinct lexicon shaped by African and indigenous languages.
Northern Portuguese
also: Portuense, Norte de Portugal
The Portuguese of the north of Portugal around Porto. Preserves several older features lost in the standard, including the bilabial /β/ and the betacism that distinguishes <b> from <v>.
Paulistano
also: São Paulo Portuguese
The Portuguese of São Paulo and the surrounding state. The most populous Brazilian variety and a common reference for the broader Brazilian standard.
Russian
10 dialectsFar Eastern Russian
also: Dalnevostochny govor
The Russian of the Russian Far East around Vladivostok and Khabarovsk. The youngest major regional variety, formed through 19th- and 20th-century settlement; close to standard Russian with regional vocabulary tied to Pacific life.
Kazakhstani Russian
The Russian of Kazakhstan, used by ethnic Russians and as a second language by many Kazakhs. Based on standard Russian but with regional vocabulary, code-switching, and Kazakh contact features.
Moscow Russian
also: Standard Russian, Moskovskoye
The Moscow-area standard of Russian. The reference variety in education, broadcasting, and most second-language teaching.
Northern Russian
The northern Russian dialect group around Vologda and Arkhangelsk. Preserves okanye (clear /o/ in unstressed positions) lost in standard varieties.
Pomor Russian
also: Pomorsky govor
The Russian of the Pomor coastal communities along the White Sea. A North Russian variety with a distinct lexicon shaped by maritime life, fishing, and contact with Karelian and Saami.
Saint Petersburg Russian
also: Petersburg pronunciation
The traditional educated speech of Saint Petersburg. Distinct from Moscow Russian in clearer enunciation of unstressed vowels and a number of lexical preferences.
Siberian Russian
also: Sibirskie govory
The Russian varieties of Western and Central Siberia, descended from northern-Russian settlers. Distinguished by retained okanye, a conservative lexicon, and contact features from indigenous Siberian languages.
Southern Russian
The southern Russian dialect group spanning Voronezh, Rostov, and the broader steppe. Marked by fricative /ɣ/ and a distinct verb-ending system.
Ural Russian
also: Uralsky govor
The Russian of the Ural region around Yekaterinburg and Perm. Mixes Northern and Central Russian features with contact influence from local Finno-Ugric and Turkic languages.
Volga Russian
also: Vladimir-Volga dialects, Middle Russian
The Central Russian dialects of the middle Volga, around Nizhny Novgorod and Vladimir. A transitional zone between the Northern and Southern dialect groups.
German
9 dialectsBairisch
also: Bavarian, Bavarian-Austrian
A large Upper German dialect group spoken across Bavaria and most of Austria, with distinctive vowel shifts and vocabulary far removed from Standard German.
Berlinerisch
also: Berlin German, Berlinisch
The urban dialect of Berlin and surrounding Brandenburg. Mixes Low German substrate with Standard German, famous for its dry humour and clipped delivery.
Hessisch
also: Hessian
A Central German dialect group covering Hesse around Frankfurt. The Frankfurt variety is widely recognisable from German television and theatre.
Kölsch
also: Cologne German, Ripuarian
A Ripuarian Central Franconian dialect spoken in Cologne and the surrounding Rhineland. Distinctive for its melodic intonation and rich local vocabulary.
Plattdüütsch
also: Low German, Niederdeutsch, Plattdeutsch
The Low German varieties of northern Germany and the eastern Netherlands. Linguistically closer to Dutch and English than to Standard German in several core features.
Sächsisch
also: Saxon, Upper Saxon
An East Central German dialect spoken in Saxony around Dresden and Leipzig, characterised by softened plosives and a distinctive intonation.
Schwäbisch
also: Swabian
An Alemannic dialect spoken across Baden-Württemberg and parts of Bavaria. Recognisable for its diminutive "-le" suffix and softened consonants.
Schwyzerdütsch
also: Swiss German, Schweizerdeutsch
A cluster of High Alemannic varieties spoken across German-speaking Switzerland. Mutually unintelligible with Standard German for most non-Swiss listeners.
Wienerisch
also: Viennese
The urban Bavarian dialect of Vienna, layered with Yiddish, Czech, and Italian loanwords from the city's long Habsburg history.
Greek
8 dialectsCappadocian Greek
The Greek of inner Anatolia, isolated for centuries before the 1923 population exchange brought its speakers to Greece. Heavily Turkic-influenced; once thought extinct but rediscovered in central Greece in the 2000s.
Cretan Greek
The Greek of Crete. The literary medium of the late-medieval Cretan Renaissance and the basis of a long oral and musical tradition.
Cypriot Greek
also: Kypriaki
The Greek of Cyprus, distinct from Standard Modern Greek in pronunciation, grammar, and vocabulary. Preserves a number of medieval features.
Griko
also: Italiot Greek, Salentino Greek
The Greek of southern Italy, spoken in two enclaves in Salento and Calabria. A direct descendant of the Magna Graecia varieties; one of the oldest continuously-spoken Greek varieties outside Greece.
Mariupolitan Greek
also: Rumeika, Azov Greek
The Greek of the Mariupol region in Ukraine, descended from communities resettled from Crimea by Catherine the Great in 1778. Endangered; structurally close to Pontic Greek.
Pontic Greek
also: Romeyka, Pontiaka
The Greek variety historically spoken along the Pontic coast of the Black Sea, brought to mainland Greece by refugees of the 1923 population exchange. Its Anatolian-resident form (Romeyka) survives among Muslim communities near Trabzon.
Standard Modern Greek
also: Demotic Greek
The Athens-based standard of modern Greek, derived from the Demotic vernacular and codified in the late 20th century.
Tsakonian
also: Tsakonika
The only living descendant of ancient Doric Greek, spoken in a few villages of the Peloponnese around Leonidio. Critically endangered; structurally distinct from all other Greek varieties, which descend from Attic-Ionic.
Japanese
8 dialectsHakata-ben
also: Fukuoka dialect
The dialect of Fukuoka and the Hakata district on northern Kyushu. Recognisable for sentence-final particles like "to" and "tai".
Hiroshima-ben
also: Geibi dialect
The Chūgoku-region dialect of Hiroshima and surrounding Geibi area. Known to many Japanese viewers through Hiroshima-set film and television.
Hokkaido-ben
The variety spoken across Hokkaido. A relatively young dialect shaped by 19th-century settlement, mostly close to Standard Japanese but with Tōhoku and coastal influences.
Kansai-ben
also: Kansai dialect, Kinki dialect, Osaka-ben
The major dialect group of the Kansai region, with Osaka as its modern centre. Strongly distinct from Tokyo Japanese in pitch accent, copula forms, and idiom. Kyoto-ben is treated separately here.
Kyoto-ben
also: Kyō-kotoba, Kyoto dialect
The Kyoto variety of Kansai Japanese. Historically the prestige speech of the imperial court, marked by softer cadence, distinct honorific forms, and a long literary heritage.
Nagoya-ben
also: Owari dialect
The dialect of Nagoya and the Owari plain. Sits between the Tokyo and Kansai regions and shares features with both, while keeping a distinct vowel inventory.
Tōhoku-ben
also: Tōhoku dialect, Zūzū-ben
The dialects of north-eastern Honshu around Sendai. Distinguished by reduced vowel contrasts and merged sibilants — historically caricatured as "zūzū-ben".
Tokyo Japanese
also: Hyōjungo, Standard Japanese
The Tokyo-area variety that forms the basis of Standard Japanese (Hyōjungo). Carried nationwide by broadcasting, education, and government use.
Romanes
8 dialectsArli
A Balkan Romani variety spoken widely across North Macedonia, Kosovo, southern Serbia, and Bulgaria.
Gurbeti
A Balkan Romani variety with communities across Bosnia, Serbia, and surrounding countries.
Kalderash
A Vlax Romani variety with a long history in Romania and the Balkans. Traditionally associated with coppersmith trades.
Lovari
A Vlax Romani variety with strong Hungarian and Slavic influence. Spoken across Central Europe.
Polish-Baltic Romani
also: Polska Roma, Xaladitka Roma
A northern Romani variety historically rooted in Poland and the Baltic region, shaped by long contact with Slavic and Baltic languages.
Sinte Manouche
also: Sinti, Manush, Sinté
A northwestern Romanes variety spoken by Sinti and Manouche communities across Germany, France, the Benelux, and northern Italy. Shaped by long contact with Germanic and Romance languages.
Slovak Romani
also: Central Slovak Romani, Servika Roma
A Central European Romanes variety spoken by Romanes communities in Slovakia, with strong contact-influenced vocabulary from Slovak, Hungarian, and neighbouring languages.
Zargari
also: Zargari Romani, Zargari Romanes, Zargar Romani
A highly endangered Romani variety spoken in and around the village of Zargar in Qazvin province of north-western Iran, west of Tehran. The community is traditionally said to have been settled in the Safavid period and is one of the easternmost Romani-speaking populations on record; the dialect retains a Romani lexical core but shows heavy contact influence from Persian and Azerbaijani Turkish in phonology, vocabulary, and grammar. Speaker numbers are small and intergenerational transmission has weakened sharply.
Hindi
7 dialectsAwadhi
also: Avadhi
Spoken across the Awadh region of central Uttar Pradesh. Carries a major literary tradition through Tulsidas' Ramcharitmanas.
Bhojpuri
Spoken across eastern Uttar Pradesh, western Bihar, and parts of southern Nepal. Carried abroad by indenture-era diasporas to Mauritius, Fiji, Suriname, and Trinidad.
Braj Bhasha
also: Braj
The variety of the Braj region around Mathura and Vrindavan. A major literary medium of medieval Hindi devotional poetry, especially the bhakti tradition.
Bundeli
also: Bundelkhandi
Spoken in the Bundelkhand region straddling Madhya Pradesh and Uttar Pradesh, around Jhansi. Carries a strong oral epic tradition.
Haryanvi
also: Bangaru
The Western Hindi variety of Haryana and adjacent districts of Delhi. Distinct from Khariboli in vowels, perfective forms, and rural vocabulary.
Khariboli
also: Standard Hindi, Dehlavi
The variety spoken around Delhi and the western Doab. Forms the basis of Modern Standard Hindi and shares its core grammar with Standard Urdu.
Marwari
The Rajasthani variety of the Marwar region around Jodhpur. Often classified separately from Hindi in linguistic literature, but commonly grouped with the wider Hindi-belt continuum.
Italian
7 dialectsLombard
also: Lombardo
The Gallo-Italic variety of Lombardy and Italian-speaking Switzerland. Internally split between Western (Milan) and Eastern (Bergamo) groupings.
Neapolitan
also: Napulitano
The variety of Naples and surrounding Campania. Often classified as a distinct Italo-Romance language; a major literary and operatic medium.
Romanesco
also: Roman Italian
The Italian of Rome. A Central Italian variety widely heard through national cinema and television, occupying a middle position between Tuscan and Southern varieties.
Sardinian
also: Sardu
The Romance variety of Sardinia. The most conservative Romance language, often cited as the closest living variety to Latin in several phonological respects.
Sicilian
also: Sicilianu
The variety of Sicily, often counted as a separate Italo-Romance language. Carries strong Greek, Arabic, Norman, and Spanish influences from the island's history.
Standard Italian
also: Italiano standard, Tuscan-based
The national standard of Italy, historically based on Florentine Tuscan. The reference variety in education and national media.
Venetian
also: Vèneto
The Romance variety of the Veneto region. Linguistically distinct enough from Italian that many linguists classify it as a separate language.
Tamazight
7 dialectsCentral Atlas Tamazight
also: Tamazight, Beraber
The Berber variety of the Middle Atlas mountains in central Morocco. Roughly 2.5 million speakers; the basis of much of the standardised Moroccan Tamazight taught in schools.
Kabyle
also: Taqbaylit
The Berber variety of the Kabylie region in northern Algeria. The most documented and standardised Algerian Tamazight, with a strong written literary tradition.
Nafusi
also: Nafusi Berber, Jebel Nafusa
The Berber variety of the Nafusa mountains in north-western Libya. Around 200,000 speakers; one of the few surviving Berber languages inside Libya.
Siwi
also: Siwa Berber
The easternmost surviving Berber language, spoken in the Siwa Oasis of western Egypt. Heavily restructured by Arabic contact; the only Tamazight variety inside Egypt.
Tamasheq
also: Tuareg Berber
The Berber variety of the Tuareg confederations across the central Sahara. Written historically in the Tifinagh script, which Tuareg communities have preserved continuously.
Tarifit
also: Riffian
The Berber variety of the Rif region in northern Morocco. Heavily distinct from other Tamazight varieties in phonology, including a strong final-vowel reduction.
Tashelhit
also: Shilha, Soussian Berber
The Berber variety of the southern Atlas and Souss valley in Morocco. The most numerous Tamazight variety, with a literary tradition going back centuries.
Ukrainian
7 dialectsBukovinian
also: Bukovyna Ukrainian, Pokuttia–Bukovinian
The Southwestern Ukrainian variety of the Bukovyna region around Chernivtsi. Shaped by long contact with Romanian, German, and Polish under Habsburg rule.
Carpathian Rusyn
also: Rusyn, Ruthenian, Lemko
The East Slavic varieties of the Carpathian highlands across Zakarpattia, eastern Slovakia, and southeastern Poland. Often classified as a separate language; historically grouped with Ukrainian.
Polissian
also: Northern Ukrainian, Polissky govir
The Northern Ukrainian dialect group of the Polissia region, straddling northern Ukraine and southern Belarus. Forms a transitional zone between Ukrainian and Belarusian.
Sloboda Ukrainian
also: Slobozhansky, Slobozhanshchyna Ukrainian
The Southeastern Ukrainian variety of the Sloboda region around Kharkiv and Sumy. Close to the standard, but with a distinctive lexicon shaped by long contact with Russian.
Standard Ukrainian
also: Kyivan Ukrainian
The Kyiv-based standard of Ukrainian. The state language of Ukraine and the second-largest East Slavic language by speakers.
Surzhyk
The Russian-Ukrainian mixed colloquial register, widely spoken across central and eastern Ukraine. Treated by linguists as a code-mixing continuum rather than a fixed dialect.
Western Ukrainian
also: Galician Ukrainian, Lviv Ukrainian
The Ukrainian of Galicia and the western regions around Lviv. Carries contact features from Polish and Slovak and was historically the prestige variety in Habsburg-era Galicia.
Persian
6 dialectsDari
The Persian variety spoken in Afghanistan, sharing deep roots with Farsi while carrying its own vocabulary and pronunciation.
Farsi
The standard Persian variety spoken in Iran, with a rich literary tradition rooted in classical poetry and modern expression.
Hazaragi
A distinctive Persian dialect of the Hazara people, with its own verb forms, vocabulary, and cultural identity rooted in central Afghanistan.
Herati
also: Herati Persian, Herati Dari, Herātī
The Dari of Herat and the Hari Rud valley in western Afghanistan. Transitional between Kabuli Dari and the Khorasani Persian of north-eastern Iran, with strong Iranian features in vocalism and lexicon. Around two million speakers.
Kabuli
also: Kabuli Persian, Kabuli Dari, Kābolī
The Dari variety of Kabul and the prestige basis of standard Afghan Persian as used in broadcast, government, and education. Sociolinguistically dominant across most of eastern and central Afghanistan, with several million speakers.
Tajik
The Persian variety spoken in Tajikistan, written in Cyrillic script with unique vocabulary influenced by Central Asia.
Domari
5 dialectsEgyptian & Sudanese Domari
also: Ghagar, Halebi
Dom communities across Egypt and Sudan, sometimes grouped under broader regional labels. Linguistic documentation remains limited.
Iranian Domari
Dom communities in Iran, historically referred to by various local names. Academic documentation of their speech is sparse.
Jerusalem Domari
also: Domari of Jerusalem, Nawari
The most thoroughly documented Domari variety, spoken by the Dom community of the Old City of Jerusalem. Severely endangered, with only a few hundred fluent speakers; the reference variety in Matras (2012), A Grammar of Domari.
Levantine Domari
Domari spoken by Dom communities across the wider Levant — the West Bank and Gaza beyond Jerusalem, Jordan, Lebanon, and Syria. Documentation outside Jerusalem is uneven and largely lexical.
Turkish Domari
also: Abdal
Dom communities in Türkiye (also known locally as Dom or Abdal), concentrated in southeastern Anatolia, with significant contact influence from Turkish and Kurdish.
Inuit
5 dialectsInuktitut
also: Eastern Canadian Inuktitut
The Inuit language of Nunavut and Nunavik. The largest Eastern Canadian Inuit variety and one of Nunavut's official languages, written in the Inuktitut syllabary.
Inuktun
also: Polar Eskimo, Avanersuarmiutut
The Inuit language of north-western Greenland around Qaanaaq, the world's northernmost Indigenous language community. Linguistically closer to Canadian Inuktitut than to Kalaallisut.
Inuvialuktun
also: Western Canadian Inuit
The Inuit language of the Inuvialuit communities of the western Canadian Arctic. Smaller than Inuktitut and historically split between three sub-varieties: Sallirmiutun, Uummarmiutun, and Kangiryuarmiutun.
Kalaallisut
also: West Greenlandic, Greenlandic
The West Greenlandic standard and the official language of Greenland. The most-spoken Inuit variety, with continuous use in education, literature, and media.
Tunumiisut
also: East Greenlandic
The Inuit language of east Greenland, centred on Tasiilaq. So divergent from West Greenlandic in pronunciation and lexicon that the two are barely mutually intelligible; often classified as a separate language.
Korean
5 dialectsGyeongsang
also: Southeastern Korean, Busan dialect
The Korean of Busan and the south-eastern Gyeongsang region. The most widely recognised non-standard Korean dialect, marked by its pitch-accent system.
Jeju
also: Jejueo
The variety of Jeju Island, often classified as a separate Koreanic language. Carries archaic features lost on the mainland and is critically endangered.
Jeolla
also: Southwestern Korean
The Korean of the south-western Jeolla region around Gwangju. Distinct vowel realisations, sentence-final endings, and a strong rural culinary lexicon.
Koryo-mar
also: Koryo-mal, Koryo-saram Korean
The Korean variety of the Koryo-saram, descended from communities deported from the Soviet Far East in 1937 to Central Asia. A Hamgyŏng-derived dialect heavily influenced by Russian, now critically endangered.
Seoul Korean
also: Standard Korean, Pyojuneo
The Seoul-based standard of South Korean. Carried nationwide by broadcasting and public education, and the basis of most Korean instruction abroad.
Kurdish
5 dialectsBahdini
also: Behdini, Badini, Behdînî, Bahdînî
A southern Kurmanji subdialect spoken in Duhok province of Iraqi Kurdistan and adjacent areas of southeastern Turkey. Mutually intelligible with northern Kurmanji but with distinct phonology and vocabulary, and traditionally written in Arabic script alongside Sorani rather than the Latin script used for Kurmanji further north.
Hawrami
An older and distinctive Kurdish variety spoken in pockets along the Iran-Iraq border, rich in classical poetic tradition.
Kurmanji
The most widely spoken Kurdish variety, used across southeast Turkey, northern Syria, and parts of Iraq and Iran.
Sorani
Spoken across Iraqi Kurdistan and western Iran. The main written language in Iraqi Kurdistan.
Zaza
Spoken in eastern Turkey. Closely related to Kurdish varieties, with its own grammar and vocabulary.
Mandarin
5 dialectsBeijing Mandarin
also: Pekingese, Standard Mandarin (basis)
The Beijing-area variety of Mandarin and the phonological basis of Standard Chinese (Pǔtōnghuà). Distinguished by heavy erhua and clipped syllable-final endings.
Northeastern Mandarin
also: Dōngběi Huà
The Mandarin of China's three Northeastern provinces around Shenyang and Harbin. Close to standard Mandarin but with a distinctive intonation and rich slang.
Sichuanese Mandarin
also: Sìchuānhuà
A Southwestern Mandarin variety spoken across Sichuan and Chongqing. Notably divergent in tones and vocabulary, often partially unintelligible to standard speakers.
Taiwanese Mandarin
also: Guóyǔ, ROC Standard Mandarin
The Mandarin standard of Taiwan. Diverges from Mainland Pǔtōnghuà in pronunciation, lexicon, and tone, with substantial influence from Taiwanese Hokkien.
Yunnan Mandarin
also: Yúnnánhuà
The Southwestern Mandarin of Yunnan province. Carries strong contact influence from neighbouring Tibeto-Burman and Tai-Kadai languages.
Kazakh
4 dialectsEastern Kazakh
also: Semey Kazakh
The Kazakh of eastern Kazakhstan around Öskemen and Semey, near the Altai border. Carries contact features from Mongolic Oirat and Russian.
Kazakh
also: Qazaqşa, Southern Kazakh
A Kipchak Turkic language and the official language of Kazakhstan. The Almaty-area variety, traditionally the southern standard. Currently transitioning from Cyrillic to a Latin-based alphabet.
Northern Kazakh
also: Astana Kazakh
The Kazakh of northern Kazakhstan around Astana and Pavlodar. Closer to Tatar in several lexical and phonological features, with stronger Russian-contact influence.
Western Kazakh
also: Atyrau Kazakh
The Kazakh of western Kazakhstan around Atyrau and Aktobe. The dialect group with the most extensive contact features from Tatar and Bashkir; transitional toward the Volga Kipchak varieties.
Swahili
4 dialectsCongo Swahili
also: Kingwana
The Swahili used as a lingua franca across eastern DR Congo. A simplified contact variety, structurally divergent from the East African standard.
Kimvita
also: Mombasa Swahili
The historical Swahili variety of Mombasa and the Kenyan coast. The classical literary Swahili of the 18th and 19th centuries.
Sheng
The youth vernacular of urban Kenya, especially Nairobi. A code-mixed register drawing on Swahili, English, and various Kenyan languages.
Standard Swahili
also: Kiswahili Sanifu
The Tanzania-based literary and educational standard of Swahili. Modelled on the Zanzibari Kiunguja variety; Tanzania's national language and a continental lingua franca.
Turkish
4 dialectsAnatolian Turkish
A broad cluster of regional dialects across central and eastern Anatolia. Internally varied; many features sit at the boundary with Azerbaijani and other Oghuz Turkic varieties.
Balkan Turkish
also: Rumelice
The Turkish varieties of the Balkans, surviving from Ottoman-era settlement. Spoken across pockets of Bulgaria, North Macedonia, Kosovo, and northern Greece.
Cypriot Turkish
also: Kıbrıs Türkçesi
The Turkish of the northern half of Cyprus. Preserves Anatolian features lost in standard Turkish and shows long contact with Cypriot Greek.
Istanbul Turkish
also: Standard Turkish, İstanbul Türkçesi
The Istanbul-area standard of Turkish. The basis of the modern written language and of broadcast Turkish.
Albanian
3 dialectsArbëresh
also: Italo-Albanian
The Albanian variety of the Arbëreshë communities of southern Italy, descended from 15th-century settlers fleeing the Ottoman conquest. A medieval Tosk Albanian preserved through five centuries of relative isolation.
Gheg Albanian
also: Gegërishte
The northern Albanian dialect group, spoken across northern Albania, Kosovo, and parts of North Macedonia and Montenegro. Distinguished by nasal vowels and the retention of older Albanian features.
Tosk Albanian
also: Toskërishte, Standard Albanian basis
The southern Albanian dialect group, centred on Tirana and the south. The basis of the modern Albanian literary standard codified in 1972.
Bengali
3 dialectsBangladeshi Bengali
also: Bāṅlā, Dhaka Bengali
The Bengali of Bangladesh, with Dhaka as its national centre. Closely related to the West Bengal standard but with its own pronunciation, vocabulary, and Persian-Arabic loan stratum.
Standard Bengali
also: West Bengal Bengali, Kolkata Bengali
The variety of West Bengal centred on Kolkata. The basis of literary Bengali and the medium of a major modern literature including the works of Tagore.
Sylheti
A Bengali variety of the Sylhet region of north-eastern Bangladesh. Often classified as a separate language; the dominant heritage variety of the British Bangladeshi community.
Cantonese
3 dialectsGuangzhou Cantonese
also: Canton Cantonese, Guǎngzhōuhuà
The Cantonese of Guangzhou and the broader Pearl River Delta. The historical prestige Cantonese; closely related to Hong Kong Cantonese but with distinct vocabulary.
Hong Kong Cantonese
also: Yuè, HK Cantonese
The Yue Chinese variety of Hong Kong, the most internationally recognised Cantonese. Prestige variety of Cantonese-language film, music, and broadcasting.
Taishanese
also: Toisanese, Hoisan-wa
A divergent Yue Chinese variety from the Sze Yup region of Guangdong. Once the dominant variety in early Chinese-American communities, especially in San Francisco and New York.
Cree
3 dialectsEastern Cree
also: Iyiyiw-Iyimiwin, James Bay Cree
The Cree varieties of Quebec and Labrador, including the James Bay and Atikamekw communities. The Cree dialect cluster with the most active language-revitalisation infrastructure.
Plains Cree
also: Nēhiyawēwin, y-dialect Cree
The westernmost Cree variety, spoken across Saskatchewan and Alberta. The most populous Cree dialect and the basis of much published Cree literature.
Swampy Cree
also: n-dialect Cree, Maskēkowīhew
The Cree of the Hudson Bay lowlands across northern Manitoba and Ontario. Distinguished from Plains Cree by the use of /n/ where Plains Cree has /j/.
Dutch
3 dialectsFlemish
also: Belgian Dutch, Vlaams
The Dutch varieties of Flanders. Distinguished from Netherlands Dutch in pronunciation, lexicon, and a separate broadcasting standard (VRT-Nederlands).
Standard Dutch
also: Algemeen Nederlands
The Holland-based standard of Dutch. The official language of the Netherlands and one of the official languages of Belgium and Suriname.
Surinamese Dutch
The Dutch of Suriname. The official language of the country, used alongside Sranan Tongo and other languages of the Surinamese plurilingual landscape.
Guarani
3 dialectsBolivian Guarani
also: Eastern Bolivian Guarani
The Guarani varieties of southeastern Bolivia, recognised as one of the country's official Indigenous languages.
Mbyá Guaraní
A Guarani variety distinct from Paraguayan Guaraní, spoken by the Mbyá people across southern Brazil, eastern Paraguay, and north-eastern Argentina. The most widely spoken Indigenous Guaraní variety in Brazil.
Paraguayan Guarani
also: Avañeʼẽ, Jopará
The Guarani of Paraguay. A national co-official language alongside Spanish, used by a majority of the population, often code-mixed with Spanish (Jopará).
Hausa
3 dialectsKano Hausa
also: Standard Hausa
The Kano-based standard of Hausa. A major lingua franca across the West African Sahel, used widely in Nigeria, Niger, and beyond.
Niger Hausa
The Hausa of the Republic of Niger, used alongside French as a national language. Differs from Nigerian Hausa in spelling conventions and contact lexicon.
Sokoto Hausa
also: Sakkwatanci
The Hausa of Sokoto in north-western Nigeria. Carries the historical prestige of the Sokoto Caliphate and a distinct lexicon of religious and political terms.
Indonesian
3 dialectsBahasa Indonesia
also: Standard Indonesian
The national lingua franca and official language of Indonesia, derived from Malay and shaped by extensive contact with Javanese, Sundanese, Dutch, and English.
Bahasa Malaysia
also: Malaysian Malay, Bahasa Melayu
The Malay standard of Malaysia. Shares core grammar with Indonesian but differs in vocabulary, with stronger English borrowings and distinct spelling conventions.
Brunei Malay
also: Bahasa Melayu Brunei
The Malay variety of Brunei. A vernacular distinct from both Bahasa Malaysia and Indonesian, used as the everyday spoken standard alongside formal Bahasa Melayu.
Norwegian
3 dialectsEastern Norwegian
also: Oslo Norwegian, Bokmål-aligned
The Oslo-area variety, the most widely heard spoken Norwegian in national broadcasting. Closely associated with Bokmål, the more Danish-influenced of the two written standards.
Trøndersk
also: Trøndelag Norwegian
The Norwegian of Trøndelag in central Norway, around Trondheim. Distinct from both eastern and western varieties in vowel realisation and intonation.
Western Norwegian
also: Bergen Norwegian, Vestlandsk
The Norwegian of the western fjords, centred on Bergen. The basis of much of Nynorsk, the second written Norwegian standard built from rural dialects rather than Dano-Norwegian.
Punjabi
3 dialectsDoabi
The Punjabi of the Doaba region between the Beas and Sutlej rivers. The variety with the largest Sikh diaspora footprint, especially in the United Kingdom and Canada.
Lahnda
also: Western Punjabi, Saraiki cluster
The Western Punjabi cluster of Pakistan, including Saraiki and Hindko. Often classified as a distinct language group rather than a Punjabi dialect.
Majhi Punjabi
also: Standard Punjabi
The Majhi variety of central Punjab on both sides of the India-Pakistan border. The basis of the literary standard for both Gurmukhi-script Indian Punjabi and Shahmukhi-script Pakistani Punjabi.
Quechua
3 dialectsBolivian Quechua
also: South Bolivian Quechua
The Quechua of the Bolivian highlands. The largest Quechua variety by national share of speakers and one of Bolivia's recognised official languages.
Cusco Quechua
also: Qusqu Runasimi
The Quechua of the Cusco region in southern Peru. Often used as a reference variety for Southern Quechua and the medium of much of modern Quechua-language media.
Kichwa
also: Ecuadorian Quichua
The Northern Quechua varieties of Ecuador, often spelled Kichwa. Distinct in vowel inventory and verb morphology from Southern Peruvian and Bolivian Quechua.
Tagalog
3 dialectsBatangas Tagalog
also: Batangueño
The Tagalog of Batangas province, often cited as more conservative than Manila Tagalog. Preserves several older verb forms and a distinctive intonation.
Bulacan Tagalog
The Tagalog of Bulacan province, just north of Manila. Historically influential as a literary variety in the development of modern Filipino.
Manila Tagalog
also: Filipino, Standard Tagalog
The Manila-area variety that forms the basis of Filipino, the national language of the Philippines. Heavily mixed with English in everyday speech (Taglish).
Tamil
3 dialectsMadurai Tamil
The Tamil of Madurai and the broader southern Tamil Nadu region. Carries a distinctive lexicon and intonation widely featured in Tamil cinema.
Sri Lankan Tamil
also: Jaffna Tamil, Eelam Tamil
The Tamil of northern and eastern Sri Lanka. Often described as more conservative than Indian Tamil, retaining several features lost in the Chennai variety.
Standard Tamil
also: Chennai Tamil
The Chennai-area variety of Tamil that forms the basis of the modern spoken standard. Tamil itself has one of the longest continuous literary traditions in the world.
Thai
3 dialectsCentral Thai
also: Standard Thai, Bangkok Thai
The Bangkok-area variety that forms the basis of the modern Thai standard. The reference variety in education, broadcasting, and most Thai instruction abroad.
Isan
also: Northeastern Thai, Lao-Isan
The Tai language of Thailand's Northeast region. Closer to Lao than to Central Thai; spoken by the largest non-Central-Thai population in the country.
Lanna
also: Northern Thai, Kham Mueang
The Tai language of Thailand's Northern region around Chiang Mai. Carries its own historical script and a literary tradition tied to the former Lanna kingdom.
Vietnamese
3 dialectsHanoi Vietnamese
also: Northern Vietnamese, Standard Vietnamese
The Northern Vietnamese variety of Hanoi. The basis of the national standard and of most Vietnamese taught abroad.
Huế Vietnamese
also: Central Vietnamese
The Central Vietnamese variety of Huế. Preserves a distinctive five-tone system and lexicon rooted in the former Nguyễn-dynasty imperial capital.
Saigon Vietnamese
also: Southern Vietnamese, Hồ Chí Minh City
The Southern Vietnamese variety of Ho Chi Minh City. The most widely heard variety abroad, carried by the post-1975 Vietnamese diaspora.
Yoruba
3 dialectsIfe Yoruba
The Yoruba of Ile-Ife, the religious centre of Yoruba mythology. A southeastern Yoruba variety with distinct vowel inventory and tonal features.
Lagos Yoruba
also: Èkó Yorùbá
The Yoruba of Lagos and the broader Èkó area. Heavily code-mixed with Nigerian English and Pidgin in everyday urban use.
Standard Yoruba
also: Èdè Yorùbá Òde
The Oyo-based standard of literary and broadcast Yoruba. One of the major languages of Nigeria, with substantial communities in Benin and Togo.
Amharic
2 dialectsGojjam Amharic
The Amharic of the Gojjam region in north-western Ethiopia. Often cited as one of the most conservative Amharic varieties.
Standard Amharic
also: Addis Amharic
The Addis-Ababa-based standard of Amharic. Ethiopia's federal working language and one of the major Semitic languages of the Horn of Africa.
Aymara
2 dialectsAzerbaijani
2 dialectsNorth Azerbaijani
also: Azeri, Azerbaijan Turkish
The Oghuz Turkic language of Azerbaijan, centred on Baku. Closely related to Turkish; written in a Latin-based alphabet since the post-Soviet reform.
South Azerbaijani
also: Iranian Azeri, Tabrizi
The Azerbaijani of north-western Iran, centred on Tabriz. The most-spoken Azerbaijani variety; uses a Perso-Arabic script and shows long contact with Persian.
Balochi
2 dialectsEastern Balochi
also: Pakistani Balochi
The Balochi of Pakistan's Balochistan, centred on Quetta. The largest Balochi variety by speakers and the basis of most Balochi-language broadcasting and publishing.
Western Balochi
also: Iranian Balochi, Makrani Balochi
The Balochi of south-eastern Iran around Zahedan and the Makran coast. Phonologically distinct from Eastern Balochi, with strong contact features from Persian.
Bulgarian
2 dialectsRhodope Bulgarian
also: Rodopski govori
The Bulgarian dialects of the Rhodope mountains in southern Bulgaria. Among the most conservative South Slavic varieties; some sub-dialects retain features lost in the standard a millennium ago.
Standard Bulgarian
also: Bălgarski
The Sofia-based standard of Bulgarian. A South Slavic language with one of the most analytical grammars in the family, having lost most case marking in favour of articles and prepositions.
Burmese
2 dialectsMandalay Burmese
also: Upper Burmese
The Burmese of Mandalay and Upper Myanmar. Often described as the more conservative variety, especially in pronunciation of historical consonant clusters.
Standard Burmese
also: Yangon Burmese
The Yangon-area variety that forms the basis of standard Burmese. The national lingua franca and the medium of education and broadcasting in Myanmar.
Cherokee
2 dialectsEastern Cherokee
also: Kituwah, Tsalagi (Eastern Band)
The Cherokee of the Eastern Band in western North Carolina. Descended from the communities that escaped the 1838 Trail of Tears removal; spoken in the Qualla Boundary.
Western Cherokee
also: Otali, Cherokee Nation
The Cherokee of Oklahoma, descended from the communities forcibly removed from the Southeast in 1838. The basis of the modern Cherokee Nation's language programme and the Cherokee syllabary press.
Croatian
2 dialectsČakavian
also: Čakavski
A South Slavic dialect group of the Adriatic coast and islands of Croatia, named for its pronoun "ča". Older than Štokavian and the medium of significant medieval Croatian literature.
Croatian
also: Hrvatski, Standard Croatian
The Zagreb-based standard of Croatian, built on a Štokavian basis. Distinguished from Serbian by lexical preferences and a strict Latin-script orthography.
Czech
2 dialectsMoravian Czech
also: Moravské nářečí
The Czech varieties of Moravia. Internally diverse; eastern Moravian dialects sit close to Slovak in several features.
Standard Czech
also: Spisovná čeština
The Prague-based literary and broadcast standard of Czech. The official language of Czechia and a West Slavic counterpart to Slovak with which it is highly mutually intelligible.
Danish
2 dialectsGreenlandic Danish
The Danish of Greenland, co-official with Kalaallisut. Used widely in administration and education; carries some contact features from Kalaallisut.
Standard Danish
also: Rigsdansk
The Copenhagen-based standard of Danish. A North Germanic language, widely cited for its highly reduced vowels and famously challenging stød (a quasi-tonal feature) for learners.
Gujarati
2 dialectsKathiawari Gujarati
The Gujarati of the Kathiawar peninsula in western Gujarat. Distinct in vowels, verb endings, and a number of regional vocabulary items.
Standard Gujarati
also: Surati / Charotari
The Gujarati of central and southern Gujarat around Ahmedabad and Vadodara. The basis of the literary and broadcast standard.
Macedonian
2 dialectsOhrid Macedonian
also: Western Macedonian, Ohridsko-prespanski
The Western Macedonian dialect group around Lake Ohrid and Prespa. Provided much of the structural basis for the modern literary standard, especially in stress placement and verb morphology.
Standard Macedonian
also: Makedonski
The Skopje-based standard of Macedonian, codified after the Second World War. Closely related to Bulgarian; written in Cyrillic with several letters unique to Macedonian.
Māori
2 dialectsEastern Māori
The Māori varieties of the East Coast of the North Island. Distinguished by tribal lexicon and tonal patterns associated with iwi of the region.
Te Reo Māori
also: Standard Māori
The Māori language of Aotearoa New Zealand. An Eastern Polynesian language and one of New Zealand's official languages, central to ongoing language-revitalisation efforts.
Marathi
2 dialectsStandard Marathi
also: Puneri, Pune Marathi
The Pune-area variety of Marathi that forms the literary and educational standard across Maharashtra.
Varhadi
also: Vidarbha Marathi
The Marathi of the Vidarbha region in eastern Maharashtra. Distinct from Standard Marathi in pronouns, verb endings, and Hindi-contact vocabulary.
Mongolian
2 dialectsInner Mongolian
also: Chakhar Mongolian, Southern Mongolian
The Mongolian of China's Inner Mongolia Autonomous Region, centred on Hohhot. Closely related to Khalkha but written in the traditional vertical Mongolian script and with strong Mandarin contact features.
Khalkha Mongolian
also: Standard Mongolian, Khalkh
The Khalkha-based standard of Mongolia. The official language of the country and the largest Mongolic variety; written in Cyrillic since the mid-20th century, with a planned return to traditional Mongolian script.
Ojibwe
2 dialectsCentral Ojibwe
also: Anishinaabemowin, Chippewa
The largest Ojibwe variety, spoken across Ontario, Manitoba, Minnesota, and Wisconsin. An Algonquian language with a continuous oral tradition and recent literary growth.
Odawa
also: Ottawa, Nishnaabemwin
The Ojibwe variety of Manitoulin Island and the Bruce Peninsula. Distinguished from Central Ojibwe by extensive vowel syncope and a distinct pronoun system.
Pashto
2 dialectsNorthern Pashto
also: Yusufzai Pashto, Peshawari Pashto
The Pashto of Khyber Pakhtunkhwa in Pakistan and parts of eastern Afghanistan. The literary basis of Pakistani Pashto, with the characteristic /ʃ/-realisation of the historical retroflex sibilant.
Southern Pashto
also: Kandahari Pashto
The Pashto of southern Afghanistan around Kandahar. Considered the most conservative Pashto variety, retaining retroflex consonants lost in the northern dialects.
Polish
2 dialectsSilesian
also: Ślōnskŏ gŏdka
The Slavic variety of Upper Silesia, often classified as a separate language from Polish. Carries strong contact features from German and Czech.
Standard Polish
also: Język polski
The Warsaw-based standard of Polish. The official language of Poland and the largest Slavic language by speakers within the European Union.
Romanian
2 dialectsAromanian
also: Arumun, Vlach
An Eastern Romance variety spoken across the southern Balkans. Often classified as a separate language; carries significant contact influence from Greek, Albanian, and Slavic.
Daco-Romanian
also: Standard Romanian
The Bucharest-based standard of Romanian, spoken across Romania and Moldova. The largest Eastern Romance language and the basis of the modern literary standard.
Slovenian
2 dialectsPrekmurje Slovenian
also: Prekmurščina
The Slovenian variety of the Prekmurje region between the Mura and the Hungarian border. Carries a separate literary tradition that developed independently from the central standard until the 20th century.
Standard Slovenian
also: Slovenščina, Ljubljana Slovenian
The Ljubljana-based standard of Slovenian. A South Slavic language with a strikingly diverse internal dialect landscape — historically grouped into seven major dialect bases.
Sorbian
2 dialectsLower Sorbian
also: Dolnoserbski, Niedersorbisch, Wendish
A West Slavic language spoken by the Sorb minority of Lower Lusatia in Brandenburg, centred on Cottbus (Chóśebuz). Critically endangered, with revitalisation efforts in bilingual schools and the Witaj programme.
Upper Sorbian
also: Hornjoserbsce, Obersorbisch
A West Slavic language spoken by the Sorb minority of Upper Lusatia in Saxony, centred on Bautzen (Budyšin). Around 13,000 speakers; carries a continuous literary tradition since the 16th century.
Sotho
2 dialectsSepedi
also: Northern Sotho, Pedi
The Sotho-Tswana variety of Limpopo and Mpumalanga in north-eastern South Africa. One of the eleven official languages of South Africa; mutually distinct enough from Sesotho to be treated separately.
Sesotho
also: Southern Sotho, Sotho
A Sotho-Tswana Bantu language and the national language of Lesotho. Also one of the official languages of South Africa, where it is widely spoken in the Free State and Gauteng.
Swedish
2 dialectsFinland Swedish
also: Finlandssvenska
The Swedish of Finland's Swedish-speaking minority, co-official with Finnish. More conservative than Standard Swedish in pronunciation, retaining several older features.
Standard Swedish
also: Rikssvenska
The Stockholm-based standard of Swedish. The largest North Germanic language; mutually intelligible with Norwegian and Danish, particularly in writing.
Telugu
2 dialectsCoastal Telugu
also: Andhra Telugu
The Telugu of the coastal Andhra region around Vijayawada. Forms the basis of the literary standard and of much of Telugu cinema.
Telangana Telugu
also: Hyderabadi Telugu
The Telugu of Telangana, centred on Hyderabad. Distinguished from Coastal Telugu in lexicon and shaped by long contact with Dakhini Urdu.
Zulu
2 dialectsStandard Zulu
also: isiZulu
The largest Bantu language by first-language speakers in South Africa. Used widely as a second language across KwaZulu-Natal and Gauteng.
Urban Zulu
also: Tsotsitaal-influenced isiZulu
The Zulu of South Africa's townships and major cities. Heavily code-mixed with English, Tsotsitaal, and other Bantu languages.
Other languages
210 dialectsAbkhaz
also: Apswa byzshwa
A Northwest Caucasian language of Abkhazia. Famous for one of the smallest vowel inventories of any natural language (just two phonemic vowels) and a correspondingly large consonant system.
Acholi
also: Acoli, Lwo
A Western Nilotic language of the Luo cluster, spoken across northern Uganda and adjacent South Sudan. Closely related to Lango and to the Luo of western Kenya.
Adyghe
also: West Circassian
A Northwest Caucasian language of the Adygea Republic in Russia. Together with Kabardian, makes up the Circassian language continuum, with a large diaspora in Turkey and the Levant.
Afar
also: Qafar
A Cushitic Afroasiatic language of the Afar people, spoken across the Afar Triangle of north-eastern Ethiopia, southern Eritrea, and Djibouti. Around 2 million speakers.
Afrikaans
A West Germanic language descended from 17th-century Cape Dutch, with substantial substrate from Khoisan, Malay, and Bantu languages. One of the official languages of South Africa and widely spoken in Namibia.
Ainu
A language isolate of the indigenous Ainu people of Hokkaido and historically Sakhalin and the Kuril Islands. Critically endangered with only a handful of native speakers; subject of an active revitalisation effort.
Akan / Twi
also: Akan
A Kwa Niger-Congo language and the largest indigenous language of Ghana. The Asante Twi variety, centred on Kumasi, is the most widely spoken and the basis of Akan-language broadcasting.
Asháninka
An Arawakan language spoken in the central Peruvian Amazon and adjacent western Brazil. The most-spoken Amazonian Arawakan language, with around 100,000 speakers.
Avar
also: Avar macʻ
A Northeast Caucasian language of Dagestan, the largest of the Daghestanian languages. Used as a regional lingua franca across many of Dagestan's small mountain communities.
Bahamian Creole
also: Bahamian Dialect
An English-based creole spoken across the Bahamas. Closely related to Gullah and to the English-based creoles of the wider Caribbean.
Bakhtiari
A southwestern Iranian language of the Bakhtiari tribal confederation in the central Zagros mountains. Sister to Luri, with a strong oral tradition tied to the seasonal migrations of pastoral life.
Bambara
also: Bamanankan
The largest Mande Niger-Congo language and the lingua franca of Mali. Closely related to Dyula and Maninka in a wider Manding dialect continuum across West Africa.
Bashkir
also: Başqortsa
A Kipchak Turkic language of Bashkortostan in the southern Urals. Closely related to Tatar but with distinct phonology and lexicon shaped by the Bashkir steppe heritage.
Basque
also: Euskara
A language isolate of the western Pyrenees, with no proven relatives anywhere in the world. The only pre-Indo-European language to survive in Western Europe; co-official in the Basque Country and Navarre.
Beja
also: Bedawi
The northernmost Cushitic language, spoken across the Red Sea hills of Sudan, Eritrea, and Egypt. Sometimes classified as its own North Cushitic branch within the Afroasiatic family.
Bemba
also: Chibemba, Icibemba
A Bantu language and the most-spoken indigenous language of Zambia, used across the Northern, Luapula, and Copperbelt provinces. A major lingua franca on the Zambian Copperbelt.
Blackfoot
also: Siksiká
A Plains Algonquian language of the Blackfoot Confederacy across southern Alberta and northern Montana. Around 5,000 speakers; one of the most divergent Algonquian languages.
Bosnian
also: Bosanski
The Sarajevo-based standard of Bosnian. A Štokavian-based BCMS standard, distinguished by retention of Turkish, Arabic, and Persian loanwords from Ottoman rule and the use of both Latin and Cyrillic scripts.
Breton
also: Brezhoneg
A Brittonic Celtic language of Brittany in north-western France. The only continental Celtic survivor; brought from south-western Britain in the 5th-6th centuries.
Burushaski
A language isolate spoken in the Hunza, Yasin, and Nagar valleys of Gilgit-Baltistan, northern Pakistan. Around 100,000 speakers; the most-spoken language isolate of Asia.
Buryat
also: Buryat-Mongolian, Buryad
A northern Mongolic language spoken across the Buryat Republic in Russia, northern Mongolia, and parts of north-eastern China. Traditional Buddhist literary language of the Trans-Baikal region.
Cape Verdean Creole
also: Kriolu, Kabuverdianu
A Portuguese-based creole and the everyday language of Cape Verde, used by the entire population alongside Portuguese. The oldest still-spoken creole language, with a continuous documented history since the 16th century.
Catalan
also: Català
A Western Romance language, official in Andorra and co-official across Catalonia, the Balearic Islands, and the Valencian Community of Spain. Around 10 million speakers, with strong literary tradition since the medieval period.
Cebuano
also: Bisaya, Binisayâ
A Visayan Austronesian language of the central and southern Philippines. The most widely-spoken language of the Visayas and Mindanao, often counted as the largest Philippine language by L1 speakers.
Central Alaskan Yupʼik
also: Yugtun
The largest Yupʼik language, spoken across south-western Alaska. A sister language to Inuktitut within the Eskaleut family, with around 10,000 speakers.
Chechen
also: Noxçiyn mott
A Northeast Caucasian (Nakh) language and the official language of the Chechen Republic in Russia. The largest Northeast Caucasian language, with around 1.5 million speakers.
Cheyenne
also: Tsėhésenėstsestȯtse
A Plains Algonquian language of the Northern and Southern Cheyenne tribes, in Montana and Oklahoma. Around 1,700 speakers; subject of substantial linguistic documentation since the 19th century.
Chichewa
also: Nyanja, Chinyanja
A Bantu language and the national language of Malawi, also widely spoken in eastern Zambia and northern Mozambique under the name Nyanja. Around 14 million speakers.
Chickasaw
also: Chikashshanompa
A Muskogean language of the Chickasaw Nation of Oklahoma. Closely related to Choctaw; critically endangered with active revitalisation through the Chickasaw Nation Language Department.
Choctaw
also: Chahta anumpa
A Muskogean language with two surviving communities — the Mississippi Band of Choctaw Indians and the Choctaw Nation of Oklahoma, descendants of the Trail-of-Tears removal. Around 10,000 speakers.
Chukchi
also: Lygʼoravetlʼan
A Chukotko-Kamchatkan language of the Chukotka Peninsula in north-eastern Siberia. Around 5,000 speakers; one of the few languages with male and female speech-form distinctions.
Chuvash
also: Çăvaşla
The sole surviving member of the Oghur branch of Turkic, spoken in Chuvashia in the Volga region. Linguistically the most divergent living Turkic language, with features lost everywhere else in the family.
Comorian
also: Shikomori, Shikomoro
A Bantu language cluster of the Comoros archipelago and Mayotte, closely related to Swahili. Four mutually intelligible varieties, one per major island; around 1 million speakers.
Crimean Tatar
also: Qırımtatarca
A mixed Turkic language of the Crimean peninsula, descended from contact between Kipchak and Oghuz varieties. Critically endangered after the 1944 Soviet deportation; subject of an active revitalisation effort.
Dakota
also: Dakhótiyapi, Santee-Sisseton
The eastern Siouan language closely related to Lakota. Spoken by the Santee, Sisseton-Wahpeton, and Yankton communities across Minnesota, the Dakotas, and southern Manitoba.
Daur
also: Dagur
A divergent Mongolic language spoken in Heilongjiang and Inner Mongolia in north-eastern China. Geographically isolated from the rest of the family and shaped by long contact with Manchu and Mandarin.
Dhivehi
also: Maldivian
An Indo-Aryan language and the official language of the Maldives. Closest living relative of Sinhala, with which it shares a 12th-century separation date.
Diné Bizaad
also: Navajo
A Southern Athabaskan (Na-Dené) language spoken across the Navajo Nation in Arizona, New Mexico, and Utah. The most-spoken Indigenous language north of Mexico, with around 170,000 speakers.
Dinka
also: Thuɔŋjäŋ
A Western Nilotic language and the largest language of South Sudan. Internally varied, with Padang, Bor, Agar, Rek, and Twic as major dialect groups.
Dogon
also: Dogoso cluster, Toro So, Tomo Kan, Jamsay
A Niger-Congo branch of around twenty deeply differentiated varieties on and around the Bandiagara escarpment in central Mali, with smaller populations in north-western Burkina Faso. Around 600,000 speakers total; descriptive work since the 2000s treats several Dogon varieties (Toro So, Tomo Kan, Jamsay, others) as separate languages rather than dialects of a single Dogon language.
Dungan
also: Hui Chinese (Central Asia)
A Sinitic language descended from late-19th-century Hui Muslim refugees who fled into the Russian Empire after the Qing-era Hui revolts. Written in Cyrillic, with substantial Russian, Persian, and Turkic loans.
Dyula
also: Jula, Dioula
A Manding Niger-Congo language used as a trade lingua franca across northern Côte d'Ivoire, western Burkina Faso, and adjoining areas. Closely related to Bambara and Maninka.
Dzongkha
A Tibetic language and the official language of Bhutan, derived from Old Tibetan. Closely related to Standard Tibetan but increasingly diverging through political separation.
Elfdalian
also: Övdalian, Älvdalska
A heavily archaic North Germanic variety spoken in Älvdalen in northern Dalarna, Sweden. Often classified as a separate language; preserves Old Norse features lost in all other living varieties.
Enga
A Trans-New Guinea language and the largest Papuan language by speakers, used across Enga Province in the western highlands of Papua New Guinea. Around 250,000 speakers.
Estonian
also: Eesti
A Finnic Uralic language and the official language of Estonia. Closely related to Finnish but distinct in vocabulary and a partially restructured case system.
Evenki
A Tungusic language of indigenous reindeer-herding communities across central and eastern Siberia and northern China. The most geographically dispersed Tungusic language.
Ewe
also: Eʋegbe
A Kwa Niger-Congo language spoken across south-eastern Ghana and southern Togo. Closely related to Fon and the major language of the Ewe homeland on both sides of the Volta.
Faroese
also: Føroyskt
The North Germanic language of the Faroe Islands. Closely related to Icelandic and Western Norwegian; co-official with Danish in the islands.
Fijian
also: Vosa Vakaviti
An Austronesian language and an official language of Fiji. Often used alongside Fiji English and Fiji Hindi, the country's three official languages.
Finnish
also: Suomi
A Finnic Uralic language and an official language of Finland and the EU. Famous for its agglutinative grammar and vowel harmony, with a rich oral tradition recorded in the Kalevala.
Fon
also: Fongbe, Fon-Gbe
A Gbe Niger-Congo language and the largest language of Benin. Closely related to Ewe; the historical language of the Kingdom of Dahomey and a major source of Vodun religious vocabulary.
Fulani
also: Fula, Fulfulde, Pulaar
An Atlantic Niger-Congo language spoken by Fulani (Peul) communities across the Sahel from Senegal to Sudan. One of the most geographically spread African languages, with 25-30 million speakers.
Gagana Sāmoa
also: Samoan
The Polynesian language of the Samoan Islands. The official language of Samoa and one of the official languages of American Samoa, with sizeable diaspora communities in New Zealand and the United States.
Gagauz
also: Gagauz dili
An Oghuz Turkic language spoken by the Christian Gagauz people of southern Moldova. The official language of the autonomous Gagauz region (Gagauz Yeri), distinct from neighbouring Romanian and Bulgarian.
Galician
also: Galego
A Western Romance language closely related to Portuguese, co-official in Galicia in north-western Spain. The two were a single language as late as the 13th century.
Garífuna
An Arawakan language with major African and Carib influence, spoken along the Caribbean coast of Central America. Descended from communities deported from St. Vincent in 1797; a UNESCO Masterpiece of Intangible Heritage.
Georgian
also: Kartuli
The largest Kartvelian language and the official language of Georgia. Written in the unique Mkhedruli script, with a literary tradition dating back to the 5th century.
Gilaki
A northwestern Iranian language spoken across Iran's Caspian province of Gilan around Rasht. Distinct from Persian in case marking and a richer verb system retaining older Iranian features.
Hadza
also: Hadzane
A click-language isolate spoken by the Hadza hunter-gatherers around Lake Eyasi in northern Tanzania. Fewer than 1,000 speakers; no demonstrated genetic relationship to any other language.
Haitian Creole
also: Kreyòl ayisyen
A French-based creole that emerged on Saint-Domingue plantations. Co-official with French in Haiti since 1987, and the everyday language of nearly all Haitians.
Hakka
also: Kèjiā huà
A Sinitic language spoken across pockets of southern China and Taiwan by the Hakka diaspora communities. Around 35 million speakers, with major communities in Guangdong, Fujian, and Taiwan.
Halkomelem
also: Hulʼqʼumiʼnumʼ
A Coast Salishan language of the lower Fraser river valley and southeastern Vancouver Island. Around 200 fluent speakers across Upriver, Downriver, and Island dialect groups.
Hmong
also: Miao
A cluster of Hmong-Mien languages spoken across the highlands of southern China, northern Vietnam, Laos, and Thailand. Carried abroad through the post-Vietnam-War Hmong diaspora to the United States, France, and Australia.
Hopi
A Northern Uto-Aztecan language of the Hopi people of north-eastern Arizona. Around 5,000 speakers; the language famously analysed by Whorf in his work on linguistic relativity.
Hungarian
also: Magyar
A Ugric Uralic language, the largest of the Uralic family by speakers and the official language of Hungary. Reached its current location through the Magyar migrations of the 9th century.
ʻŌlelo Hawaiʻi
also: Hawaiian
The indigenous Polynesian language of the Hawaiian Islands. Co-official with English in Hawaiʻi and the focus of an ongoing revitalisation movement after near-extinction in the 20th century.
Icelandic
also: Íslenska
The most conservative living North Germanic language, retaining a complex inflectional system close to Old Norse. Modern Icelanders can still read 13th-century sagas with limited difficulty.
Igbo
A Volta-Niger language of south-eastern Nigeria. One of the three major languages of Nigeria, with a literary standard codified in the late 20th century from a heavily dialect-fragmented base.
Ilocano
also: Iloko, Ilokano
An Austronesian language of north-western Luzon, the third most-spoken language of the Philippines. Used widely as a regional lingua franca across northern Luzon.
Irish
also: Gaeilge
A Goidelic Celtic language and the first official language of Ireland. Daily-spoken communities (the Gaeltachtaí) survive in pockets along the western seaboard, around Galway, Donegal, and Kerry.
Jamaican Patois
also: Patwa, Jamaican Creole
An English-based creole that emerged on Jamaican plantations from contact between English and West African languages. Native to most Jamaicans and used widely in music, literature, and everyday life.
Javanese
also: Basa Jawa
The largest Austronesian language by speakers, used across central and eastern Java and as a regional lingua franca. Distinguished by an intricate honorific system (krama / madya / ngoko).
Juhuri
also: Judeo-Tat, Judeo-Tati, Mountain Jewish, Juhuro, Juwuri
A Southwestern Iranian language of the Mountain Jews of the eastern Caucasus, historically classed as a Judeo-Tat variety. Spoken in Krasnaya Sloboda (Quba) in Azerbaijan and Derbent in Dagestan, with substantial Hebrew, Aramaic, and Azerbaijani influence. Written historically in Hebrew script, then Cyrillic from the Soviet period onward.
Juǀʼhoan
also: Juǀʼhoansi, !Kung
A Kxʼa click language of the Kalahari, spoken by Juǀʼhoan San communities across north-eastern Namibia and north-western Botswana. The most thoroughly documented of the surviving San languages.
Kaingang
The largest southern Macro-Jê language, spoken across communities in Paraná, Santa Catarina, and Rio Grande do Sul. Around 30,000 speakers, with literary materials in Kaingang since the late 19th century.
Kalmyk
also: Kalmyk Oirat, Khalimag
The only Mongolic language of Europe, spoken in the Republic of Kalmykia by the descendants of 17th-century Oirat migrants from Dzungaria. Endangered but officially recognised.
Kamba
also: Kikamba
A Bantu language of south-eastern Kenya, closely related to Kikuyu. Around 4 million speakers across Machakos, Kitui, and Makueni counties.
Kanuri
A Saharan Nilo-Saharan language and the historical court language of the Kanem-Bornu empire. Around 13 million speakers across north-eastern Nigeria, southern Niger, eastern Chad, and northern Cameroon.
Karakalpak
also: Qaraqalpaqsha
A Kipchak Turkic language closely related to Kazakh, spoken in the autonomous Karakalpakstan region of Uzbekistan around the Aral Sea.
Karelian
also: Karjala
A Finnic Uralic language spoken across the Republic of Karelia in Russia and adjacent parts of eastern Finland. Closely related to Finnish but distinct enough to be classified as a separate language.
Kashmiri
also: Koshur
A Dardic Indo-Aryan language of the Kashmir valley, traditionally written in a Perso-Arabic script (Naskh) but also in Devanagari and the Sharada heritage script.
Ket
The last living Yeniseian language, spoken along the middle Yenisei in central Siberia. Around 200 elderly speakers; recently linked to the Na-Dené family of North America in the controversial Dene-Yeniseian hypothesis.
Khmer
also: Cambodian
An Austroasiatic language and the official language of Cambodia. The largest non-Tai-Kadai language of mainland Southeast Asia, with a literary tradition dating back to the 7th century.
Khoekhoe
also: Nama, Damara
The largest Khoisan language, spoken across Namibia and parts of Botswana and South Africa. The most-documented click language; one of the official languages of Namibia.
Kikongo
also: Kongo
A Bantu language of the Kongo people, spoken across Angola, the DRC, and the Republic of the Congo. Carried abroad in the Atlantic slave trade and a major substrate in Caribbean and Brazilian creoles.
Kikuyu
also: Gikuyu, Gĩkũyũ
A Bantu language and the largest indigenous language of Kenya, spoken across the central highlands around Mount Kenya. Around 8 million speakers.
Kimbundu
also: North Mbundu
A Bantu language of north-western Angola including the Luanda hinterland. Around 4 million speakers; a major substrate in Brazilian Portuguese vocabulary through the Atlantic slave trade.
Kinyarwanda
The Bantu national language of Rwanda. Mutually intelligible with Kirundi (Burundi) and one of the few African languages spoken by the entire population of a country.
Komi
also: Komi-Zyrian
A Permic Uralic language of the Komi Republic in northern European Russia. One of the oldest written Uralic languages, with a literary tradition going back to the 14th-century missionary Stephen of Perm.
Koryak
also: Nymylan
A Chukotko-Kamchatkan language of the Kamchatka peninsula. Spoken by Koryak reindeer herders and coastal communities; sister language to Chukchi within the Chukotian sub-branch.
Krio
also: Sierra Leonean Creole
An English-based creole and the lingua franca of Sierra Leone, used by virtually the entire population. Descended from the speech of liberated Africans resettled in Freetown in the 19th century.
Kriol
also: Australian Kriol
An English-based creole spoken across northern Australia by Aboriginal communities, especially in the Northern Territory and the Kimberley. Native to many speakers and used in bilingual schools and broadcasting.
Kuman
also: Simbu
A Trans-New Guinea language of the Chimbu / Simbu Province in the central highlands of Papua New Guinea. Around 80,000 speakers.
Kwakʼwala
also: Kwakiutl
The largest Wakashan language, of the Kwakwakaʼwakw peoples on northern Vancouver Island and the adjacent BC mainland. Famous to anthropology through Franz Boas's field documentation.
Kyrgyz
also: Kyrgyzcha
A Kipchak Turkic language and the official language of Kyrgyzstan. Carries the Manas oral epic, one of the longest oral epic traditions in world literature.
Kʼicheʼ
also: Quiché
The largest Mayan language of Guatemala, spoken across the western highlands. The language of the Popol Vuh, the foundational K'ichean creation epic.
Lakota
also: Lakȟótiyapi, Teton Sioux
A Western Siouan language spoken across the Sioux nations of the Great Plains, with major communities at Pine Ridge, Rosebud, and Standing Rock. Distinct from Dakota in core vocabulary and pronouns.
Lao
A Tai-Kadai language and the official language of Laos. Closely related to Thai Isan; written in a script derived from the same source as Thai but distinct in several letters and vowel signs.
Lezgian
also: Lezgi
A Northeast Caucasian language of southern Dagestan and northern Azerbaijan. Around 800,000 speakers; one of the most studied Daghestanian languages.
Lingala
A Bantu language used as a lingua franca across the western DRC and the Republic of the Congo, with Kinshasa as its largest urban centre. The language of much of Congolese popular music.
Luba-Kasai
also: Tshiluba, Western Luba
A Bantu language of the Kasai region in the southern DRC and one of the four national languages of the country. Around 6 million speakers.
Luganda
also: Ganda
A Bantu language of central Uganda, the most-spoken Ugandan Bantu language and the historical court language of the Buganda kingdom. Widely used as a lingua franca in Kampala.
Luo
also: Dholuo
A Western Nilotic language of the Luo people of western Kenya around Lake Victoria. Related to Acholi and to the Dinka and Nuer of South Sudan.
Luri
also: Lori
A southwestern Iranian language of the Lur people in Iran's western Zagros region. Closely related to Persian but linguistically distinct, with several million speakers across Lorestan, Khuzestan, and Ilam.
Lushootseed
also: Puget Salish
A Salishan language of the Coast Salish peoples around Puget Sound in western Washington. Critically endangered; subject of an active revitalisation through the Tulalip and Muckleshoot tribes.
Maasai
also: Maa, Olmaa
A Nilotic language spoken by the Maasai people across southern Kenya and northern Tanzania. Closely related to Samburu and Camus within the Maa cluster.
Macushi
also: Makushi
A Cariban language spoken across the Roraima savannas of northern Brazil, southwestern Guyana, and adjacent Venezuela. Closely related to Pemón.
Makhuwa
also: Emakhuwa, Macua
A Bantu language and the largest indigenous language of Mozambique, spoken across the northern provinces of Nampula, Cabo Delgado, and Niassa. Around 7 million speakers.
Malagasy
The Austronesian language of Madagascar, descended from settlers who crossed the Indian Ocean from Borneo around 1,500 years ago. The westernmost outpost of the Austronesian family.
Maltese
also: Malti
A Semitic language descended from medieval Maghrebi Arabic, with extensive Italian and English influence. The only Semitic language written in Latin script and the only one official in the EU.
Mam
A Mamean Mayan language spoken across the western highlands of Guatemala and adjacent parts of Chiapas, Mexico. Around 600,000 speakers.
Manchu
A Tungusic language and the historical court language of the Qing dynasty (1644-1912). Now critically endangered, with only a handful of native speakers in Heilongjiang.
Maninka
also: Malinke, Mandinka
A Manding Niger-Congo language of Upper Guinea and adjacent Mali. The historical language of the Mali Empire and the closest living relative of the Mandinka of Senegambia.
Manx
also: Gaelg
A Goidelic Celtic language of the Isle of Man. Functionally extinct in 1974 with the death of its last native speaker; revived through education and now spoken as a second language by hundreds.
Mapudungun
also: Mapuche, Araucanian
The language of the Mapuche people, spoken across south-central Chile and adjacent Argentina. The largest Indigenous language of Chile and one of South America's most documented language isolates.
Mauritian Creole
also: Kreol Morisien
A French-based creole and the everyday language of Mauritius, spoken by virtually the entire population alongside English and French. Closely related to Seychellois and Réunion creoles.
Mazandarani
also: Tabari
A northwestern Iranian language of Iran's Caspian province of Mazandaran, around Sari and Babol. Closely related to Gilaki, with around three million speakers.
Mende
A Mande Niger-Congo language and the largest indigenous language of Sierra Leone. Notable for the Kikakui syllabary, an indigenous 19th-century writing system.
Mien
also: Yao
A cluster of Hmong-Mien languages spoken by the Yao peoples across southern China and adjacent Southeast Asia. Sister branch to Hmong, with around 800,000 speakers across the diaspora.
Min Nan (Hokkien)
also: Hokkien, Taiwanese
The largest Min Chinese variety, spoken in Fujian and Taiwan and across the Hokkien-speaking diaspora of Southeast Asia. The basis of Taiwanese Hokkien and a major contributor to Singaporean Hokkien.
Mingrelian
also: Margaluri
A Kartvelian language of western Georgia around Zugdidi. Closely related to Georgian but linguistically distinct; primarily an oral language with limited written use.
Miʼkmaq
also: Mikmawisimk
An Eastern Algonquian language of the Miʼkmaq Nation across the Maritime provinces of Canada, Quebec, and Newfoundland, plus eastern Maine. Around 8,000 speakers, with a long oral and hieroglyphic tradition.
Modern Hebrew
also: Ivrit
A Northwest Semitic language, revived from a primarily liturgical use into a full living language between the late 19th and mid-20th centuries. Now the official language of Israel.
Mon
An Austroasiatic language of Lower Burma and adjacent Thailand. Once the dominant cultural language of mainland Southeast Asia, displaced over centuries by Burmese and Thai expansion.
Mongo
also: Lomongo, Nkundo
A Bantu language cluster of the central Congo basin, spoken across the Cuvette and Équateur provinces of the DRC. Around 6 million speakers across a wide dialect chain.
Montenegrin
also: Crnogorski
The most recently codified BCMS national standard, declared official in Montenegro in 2007. Distinguished from Serbian by the recognition of two additional Montenegrin-specific letters and a number of lexical preferences.
Mòoré
also: Mossi, Moshi
A Gur Niger-Congo language and the largest language of Burkina Faso, spoken by the Mossi people on the Ouagadougou plateau. Around 8 million speakers.
Muscogee (Creek)
also: Mvskoke
A Muskogean language of the Muscogee (Creek) and Seminole nations of Oklahoma and Florida. The historical lingua franca of the Southeastern Indigenous confederacy before forced removal.
Nahuatl
also: Mexicano, Nawatlahtolli
The largest Uto-Aztecan language, spoken across central Mexico by around 1.6 million people. The language of the Aztec Empire, with literary records in Latin script since the 16th century.
Nanai
also: Goldi
A Tungusic language of the lower Amur basin, spoken in Russia's Khabarovsk Krai and adjacent China. Around a thousand speakers, all elderly.
Ndebele
also: Northern Ndebele, isiNdebele
A Nguni Bantu language of south-western Zimbabwe, descended from a 19th-century offshoot of Zulu. Around 2 million speakers, centred on Bulawayo and Matabeleland.
Nepali
also: Nepāli
An Indo-Aryan language and the official language of Nepal. Also a scheduled language of India, with substantial communities in Sikkim, West Bengal, and Bhutan.
Newar
also: Nepal Bhasa, Newari
A Tibeto-Burman language of the Newar people of the Kathmandu valley. The historical lingua franca of medieval Nepal, with an extensive literary tradition.
Nheengatu
also: Lingua Geral Amazônica, Modern Tupi
A Tupi-Guaraní language descended from old Tupinambá; the historical Lingua Geral of Amazonia. Co-official in São Gabriel da Cachoeira (Brazil), the only municipality in Brazil with Indigenous co-official languages.
Nigerian Pidgin
also: Naijá, Pidgin English
An English-based creole used as a lingua franca across Nigeria and the wider West African coast. Estimated 75-100 million speakers; rapidly expanding as a first language in urban Nigeria.
Nivkh
also: Gilyak
A language isolate of Sakhalin and the lower Amur basin in Russia. Critically endangered; one of the few Paleo-Siberian languages with no demonstrated relatives.
Nobiin
also: Mahas-Fadicca
A Nile Nubian language spoken along the Nile from southern Egypt into northern Sudan. Heir to Old Nubian, the medieval Christian-era written language of the Nubian kingdoms.
Northern Sami
also: Davvisámegiella
The largest Sami language, spoken by the indigenous Sami people across northern Norway, Sweden, and Finland. The most documented and standardised of the Sami languages.
Nuer
also: Thok Naath
A Western Nilotic language and the second-largest language of South Sudan after Dinka. Around 1.7 million speakers across the Greater Upper Nile and Gambela.
Nuu-chah-nulth
also: Nootka
A Wakashan language of the western coast of Vancouver Island. Internally diverse across 14 Nuu-chah-nulth First Nations; central to the early study of Pacific Northwest art and music.
Occitan
also: Lenga dʼòc, Provençal
A Romance language of southern France, the Aran valley in Catalonia, and a few alpine valleys of Italy. The language of the medieval troubadours; now endangered in everyday use.
Oirat
also: Western Mongolian, Dörbed
A western Mongolic language spoken across western Mongolia (Khovd, Bayan-Ölgii) and parts of Xinjiang in China. Sister to Kalmyk, with a separate writing tradition based on the Clear Script (Todo Bichig).
Oromo
also: Afaan Oromoo
A Cushitic Afroasiatic language and the most-spoken language of Ethiopia, with around 37 million speakers. The official working language of the Oromia Region, written in Latin-based Qubee.
Ossetian
also: Iron, Digor
An eastern Iranian language of the North Caucasus, the only living descendant of the ancient Scythian-Sarmatian languages. Two main varieties: Iron (the standard) and the more divergent Digor.
Papiamento
also: Papiamentu
An Iberian-based creole of the ABC Islands (Aruba, Bonaire, Curaçao). Co-official with Dutch on Curaçao and Aruba; one of the few Caribbean creoles with a fully developed literary standard.
Pashayi
also: Pashai
A cluster of Indo-Aryan languages spoken in eastern Afghanistan, especially Kapisa, Laghman, and Nangarhar. Linguistically a Dardic language, sister to Kashmiri rather than to Hindi.
Pemón
also: Arekuna, Kamarakoto
A Cariban language of the Gran Sabana and adjacent highlands of Venezuela, Guyana, and northern Brazil. Around 30,000 speakers; the language of the Pemón people of Mount Roraima.
Pitjantjatjara
also: Western Desert language, Anangu
A Western Desert language of central Australia, spoken by the Anangu communities around Uluṟu and the APY Lands. One of the most-spoken Aboriginal Australian languages, with active bilingual education.
Plautdietsch
also: Mennonite Low German
A Low German variety preserved by Mennonite diaspora communities since the 16th century. Spoken across colonies in Mexico, Paraguay, Bolivia, Belize, and the Canadian Prairies.
Qʼeqchiʼ
also: Kekchi
A K'ichean Mayan language spoken in central Guatemala and southern Belize. The fastest-growing Mayan language, expanding across former Q'eqchi'-monolingual territory.
Romansh
also: Rumantsch
A Rhaeto-Romance language and the fourth official language of Switzerland, spoken across the canton of Graubünden. Around 60,000 speakers; the federal standard (Rumantsch Grischun) bridges five regional varieties.
Sakha
also: Yakut, Sakha tyla
The Siberian Turkic language of the Sakha Republic in north-eastern Russia. The northernmost Turkic language and one of the most linguistically divergent, with extensive Mongolic and Tungusic influence.
Sango
A Ngbandi-derived contact language and the official lingua franca of the Central African Republic. Used by virtually the entire population alongside French; structurally simplified through extensive multilingual contact.
Sara
also: Sara-Bagirmi cluster, Ngambay, Sar, Mbay
A cluster of closely related Central Sudanic (Nilo-Saharan) varieties across southern Chad and adjoining parts of the Central African Republic. The Sara group is the largest indigenous ethnolinguistic complex of Chad; Ngambay alone counts around 1.5 million speakers, with Sar, Mbay, and other varieties extending the dialect chain.
Scottish Gaelic
also: Gàidhlig
A Goidelic Celtic language closely related to Irish, spoken across the Outer Hebrides and parts of the Highlands of Scotland. Around 60,000 speakers, with sustained revitalisation through Gaelic-medium education.
Serbian
also: Srpski, Standard Serbian
The Belgrade-based standard of Serbian. Part of the Štokavian dialect continuum that underlies all four BCMS national standards (Bosnian, Croatian, Montenegrin, Serbian); written in both Cyrillic and Latin scripts.
Setswana
also: Tswana
A Sotho-Tswana Bantu language and the national language of Botswana. Closely related to Sesotho, with substantial speaker communities in South Africa's North West and Northern Cape provinces.
Shipibo-Konibo
also: Shipibo
The largest Panoan language, of the Ucayali river basin in the Peruvian Amazon. Around 30,000 speakers; well known internationally through Shipibo-Konibo art and the ayahuasca traditions.
Shona
also: chiShona
The largest Bantu language of Zimbabwe and an official language of the country. Internally varied, with Karanga, Zezuru, and Manyika as the main dialect groups.
Shoshone
A Numic Uto-Aztecan language of the Great Basin, spoken by Shoshone communities across Wyoming, Idaho, Nevada, Utah, and California.
Shughni
also: Shughni-Rushani
The largest Pamir Iranian language, spoken across Gorno-Badakhshan in Tajikistan and the Afghan Pamir. Forms a dialect cluster with Rushani, Bartangi, and Roshorvi.
Sidamo
also: Sidaama Afoo
A Cushitic Afroasiatic language of southern Ethiopia, spoken by the Sidama people around Hawassa. Around three million speakers; written in a Latin-based orthography.
Sindhi
An Indo-Aryan language of Sindh province in Pakistan, with substantial communities in India after Partition. Around 25 million speakers; written in both Perso-Arabic and Devanagari scripts.
Sinhala
also: Sinhalese
An Indo-Aryan language and the majority language of Sri Lanka, descended from settlers from northern India in the 1st millennium BCE. Geographically far removed from its closest Indo-Aryan relatives.
Somali
also: Soomaali
A Cushitic Afroasiatic language and the official language of Somalia. Also widely spoken in Somaliland, Djibouti, the Somali Region of Ethiopia, and north-eastern Kenya.
Songhay
also: Songhai, Sonrai
A Nilo-Saharan language cluster of the Niger river bend, the language of the medieval Songhay empire. Around 4 million speakers across Mali, Niger, and Burkina Faso.
Soninke
A Mande Niger-Congo language of the historical Ghana Empire region. Spoken across Mali, Mauritania, Senegal, and the Gambia, with a long oral and Ajami-script tradition.
Sranan Tongo
also: Sranan
An English-based creole of Suriname, used as the country's lingua franca alongside Dutch. Strong substrate from Gbe and Akan languages of the West African slave trade.
Standard Tibetan
also: Lhasa Tibetan, Bhö-skad
The Lhasa-based standard of Tibetan, used across the Tibet Autonomous Region of China, parts of Sichuan, Qinghai, and the Tibetan diaspora. Tibetan Buddhism's primary liturgical language for centuries.
Sukuma
also: Kisukuma
A Bantu language and the largest indigenous language of Tanzania by first-language speakers, used across the Lake Victoria region around Mwanza. Around 8 million speakers.
Sundanese
also: Basa Sunda
A Malayo-Polynesian language of western Java, the second-largest regional language of Indonesia after Javanese. Around 40 million speakers, with its own honorific register system.
Susu
also: Soso, Sosoxui
A Mande Niger-Congo language and the main lingua franca of coastal Guinea, including the capital Conakry. Around 1.5 million first-language speakers.
Svan
also: Lushnu Nin
The most divergent Kartvelian language, spoken in the high-mountain Svaneti region of north-western Georgia. Considered to have split from Proto-Kartvelian several thousand years ago.
Tahitian
also: Reo Tahiti
A Polynesian language and one of the official languages of French Polynesia. The largest of the Tahitic languages, with around 70,000 speakers across French Polynesia and the diaspora.
Talysh
A northwestern Iranian language straddling the Iran-Azerbaijan border in the Lankaran region. The largest northern variety is in Azerbaijan; the southern variety in Iran is more conservative.
Tatar
also: Tatarça, Volga Tatar
A Kipchak Turkic language of the Volga-Ural region of Russia, centred on Tatarstan and Kazan. The largest minority language of European Russia.
Tigre
also: Xasa
A Semitic Afroasiatic language of the Eritrean lowlands and adjoining eastern Sudan. Around 1 million speakers; closely related to Tigrinya but mutually unintelligible.
Tigrinya
An Ethiopic Semitic language and the most-spoken language of Eritrea, also widely used in northern Ethiopia (Tigray). Written in the Ge'ez script.
Tiwi
A non-Pama-Nyungan language of the Tiwi Islands north of Darwin, the only living member of its family. One of the most morphologically complex languages on record, with a famously elaborate noun-classification system.
Tlingit
also: Lingít
A Na-Dené language of the Tlingit people of south-eastern Alaska and adjacent British Columbia and Yukon. Famous for its complex consonant system, including ejective fricatives.
Tongan
also: Lea Faka-Tonga
A Polynesian language and the official language of Tonga. Distinct among Polynesian languages for retaining a definite/indefinite article system inherited from Proto-Polynesian.
Torres Strait Creole
also: Yumplatok, Broken
An English-based creole spoken across the Torres Strait Islands and adjacent communities of far northern Queensland. Closely related to Tok Pisin and Bislama within the Melanesian Pidgin family.
Toubou
also: Tubu, Tebu, Tibbu, Teda, Daza, Gorane
The Saharan Nilo-Saharan cluster of the central Sahara: Teda in the Tibesti and southern Libya, Daza further south in Borkou and Kanem. Sister branch to Kanuri within Saharan; commonly treated as a single Toubou cluster of two closely related varieties, with around half a million speakers across northern Chad, north-eastern Niger, and south-eastern Libya.
Tshivenda
also: Venda, Luvenda
A Bantu language of the Soutpansberg region in Limpopo, South Africa, and adjoining south-western Zimbabwe. One of the eleven official languages of South Africa; around 1.3 million speakers.
Tsonga
also: Xitsonga
A Bantu language of the Limpopo lowveld, spoken across north-eastern South Africa, southern Mozambique, and south-eastern Zimbabwe. Around 7 million speakers.
Tucano
also: Tukano, Dahsea
A Tukanoan language of the Vaupés region on the Brazil-Colombia border. Used as a regional lingua franca; the most-spoken Eastern Tukanoan variety.
Turkmen
also: Türkmençe
An Oghuz Turkic language spoken across Turkmenistan and adjacent communities in Iran and Afghanistan. The southernmost Central Asian Turkic language.
Tuvan
also: Tyva dyl
A Siberian Turkic language of the Tuva Republic in southern Siberia. Famous outside Russia for the throat-singing tradition of khoomei, deeply tied to the language's pitch and consonant system.
Umbundu
also: South Mbundu
A Bantu language and the most-spoken indigenous language of Angola, used across the central highlands around Huambo. Around 6 million speakers.
Urdu
The Hindustani national language of Pakistan and one of India's scheduled languages. Linguistically the same spoken language as Standard Hindi, distinguished by its Perso-Arabic script and Persian-Arabic literary register.
Uyghur
also: Uyghurche
A Karluk Turkic language of Xinjiang in north-western China. Closely related to Uzbek; written in a Perso-Arabic-based script and the medium of a long literary tradition along the Silk Road.
Uzbek
also: Oʻzbekcha
A Karluk Turkic language and the most-spoken Turkic language after Turkish. The official language of Uzbekistan, with substantial communities across Central Asia, Afghanistan, and the Caucasus.
Wakhi
also: Wakhi Pamir
A Pamir Iranian language spoken across the Wakhan Corridor where Afghanistan, Tajikistan, Pakistan, and China meet. One of the most archaic Iranian languages, retaining features of Old Iranian lost everywhere else.
Warlpiri
A Pama-Nyungan language of the Tanami Desert in the Northern Territory. Spoken across communities including Yuendumu and Lajamanu; the subject of substantial linguistic documentation.
Wayampi
also: Wajãpi
A Tupi-Guaraní language of the Oyapock river basin, straddling French Guiana and northern Brazil. Around 1,500 speakers; one of French Guiana's recognised regional languages.
Wayuu
also: Wayuunaiki, Guajiro
The largest Arawakan language, spoken across the La Guajira peninsula straddling Colombia and Venezuela. Around 400,000 speakers, with active revitalisation and bilingual education.
Welsh
also: Cymraeg
A Brittonic Celtic language and the most-spoken Celtic language. Co-official in Wales, with around 900,000 speakers and a strong contemporary literary and broadcast presence.
Western Apache
also: Ndee biyáti
A Southern Athabaskan (Na-Dené) language of east-central Arizona, closely related to Navajo. Around 14,000 speakers across the White Mountain and San Carlos Apache reservations.
Wolaita
also: Wolaytta, Welayta
An Omotic Afroasiatic language of southern Ethiopia, spoken around Wolaita Sodo. Around 2 million speakers; one of the larger Omotic languages, a family found nowhere outside Ethiopia.
Wolof
An Atlantic Niger-Congo language and the lingua franca of Senegal and the Gambia. The most-spoken African language of urban Senegal, used widely alongside French in Dakar.
Wu (Shanghainese)
also: Shanghai Wu, Shanghainese
The largest Wu Chinese variety, spoken across Shanghai and the lower Yangtze. The second-largest Sinitic language by speakers after Mandarin, distinct enough that it is mutually unintelligible with Mandarin.
Xavante
also: A'uwẽ
A Macro-Jê language of the Xavante people of central Brazil, in the cerrado of Mato Grosso. Around 20,000 speakers, with active bilingual education in Xavante communities.
Xhosa
also: isiXhosa
A Nguni Bantu language of the Eastern Cape and one of the official languages of South Africa. Famous for its 18 distinct click consonants borrowed from neighbouring Khoisan languages.
Yaghnobi
also: Yagnobi
An eastern Iranian language of the Yaghnob valley in Tajikistan. The only living descendant of Sogdian, the lingua franca of the medieval Silk Road; critically endangered.
Yanomami
also: Yanomamö
A cluster of Yanomaman languages spoken across the Brazil-Venezuela Amazonian border. Around 30,000 speakers, with internal varieties (Yanomami, Sanumá, Ninam, Yanam) often considered separate languages.
Yiddish
A High German language with substantial Hebrew, Aramaic, and Slavic vocabulary, written in the Hebrew alphabet. The historical vernacular of Ashkenazi Jewry; now spoken mainly by Hasidic communities in the United States, Israel, and Western Europe.
Yolŋu Matha
also: Dhuwal-Dhuwala, Yolngu languages
A cluster of closely-related Pama-Nyungan languages spoken across Yolŋu communities of north-east Arnhem Land. Centred on Yirrkala and Galiwinʼku.
Yucatec Maya
also: Maya, Mayaʼ tʼaan
The most-spoken Mayan language, used across the Yucatán peninsula by around 800,000 people. Direct linguistic descendant of the language used in the Maya hieroglyphic inscriptions of the Classic period.
Zarma
also: Djerma, Zerma, Zaberma
The Songhay-Zarma variety of south-western Niger, centred on Niamey and the Tillabéri–Dosso belt. The second national language of Niger after Hausa, with around 3.7 million speakers; treated by most descriptive work as a distinct standardized language within the wider Songhay cluster.