Languages & Dialects

Every language family carries many dialects — each with its own vocabulary, script, and community. Browse what Dialect Atlas currently documents, and watch the directory grow as more communities contribute.

English

21 dialects

Aboriginal Australian English

also: AAE

A distinct Australian English variety used widely across Aboriginal communities. Shows substrate influence from many Aboriginal languages and forms a continuum with Kriol in northern Australia.

African American Vernacular English

also: AAVE, Black English, Ebonics

A widespread English variety with a coherent grammatical system, including aspectual habitual "be" and the zero copula. Spoken across the African American community throughout the United States.

Appalachian English

The English of the Appalachian mountains across West Virginia, Kentucky, Tennessee, and western North Carolina. Preserves several archaic Scotch-Irish features lost in other US varieties.

Australian English

also: General Australian

The mainstream variety of Australian English, used by the majority of speakers nationally. The General Australian sociolect, sitting between Broad and Cultivated extremes.

Broad Australian

also: Strine

The most strongly-marked sociolect of Australian English, traditionally rural and outback in association. Famously caricatured as "Strine"; recognisable by its diphthongal vowels and broad nasalisation.

Cajun English

The English of the Cajun communities of southern Louisiana, descended from generations of French-English bilingualism. Distinguished by intonation, vowel realisations, and lexical borrowings from Cajun French.

Canadian English

The English of English-speaking Canada. Closer to General American than to British English in most features, but with its own vocabulary and the well-known Canadian raising.

Chicano English

also: Mexican-American English

A native English variety spoken in Mexican-American communities of the US Southwest, particularly Los Angeles. Despite its name, most speakers are monolingual English speakers.

Cultivated Australian

The historically prestigious Australian English variety, modelled on British Received Pronunciation. Once dominant in broadcasting and law; now rare in everyday speech.

Eastern New England English

also: Boston English

The traditional English of Boston and eastern New England. Non-rhotic in conservative speech, with the cot-caught merger and the famously Boston-broad short-a.

General American

also: Standard American English

The broad accent associated with national US broadcasting and most of the inland north and west. A reference point rather than a single regional variety.

Geordie

also: Tyneside English

The dialect of Newcastle and the Tyneside region. Retains a number of Old English and Norse-derived features lost elsewhere in England.

Hawaiian Pidgin

also: Hawaii Creole English, Pidgin

An English-based creole that emerged on the plantations of Hawaiʻi in the late 19th century. Now native to a substantial share of the Hawaiʻi-born population, with influences from Hawaiian, Portuguese, Cantonese, and Japanese.

Hiberno-English

also: Irish English

The English of Ireland, shaped by long contact with Irish Gaelic. Distinctive grammar, intonation, and vocabulary set it apart from British and American varieties.

Indian English

The pluricentric variety of English used across India. Shaped by long contact with Hindi and other Indian languages, with its own pronunciation, grammar, and lexical norms.

New York English

also: NYC English, New Yorkese

The English of New York City and surrounding areas. Marked by the famous tense-lax split of short-a, raised /ɔ/, and (historically) non-rhoticity.

Newfoundland English

also: Newfie English

The most distinctive variety of Canadian English, descended from 17th- and 18th-century West Country and Hiberno-English settlers. Strongly divergent from mainland Canadian English in vowels and grammar.

Received Pronunciation

also: RP, BBC English, Standard Southern British

The traditional prestige accent of southern England. Long carried by national broadcasting and higher education, but no longer dominant in everyday British speech.

Scottish English

The variety of English spoken across Scotland. Distinct from Scots (which is sometimes counted as a separate language) but heavily influenced by it.

Singlish

also: Singapore Colloquial English

The colloquial English of Singapore, mixing English with grammatical features and lexicon drawn from Hokkien, Malay, Cantonese, and Tamil.

Southern American English

also: Southern Drawl

The dialects of the US South, characterised by the Southern Vowel Shift, the pin-pen merger, and a distinct lexical tradition.

Spanish

17 dialects

Andalusian Spanish

also: Andaluz

The Spanish of southern Spain, marked by aspiration of /s/, the loss of final consonants, and a strong influence on the formation of Latin American varieties via the Atlantic ports.

Andean Spanish

The Spanish of the Andean highlands across Peru, Bolivia, and Ecuador. Heavily influenced in pronunciation, grammar, and lexicon by Quechua and Aymara.

Bolivian Spanish

also: Cochabambino, Camba (Lowland Bolivian)

The Spanish of Bolivia, internally split between Andean (La Paz, Cochabamba) and Lowland Camba (Santa Cruz) varieties. Heavy contact features from Quechua and Aymara in the highlands.

Canarian Spanish

also: Canario

The Spanish of the Canary Islands. Linguistically a bridge between Andalusian and Caribbean varieties, with shared seseo and weakened final consonants.

Caribbean Spanish

The Spanish of Cuba, the Dominican Republic, and Puerto Rico. Shares with Andalusian the aspiration of /s/ and the loss of final consonants, and carries a distinctive lexicon shaped by Taino and African languages.

Castilian Spanish

also: Castellano, Peninsular Spanish

The standard variety of Spain, centred on Madrid and the historical kingdom of Castile. Distinguished from American varieties by the distinción between /θ/ and /s/.

Central American Spanish

The Spanish of Central America, centred on Guatemala City. Linguistically transitional between Mexican and Caribbean varieties, with widespread voseo verb forms.

Chicano Spanish

also: Mexican-American Spanish, US Spanish

The Spanish of Mexican-American communities across the US Southwest. Diverse and bilingual; shaped by long contact with English and characterised by code-switching (Spanglish).

Chilean Spanish

The Spanish of Chile. Known for rapid speech, heavy aspiration of /s/, distinct second-person verb forms, and a rich slang lexicon.

Colombian Spanish

also: Bogotano

The Spanish of Colombia, internally varied by region. The Andean Bogotá variety is widely cited for its conservative phonology and is often described as one of the clearest spoken varieties in the Americas.

Cuban-American Spanish

also: Miami Spanish

The Spanish of the Cuban-American community of Florida, especially Miami. A Caribbean Spanish variety with substantial English contact features and a distinctive bilingual code-switching register.

Ecuadorian Spanish

also: Costeño Ecuadoriano

The Spanish of coastal Ecuador, centred on Guayaquil. Distinct from the Andean Quito variety, with stronger Caribbean Spanish features including /s/-aspiration.

Mexican Spanish

The most widely spoken Spanish variety in the world. Heavily shaped by Nahuatl and other indigenous languages, especially in lexicon, and exported widely through Mexican film and music.

Northern Mexican Spanish

also: Norteño Spanish

The Spanish of northern Mexico, centred on Monterrey and Chihuahua. Distinct from central Mexican Spanish in vowel realisation, intonation, and a distinct lexicon shaped by long contact with US English.

Rioplatense Spanish

also: Río de la Plata Spanish

The Spanish of Argentina and Uruguay. Distinctive for its voseo verb forms, the sheísmo / zheísmo realisation of <ll>/<y>, and a strong Italian-influenced intonation.

Venezuelan Spanish

The Spanish of Venezuela, centred on Caracas. A Caribbean Spanish variety closely related to Cuban and Dominican Spanish, with strong /s/-aspiration and a distinctive lexicon.

Yucatec Spanish

also: Yucatecan Spanish, Español yucateco

The Spanish of the Yucatán peninsula. Heavily shaped by contact with Yucatec Maya, with distinctive intonation, glottalised stops, and a substantial Mayan-derived lexicon.

Arabic

16 dialects

Algerian Arabic

also: Dziri, Darja

The Maghrebi Arabic of Algeria. Internally varied between urban Tellian, Saharan, and Hilalian rural varieties; carries deep Berber substrate and French contact lexicon.

Chadian Arabic

also: Shuwa Arabic, Baggara Arabic

Spoken across Chad and adjoining parts of Sudan, north-eastern Nigeria, and northern Cameroon. The main Sahelian Arabic variety, used as a lingua franca well beyond Arab communities.

Egyptian Arabic

also: Masri

The most widely understood Arabic dialect, carried across the region by cinema and music.

Gulf Arabic

also: Khaliji

Spoken across the Arabian Peninsula. Retains classical features alongside modern urban slang.

Hassaniya Arabic

A Bedouin-descended Arabic variety spoken across Mauritania, Western Sahara, and adjacent parts of the Sahel. Carries deep Berber substrate and a strong oral poetic tradition.

Hejazi Arabic

also: Hijazi

The Arabic of western Saudi Arabia, centred on Mecca, Medina, and Jeddah. Sits between Egyptian and Najdi varieties and is widely understood across the peninsula.

Iraqi Arabic

also: Mesopotamian Arabic

Spoken across Iraq and parts of eastern Syria and southwestern Iran. Carries strong Aramaic and Turkic influence and splits internally between gilit (southern) and qeltu (northern) varieties.

Levantine Arabic

also: Shami

Spoken across Syria, Lebanon, Palestine, and Jordan. Known for soft consonants and a shared cultural vocabulary.

Libyan Arabic

also: Sulaimitian Arabic

The Maghrebi Arabic of Libya, transitional between western Maghrebi and Egyptian. Tripolitanian and Cyrenaican varieties differ noticeably in vocabulary and phonology.

Maghrebi Arabic

also: Darija

Spoken across Morocco, Algeria, Tunisia, and Libya. Shaped by Berber, French, and Spanish contact.

Moroccan Darija

also: Darija, Moroccan Arabic

The Maghrebi Arabic of Morocco, with heavy Berber substrate and substantial French and Spanish loanwords. Distinct enough from Mashreqi Arabic to function as a separate everyday language.

Najdi Arabic

The Arabic of central Saudi Arabia around Riyadh. Closely related to Gulf Arabic but with distinct vowel and consonant features rooted in the bedouin Najd plateau.

Saidi Arabic

also: Upper Egyptian Arabic

The Arabic of Upper Egypt, from Beni Suef south to Aswan. Conservative relative to Cairene Egyptian Arabic, with retained interdentals and a distinct rural prestige.

Sudanese Arabic

The Arabic of Sudan, retaining a number of older features lost in other dialects and shaped by long contact with Nubian and other African languages.

Tunisian Arabic

also: Tounsi, Derja

The Maghrebi Arabic of Tunisia. Notable for vowel reductions, a rich set of French and Italian loanwords, and a literary written tradition unusual among colloquial Arabic varieties.

Yemeni Arabic

A cluster of conservative Arabic varieties across Yemen. Often cited as preserving features close to Classical Arabic that have shifted elsewhere.

French

11 dialects

Acadian French

The French of the Maritime provinces, especially New Brunswick. Preserves several 17th-century features lost in both Quebec and Metropolitan French.

Belgian French

also: Français de Belgique

The French of Wallonia and Brussels. Maintains a number of vocabulary and number forms (septante, nonante) that the Paris standard has dropped.

Cajun French

also: Louisiana French, Français cadien

The French of the Cajun communities of southern Louisiana, descended from 18th-century Acadian settlers expelled from the Maritimes. Endangered, with active revitalisation through CODOFIL and the Louisiana French immersion programme.

Franco-Ontarian

also: Ontario French, Français ontarien

The French of the historic Francophone communities of Ontario, especially around Sudbury, Ottawa, and the Northern shore. Closely related to Quebec French but with stronger English contact features.

Maghrebi French

The French of North Africa, used as a second language across Algeria, Tunisia, and Morocco. Marked by code-switching with Maghrebi Arabic and Berber.

Meridional French

also: Southern French, Français méridional

The southern French of Marseille, Toulouse, and the broader Occitan-speaking belt. Marked by realisation of mute vowels and a distinct intonation contour.

Métis French

also: Mitchif Français, Prairie French

The French variety of the Métis Nation of the Canadian Prairies. Distinct from Quebec and Acadian French; not to be confused with Michif, the mixed Cree-French language of the same communities.

Metropolitan French

also: Parisian French, Standard French

The Paris-area standard of European French. Carried nationally by education and broadcasting; the reference point for most French taught abroad.

Quebec French

also: Québécois, Joual

The dominant French variety of Canada. Strongly distinct from European French in vowel realisation, lexicon, and the colloquial Joual register.

Swiss French

also: Romand French, Suisse romand

The French of western Switzerland. Like Belgian French, retains older numerals (septante, huitante, nonante) and shows long contact with German-Swiss varieties.

West African French

The French of Francophone West Africa — used in administration, education, and media. Strong contact features from Wolof, Bambara, and other regional languages.

Portuguese

10 dialects

Angolan Portuguese

The Portuguese of Angola, used as a national lingua franca alongside Bantu languages such as Kimbundu and Umbundu, which have shaped both pronunciation and vocabulary.

Azorean Portuguese

also: Açoriano

The Portuguese of the Azores archipelago. Highly distinctive vowels and a strong island-by-island variation, often cited as one of the more divergent European varieties.

Carioca

also: Rio de Janeiro Portuguese, Fluminense

The Portuguese of Rio de Janeiro. Recognisable for its palatalised /s/ in coda position — a feature shared with Lisbon — and a distinctive intonation.

European Portuguese

also: Portuguese (Portugal), Lisbon Portuguese

The standard variety of Portugal, centred on Lisbon. Distinguished from Brazilian Portuguese by reduced unstressed vowels and a denser consonant cluster system.

Gaúcho

also: Sulista, Southern Brazilian Portuguese

The Brazilian Portuguese of Rio Grande do Sul and southern Brazil. Distinct from northern Brazilian varieties, with strong Spanish (Rioplatense) and Italian/German immigrant contact features.

Mineiro

also: Belo Horizonte Portuguese

The Brazilian Portuguese of Minas Gerais and the surrounding interior. Distinguished by characteristic vowel reductions, monophthongisation of diphthongs, and a distinctive lexicon.

Mozambican Portuguese

The Portuguese of Mozambique. A national lingua franca shaped by long contact with Bantu languages including Makhuwa, Sena, and Tsonga.

Nordestino

also: Northeastern Brazilian Portuguese

The Portuguese of Brazil's Northeast region around Recife, Salvador, and Fortaleza. Internally varied, with a distinct lexicon shaped by African and indigenous languages.

Northern Portuguese

also: Portuense, Norte de Portugal

The Portuguese of the north of Portugal around Porto. Preserves several older features lost in the standard, including the bilabial /β/ and the betacism that distinguishes <b> from <v>.

Paulistano

also: São Paulo Portuguese

The Portuguese of São Paulo and the surrounding state. The most populous Brazilian variety and a common reference for the broader Brazilian standard.

Russian

10 dialects

Far Eastern Russian

also: Dalnevostochny govor

The Russian of the Russian Far East around Vladivostok and Khabarovsk. The youngest major regional variety, formed through 19th- and 20th-century settlement; close to standard Russian with regional vocabulary tied to Pacific life.

Kazakhstani Russian

The Russian of Kazakhstan, used by ethnic Russians and as a second language by many Kazakhs. Based on standard Russian but with regional vocabulary, code-switching, and Kazakh contact features.

Moscow Russian

also: Standard Russian, Moskovskoye

The Moscow-area standard of Russian. The reference variety in education, broadcasting, and most second-language teaching.

Northern Russian

The northern Russian dialect group around Vologda and Arkhangelsk. Preserves okanye (clear /o/ in unstressed positions) lost in standard varieties.

Pomor Russian

also: Pomorsky govor

The Russian of the Pomor coastal communities along the White Sea. A North Russian variety with a distinct lexicon shaped by maritime life, fishing, and contact with Karelian and Saami.

Saint Petersburg Russian

also: Petersburg pronunciation

The traditional educated speech of Saint Petersburg. Distinct from Moscow Russian in clearer enunciation of unstressed vowels and a number of lexical preferences.

Siberian Russian

also: Sibirskie govory

The Russian varieties of Western and Central Siberia, descended from northern-Russian settlers. Distinguished by retained okanye, a conservative lexicon, and contact features from indigenous Siberian languages.

Southern Russian

The southern Russian dialect group spanning Voronezh, Rostov, and the broader steppe. Marked by fricative /ɣ/ and a distinct verb-ending system.

Ural Russian

also: Uralsky govor

The Russian of the Ural region around Yekaterinburg and Perm. Mixes Northern and Central Russian features with contact influence from local Finno-Ugric and Turkic languages.

Volga Russian

also: Vladimir-Volga dialects, Middle Russian

The Central Russian dialects of the middle Volga, around Nizhny Novgorod and Vladimir. A transitional zone between the Northern and Southern dialect groups.

German

9 dialects

Bairisch

also: Bavarian, Bavarian-Austrian

A large Upper German dialect group spoken across Bavaria and most of Austria, with distinctive vowel shifts and vocabulary far removed from Standard German.

Berlinerisch

also: Berlin German, Berlinisch

The urban dialect of Berlin and surrounding Brandenburg. Mixes Low German substrate with Standard German, famous for its dry humour and clipped delivery.

Hessisch

also: Hessian

A Central German dialect group covering Hesse around Frankfurt. The Frankfurt variety is widely recognisable from German television and theatre.

Kölsch

also: Cologne German, Ripuarian

A Ripuarian Central Franconian dialect spoken in Cologne and the surrounding Rhineland. Distinctive for its melodic intonation and rich local vocabulary.

Plattdüütsch

also: Low German, Niederdeutsch, Plattdeutsch

The Low German varieties of northern Germany and the eastern Netherlands. Linguistically closer to Dutch and English than to Standard German in several core features.

Sächsisch

also: Saxon, Upper Saxon

An East Central German dialect spoken in Saxony around Dresden and Leipzig, characterised by softened plosives and a distinctive intonation.

Schwäbisch

also: Swabian

An Alemannic dialect spoken across Baden-Württemberg and parts of Bavaria. Recognisable for its diminutive "-le" suffix and softened consonants.

Schwyzerdütsch

also: Swiss German, Schweizerdeutsch

A cluster of High Alemannic varieties spoken across German-speaking Switzerland. Mutually unintelligible with Standard German for most non-Swiss listeners.

Wienerisch

also: Viennese

The urban Bavarian dialect of Vienna, layered with Yiddish, Czech, and Italian loanwords from the city's long Habsburg history.

Greek

8 dialects

Cappadocian Greek

The Greek of inner Anatolia, isolated for centuries before the 1923 population exchange brought its speakers to Greece. Heavily Turkic-influenced; once thought extinct but rediscovered in central Greece in the 2000s.

Cretan Greek

The Greek of Crete. The literary medium of the late-medieval Cretan Renaissance and the basis of a long oral and musical tradition.

Cypriot Greek

also: Kypriaki

The Greek of Cyprus, distinct from Standard Modern Greek in pronunciation, grammar, and vocabulary. Preserves a number of medieval features.

Griko

also: Italiot Greek, Salentino Greek

The Greek of southern Italy, spoken in two enclaves in Salento and Calabria. A direct descendant of the Magna Graecia varieties; one of the oldest continuously-spoken Greek varieties outside Greece.

Mariupolitan Greek

also: Rumeika, Azov Greek

The Greek of the Mariupol region in Ukraine, descended from communities resettled from Crimea by Catherine the Great in 1778. Endangered; structurally close to Pontic Greek.

Pontic Greek

also: Romeyka, Pontiaka

The Greek variety historically spoken along the Pontic coast of the Black Sea, brought to mainland Greece by refugees of the 1923 population exchange. Its Anatolian-resident form (Romeyka) survives among Muslim communities near Trabzon.

Standard Modern Greek

also: Demotic Greek

The Athens-based standard of modern Greek, derived from the Demotic vernacular and codified in the late 20th century.

Tsakonian

also: Tsakonika

The only living descendant of ancient Doric Greek, spoken in a few villages of the Peloponnese around Leonidio. Critically endangered; structurally distinct from all other Greek varieties, which descend from Attic-Ionic.

Japanese

8 dialects

Hakata-ben

also: Fukuoka dialect

The dialect of Fukuoka and the Hakata district on northern Kyushu. Recognisable for sentence-final particles like "to" and "tai".

Hiroshima-ben

also: Geibi dialect

The Chūgoku-region dialect of Hiroshima and surrounding Geibi area. Known to many Japanese viewers through Hiroshima-set film and television.

Hokkaido-ben

The variety spoken across Hokkaido. A relatively young dialect shaped by 19th-century settlement, mostly close to Standard Japanese but with Tōhoku and coastal influences.

Kansai-ben

also: Kansai dialect, Kinki dialect, Osaka-ben

The major dialect group of the Kansai region, with Osaka as its modern centre. Strongly distinct from Tokyo Japanese in pitch accent, copula forms, and idiom. Kyoto-ben is treated separately here.

Kyoto-ben

also: Kyō-kotoba, Kyoto dialect

The Kyoto variety of Kansai Japanese. Historically the prestige speech of the imperial court, marked by softer cadence, distinct honorific forms, and a long literary heritage.

Nagoya-ben

also: Owari dialect

The dialect of Nagoya and the Owari plain. Sits between the Tokyo and Kansai regions and shares features with both, while keeping a distinct vowel inventory.

Tōhoku-ben

also: Tōhoku dialect, Zūzū-ben

The dialects of north-eastern Honshu around Sendai. Distinguished by reduced vowel contrasts and merged sibilants — historically caricatured as "zūzū-ben".

Tokyo Japanese

also: Hyōjungo, Standard Japanese

The Tokyo-area variety that forms the basis of Standard Japanese (Hyōjungo). Carried nationwide by broadcasting, education, and government use.

Romanes

8 dialects

Arli

A Balkan Romani variety spoken widely across North Macedonia, Kosovo, southern Serbia, and Bulgaria.

Gurbeti

A Balkan Romani variety with communities across Bosnia, Serbia, and surrounding countries.

Kalderash

A Vlax Romani variety with a long history in Romania and the Balkans. Traditionally associated with coppersmith trades.

Lovari

A Vlax Romani variety with strong Hungarian and Slavic influence. Spoken across Central Europe.

Polish-Baltic Romani

also: Polska Roma, Xaladitka Roma

A northern Romani variety historically rooted in Poland and the Baltic region, shaped by long contact with Slavic and Baltic languages.

Sinte Manouche

also: Sinti, Manush, Sinté

A northwestern Romanes variety spoken by Sinti and Manouche communities across Germany, France, the Benelux, and northern Italy. Shaped by long contact with Germanic and Romance languages.

Slovak Romani

also: Central Slovak Romani, Servika Roma

A Central European Romanes variety spoken by Romanes communities in Slovakia, with strong contact-influenced vocabulary from Slovak, Hungarian, and neighbouring languages.

Zargari

also: Zargari Romani, Zargari Romanes, Zargar Romani

A highly endangered Romani variety spoken in and around the village of Zargar in Qazvin province of north-western Iran, west of Tehran. The community is traditionally said to have been settled in the Safavid period and is one of the easternmost Romani-speaking populations on record; the dialect retains a Romani lexical core but shows heavy contact influence from Persian and Azerbaijani Turkish in phonology, vocabulary, and grammar. Speaker numbers are small and intergenerational transmission has weakened sharply.

Hindi

7 dialects

Italian

7 dialects

Tamazight

7 dialects

Ukrainian

7 dialects

Persian

6 dialects

Domari

5 dialects

Inuit

5 dialects

Korean

5 dialects

Kurdish

5 dialects

Mandarin

5 dialects

Kazakh

4 dialects

Swahili

4 dialects

Turkish

4 dialects

Albanian

3 dialects

Bengali

3 dialects

Cantonese

3 dialects

Cree

3 dialects

Dutch

3 dialects

Guarani

3 dialects

Hausa

3 dialects

Indonesian

3 dialects

Norwegian

3 dialects

Punjabi

3 dialects

Quechua

3 dialects

Tagalog

3 dialects

Tamil

3 dialects

Thai

3 dialects

Vietnamese

3 dialects

Yoruba

3 dialects

Amharic

2 dialects

Aymara

2 dialects

Azerbaijani

2 dialects

Balochi

2 dialects

Bulgarian

2 dialects

Burmese

2 dialects

Cherokee

2 dialects

Croatian

2 dialects

Czech

2 dialects

Danish

2 dialects

Gujarati

2 dialects

Macedonian

2 dialects

Māori

2 dialects

Marathi

2 dialects

Mongolian

2 dialects

Ojibwe

2 dialects

Pashto

2 dialects

Polish

2 dialects

Romanian

2 dialects

Slovenian

2 dialects

Sorbian

2 dialects

Sotho

2 dialects

Swedish

2 dialects

Telugu

2 dialects

Zulu

2 dialects

Other languages

210 dialects

Abkhaz

also: Apswa byzshwa

A Northwest Caucasian language of Abkhazia. Famous for one of the smallest vowel inventories of any natural language (just two phonemic vowels) and a correspondingly large consonant system.

Acholi

also: Acoli, Lwo

A Western Nilotic language of the Luo cluster, spoken across northern Uganda and adjacent South Sudan. Closely related to Lango and to the Luo of western Kenya.

Adyghe

also: West Circassian

A Northwest Caucasian language of the Adygea Republic in Russia. Together with Kabardian, makes up the Circassian language continuum, with a large diaspora in Turkey and the Levant.

Afar

also: Qafar

A Cushitic Afroasiatic language of the Afar people, spoken across the Afar Triangle of north-eastern Ethiopia, southern Eritrea, and Djibouti. Around 2 million speakers.

Afrikaans

A West Germanic language descended from 17th-century Cape Dutch, with substantial substrate from Khoisan, Malay, and Bantu languages. One of the official languages of South Africa and widely spoken in Namibia.

Ainu

A language isolate of the indigenous Ainu people of Hokkaido and historically Sakhalin and the Kuril Islands. Critically endangered with only a handful of native speakers; subject of an active revitalisation effort.

Akan / Twi

also: Akan

A Kwa Niger-Congo language and the largest indigenous language of Ghana. The Asante Twi variety, centred on Kumasi, is the most widely spoken and the basis of Akan-language broadcasting.

Asháninka

An Arawakan language spoken in the central Peruvian Amazon and adjacent western Brazil. The most-spoken Amazonian Arawakan language, with around 100,000 speakers.

Avar

also: Avar macʻ

A Northeast Caucasian language of Dagestan, the largest of the Daghestanian languages. Used as a regional lingua franca across many of Dagestan's small mountain communities.

Bahamian Creole

also: Bahamian Dialect

An English-based creole spoken across the Bahamas. Closely related to Gullah and to the English-based creoles of the wider Caribbean.

Bakhtiari

A southwestern Iranian language of the Bakhtiari tribal confederation in the central Zagros mountains. Sister to Luri, with a strong oral tradition tied to the seasonal migrations of pastoral life.

Bambara

also: Bamanankan

The largest Mande Niger-Congo language and the lingua franca of Mali. Closely related to Dyula and Maninka in a wider Manding dialect continuum across West Africa.

Bashkir

also: Başqortsa

A Kipchak Turkic language of Bashkortostan in the southern Urals. Closely related to Tatar but with distinct phonology and lexicon shaped by the Bashkir steppe heritage.

Basque

also: Euskara

A language isolate of the western Pyrenees, with no proven relatives anywhere in the world. The only pre-Indo-European language to survive in Western Europe; co-official in the Basque Country and Navarre.

Beja

also: Bedawi

The northernmost Cushitic language, spoken across the Red Sea hills of Sudan, Eritrea, and Egypt. Sometimes classified as its own North Cushitic branch within the Afroasiatic family.

Bemba

also: Chibemba, Icibemba

A Bantu language and the most-spoken indigenous language of Zambia, used across the Northern, Luapula, and Copperbelt provinces. A major lingua franca on the Zambian Copperbelt.

Blackfoot

also: Siksiká

A Plains Algonquian language of the Blackfoot Confederacy across southern Alberta and northern Montana. Around 5,000 speakers; one of the most divergent Algonquian languages.

Bosnian

also: Bosanski

The Sarajevo-based standard of Bosnian. A Štokavian-based BCMS standard, distinguished by retention of Turkish, Arabic, and Persian loanwords from Ottoman rule and the use of both Latin and Cyrillic scripts.

Breton

also: Brezhoneg

A Brittonic Celtic language of Brittany in north-western France. The only continental Celtic survivor; brought from south-western Britain in the 5th-6th centuries.

Burushaski

A language isolate spoken in the Hunza, Yasin, and Nagar valleys of Gilgit-Baltistan, northern Pakistan. Around 100,000 speakers; the most-spoken language isolate of Asia.

Buryat

also: Buryat-Mongolian, Buryad

A northern Mongolic language spoken across the Buryat Republic in Russia, northern Mongolia, and parts of north-eastern China. Traditional Buddhist literary language of the Trans-Baikal region.

Cape Verdean Creole

also: Kriolu, Kabuverdianu

A Portuguese-based creole and the everyday language of Cape Verde, used by the entire population alongside Portuguese. The oldest still-spoken creole language, with a continuous documented history since the 16th century.

Catalan

also: Català

A Western Romance language, official in Andorra and co-official across Catalonia, the Balearic Islands, and the Valencian Community of Spain. Around 10 million speakers, with strong literary tradition since the medieval period.

Cebuano

also: Bisaya, Binisayâ

A Visayan Austronesian language of the central and southern Philippines. The most widely-spoken language of the Visayas and Mindanao, often counted as the largest Philippine language by L1 speakers.

Central Alaskan Yupʼik

also: Yugtun

The largest Yupʼik language, spoken across south-western Alaska. A sister language to Inuktitut within the Eskaleut family, with around 10,000 speakers.

Chechen

also: Noxçiyn mott

A Northeast Caucasian (Nakh) language and the official language of the Chechen Republic in Russia. The largest Northeast Caucasian language, with around 1.5 million speakers.

Cheyenne

also: Tsėhésenėstsestȯtse

A Plains Algonquian language of the Northern and Southern Cheyenne tribes, in Montana and Oklahoma. Around 1,700 speakers; subject of substantial linguistic documentation since the 19th century.

Chichewa

also: Nyanja, Chinyanja

A Bantu language and the national language of Malawi, also widely spoken in eastern Zambia and northern Mozambique under the name Nyanja. Around 14 million speakers.

Chickasaw

also: Chikashshanompa

A Muskogean language of the Chickasaw Nation of Oklahoma. Closely related to Choctaw; critically endangered with active revitalisation through the Chickasaw Nation Language Department.

Choctaw

also: Chahta anumpa

A Muskogean language with two surviving communities — the Mississippi Band of Choctaw Indians and the Choctaw Nation of Oklahoma, descendants of the Trail-of-Tears removal. Around 10,000 speakers.

Chukchi

also: Lygʼoravetlʼan

A Chukotko-Kamchatkan language of the Chukotka Peninsula in north-eastern Siberia. Around 5,000 speakers; one of the few languages with male and female speech-form distinctions.

Chuvash

also: Çăvaşla

The sole surviving member of the Oghur branch of Turkic, spoken in Chuvashia in the Volga region. Linguistically the most divergent living Turkic language, with features lost everywhere else in the family.

Comorian

also: Shikomori, Shikomoro

A Bantu language cluster of the Comoros archipelago and Mayotte, closely related to Swahili. Four mutually intelligible varieties, one per major island; around 1 million speakers.

Crimean Tatar

also: Qırımtatarca

A mixed Turkic language of the Crimean peninsula, descended from contact between Kipchak and Oghuz varieties. Critically endangered after the 1944 Soviet deportation; subject of an active revitalisation effort.

Dakota

also: Dakhótiyapi, Santee-Sisseton

The eastern Siouan language closely related to Lakota. Spoken by the Santee, Sisseton-Wahpeton, and Yankton communities across Minnesota, the Dakotas, and southern Manitoba.

Daur

also: Dagur

A divergent Mongolic language spoken in Heilongjiang and Inner Mongolia in north-eastern China. Geographically isolated from the rest of the family and shaped by long contact with Manchu and Mandarin.

Dhivehi

also: Maldivian

An Indo-Aryan language and the official language of the Maldives. Closest living relative of Sinhala, with which it shares a 12th-century separation date.

Diné Bizaad

also: Navajo

A Southern Athabaskan (Na-Dené) language spoken across the Navajo Nation in Arizona, New Mexico, and Utah. The most-spoken Indigenous language north of Mexico, with around 170,000 speakers.

Dinka

also: Thuɔŋjäŋ

A Western Nilotic language and the largest language of South Sudan. Internally varied, with Padang, Bor, Agar, Rek, and Twic as major dialect groups.

Dogon

also: Dogoso cluster, Toro So, Tomo Kan, Jamsay

A Niger-Congo branch of around twenty deeply differentiated varieties on and around the Bandiagara escarpment in central Mali, with smaller populations in north-western Burkina Faso. Around 600,000 speakers total; descriptive work since the 2000s treats several Dogon varieties (Toro So, Tomo Kan, Jamsay, others) as separate languages rather than dialects of a single Dogon language.

Dungan

also: Hui Chinese (Central Asia)

A Sinitic language descended from late-19th-century Hui Muslim refugees who fled into the Russian Empire after the Qing-era Hui revolts. Written in Cyrillic, with substantial Russian, Persian, and Turkic loans.

Dyula

also: Jula, Dioula

A Manding Niger-Congo language used as a trade lingua franca across northern Côte d'Ivoire, western Burkina Faso, and adjoining areas. Closely related to Bambara and Maninka.

Dzongkha

A Tibetic language and the official language of Bhutan, derived from Old Tibetan. Closely related to Standard Tibetan but increasingly diverging through political separation.

Elfdalian

also: Övdalian, Älvdalska

A heavily archaic North Germanic variety spoken in Älvdalen in northern Dalarna, Sweden. Often classified as a separate language; preserves Old Norse features lost in all other living varieties.

Enga

A Trans-New Guinea language and the largest Papuan language by speakers, used across Enga Province in the western highlands of Papua New Guinea. Around 250,000 speakers.

Estonian

also: Eesti

A Finnic Uralic language and the official language of Estonia. Closely related to Finnish but distinct in vocabulary and a partially restructured case system.

Evenki

A Tungusic language of indigenous reindeer-herding communities across central and eastern Siberia and northern China. The most geographically dispersed Tungusic language.

Ewe

also: Eʋegbe

A Kwa Niger-Congo language spoken across south-eastern Ghana and southern Togo. Closely related to Fon and the major language of the Ewe homeland on both sides of the Volta.

Faroese

also: Føroyskt

The North Germanic language of the Faroe Islands. Closely related to Icelandic and Western Norwegian; co-official with Danish in the islands.

Fijian

also: Vosa Vakaviti

An Austronesian language and an official language of Fiji. Often used alongside Fiji English and Fiji Hindi, the country's three official languages.

Finnish

also: Suomi

A Finnic Uralic language and an official language of Finland and the EU. Famous for its agglutinative grammar and vowel harmony, with a rich oral tradition recorded in the Kalevala.

Fon

also: Fongbe, Fon-Gbe

A Gbe Niger-Congo language and the largest language of Benin. Closely related to Ewe; the historical language of the Kingdom of Dahomey and a major source of Vodun religious vocabulary.

Fulani

also: Fula, Fulfulde, Pulaar

An Atlantic Niger-Congo language spoken by Fulani (Peul) communities across the Sahel from Senegal to Sudan. One of the most geographically spread African languages, with 25-30 million speakers.

Gagana Sāmoa

also: Samoan

The Polynesian language of the Samoan Islands. The official language of Samoa and one of the official languages of American Samoa, with sizeable diaspora communities in New Zealand and the United States.

Gagauz

also: Gagauz dili

An Oghuz Turkic language spoken by the Christian Gagauz people of southern Moldova. The official language of the autonomous Gagauz region (Gagauz Yeri), distinct from neighbouring Romanian and Bulgarian.

Galician

also: Galego

A Western Romance language closely related to Portuguese, co-official in Galicia in north-western Spain. The two were a single language as late as the 13th century.

Garífuna

An Arawakan language with major African and Carib influence, spoken along the Caribbean coast of Central America. Descended from communities deported from St. Vincent in 1797; a UNESCO Masterpiece of Intangible Heritage.

Georgian

also: Kartuli

The largest Kartvelian language and the official language of Georgia. Written in the unique Mkhedruli script, with a literary tradition dating back to the 5th century.

Gilaki

A northwestern Iranian language spoken across Iran's Caspian province of Gilan around Rasht. Distinct from Persian in case marking and a richer verb system retaining older Iranian features.

Hadza

also: Hadzane

A click-language isolate spoken by the Hadza hunter-gatherers around Lake Eyasi in northern Tanzania. Fewer than 1,000 speakers; no demonstrated genetic relationship to any other language.

Haitian Creole

also: Kreyòl ayisyen

A French-based creole that emerged on Saint-Domingue plantations. Co-official with French in Haiti since 1987, and the everyday language of nearly all Haitians.

Hakka

also: Kèjiā huà

A Sinitic language spoken across pockets of southern China and Taiwan by the Hakka diaspora communities. Around 35 million speakers, with major communities in Guangdong, Fujian, and Taiwan.

Halkomelem

also: Hulʼqʼumiʼnumʼ

A Coast Salishan language of the lower Fraser river valley and southeastern Vancouver Island. Around 200 fluent speakers across Upriver, Downriver, and Island dialect groups.

Hmong

also: Miao

A cluster of Hmong-Mien languages spoken across the highlands of southern China, northern Vietnam, Laos, and Thailand. Carried abroad through the post-Vietnam-War Hmong diaspora to the United States, France, and Australia.

Hopi

A Northern Uto-Aztecan language of the Hopi people of north-eastern Arizona. Around 5,000 speakers; the language famously analysed by Whorf in his work on linguistic relativity.

Hungarian

also: Magyar

A Ugric Uralic language, the largest of the Uralic family by speakers and the official language of Hungary. Reached its current location through the Magyar migrations of the 9th century.

ʻŌlelo Hawaiʻi

also: Hawaiian

The indigenous Polynesian language of the Hawaiian Islands. Co-official with English in Hawaiʻi and the focus of an ongoing revitalisation movement after near-extinction in the 20th century.

Icelandic

also: Íslenska

The most conservative living North Germanic language, retaining a complex inflectional system close to Old Norse. Modern Icelanders can still read 13th-century sagas with limited difficulty.

Igbo

A Volta-Niger language of south-eastern Nigeria. One of the three major languages of Nigeria, with a literary standard codified in the late 20th century from a heavily dialect-fragmented base.

Ilocano

also: Iloko, Ilokano

An Austronesian language of north-western Luzon, the third most-spoken language of the Philippines. Used widely as a regional lingua franca across northern Luzon.

Irish

also: Gaeilge

A Goidelic Celtic language and the first official language of Ireland. Daily-spoken communities (the Gaeltachtaí) survive in pockets along the western seaboard, around Galway, Donegal, and Kerry.

Jamaican Patois

also: Patwa, Jamaican Creole

An English-based creole that emerged on Jamaican plantations from contact between English and West African languages. Native to most Jamaicans and used widely in music, literature, and everyday life.

Javanese

also: Basa Jawa

The largest Austronesian language by speakers, used across central and eastern Java and as a regional lingua franca. Distinguished by an intricate honorific system (krama / madya / ngoko).

Juhuri

also: Judeo-Tat, Judeo-Tati, Mountain Jewish, Juhuro, Juwuri

A Southwestern Iranian language of the Mountain Jews of the eastern Caucasus, historically classed as a Judeo-Tat variety. Spoken in Krasnaya Sloboda (Quba) in Azerbaijan and Derbent in Dagestan, with substantial Hebrew, Aramaic, and Azerbaijani influence. Written historically in Hebrew script, then Cyrillic from the Soviet period onward.

Juǀʼhoan

also: Juǀʼhoansi, !Kung

A Kxʼa click language of the Kalahari, spoken by Juǀʼhoan San communities across north-eastern Namibia and north-western Botswana. The most thoroughly documented of the surviving San languages.

Kaingang

The largest southern Macro-Jê language, spoken across communities in Paraná, Santa Catarina, and Rio Grande do Sul. Around 30,000 speakers, with literary materials in Kaingang since the late 19th century.

Kalmyk

also: Kalmyk Oirat, Khalimag

The only Mongolic language of Europe, spoken in the Republic of Kalmykia by the descendants of 17th-century Oirat migrants from Dzungaria. Endangered but officially recognised.

Kamba

also: Kikamba

A Bantu language of south-eastern Kenya, closely related to Kikuyu. Around 4 million speakers across Machakos, Kitui, and Makueni counties.

Kanuri

A Saharan Nilo-Saharan language and the historical court language of the Kanem-Bornu empire. Around 13 million speakers across north-eastern Nigeria, southern Niger, eastern Chad, and northern Cameroon.

Karakalpak

also: Qaraqalpaqsha

A Kipchak Turkic language closely related to Kazakh, spoken in the autonomous Karakalpakstan region of Uzbekistan around the Aral Sea.

Karelian

also: Karjala

A Finnic Uralic language spoken across the Republic of Karelia in Russia and adjacent parts of eastern Finland. Closely related to Finnish but distinct enough to be classified as a separate language.

Kashmiri

also: Koshur

A Dardic Indo-Aryan language of the Kashmir valley, traditionally written in a Perso-Arabic script (Naskh) but also in Devanagari and the Sharada heritage script.

Ket

The last living Yeniseian language, spoken along the middle Yenisei in central Siberia. Around 200 elderly speakers; recently linked to the Na-Dené family of North America in the controversial Dene-Yeniseian hypothesis.

Khmer

also: Cambodian

An Austroasiatic language and the official language of Cambodia. The largest non-Tai-Kadai language of mainland Southeast Asia, with a literary tradition dating back to the 7th century.

Khoekhoe

also: Nama, Damara

The largest Khoisan language, spoken across Namibia and parts of Botswana and South Africa. The most-documented click language; one of the official languages of Namibia.

Kikongo

also: Kongo

A Bantu language of the Kongo people, spoken across Angola, the DRC, and the Republic of the Congo. Carried abroad in the Atlantic slave trade and a major substrate in Caribbean and Brazilian creoles.

Kikuyu

also: Gikuyu, Gĩkũyũ

A Bantu language and the largest indigenous language of Kenya, spoken across the central highlands around Mount Kenya. Around 8 million speakers.

Kimbundu

also: North Mbundu

A Bantu language of north-western Angola including the Luanda hinterland. Around 4 million speakers; a major substrate in Brazilian Portuguese vocabulary through the Atlantic slave trade.

Kinyarwanda

The Bantu national language of Rwanda. Mutually intelligible with Kirundi (Burundi) and one of the few African languages spoken by the entire population of a country.

Komi

also: Komi-Zyrian

A Permic Uralic language of the Komi Republic in northern European Russia. One of the oldest written Uralic languages, with a literary tradition going back to the 14th-century missionary Stephen of Perm.

Koryak

also: Nymylan

A Chukotko-Kamchatkan language of the Kamchatka peninsula. Spoken by Koryak reindeer herders and coastal communities; sister language to Chukchi within the Chukotian sub-branch.

Krio

also: Sierra Leonean Creole

An English-based creole and the lingua franca of Sierra Leone, used by virtually the entire population. Descended from the speech of liberated Africans resettled in Freetown in the 19th century.

Kriol

also: Australian Kriol

An English-based creole spoken across northern Australia by Aboriginal communities, especially in the Northern Territory and the Kimberley. Native to many speakers and used in bilingual schools and broadcasting.

Kuman

also: Simbu

A Trans-New Guinea language of the Chimbu / Simbu Province in the central highlands of Papua New Guinea. Around 80,000 speakers.

Kwakʼwala

also: Kwakiutl

The largest Wakashan language, of the Kwakwakaʼwakw peoples on northern Vancouver Island and the adjacent BC mainland. Famous to anthropology through Franz Boas's field documentation.

Kyrgyz

also: Kyrgyzcha

A Kipchak Turkic language and the official language of Kyrgyzstan. Carries the Manas oral epic, one of the longest oral epic traditions in world literature.

Kʼicheʼ

also: Quiché

The largest Mayan language of Guatemala, spoken across the western highlands. The language of the Popol Vuh, the foundational K'ichean creation epic.

Lakota

also: Lakȟótiyapi, Teton Sioux

A Western Siouan language spoken across the Sioux nations of the Great Plains, with major communities at Pine Ridge, Rosebud, and Standing Rock. Distinct from Dakota in core vocabulary and pronouns.

Lao

A Tai-Kadai language and the official language of Laos. Closely related to Thai Isan; written in a script derived from the same source as Thai but distinct in several letters and vowel signs.

Lezgian

also: Lezgi

A Northeast Caucasian language of southern Dagestan and northern Azerbaijan. Around 800,000 speakers; one of the most studied Daghestanian languages.

Lingala

A Bantu language used as a lingua franca across the western DRC and the Republic of the Congo, with Kinshasa as its largest urban centre. The language of much of Congolese popular music.

Luba-Kasai

also: Tshiluba, Western Luba

A Bantu language of the Kasai region in the southern DRC and one of the four national languages of the country. Around 6 million speakers.

Luganda

also: Ganda

A Bantu language of central Uganda, the most-spoken Ugandan Bantu language and the historical court language of the Buganda kingdom. Widely used as a lingua franca in Kampala.

Luo

also: Dholuo

A Western Nilotic language of the Luo people of western Kenya around Lake Victoria. Related to Acholi and to the Dinka and Nuer of South Sudan.

Luri

also: Lori

A southwestern Iranian language of the Lur people in Iran's western Zagros region. Closely related to Persian but linguistically distinct, with several million speakers across Lorestan, Khuzestan, and Ilam.

Lushootseed

also: Puget Salish

A Salishan language of the Coast Salish peoples around Puget Sound in western Washington. Critically endangered; subject of an active revitalisation through the Tulalip and Muckleshoot tribes.

Maasai

also: Maa, Olmaa

A Nilotic language spoken by the Maasai people across southern Kenya and northern Tanzania. Closely related to Samburu and Camus within the Maa cluster.

Macushi

also: Makushi

A Cariban language spoken across the Roraima savannas of northern Brazil, southwestern Guyana, and adjacent Venezuela. Closely related to Pemón.

Makhuwa

also: Emakhuwa, Macua

A Bantu language and the largest indigenous language of Mozambique, spoken across the northern provinces of Nampula, Cabo Delgado, and Niassa. Around 7 million speakers.

Malagasy

The Austronesian language of Madagascar, descended from settlers who crossed the Indian Ocean from Borneo around 1,500 years ago. The westernmost outpost of the Austronesian family.

Maltese

also: Malti

A Semitic language descended from medieval Maghrebi Arabic, with extensive Italian and English influence. The only Semitic language written in Latin script and the only one official in the EU.

Mam

A Mamean Mayan language spoken across the western highlands of Guatemala and adjacent parts of Chiapas, Mexico. Around 600,000 speakers.

Manchu

A Tungusic language and the historical court language of the Qing dynasty (1644-1912). Now critically endangered, with only a handful of native speakers in Heilongjiang.

Maninka

also: Malinke, Mandinka

A Manding Niger-Congo language of Upper Guinea and adjacent Mali. The historical language of the Mali Empire and the closest living relative of the Mandinka of Senegambia.

Manx

also: Gaelg

A Goidelic Celtic language of the Isle of Man. Functionally extinct in 1974 with the death of its last native speaker; revived through education and now spoken as a second language by hundreds.

Mapudungun

also: Mapuche, Araucanian

The language of the Mapuche people, spoken across south-central Chile and adjacent Argentina. The largest Indigenous language of Chile and one of South America's most documented language isolates.

Mauritian Creole

also: Kreol Morisien

A French-based creole and the everyday language of Mauritius, spoken by virtually the entire population alongside English and French. Closely related to Seychellois and Réunion creoles.

Mazandarani

also: Tabari

A northwestern Iranian language of Iran's Caspian province of Mazandaran, around Sari and Babol. Closely related to Gilaki, with around three million speakers.

Mende

A Mande Niger-Congo language and the largest indigenous language of Sierra Leone. Notable for the Kikakui syllabary, an indigenous 19th-century writing system.

Mien

also: Yao

A cluster of Hmong-Mien languages spoken by the Yao peoples across southern China and adjacent Southeast Asia. Sister branch to Hmong, with around 800,000 speakers across the diaspora.

Min Nan (Hokkien)

also: Hokkien, Taiwanese

The largest Min Chinese variety, spoken in Fujian and Taiwan and across the Hokkien-speaking diaspora of Southeast Asia. The basis of Taiwanese Hokkien and a major contributor to Singaporean Hokkien.

Mingrelian

also: Margaluri

A Kartvelian language of western Georgia around Zugdidi. Closely related to Georgian but linguistically distinct; primarily an oral language with limited written use.

Miʼkmaq

also: Mikmawisimk

An Eastern Algonquian language of the Miʼkmaq Nation across the Maritime provinces of Canada, Quebec, and Newfoundland, plus eastern Maine. Around 8,000 speakers, with a long oral and hieroglyphic tradition.

Modern Hebrew

also: Ivrit

A Northwest Semitic language, revived from a primarily liturgical use into a full living language between the late 19th and mid-20th centuries. Now the official language of Israel.

Mon

An Austroasiatic language of Lower Burma and adjacent Thailand. Once the dominant cultural language of mainland Southeast Asia, displaced over centuries by Burmese and Thai expansion.

Mongo

also: Lomongo, Nkundo

A Bantu language cluster of the central Congo basin, spoken across the Cuvette and Équateur provinces of the DRC. Around 6 million speakers across a wide dialect chain.

Montenegrin

also: Crnogorski

The most recently codified BCMS national standard, declared official in Montenegro in 2007. Distinguished from Serbian by the recognition of two additional Montenegrin-specific letters and a number of lexical preferences.

Mòoré

also: Mossi, Moshi

A Gur Niger-Congo language and the largest language of Burkina Faso, spoken by the Mossi people on the Ouagadougou plateau. Around 8 million speakers.

Muscogee (Creek)

also: Mvskoke

A Muskogean language of the Muscogee (Creek) and Seminole nations of Oklahoma and Florida. The historical lingua franca of the Southeastern Indigenous confederacy before forced removal.

Nahuatl

also: Mexicano, Nawatlahtolli

The largest Uto-Aztecan language, spoken across central Mexico by around 1.6 million people. The language of the Aztec Empire, with literary records in Latin script since the 16th century.

Nanai

also: Goldi

A Tungusic language of the lower Amur basin, spoken in Russia's Khabarovsk Krai and adjacent China. Around a thousand speakers, all elderly.

Ndebele

also: Northern Ndebele, isiNdebele

A Nguni Bantu language of south-western Zimbabwe, descended from a 19th-century offshoot of Zulu. Around 2 million speakers, centred on Bulawayo and Matabeleland.

Nepali

also: Nepāli

An Indo-Aryan language and the official language of Nepal. Also a scheduled language of India, with substantial communities in Sikkim, West Bengal, and Bhutan.

Newar

also: Nepal Bhasa, Newari

A Tibeto-Burman language of the Newar people of the Kathmandu valley. The historical lingua franca of medieval Nepal, with an extensive literary tradition.

Nheengatu

also: Lingua Geral Amazônica, Modern Tupi

A Tupi-Guaraní language descended from old Tupinambá; the historical Lingua Geral of Amazonia. Co-official in São Gabriel da Cachoeira (Brazil), the only municipality in Brazil with Indigenous co-official languages.

Nigerian Pidgin

also: Naijá, Pidgin English

An English-based creole used as a lingua franca across Nigeria and the wider West African coast. Estimated 75-100 million speakers; rapidly expanding as a first language in urban Nigeria.

Nivkh

also: Gilyak

A language isolate of Sakhalin and the lower Amur basin in Russia. Critically endangered; one of the few Paleo-Siberian languages with no demonstrated relatives.

Nobiin

also: Mahas-Fadicca

A Nile Nubian language spoken along the Nile from southern Egypt into northern Sudan. Heir to Old Nubian, the medieval Christian-era written language of the Nubian kingdoms.

Northern Sami

also: Davvisámegiella

The largest Sami language, spoken by the indigenous Sami people across northern Norway, Sweden, and Finland. The most documented and standardised of the Sami languages.

Nuer

also: Thok Naath

A Western Nilotic language and the second-largest language of South Sudan after Dinka. Around 1.7 million speakers across the Greater Upper Nile and Gambela.

Nuu-chah-nulth

also: Nootka

A Wakashan language of the western coast of Vancouver Island. Internally diverse across 14 Nuu-chah-nulth First Nations; central to the early study of Pacific Northwest art and music.

Occitan

also: Lenga dʼòc, Provençal

A Romance language of southern France, the Aran valley in Catalonia, and a few alpine valleys of Italy. The language of the medieval troubadours; now endangered in everyday use.

Oirat

also: Western Mongolian, Dörbed

A western Mongolic language spoken across western Mongolia (Khovd, Bayan-Ölgii) and parts of Xinjiang in China. Sister to Kalmyk, with a separate writing tradition based on the Clear Script (Todo Bichig).

Oromo

also: Afaan Oromoo

A Cushitic Afroasiatic language and the most-spoken language of Ethiopia, with around 37 million speakers. The official working language of the Oromia Region, written in Latin-based Qubee.

Ossetian

also: Iron, Digor

An eastern Iranian language of the North Caucasus, the only living descendant of the ancient Scythian-Sarmatian languages. Two main varieties: Iron (the standard) and the more divergent Digor.

Papiamento

also: Papiamentu

An Iberian-based creole of the ABC Islands (Aruba, Bonaire, Curaçao). Co-official with Dutch on Curaçao and Aruba; one of the few Caribbean creoles with a fully developed literary standard.

Pashayi

also: Pashai

A cluster of Indo-Aryan languages spoken in eastern Afghanistan, especially Kapisa, Laghman, and Nangarhar. Linguistically a Dardic language, sister to Kashmiri rather than to Hindi.

Pemón

also: Arekuna, Kamarakoto

A Cariban language of the Gran Sabana and adjacent highlands of Venezuela, Guyana, and northern Brazil. Around 30,000 speakers; the language of the Pemón people of Mount Roraima.

Pitjantjatjara

also: Western Desert language, Anangu

A Western Desert language of central Australia, spoken by the Anangu communities around Uluṟu and the APY Lands. One of the most-spoken Aboriginal Australian languages, with active bilingual education.

Plautdietsch

also: Mennonite Low German

A Low German variety preserved by Mennonite diaspora communities since the 16th century. Spoken across colonies in Mexico, Paraguay, Bolivia, Belize, and the Canadian Prairies.

Qʼeqchiʼ

also: Kekchi

A K'ichean Mayan language spoken in central Guatemala and southern Belize. The fastest-growing Mayan language, expanding across former Q'eqchi'-monolingual territory.

Romansh

also: Rumantsch

A Rhaeto-Romance language and the fourth official language of Switzerland, spoken across the canton of Graubünden. Around 60,000 speakers; the federal standard (Rumantsch Grischun) bridges five regional varieties.

Sakha

also: Yakut, Sakha tyla

The Siberian Turkic language of the Sakha Republic in north-eastern Russia. The northernmost Turkic language and one of the most linguistically divergent, with extensive Mongolic and Tungusic influence.

Sango

A Ngbandi-derived contact language and the official lingua franca of the Central African Republic. Used by virtually the entire population alongside French; structurally simplified through extensive multilingual contact.

Sara

also: Sara-Bagirmi cluster, Ngambay, Sar, Mbay

A cluster of closely related Central Sudanic (Nilo-Saharan) varieties across southern Chad and adjoining parts of the Central African Republic. The Sara group is the largest indigenous ethnolinguistic complex of Chad; Ngambay alone counts around 1.5 million speakers, with Sar, Mbay, and other varieties extending the dialect chain.

Scottish Gaelic

also: Gàidhlig

A Goidelic Celtic language closely related to Irish, spoken across the Outer Hebrides and parts of the Highlands of Scotland. Around 60,000 speakers, with sustained revitalisation through Gaelic-medium education.

Serbian

also: Srpski, Standard Serbian

The Belgrade-based standard of Serbian. Part of the Štokavian dialect continuum that underlies all four BCMS national standards (Bosnian, Croatian, Montenegrin, Serbian); written in both Cyrillic and Latin scripts.

Setswana

also: Tswana

A Sotho-Tswana Bantu language and the national language of Botswana. Closely related to Sesotho, with substantial speaker communities in South Africa's North West and Northern Cape provinces.

Shipibo-Konibo

also: Shipibo

The largest Panoan language, of the Ucayali river basin in the Peruvian Amazon. Around 30,000 speakers; well known internationally through Shipibo-Konibo art and the ayahuasca traditions.

Shona

also: chiShona

The largest Bantu language of Zimbabwe and an official language of the country. Internally varied, with Karanga, Zezuru, and Manyika as the main dialect groups.

Shoshone

A Numic Uto-Aztecan language of the Great Basin, spoken by Shoshone communities across Wyoming, Idaho, Nevada, Utah, and California.

Shughni

also: Shughni-Rushani

The largest Pamir Iranian language, spoken across Gorno-Badakhshan in Tajikistan and the Afghan Pamir. Forms a dialect cluster with Rushani, Bartangi, and Roshorvi.

Sidamo

also: Sidaama Afoo

A Cushitic Afroasiatic language of southern Ethiopia, spoken by the Sidama people around Hawassa. Around three million speakers; written in a Latin-based orthography.

Sindhi

An Indo-Aryan language of Sindh province in Pakistan, with substantial communities in India after Partition. Around 25 million speakers; written in both Perso-Arabic and Devanagari scripts.

Sinhala

also: Sinhalese

An Indo-Aryan language and the majority language of Sri Lanka, descended from settlers from northern India in the 1st millennium BCE. Geographically far removed from its closest Indo-Aryan relatives.

Somali

also: Soomaali

A Cushitic Afroasiatic language and the official language of Somalia. Also widely spoken in Somaliland, Djibouti, the Somali Region of Ethiopia, and north-eastern Kenya.

Songhay

also: Songhai, Sonrai

A Nilo-Saharan language cluster of the Niger river bend, the language of the medieval Songhay empire. Around 4 million speakers across Mali, Niger, and Burkina Faso.

Soninke

A Mande Niger-Congo language of the historical Ghana Empire region. Spoken across Mali, Mauritania, Senegal, and the Gambia, with a long oral and Ajami-script tradition.

Sranan Tongo

also: Sranan

An English-based creole of Suriname, used as the country's lingua franca alongside Dutch. Strong substrate from Gbe and Akan languages of the West African slave trade.

Standard Tibetan

also: Lhasa Tibetan, Bhö-skad

The Lhasa-based standard of Tibetan, used across the Tibet Autonomous Region of China, parts of Sichuan, Qinghai, and the Tibetan diaspora. Tibetan Buddhism's primary liturgical language for centuries.

Sukuma

also: Kisukuma

A Bantu language and the largest indigenous language of Tanzania by first-language speakers, used across the Lake Victoria region around Mwanza. Around 8 million speakers.

Sundanese

also: Basa Sunda

A Malayo-Polynesian language of western Java, the second-largest regional language of Indonesia after Javanese. Around 40 million speakers, with its own honorific register system.

Susu

also: Soso, Sosoxui

A Mande Niger-Congo language and the main lingua franca of coastal Guinea, including the capital Conakry. Around 1.5 million first-language speakers.

Svan

also: Lushnu Nin

The most divergent Kartvelian language, spoken in the high-mountain Svaneti region of north-western Georgia. Considered to have split from Proto-Kartvelian several thousand years ago.

Tahitian

also: Reo Tahiti

A Polynesian language and one of the official languages of French Polynesia. The largest of the Tahitic languages, with around 70,000 speakers across French Polynesia and the diaspora.

Talysh

A northwestern Iranian language straddling the Iran-Azerbaijan border in the Lankaran region. The largest northern variety is in Azerbaijan; the southern variety in Iran is more conservative.

Tatar

also: Tatarça, Volga Tatar

A Kipchak Turkic language of the Volga-Ural region of Russia, centred on Tatarstan and Kazan. The largest minority language of European Russia.

Tigre

also: Xasa

A Semitic Afroasiatic language of the Eritrean lowlands and adjoining eastern Sudan. Around 1 million speakers; closely related to Tigrinya but mutually unintelligible.

Tigrinya

An Ethiopic Semitic language and the most-spoken language of Eritrea, also widely used in northern Ethiopia (Tigray). Written in the Ge'ez script.

Tiwi

A non-Pama-Nyungan language of the Tiwi Islands north of Darwin, the only living member of its family. One of the most morphologically complex languages on record, with a famously elaborate noun-classification system.

Tlingit

also: Lingít

A Na-Dené language of the Tlingit people of south-eastern Alaska and adjacent British Columbia and Yukon. Famous for its complex consonant system, including ejective fricatives.

Tongan

also: Lea Faka-Tonga

A Polynesian language and the official language of Tonga. Distinct among Polynesian languages for retaining a definite/indefinite article system inherited from Proto-Polynesian.

Torres Strait Creole

also: Yumplatok, Broken

An English-based creole spoken across the Torres Strait Islands and adjacent communities of far northern Queensland. Closely related to Tok Pisin and Bislama within the Melanesian Pidgin family.

Toubou

also: Tubu, Tebu, Tibbu, Teda, Daza, Gorane

The Saharan Nilo-Saharan cluster of the central Sahara: Teda in the Tibesti and southern Libya, Daza further south in Borkou and Kanem. Sister branch to Kanuri within Saharan; commonly treated as a single Toubou cluster of two closely related varieties, with around half a million speakers across northern Chad, north-eastern Niger, and south-eastern Libya.

Tshivenda

also: Venda, Luvenda

A Bantu language of the Soutpansberg region in Limpopo, South Africa, and adjoining south-western Zimbabwe. One of the eleven official languages of South Africa; around 1.3 million speakers.

Tsonga

also: Xitsonga

A Bantu language of the Limpopo lowveld, spoken across north-eastern South Africa, southern Mozambique, and south-eastern Zimbabwe. Around 7 million speakers.

Tucano

also: Tukano, Dahsea

A Tukanoan language of the Vaupés region on the Brazil-Colombia border. Used as a regional lingua franca; the most-spoken Eastern Tukanoan variety.

Turkmen

also: Türkmençe

An Oghuz Turkic language spoken across Turkmenistan and adjacent communities in Iran and Afghanistan. The southernmost Central Asian Turkic language.

Tuvan

also: Tyva dyl

A Siberian Turkic language of the Tuva Republic in southern Siberia. Famous outside Russia for the throat-singing tradition of khoomei, deeply tied to the language's pitch and consonant system.

Umbundu

also: South Mbundu

A Bantu language and the most-spoken indigenous language of Angola, used across the central highlands around Huambo. Around 6 million speakers.

Urdu

The Hindustani national language of Pakistan and one of India's scheduled languages. Linguistically the same spoken language as Standard Hindi, distinguished by its Perso-Arabic script and Persian-Arabic literary register.

Uyghur

also: Uyghurche

A Karluk Turkic language of Xinjiang in north-western China. Closely related to Uzbek; written in a Perso-Arabic-based script and the medium of a long literary tradition along the Silk Road.

Uzbek

also: Oʻzbekcha

A Karluk Turkic language and the most-spoken Turkic language after Turkish. The official language of Uzbekistan, with substantial communities across Central Asia, Afghanistan, and the Caucasus.

Wakhi

also: Wakhi Pamir

A Pamir Iranian language spoken across the Wakhan Corridor where Afghanistan, Tajikistan, Pakistan, and China meet. One of the most archaic Iranian languages, retaining features of Old Iranian lost everywhere else.

Warlpiri

A Pama-Nyungan language of the Tanami Desert in the Northern Territory. Spoken across communities including Yuendumu and Lajamanu; the subject of substantial linguistic documentation.

Wayampi

also: Wajãpi

A Tupi-Guaraní language of the Oyapock river basin, straddling French Guiana and northern Brazil. Around 1,500 speakers; one of French Guiana's recognised regional languages.

Wayuu

also: Wayuunaiki, Guajiro

The largest Arawakan language, spoken across the La Guajira peninsula straddling Colombia and Venezuela. Around 400,000 speakers, with active revitalisation and bilingual education.

Welsh

also: Cymraeg

A Brittonic Celtic language and the most-spoken Celtic language. Co-official in Wales, with around 900,000 speakers and a strong contemporary literary and broadcast presence.

Western Apache

also: Ndee biyáti

A Southern Athabaskan (Na-Dené) language of east-central Arizona, closely related to Navajo. Around 14,000 speakers across the White Mountain and San Carlos Apache reservations.

Wolaita

also: Wolaytta, Welayta

An Omotic Afroasiatic language of southern Ethiopia, spoken around Wolaita Sodo. Around 2 million speakers; one of the larger Omotic languages, a family found nowhere outside Ethiopia.

Wolof

An Atlantic Niger-Congo language and the lingua franca of Senegal and the Gambia. The most-spoken African language of urban Senegal, used widely alongside French in Dakar.

Wu (Shanghainese)

also: Shanghai Wu, Shanghainese

The largest Wu Chinese variety, spoken across Shanghai and the lower Yangtze. The second-largest Sinitic language by speakers after Mandarin, distinct enough that it is mutually unintelligible with Mandarin.

Xavante

also: A'uwẽ

A Macro-Jê language of the Xavante people of central Brazil, in the cerrado of Mato Grosso. Around 20,000 speakers, with active bilingual education in Xavante communities.

Xhosa

also: isiXhosa

A Nguni Bantu language of the Eastern Cape and one of the official languages of South Africa. Famous for its 18 distinct click consonants borrowed from neighbouring Khoisan languages.

Yaghnobi

also: Yagnobi

An eastern Iranian language of the Yaghnob valley in Tajikistan. The only living descendant of Sogdian, the lingua franca of the medieval Silk Road; critically endangered.

Yanomami

also: Yanomamö

A cluster of Yanomaman languages spoken across the Brazil-Venezuela Amazonian border. Around 30,000 speakers, with internal varieties (Yanomami, Sanumá, Ninam, Yanam) often considered separate languages.

Yiddish

A High German language with substantial Hebrew, Aramaic, and Slavic vocabulary, written in the Hebrew alphabet. The historical vernacular of Ashkenazi Jewry; now spoken mainly by Hasidic communities in the United States, Israel, and Western Europe.

Yolŋu Matha

also: Dhuwal-Dhuwala, Yolngu languages

A cluster of closely-related Pama-Nyungan languages spoken across Yolŋu communities of north-east Arnhem Land. Centred on Yirrkala and Galiwinʼku.

Yucatec Maya

also: Maya, Mayaʼ tʼaan

The most-spoken Mayan language, used across the Yucatán peninsula by around 800,000 people. Direct linguistic descendant of the language used in the Maya hieroglyphic inscriptions of the Classic period.

Zarma

also: Djerma, Zerma, Zaberma

The Songhay-Zarma variety of south-western Niger, centred on Niamey and the Tillabéri–Dosso belt. The second national language of Niger after Hausa, with around 3.7 million speakers; treated by most descriptive work as a distinct standardized language within the wider Songhay cluster.