Vulgar Latin

Vulgar Latin, as in this political graffito at Pompeii, was the speech of ordinary people of the Roman Empire — different from Latin as written by the Roman elites.
Vulgar Latin (in Latin, sermo vulgaris, "folk speech") is a blanket term covering the popular dialects and sociolects of the Latin language which diverged from each other in the early Middle Ages, evolving into the Romance languages by the 9th century. The terms Vulgar Latin and Late Latin are often used synonymously. Vulgar Latin can also refer to vernacular speech from other periods, including the Classical period, in which case it may also be called Popular Latin.

This spoken Latin came to differ from literary Latin in its pronunciation, vocabulary, and grammar, though some of its features did not appear until the late Empire. Other features are likely to have been present much earlier in spoken Latin.

During the Middle Ages, Vulgar Latin coexisted with a more cultivated form of the language used by scholars, scribes and the clergy in formal settings, but lacking any native speakers, called Medieval Latin.

What was Vulgar Latin?

The Cantar de Mio Cid (Song of my Cid) is the earliest text of reasonable length that exists in Medieval Spanish, and marks the beginning of this language as distinct from Vulgar Latin
The name "vulgar" simply means "folk", derived from the Latin word vulgaris, meaning "of people". "Vulgar Latin" has a variety of meanings:

  1. Variation within Latin (socially, geographically, and chronologically) that differs from the Classical literary standard in an age when most people were illiterate and the primary method of language transmission between people was oral. This typically excludes the language of the more educated upper classes, which, although it does include variation, comes closest to the standard.
  2. The spoken Latin of the Roman Empire. Classical Latin represents the literary register of Latin, based on the model of ancient literary Greek. It represented a selection from a variety of spoken forms. The Latin brought by Roman soldiers to Gaul, Iberia or Dacia was not identical to the Latin of Cicero, and differed from it in vocabulary, and later in syntax and grammar as well. By this definition, Vulgar Latin was a spoken language and Classical Latin was used for writing, with the later style of literary Latin being slightly different, showing greater influence from the vulgar dialects, compared to the earlier "classical" standards.
  3. The hypothetical ancestor of the Romance languages ("Proto-Romance"), which cannot be directly known, except from a few graffiti inscriptions. Proto-Romance is a hypothetical vernacular derived from Latin that had undergone important and varying sound shifts and other changes which can be reconstructed from the changes evident in its descendants, the Romance languages.
  4. "Vulgar Latin" is sometimes used to describe the grammatical changes found in some Late Latin texts, such as the 4th-century Itinerarium Egeriae, Egeria's account of her journey to Palestine and Mt. Sinai; or the works of St Gregory of Tours. Since written documentation of Vulgar Latin forms is scarce, these works are invaluable to philologists, mainly because of the occasional presence of variations or errors in spelling, providing some evidence of spoken usage during the period in which they were written.

Most definitions of "Vulgar Latin" define it as the spoken, rather than written, language. It is important to remember that "Vulgar Latin" is an abstract term, not the name of any particular dialect. The term itself predates the field of sociolinguistics, and research into the history of Vulgar Latin was in some ways a precursor to sociolinguistics. The latter studies language variation associated with social variables, and tends not to view variation as a strict standard–non-standard dichotomy (for example, Classical–Vulgar Latin) but as a large pool of variations. In light of fields such as sociolinguistics, dialectology, and historical linguistics, Vulgar Latin can be seen as nearly synonymous to "language variation in Latin" (socially, geographically, and chronologically) that excludes the speech, and especially writings, of the more educated upper classes. It is because there are so many types of variation that definitions of Vulgar Latin differ so much.


Because the daily speech of Latin speakers was not transcribed, Vulgar Latin can only be studied indirectly through other methods. Our knowledge of Vulgar Latin comes from three chief sources. First, the comparative method reconstructs the underlying forms from the attested Romance languages, and notes where they differ from Classical Latin. Second, various prescriptive grammar texts from the Late Latin period condemn linguistic errors that Latin speakers were liable to commit, giving us an idea of how Latin speakers spoke. Third, the solecisms and non-Classical usages that occasionally are found in Late Latin texts also reveal, in part, the author's spoken language.

Some literary works with a lower register of Latin also provide a glimpse into the world of early Vulgar Latin. The works of Plautus and Terence, being comedies with many characters who were slaves, preserve some early basilectal Latin features, as does the recorded speech of freedmen in the Cena Trimalchionis by Petronius Arbiter.

For many centuries after the fall of the Roman Empire in the West, Vulgar Latin continued to coexist with a written form of Late Latin, nowadays referred to as Medieval Latin; for when speakers of Romance vernaculars set out to write with correct grammar and spelling, they attempted to emulate the norms of Classical Latin. This scholarly Latin, "frozen" by Justinian's codifications of Roman law on the one hand, and by the Catholic Church on the other, was eventually unified by the medieval copyists; it continued to exist as a Dachsprache in the Middle Ages, and a lingua franca well beyond them.

Vulgar Latin developed differently in the various provinces of the Roman Empire, gradually giving rise to such modern languages as French, Catalan, Italian, Spanish, Portuguese, and Romanian. Although the official language in these areas was Latin, Vulgar Latin was popularly spoken until the new localized forms diverged sufficiently from Latin, thus emerging as separate languages. However, despite the widening gulf between the spoken and written Latin, throughout the imperial era and until the 8th century CE, it was not significant enough as to make them mutually unintelligible. József Herman states:

It seems certain that in the sixth century, and quite likely into the early parts of the seventh century, people in the main Romanized areas could still largely understand the biblical and liturgical texts and the commentaries (of greater or lesser simplicity) that formed part of the rites and of religious practice, and that even later, throughout the seventh century, saints' lives written in Latin could be read aloud to the congregations with an expectation that they would be understood. We can also deduce however, that in Gaul, from the central part of the eighth century onwards, many people, including several of the clerics, were not able to understand even the most straightforward religious texts

József Herman, Vulgar Latin

Indeed, at the third Council of Tours in 813, priests were ordered to preach in the vernacular language — either in the rustica lingua romanica (Vulgar Latin), or in the Germanic vernaculars — since the common people could no longer understand formal Latin. Within a generation, the Oaths of Strasbourg (842), a treaty between Charlemagne's grandsons Charles the Bald and Louis the German, was proffered and recorded in a language that was already distinguished from Latin. Consider the excerpt below:

Extract of the Oaths
Pro Deo amur et pro christian poblo et nostro commun salvament, d'ist di in avant, in quant Deus savir et podir me dunat, si salvarai eo cist meon fradre Karlo et in ajudha et in cadhuna cosa, si cum om per dreit son fradra salvar dift, in o quid il me altresi fazet, et ab Ludher nul plaid numquam prindrai, qui, meon vol, cist meon fradre Karle in damno sit.

For the love of God and for Christendom and our common salvation, from this day onwards, as God will give me the wisdom and power, I shall protect this brother of mine Charles, with aid or anything else, as one ought to protect one's brother, so that he may do the same for me, and I shall never knowingly make any covenant with Lothair that would harm this brother of mine Charles.

From this point on, the Latin vernaculars began to be treated as separate languages in practice, developing local norms and orthographies of their own, and "Vulgar Latin" ceases to be a useful term.


Classical Latin Vulgar Latin English
bellum *guerra war
cogitare pensare think
edere manducare eat
emere comparare buy
equus caballus horse
feles catta cat
hortus *gardinus garden
ignis focus fire
ludere iocari play
omnis totus all
os bucca mouth
pulcher bellus beautiful
urbs civitas city
verbum parabola word
vesper sera evening

Certain words from Classical Latin were dropped from the vocabulary. Classical equus, "horse", was consistently replaced by caballus "nag" (but note Romanian iapă, Sardinian èbba, Spanish yegua, Catalan euga and Portuguese égua all meaning "mare" and deriving from Classical equa).

A sample of words that are exclusively Classical, and those that were productive in Romance, is to be found in the table to the right.

The vocabulary changes affected even the basic grammatical particles of Latin; there are many that vanish without a trace in Romance, such as an, at, autem, donec, enim, ergo, etiam, haud, igitur, ita, nam, postquam, quidem, quin, quod, quoque, sed, utrum and vel.

Verbs with prefixed prepositions frequently displaced simple forms. The number of words formed by such suffixes as -bilis, -arius, -itare and -icare grew apace. These changes occurred frequently to avoid irregular forms or to regularise genders.

On the other hand, since Vulgar Latin and Latin proper were for much of their history different registers of the same language, rather than different languages, some Romance languages preserve Latin words that were lost in most others. For example, Italian ogni ("each/every") preserves Latin omnes. Other languages use cognates of totus for the same meaning; for example tutto in Italian, tudo/todo in Portuguese, todo in Spanish, tot in Catalan, tout in French and tot in Romanian.

Sometimes, a classical Latin word was kept alongside a Vulgar Latin word. In Vulgar Latin, classical caput, "head", yielded to testa (originally "pot") in some forms of western Romance, including French and Italian. But Italian, French and Catalan kept the Latin word under the form capo, chef, and cap which retained many metaphorical meanings of "head", including "boss". The Latin word with the original meaning is preserved in Romanian cap, together with țeastă, both meaning 'head' in the anatomical sense. Southern Italian dialects likewise preserve capo as the normal word for "head". Spanish and Portuguese have cabeza/cabeça, derived from *capetia, a modified form of caput, while in Portuguese testa was retained as the word for "forehead".

Frequently, words borrowed directly from literary Latin at some later date, rather than evolved within Vulgar Latin, are found side by side with the evolved form. The (lack of) expected phonetic developments is a clue that one word has been borrowed. For example, Vulgar Latin fungus, "fungus, mushroom", which became Italian fungo, Catalan fong, and Portuguese fungo, became hongo in Spanish, showing the f > h shift that was common in early Spanish (cf. filius > Spanish hijo, "son", facere > Spanish hacer, "to do"). But Spanish also had fungo, which by its lack of the expected sound shift shows that it was borrowed directly from Latin.

Vulgar Latin contained a large number of words of foreign origin not present in literary texts. Many works on medicine were written and distributed in Greek, and words were often borrowed from these sources. For example, gamba ( 'knee joint' ), originally a veterinary term only, replaced the classical Latin word for leg (crus) in most Romance languages. (cf. Fr. jambe, It. gamba). Cooking terms were also often borrowed from Greek sources, a calque based on a Greek term was ficatum (iecur) (goose's liver fattened with figs), with the participle ficatum becoming the common word for liver in Vulgar Latin (cf. Sp. higado, Fr. foie, Pt. fígado, It. fegato, Romanian ficat). Important religious terms were also drawn from religious texts written in Greek, such as episcopus (bishop), presbyter (priest), martyr etc. Words borrowed from Gaulish include caballus (horse) and carrus (chariot).

The Reichenau Glosses

Insight into the vocabulary changes of late Vulgar Latin in France can be seen in the Reichenau Glosses, written on the margins of a copy of the Vulgate Bible, suggesting that the 4th-century Vulgate words were no longer readily understood in the 8th century, when the glosses were likely written. These glosses demonstrate typical vocabulary changes in Gallo-Romance.

The Reichenau Glosses show vocabulary replacement:

  • arena > sabulo (French sable, Italian sabbia, "sand"; but c.f. Spanish arena, Galician area, Portuguese areia, Italian rena)
  • canere > cantare (Portuguese/Galician/Spanish/Catalan cantar, French chanter, Italian cantare, Romanian cânta, "to sing", frequentative of canere)
  • mares (nom. mas) > masculi (French mâle, Italian maschio, Spanish macho, "male", diminutive of mas)
  • liberos > infantes (Catalan infants, "children"; French enfants, "children"; Italian infante, "infant"; Portuguese infante, "prince"; Spanish infante, "child" but as a literary word also "prince")
  • hiems > hibernus (French hiver, Italian inverno, Spanish invierno, Portuguese inverno, Catalan invern, Romanian iarnă, "winter", adjective of heims)
  • forum > mercatum (French marché, Italian mercato, Spanish mercado; "market")
  • lamento > ploro (French pleurer, Spanish llorar, Portuguese chorar, Catalan plorar, "to weep")
  • ager > campus (French champ, Italian/Spanish/Portuguese campo, Romanian câmp)
  • caseum > formaticum (French fromage, Italian formaggio, Catalan formatge, "cheese", post-classical, from formare, "to form"; but cf. Italian cacio, Portuguese queijo, Spanish queso, Romanian caş)
  • flare > suflare (French souffler, Italian soffiare, Romanian sufla, Spanish soplar, "to blow", from flare with prefix sub)
  • ita > sic (Italian , Spanish , Portuguese sim, "yes")
  • pulcra > bella (French beau, Italian/Spanish bello, Portuguese belo, Catalan bell, "beautiful", diminutive of bonus, "good")
  • umo > terra (French terre, Italian/Portuguese terra, Spanish tierra, "ground")
  • lebes > chaldaria (French chaudière, Italian calderone, Spanish caldera, "cauldron", from calidus, "warm")
  • necetur > occidetur (Italian uccidere, Romanian ucide, "to kill")
  • pingues > grassi (French gros, Italian grasso, Romanian gras, "fat", post-classical, of uncertain origin)
  • ungues > ungulas (French ongle, Italian unghia, Spanish uña, Portuguese unha, Romanian unghie, Catalan ungla, "fingernail", diminutive of unguis)
  • vim > fortiam (French force, Italian forza, Spanish fuerza, Portuguese força, "force", post-classical, from fortis, "strong")
  • si vis > si voles (French tu veux, Italian tu vuoi, Catalan tu vols, "you want", 2nd personal singular of *volere, "to want", regularized form of velle)
  • oppidis > civitatibus (French cité, Italian città, Portuguese cidade, Spanish ciudad, Catalan ciutat, Romanian cetate, "city")

Grammatical changes:

  • optimos > meliores (Portuguese melhor, Galician mellor, Spanish mejor, Catalan millor, French meilleur, Italian migliore, "best", originally "better"; but cf. Spanish óptimo, Portuguese ótimo, Italian ottimo, French optimal, with the sense of "excellent" or "optimal")
  • saniore > plus sano (French plus sain, Italian più sano)

Germanic loan words:

  • turbas > fulcos (French foule, Italian folla, "mob"; but cf. Spanish/Portuguese/Catalan turba)
  • cementariis > mationibus (French maçon, Spanish masón, "stonemason")
  • galea > helme (French heaume, Italian/Portuguese elmo, Catalan elm, Spanish yelmo, "helmet")
  • coturnix > quaccola (French caille, Italian quaglia, "quail"; but cf. Spanish/Portuguese codorniz)
  • furvus > brunus (French/Romanian brun, Catalan bru, Portuguese/Italian bruno, "brown/dark")

And words whose meaning has changed:

  • in ore (nom. os) > in bucca (Portuguese/Spanish/Catalan boca, French bouche, Italian bocca, "mouth", originally "cheek")
  • emit > comparavit (Italian comprare, Spanish comprar, Portuguese comprar, Romanian cumpăra, Catalan comprar, "to buy", originally "to arrange, settle")
  • rerum (nom. res) > causarum (French chose, Italian/Spanish/Catalan cosa, Portuguese coisa/cousa, "thing", originally "cause")
  • rostrum > beccus (French bec, Italian becco, Catalan bec, Spanish pico, Portuguese bico, "beak", post-classical, from Gaulish; but cf. Spanish/Galician rostro, and Portuguese rosto, "face")
  • femur > coxa (Portuguese, Galician and Old Spanish coxa, French cuisse, Italian coscia, Catalan cuixa, Romanian coapsă, "thigh", originally "hip", first attested in Silver Latin)


Evidence of changes

Evidence of phonological changes can be seen in the late 3rd century Appendix Probi, a collection of glosses prescribing correct classical Latin forms for certain vulgar forms. These glosses describe:

  • a process of syncope, the loss of unstressed vowels ("masculus non masclus");
  • the merger between long /e/ and short /i/ ("vinea non vinia");
  • the levelling of the distinction between /o/ and /u/ ("coluber non colober") and /e/ and /i/ ("dimidius non demedius");
  • regularization of irregular forms ("glis non glirus");
  • regularization and emphasis of gendered forms ("pauper mulier non paupera mulier");
  • levelling of the distinction between /b/ and /v/ between vowels ("bravium non brabium");
  • the substitution of diminutives for unmarked words ("auris non oricla, neptis non nepticla")
  • the loss of syllable-final nasals ("mensa non mesa") or their inappropriate insertion as a form of hypercorrection ("formosus non formunsus").
  • the loss of /h/, both initially ("hostiae non ostiae", although note that this is a hypercorrection) and within the word ("adhuc non aduc").

Many of the forms castigated in the Appendix Probi proved to be the productive forms in Romance; oricla (Classical Latin auricula) is the source of French oreille, Catalan orella, Spanish oreja, Italian orecchia, Romanian ureche, Portuguese orelha, "ear", not the Classical Latin form.


Significant sound changes affected the consonants of Vulgar Latin:

  • Final -t, which occurred frequently in verb conjugations, and final -s, in nouns, were dropped.
  • The scansion in Latin poetry suggests that the letter -m may have been pronounced very softly in classical Latin, being either voiceless or merely a silent letter that marked the nasalisation of the vowel which preceded it. It continued, however, to be consistently written in the literary language. In Vulgar Latin, these nasal vowels disappeared completely (the nasal vowels of French and Portuguese developed from other sources).
  • Palatalisation ("softening") of Latin c, t and often g before the front vowels e and i was almost universal in Vulgar Latin. The initial results were likely affricates — [ts], [tʃ], [dz], [dʒ] —, possibly after a palatal intermediate stage, but these eventually became plain fricatives in most languages. Thus Latin caelum (sky, heaven), pronounced [kailu(m)] with an initial [k], became Italian cielo, [tʃɛlo], Romanian cer [tʃer], Spanish cielo, [θjelo] or [sjelo], French ciel, [sjɛl], Catalan cel, [sɛɫ], and Portuguese céu, [sɛu]. The only Romance languages that were not affected were Dalmatian and some varieties of Sardinian.

Several other consonants were "softened", especially in intervocalic position (an instance of diachronic lenition):

  • Single voiceless plosives became voiced: -p-, -t-, -c--b-, -d-, -g-. In a few languages such as Spanish, these were further lenited to approximants, [β̞], [ð̞], [ɣ].
  • The plain sibilant -s- [s] was also voiced to [z] between vowels, although in many languages its spelling has not changed. (In Spanish, intervocalic [z] was later devoiced back to [s].)
  • The double plosives became single: -pp-, -tt-, -cc-, -bb-, -dd-, -gg--p-, -t-, -c-, -b-, -d-, -g- in most languages. Some languages of Italy have retained the distinction between double and single consonants, although they have also tended to add to the number of geminates. In French spelling, double consonants are merely etymological.
  • The double sibilant -ss- [sː] also became phonetically single [s], although in many languages its spelling has not changed.
  • The voiced labial consonants -b- and -v- (pronounced [b] and [w], respectively, in classical Latin) both shifted to the fricative [β] between vowels. This is clear from the orthography; in medieval times, the spelling u/v is often used for what had been a b in classical Latin, or the two spellings are used interchangeably. In many Romance languages (Italian, French, etc.), this fricative later developed into a [v] sound; but in others (Spanish, Catalan, etc.) it eventually merged back with [b] into a common phoneme.
  • The approximant j, which in Latin was merely an allophone of the vowel i, became a fricative, and in most languages split into an independent phoneme.
  • In Western Romance, an epenthetic vowel was inserted at the beginning of any word that began with s and another consonant: thus Latin spatha (sword) becomes Portuguese and Spanish espada, Catalan espasa, French épée. Italian preserved euphony rules by adding the epenthesis in the preceding article when necessary instead, so it preserves feminine spada as la spada, but changes the masculine *il spaghetto to lo spaghetto.

Stressed vowels

Evolution of the stressed vowels in Vulgar Latin
Classical Vulgar
Acad.1 Roman IPA Acad.1
ă short A [a] [a] a
ā long A [aː]
ĕ short E [ɛ] [ɛ] ę
ē long E [eː] [e]
ĭ short I [ɪ]
ī long I [iː] [i] i
ŏ short O [ɔ] [ɔ] ǫ
ō long O [oː] [o]
ŭ short V [ʊ]
ū long V [uː] [u] u
ў short Y [y] > [ɪ] [e]
ȳ long Y [yː] > [iː] [i] y, i
æ AE [ai] > [ɛ] [ɛ] ę
œ OE [oi] > [e] [e]
au AV [au] [au] > [o] au,
1 Traditional academic transcription in Latin and Romance studies, respectively.

One profound change that affected Vulgar Latin was the reorganisation of its vowel system. Classical Latin had five short vowels, ă, ĕ, ĭ, ŏ, ŭ, and five long vowels, ā, ē, ī, ō, ū, each of which was an individual phoneme (see the table in the right, for their likely pronunciation in IPA), and three diphthongs, ae, oe, and au (four according to some authors, including ui). There were also long and short versions of y, representing the rounded vowel [y(ː)] in Greek borrowings, which however probably came to be pronounced [i(ː)] even before Romance vowel changes started.

There is evidence that in the imperial period all the short vowels except a differed by quality as well as by length from their long counterparts. So, for example ē was pronounced close-mid /eː/ while ĕ was pronounced open-mid /ɛ/, and ī was pronounced close /iː/ while ĭ was pronounced near-close /ɪ/. The diphthongs ae and oe, pronounced /ai/ and /oi/ in earlier Latin, had also begun their monophthongisation to /ɛ/ and /e/, respectively. Oe was always a rare diphthong in Classical Latin; in Old Latin, oinos (one) regularly became unus.

As Vulgar Latin evolved, three main changes occurred in parallel. First, length distinctions were lost, so that for instance ă and ā came to be pronounced the same way. Second, the near-close vowels ĭ and ŭ became more open in most varieties of Vulgar Latin, merging with the long vowels ē and ō, respectively. As a result, Latin pira "pear" (fruit) and vēra "true", came to rhyme in most of its daughter languages: Italian, French, and Spanish pera, vera; Old French poire, voire. Similarly, Latin nux ("nut", acc. sing) and vōx (voice) become Italian noce, voce, Portuguese noz, voz, and French noix, voix (in some cases the quality of the vowel later changed again, because of regularising tendencies, or other extraneous influences).

There must have been some regional variation in pronunciation, since the Eastern Romance languages and the Southern Romance languages evolved differently. In Sardinian, for instance, ĭ and ŭ became more close, merging with their long counterparts ī and ū. Apart from Sardinian, which preserved the position of the Classical Latin vowels but lost phonemic vowel length, what happened to Vulgar Latin can be summarized as in the table to the right. More precisely, these mergers happened in most of western Europe, yielding the seven vowel system of Italo-Western-Romance.

In general, though, the ten-vowel system of Classical Latin (not counting the Greek letter y), which relied on phonemic vowel length, was newly modelled into one in which vowel length distinctions lost phonemic importance, and qualitative distinctions of height became more prominent. (Exceptions were Friulian, and some dialects of French, which have retained a contrast between long and short vowels.)

In Vulgar Latin, the stress on accented syllables became much more pronounced than in Classical Latin. This tended to cause unaccented syllables to become less distinct, while working further changes on the sounds of the accented syllables. The results of short o and e in stressed position proved to be unstable in several of the Romance languages, with a tendency to break up into diphthongs. Focus, "fireplace", became the general word in Vulgar Latin for "fire" (replacing ignis), but its short o sound became a diphthong — a different diphthong — in many languages:

  • Italian: fuoco
  • Spanish: fuego
  • French: feu (now no longer a diphthong but [fø])

In French and Italian, these changes occurred only in open syllables. Spanish, however, diphthongised in all circumstances, resulting in a simple five-vowel system in both stressed and unstressed syllables. Romanian shows diphthongisation of short e (fier from Latin ferrum, "iron") but not of short o (foc). In Portuguese, no diphthongisation occurred at all (ferro, fogo).

Some languages experienced further mergers, reducing the number of stressed vowels down from seven (to six in Romanian, to five in Sardinian and Spanish). On the other hand, later monophthongisations led to new vowel phonemes in some languages (such as [y], [œ], and [ø] in French), while nasalisation produced new phonemic nasal vowels in French and Portuguese.

Latin au was under some pressure to change in the Roman Republican period; a number of populist politicians adopted the spelling Clodius for the well known Roman name Claudius, but this change was not universal, and marked as basilectal well into the early Empire. Au was initially retained, but was eventually reduced in many languages to [o]. (Portuguese evolved only as far as [ou] until much more recently; Occitan and Romanian preserve [au] to this day.) The results of Latin ae were also subject to at least some early variation; French proie (spoils) presumes [e] rather than [ɛ] from Classical Latin praeda.

Unstressed vowels

There was more variability in the result of the unstressed vowels. Two main paths can be distinguished:

  • Languages like Italian or Spanish have largely retained the system of five unstressed vowels of Vulgar Latin, with pronunciations still close to what they would have been in Classical Latin, except for length.
  • In French, Portuguese, or Occitan, there has been more instability, with unstressed vowels changing pronunciation significantly (unstressed o, a[u], [ɔ] in Occitan; unstressed o, e[u], [i]/[ɨ] in Portuguese), some being reduced to a kind of schwa (unstressed final ae[ə] in French).

In Catalan, the process was similar to that of Portuguese in that the short Latin o turned into an open vowel, but short e eventually turned into a closed [e] in Western dialects (opposite to the pattern in the other Italo-Western languages), and a schwa in the Eastern ones. This schwa slowly evolved towards an open [ɛ], although in most of the Balearic Islands the schwa is maintained even today. Eastern dialects have some vocalic instability similar to that of Portuguese as well: unstressed [e] and [a] turn into a schwa (at some point of the evolution of the language, this change did not affect [e] in pre-stressed position, a pronunciation that can still be heard in part of the Balearics), and, except in most of Majorca, unstressed [o] and [u] merge into [u].


The Romance articles

It is difficult to place the point in which the definite article, absent in Latin but present in some form in all of the Romance languages, arose; largely because the highly colloquial speech in which it arose was seldom written down until the daughter languages had strongly diverged; most surviving texts in early Romance show the articles fully developed.

Definite articles formerly were demonstrative pronouns or adjectives; compare the fate of the Latin demonstrative adjective ille, illa, (illud), in the Romance languages, becoming French le and la, Catalan and Spanish el and la, and Italian il and la. The Portuguese articles o and a are ultimately from the same source. Sardinian went its own way here also, forming its article from ipse, ipsa (su, sa); some Catalan and Occitan dialects have articles from the same source. While most of the Romance languages put the article before the noun, Romanian has its own way, by putting the article after the noun, eg. lupul ("the wolf") and omul ("the man" — from lupus ille and *homo ille), a result of its membership in the Balkan linguistic union.

This demonstrative is used in a number of contexts in some early texts in ways that suggest that the Latin demonstrative was losing its force. The Vetus Latina Bible contains a passage Est tamen ille dæmon sodalis peccati ("The devil is a companion of sin"), in a context that suggests that the word meant little more than an article. The need to translate sacred texts that were originally in Greek, which had a definite article, may have given Christian Latin an incentive to choose a substitute. Aetheria uses ipse similarly: per mediam vallem ipsam ("through the middle of the valley"), suggesting that it too was weakening in force.

Another indication of the weakening of the demonstratives can be inferred from the fact that at this time, legal and similar texts begin to swarm with prædictus, supradictus, and so forth (all meaning, essentially, "aforesaid"), which seem to mean little more than "this" or "that". Gregory of Tours writes, Erat autem. . . beatissimus Anianus in supradicta civitate episcopus ("Blessed Anianus was bishop in that city.") The original Latin demonstrative adjectives were felt no longer to be specific enough. In less formal speech, reconstructed forms suggest that the inherited Latin demonstratives were made more forceful by being compounded with ecce (originally an interjection: "behold!"), which also spawned Italian ecco. This is the origin of Old French cil (*ecce ille), cist (*ecce iste) and ici (*ecce hic); Spanish aquel and Portuguese aquele (*eccu ille); Italian questo (*eccu iste), quello (*eccu ille) and obsolescent codesto (*eccu tibi iste); Spanish acá and Portuguese , (*ecce hic), Portuguese acolá (*ecce illic) and aquém (*ecce inde); Romanian acest (*ecce iste) and (*ecce ille) and many other forms.

On the other hand, even in the Oaths of Strasbourg, no demonstrative appears even in places where one would clearly be called for in all the later languages. (pro Deo amur — "for the love of God") Using the demonstratives as articles may have still been too slangy for a royal oath in the 9th century. Considerable variation exists in all of the Romance vernaculars as to their actual use: in Romanian, the articles can be suffixed to the noun, as in other members of the Balkan linguistic union and the North Germanic languages.

The numeral unus, una (one) supplies the indefinite article in all cases. This is anticipated in Classical Latin; Cicero writes cum uno gladiatore nequissimo ("with a most immoral gladiator"). This suggests that unus was beginning to supplant quidam in the meaning of "a certain" or "some" by the 1st century BCE.

Gender: loss of the neuter

The genders
The three grammatical genders of Classical Latin were replaced by a two-gender system in most Romance languages. In Latin, gender is partly a matter of inflection, i.e. there are different declensional paradigms associated with the masculine, the feminine, and the neuter, and partly a matter of agreement, i.e. nouns of a certain gender require forms of the same gender in adjectives and pronouns associated with them.

The loss of these final consonants led to a remodelling of the gender system. In Classical Latin, the endings -us and -um distinguished masculine from neuter nouns in the second declension; with both -s and -m gone, the neuters merged with the masculines, a process that is complete in Romance. By contrast, some neuter plurals such as gaudia, "joys", were re-analysed as feminine singulars. The loss of the final m was a process which seems to have begun by the time of the earliest monuments of the Latin language. The epitaph of Lucius Cornelius Scipio Barbatus, who died around 150 BCE, reads TAVRASIA CISAVNA SAMNIO CEPIT, which in Classical Latin would be Taurāsiam, Cisaunam, Samnium cēpit, "He captured Taurasia, Cisauna, and Samnium". (Note that in the Latin alphabet, the letters u and v, i and j were not distinguished until the early modern period. Upper-case u and j did not exist, while lower-case j and v were only graphic variations of i and u, respectively.)

The neuter gender of classical Latin was in most cases absorbed by the masculine both syntactically and morphologically. The syntactical confusion starts already in the Pompeian graffiti, e.g. cadaver mortuus for cadaver mortuum "dead body" and hoc locum for hunc locum "this place" (-us was normally a masculine ending, and -um a neuter ending). The morphological confusion shows primarily in the adoption of the nominative ending -us ( after -r) in the o-declension: in Petronius Arbiter, we find balneus for balneum "bath", fatus for fatum "fate", caelus for caelum "heaven", amphiteatrus for amphitheatrum "amphitheatre" and conversely the nominative thesaurum for thesaurus "treasure".

In modern Romance languages, the nominative s-ending has been abandoned and all substantives of the o-declension have an ending derived from -UM > -u/-o/: MURUM > Italian and Spanish muro, Catalan and French mur and CAELUM > Italian, Spanish cielo, French ciel, Catalan cel, Sardinian kelu. Old French still had -s in the nominative and in the accusative in both original genders (murs, ciels).

For some neuter nouns of the third declension, the oblique stem was the productive form in Romance; for others, the nominative/accusative form, which was identical in Classical Latin, was the one that survived. Evidence suggests that the neuter gender was under pressure well back into the imperial period. French (le) lait, Catalan (la) llet, Spanish (la) leche, Portuguese (o) leite, Italian (il) latte, and Romanian lapte(le) ("milk"), all derive from the non-standard but attested Latin nom./acc. neut. lacte or acc. masc. lactem. Note also that in Spanish the word became feminine, while in French, Portuguese and Italian it became masculine (in Romanian it remained neuter, lapte/lăpturi). Other neuter forms, however, were preserved in Romance; Catalan and French nom, Portuguese nome, and Italian nome ("name") all preserve the Latin nominative/accusative nomen, rather than the oblique stem form *nominem (which nevertheless produced Spanish nombre).

Typical Italian endings
Nouns Adj. & determiners
sing. plur. sing. plur.
m giardino giardini buono buoni
f donna donne buona buone
n uovo uova buono buone

Most neuter nouns had plural forms ending in -A or -IA; some of these were reanalysed as feminine singulars, such as gaudium ("joy"), plural gaudia; the plural form lies at the root of the French feminine singular (la) joie, as well as of Catalan and Occitan (la) joia (Italian la gioia is a borrowing from French); the same for lignum ("wood stick"), plural ligna, that originated the Catalan feminine singular noun (la) llenya, and Spanish (la) leña. Some Romance languages still have a special form derived from the ancient neuter plural which is treated grammatically as feminine: e.g. BRACCHIUM : BRACCHIA "arm(s)" > Italian (il) braccio : (le) braccia, Romanian braț(ul) : brațe(le). Cf. also Merovingian Latin ipsa animalia aliquas mortas fuerant.

Alternations such as l'uovo fresco ("the fresh egg") / le uova fresche ("the fresh eggs") in Italian are usually analysed as masculine in the singular and feminine in the plural, with an irregular plural in -a (heteroclisis). However, it is also consistent with their historical development to say that uovo is simply a regular neuter noun (< ovum, plural ova) and that the characteristic ending for words agreeing with these nouns is -o in the singular and -e in the plural. Thus, neuter nouns can arguably be said to persist in Italian, and also Romanian.

These formations were especially common when they could be used to avoid irregular forms. In Latin, the names of trees were usually feminine, but many were declined in the second declension paradigm, which was dominated by masculine or neuter nouns. Latin pirus ("pear tree"), a feminine noun with a masculine-looking ending, became masculine in Italian (il) pero and Romanian păr(ul); in French and Spanish it was replaced by the masculine derivations (le) poirier, (el) peral; and in Portuguese and Catalan by the feminine derivations (a) pereira, (la) perera. Fagus ("beech"), another feminine noun ending in -us, is preserved in some languages as a masculine, e.g. Romanian fag(ul) or Catalan (el) faig; other dialects have replaced it with its adjectival forms fageus or fagea ("made of beechwood"), whence Italian (il) faggio, Spanish (el) haya, and Portuguese (a) faia.

As usual, irregularities persisted longest in frequently used forms. From the fourth declension noun manus ("hand"), another feminine noun with the ending -us, Italian and Spanish derived (la) mano, Catalan (la) mà, and Portuguese (a) mão, which preserve the feminine gender along with the masculine appearance.

Except for the Italian and Romanian heteroclitic nouns, other major Romance languages have no trace of neuter nouns, but all have vestigial, semantically neuter pronouns. French: celui-ci, celle-ci, ceci; Spanish: éste, ésta, esto (all meaning "this"); Italian: gli, le, ci ("to him", "to her", "to it"); Catalan: ho, açò, això, allò ("it", this, this/that, that over there); Portuguese: todo, toda, tudo ("all of him", "all of her", "all of it"); Venetian: 'sto qua, 'sta qua, questo (meaning "this") and qûeło là, qûeła là, queło=queła (meaning "that").

In Spanish, a three-way contrast is also made with the definite articles el, la, and lo. The last is used with nouns denoting abstract categories: lo bueno, literally 'the good' or 'that which is good', from bueno: good; "lo importante", i.e. that which is important. "¿Sabes lo tarde que es?", literally "Do you know 'the late' that it is?", or more idiomatically "Do you know how late it is?", from tarde: late. This is traditionally interpreted as the existence of a neuter gender in Spanish, although no morphological distinction is made anywhere else but in the singular definite article.

Some varieties of Astur-Leonese maintain endings for the three genders such as follows: bonu, bona, bono ("good").

The loss of the noun case system

Classical Latin
Nominative: rosa
Accusative: rosam
Genitive: rosae
Dative: rosae
Ablative: rosā
Vulgar Latin
Nominative: rosa
Accusative: rosa
Genitive: rose
Dative: rose
Ablative: rosa

The sound changes that were occurring in Vulgar Latin made the noun case system of Classical Latin harder to sustain, and ultimately spelled doom for the system of Latin declensions. As a result of the untenability of the noun case system after these phonetic changes, vulgar Latin moved from being a markedly synthetic language to a more analytic language where word order is a necessary element of syntax. Consider what the loss of final /m/, the loss of phonemic vowel length, and the sound shift of ae from /ai/ to /ɛ/ entailed for a typical first declension noun (see table).

The complete elimination of case happened only gradually. Old French still maintained a nominative/ oblique distinction (called cas-sujet/cas-régime); this disappeared in the course of the 12th or 13th centuries, depending on the dialect. Old Occitan also maintained a similar distinction, as did many of the Rhaeto-Romance languages until only a few hundred years ago. Romanian still preserves a separate genitive/ dative case along with vestiges of a vocative case.

The distinction between singular and plural was marked in two ways in the Romance languages. North and west of the La Spezia-Rimini line, which runs through northern Italy, the singular was usually distinguished from the plural by means of final -s, which was present in the old accusative plurals in masculine and feminine nouns of all declensions. South and east of the La Spezia-Rimini Line, the distinction was marked by changes of final vowels, as in contemporary standard Italian and Romanian. This preserves and generalizes distinctions that were marked on the nominative plurals of the first and second declensions.

Prepositions multiply

Loss of a productive noun case system meant that the syntax purposes it formerly served now had to be performed by prepositions and other paraphrases. These particles increased in numbers, and many new ones were formed by compounding old ones. The descendant Romance languages are full of grammatical particles such as Spanish donde, "where", from Latin de + unde, or French dès, "since", from de + ex or dans, "in" from de intus, "from the inside", while the equivalent Spanish and Portuguese desde is de + ex + de. Spanish después and Portuguese depois, "after", represent de + ex + post. Some of these new compounds appear in literary texts during the late empire; French dehors, Spanish de fuera and Portuguese de fora ("outside") all represent de + foris (Romanian afară - ad + foris), and we find St Jerome writing si quis de foris venerit ("if anyone goes outside").

As Latin was losing its case system, prepositions started to move in to fill the void. In colloquial Latin, the preposition ad followed by the accusative was sometimes used as a substitute for the dative case.

Classical Latin:

Jacōbus patrĭ librum dat. "James is giving his father a/the book."

Vulgar Latin:

´Jacọmọs ´lẹvrọ a ´patre dat. "James is giving a/the book to his father."

Just as in the disappearing dative case, colloquial Latin sometimes replaced the disappearing genitive case with the preposition de followed by the ablative.

Classical Latin:

Jacōbus mihi librum patris dat. "James is giving me his father's book."

Vulgar Latin:

´Jacọmọs mẹ ´lẹvrọ dẹ ´patre dat. "James is giving me the book of (belonging to) his father."


´Jacọmọs ´lẹvrọ dẹ ´patre a ´mẹ dat. "James is giving the book of (belonging to) his father to me."


Classical Latin had a number of different suffixes that made adverbs from adjectives: carus, "dear", formed care, "dearly"; acriter, "fiercely", from acer; crebro, "often", from creber. All of these derivational suffixes were lost in Vulgar Latin, where adverbs were invariably formed by a feminine ablative form modifying mente, which was originally the ablative of mentis, and so meant "with a _____ mind". So velox ("quick") instead of velociter ("quickly") gave veloce mente (originally "with a quick mind", "quick-mindedly") This explains the widespread rule for forming adverbs in many Romance languages: add the suffix -ment(e) to the feminine form of the adjective. This originally separate word becomes a suffix in Romance. This change was well under way as early as the 1st century BCE, and the construction appears several times in Catullus, for example in Catullus 8, line 11: sed obstinata mente perfer, obdura "but carry on obstinately [obstinate-mindedly]: get over it!"


The verb forms were much less affected by the phonetic losses that eroded the noun case systems; indeed, an active verb in Spanish (or other modern Romance language) will still strongly resemble its Latin ancestor. One factor that gave the system of verb inflections more staying power was the fact that the strong stress accent of Vulgar Latin, replacing the light stress accent of Classical Latin, frequently caused different syllables to be stressed in different conjugated forms of a verb. As such, although the word forms continued to evolve phonetically, the distinctions among the conjugated forms did not erode (much).

For example, in Latin the words for "I love" and "we love" were, respectively, amō and amāmus. Because a stressed A gave rise to a diphthong in some environments in Old French, that daughter language had (j')aime for the former and (nous) amons for the latter. Though several phonemes have been lost in each case, the different stress patterns helped to preserve distinctions between them, if perhaps at the expense of irregularising the verb. Regularising influences have countered this effect in some cases (the modern French form is nous aimons), but some modern verbs have preserved the irregularity, such as je viens ("I come") versus nous venons ("we come").

Another set of changes already underway by the 1st century CE was the loss of certain final consonants. A graffito at Pompeii reads quisque ama valia, which in Classical Latin would read quisquis amat valeat ("may whoever loves be strong/do well"). In the perfect tense, many languages generalized the -aui ending most frequently found in the first conjugation. This led to an unusual development; phonetically, the ending was treated as the diphthong /au/ rather than containing a semivowel /awi/, and the /w/ sound was in many cases dropped; it did not participate in the sound shift from /w/ to /β̞/. Thus Latin amaui, amauit ("I loved; he/she loved") in many areas became proto-Romance *amai and *amaut, yielding for example Portuguese amei, amou. This suggests that in the spoken language, these changes in conjugation preceded the loss of /w/.

Another major systemic change was to the future tense, remodelled in Vulgar Latin with auxiliary verbs. This may have been due to phonetic merger of intervocalic /b/ and /w/, which caused future tense forms such as amabit to become identical to perfect tense forms such as amauit, introducing unacceptable ambiguity. A new future was originally formed with the auxiliary verb habere, *amare habeo, literally "to love I have". This was contracted into a new future suffix in Western Romance forms which can be seen in the following modern examples of "I will love":

  • French: j'aimerai (je + aimer + ai) < aimer ["to love"] + ai ["I have"].
  • Portuguese and Galician: amarei (amar + [h]ei) < amar ["to love"] + hei ["I have"]
  • Spanish and Catalan: amaré (amar + [h]e) < amar ["to love"] + he ["I have"].
  • Italian: amerò (amar + [h]o) < amare ["to love"] + ho ["I have"].

An innovative conditional (distinct from the subjunctive) also developed in the same way (infinitive + conjugated form of habere). The fact that the future and conditional endings were originally independent words is still evident in Portuguese, which in these tenses allows clitic object pronouns to be incorporated as infixes between the root of the verb and its ending: "I will love" (eu) amarei, but "I will love you" amar-te-ei, from amar + te ["you"] + (eu) hei = amar + te + [h]ei = amar-te-ei.

Contrary to the millennia-long continuity of much of the active verb system, which has now survived 6000 years of known evolution, the synthetic passive voice was utterly lost in Romance, being replaced with periphrastic verb forms—composed of the verb "to be" plus a passive participle—or impersonal reflexive forms—composed of a verb and a passivizing pronoun.

Apart from the grammatical and phonetic developments there were many cases of verbs merging as complex subtleties in Latin were reduced to simplified verbs in Romance. A classic example of this is the verbs expressing the concept "to go". Consider three particular verbs in Classical Latin expressing concepts of "going": ire, vadere, and ambulare. In Spanish and Portuguese ire and vadere merged into the verb ir which derives some conjugated forms from ire and some from vadere. andar was maintained as a separate verb derived from ambulare. Italian instead merged vadere and ambulare into the verb andare. And at the extreme French merged all three Latin verbs with, for example, the present tense deriving from vadere and ambulare and the future tense deriving from ire. Similarly the Romance distinction between the Romance verbs for "to be", essere and stare, was lost in French as these merged into the verb être.


The copula (that is, the verb signifying "to be") of Classical Latin was esse. This evolved to *essere in Vulgar Latin by attaching the common infinitive suffix -re to the classical infinitive; this produced Italian essere and French être through Proto-Gallo-Romance *essre and Old French estre as well as Spanish and Portuguese ser (Romanian a fi derives from fieri which means "to become"). However, in Vulgar Latin a second copula developed utilizing the verb stare, which originally meant (and is cognate with) "to stand" to denote a more temporary meaning. That is, *essere signified the essence, while stare signified the state. Stare evolved to Spanish and Portuguese estar and Old French ester (both through *estare), while Italian retained the original form.

The semantic shift that underlies this evolution is more or less as follows: A speaker of Classical Latin might have said (hypothetically, Classical Latin was nearly fully restricted to writing and reserved for rhetorical purposes): vir est in foro, meaning "the man is at the marketplace". The same sentence in Vulgar Latin should have been *(h)omo stat in foru, "the man stands at the marketplace", replacing the est (from esse) with stat (from stare), because "standing" was what was perceived as what the man was actually doing. The use of stare in this case was still actually correct assuming that it meant "to stand", but soon the shift from essere to stare became more wide-spread, and, in the end, essere only denoted natural qualities that would not change. (Although it might be objected that in sentences like Spanish la catedral está en la ciudad, "the church is in the city" this is also unlikely to change, but all locations are expressed through estar in Spanish, as this usage originally conveyed the sense of "the church stands in the city".)

In French, the evolved forms of the two verbs, estre and ester, merged in the late Middle Ages, as the "s" disappeared from words beginning in est-, as this phenomenon produced Modern French être and an obscure form *éter, which eventually merged.

