Proto-Indo-European language
The
Proto-Indo-European language (
PIE) is the hypothetical common ancestor of the
Indo-European languages that is believed to have been spoken in the
4th millennium BC in Central Asia (according to the
Kurgan hypothesis), or as early as the
7th millennium BC in Anatolia (according to the
Anatolian hypothesis). The existence of such a language is generally accepted by
linguists, though there has been debate about many specific details.
History
The formative phase of the field falls into the 18th and early 19th centuries, culminating in
Franz Bopp's
Comparative Grammar of
1833. The classical phase of Indo-European comparative linguistics leads from Bopp to
August Schleicher's
1861 Compendium and up to
Karl Brugmann's
Grundriss published from the
1880s. Brugmann's
junggrammatische re-evaluation of the field and
Ferdinand de Saussure's development of the
laryngeal theory may be considered the beginning of "contemporary" Indo-European studies. The Indo-European proto-language as described in the early 1900s in its main aspects is still accepted today, and the work done in the 20th century has been cleaning up and systematization, as well as the incorporation of new language material, notably the
Anatolian and
Tocharian branches unknown in the 19th century, into the Indo-European framework.
Notably, the
laryngeal theory, in its early forms discussed since the 1880s, became mainstream after the 1927 discovery by
Jerzy Kuryłowicz of the survival of at least some of these hypothetical phonemes in Anatolian.
Julius Pokorny in
1959 published his
Indogermanisches Etymologisches Wörterbuch, giving an overview of the lexical knowledge accumulated until the early 20th century, but neglecting then-recent trends of morphology and phonology, and largely ignoring Anatolian and Tocharian.
The generation of Indo-Europeanists active in the last third of the 20th century, such as
Calvert Watkins,
Jochem Schindler,
Helmut Rix, developed a better understanding of morphology and, in the wake of Kuryłowicz's
1956 Apophonie,
ablaut. From the 1960s, knowledge of Anatolian began to be of a certainty sufficient to allow it influence the image of the proto-language, see also
Indo-Hittite.
Method
There is no direct evidence of PIE, because it was never
written down. All PIE sounds and words are reconstructed from later Indo-European languages using the
comparative method and the method of
internal reconstruction. The
asterisk is used to mark reconstructed PIE words, such as * "
water", * "
dog", or * "three (masculine)". Many of the words in the modern Indo-European languages seem to have derived from such "protowords" via regular
sound changes (e.g.,
Grimm's law).
As the Proto-Indo-European language broke up, its sound system diverged as well, according to various
sound laws in the daughter languages. Notable among these are
Grimm's law and
Verner's law in
Proto-Germanic, loss of prevocalic
*p- in
Proto-Celtic, loss of prevocalic
*s- in
Proto-Greek,
Brugmann's law in
Proto-Indo-Iranian.
Grassmann's law and
Bartholomae's law may or may not have been still common Indo-European.
Relationship to other language families
Many higher-level relationships between PIE and other language families have been proposed. Due to the great time depths, there is necessarily a great deal of speculation involved, and as a result the proposals are very controversial. Perhaps the most widely accepted proposal is of an
Indo-Uralic family, encompassing PIE and
Uralic. The evidence usually cited in favor of this is the proximity of the proposed
Urheimaten of the two families, the
typological similarity between the two languages, and a number of apparent shared morphemes.
Frederik Kortlandt, while advocating a connection, concedes that "the gap between Uralic and Indo-European is huge", while Lyle Campbell, an authority of Uralic, denies any relationship exists. Other proposals, further back in time (and correspondingly less accepted), model PIE as a branch of Indo-Uralic with a
Caucasian substratum; link PIE and Uralic with
Altaic and certain other families in Asia, such as
Korean,
Japanese,
Chukotko-Kamchatkan and
Eskimo-Aleut (representative proposals are
Nostratic and
Joseph Greenberg's
Eurasiatic); or link some or all of these to
Afro-Asiatic,
Dravidian, etc., and ultimately to a single
Proto-World family (nowadays mostly associated with
Merritt Ruhlen). Various proposals, with varying levels of skepticism, also exist that join some subset of the putative Eurasiatic language families and/or some of the
Caucasian language families, such as
Uralo-Siberian,
Ural-Altaic (once widely accepted but now largely discredited),
Proto-Pontic, etc.
Proto-Indo-European is conjectured to have used the following
phonemes. See
Indo-European languages for a summary of how these sounds evolved in the various Indo-European languages.
Consonants
The table gives the most common notation in modern publications. Variant transcriptions are given below. Raised stands for
aspiration. According to the
glottalic theory, the "voiced stops" of the system as described above were glottalic, perhaps
ejectives, while the "voiced aspirated stops" may not have been voiced.
*
Proto-Celtic,
Proto-Balto-Slavic,
Albanian, and
Proto-Iranian merged the voiced aspirated series with the plain voiced series . (However, Proto-Celtic did not merge - the first became g while the second became b).
*
Proto-Germanic underwent
Grimm's law, changing voiceless stops into fricatives, devoicing unaspirated voiced stops, and de-aspirating voiced aspirates.
*
Grassmann's law ( > , e.g. > ) and
Bartholomae's law ( > , e.g. > ) describe the behaviour of aspirates in particular contexts in some early daughter languages.
Labials
, grouped with the cover symbol
P. was a very rare phoneme, which is one argument in favor of the glottalic theory - it seems that languages having ejective stops tend not to have an ejective labial stop .
Coronals/Dentals
The standard reconstruction identified three coronal/dental stops: . They are symbolically grouped with the cover symbol
T.
Some theorists conclude that consonant clusters of the form
TK would undergo a metathesis in the proto-language, resulting in , compare
Hittite dagan "earth" with Greek
khthōn "earth", from , from earlier ; Hittite
hartagas "monster", Greek
arktos "bear" from from earlier . Both metathetized and unmetathetized forms survive in different ablaut grades of the root "burn" (cognate to
dagaz, day) in Sanskrit, "is being burnt" < and "burns" < .
Dorsals
Direct comparison, informed by the
Centum-Satem isogloss yields the reconstruction of three rows of
dorsal consonants in PIE.
*Palatovelars, (also transcribed or or . These were - or -like sounds which underwent a characteristic change in the
Satem languages; they were possibly
palatalized velars (, ) in Proto-Indo-European.
*Pure velars, .
*Labiovelars, (also transcribed ). Raised stands for labialization, or
lip-rounding accompanying the articulation of velar sounds ( is a sound similar to English
qu in
queen).
The
centum group of languages merged the palatovelars with the plain velars while the
satem group of languages merged the labiovelars with the plain velars .
The existence of the plain velars as phonemes separate from the palatovelars and labiovelars has been disputed. In most circumstances they appear to be
allophones resulting from the neutralization of the other two series in particular phonetic circumstances. It is difficult to pinpoint exactly what the circumstances of the allophony are, although it is generally accepted that neutralization occurred after and , and often before . Most PIE linguists believe that all three series were distinct by late Proto-Indo-European, although a minority, including
Frederik Kortlandt, believe that the plain velar series was a later development of certain satem languages; this view was originally articuled by
Antoine Meillet in 1894. Those who support the view of the threefold distinction in PIE cite evidence from
Albanian (Holger Pedersen,
KZ 36 (1900) 277-340; Norbert Jokl,
Mélanges linguistiques offerts à M. Holger Pedersen (1937) 127-161) and
Armenian (Vittore Pisani,
Ricerche Linguistiche 1 (1950) 165ff.) that they treated plain velars differently from the labiovelars in at least some circumstances, as well as the fact that
Luwian apparently has distinct reflexes of all three series: * >
z (probably ); * >
k; * >
ku (probably ) (Craig Melchert,
Studies in Memory of Warren Cowgill (1987) 182"204). Kortlandt, however, disputes the significance of this evidence (
Recent developments in historical phonology (1978) 237-243 = [
1]). Ultimately, this dispute may be irresoluble -- analogical developments tend to quickly obscure the original distribution of allophonic variants that have been phonemicized, and the time frame is too great and the evidence too meager to make definite conclusions as to when exactly this phonemicization happened.
Fricatives
(with the voiced allophone ). The "laryngeals" may have been fricatives, but there is no consensus as to their phonetic realization. There were also fricative allophones of , usually transcribed .
Laryngeals
The symbols , with cover symbol (or and ), stand for three hypothetical "
laryngeal" phonemes. There is no consensus as to what these phonemes were, but it is widely accepted that was probably uvular or pharyngeal, and that was labialized. Commonly cited possibilities are and ; there is some evidence that may have been two consonants, and , that fell together.
The
schwa indogermanicum symbol is commonly used for a laryngeal between consonants.
Nasals and Liquids
, with vocalic allophones , grouped with the cover symbol
R.
Semivowels
(also transcribed ) with vocalic allophones .
Vowels
*
Short vowels *
Long vowels ; a colon
(:) is sometimes employed to indicate
vowel length instead of the macron sign (
a:, e:, o:).
*
Diphthongs *vocalic allophones of consonantal phonemes: .
Other long vowels may have appeared already in the proto-language by
compensatory lengthening: .
It is often suggested that all sounds (short and long) were earlier derived from an preceded or followed by , but Mayrhofer (1986: 170 ff.) has argued that PIE did in fact have and phonemes independent of .
All Indo-European languages are
inflected languages (although many modern Indo-European languages, including
Modern English, have lost much of their inflection). By comparative reconstruction, it is quite likely that at least the latest stage of the common PIE mother languages (
Late PIE) was an inflectional language, which was more
suffixing than
prefixing. However, by means of internal reconstruction and
morphological (re-)analysis of the reconstructed, seemingly most ancient PIE word forms, it has recently been shown to be very probable that at a more distant stage PIE (
Early PIE) may have been a root-inflected language, as was
Proto-Semitic. As a consequence, it seems to be highly probable that PIE once was of the root-and-pattern morphological type.
[Pooth (2004): "Ablaut und autosegmentale Morphologie: Theorie der uridg. Wurzelflexion", in: Arbeitstagung "Indogermanistik, Germanistik, Linguistik" in Jena, Sept. 2002.]Ablaut
Indo-European had a characteristic general ablaut sequence that contrasted the vowel phonemes through the same root.
Noun
Nouns were declined for eight cases (
nominative,
accusative,
genitive,
dative,
instrumental,
ablative,
locative,
vocative). There were three genders: masculine, feminine, and neuter.
There are two major types of declension, thematic and athematic. Thematic nominal stems are formed with a suffix
-o- (in vocative
-e) and the stem does not undergo
ablaut. The athematic stems are more archaic, and they are classified further by their ablaut behaviour (
acro-dynamic,
protero-dynamic,
hystero-dynamic and
holo-dynamic, after the positioning of the early PIE accent (
dynamis) in the paradigm).
Case endings:
Pronoun
PIE pronouns are difficult to reconstruct due to their variety in later languages. This is especially the case for
demonstrative pronouns.
PIE had personal
pronouns in the
first and second person, but not the third person, where demonstratives were used instead. The personal pronouns had their own unique forms and endings, and some had two distinct stems; this is most obvious in the first person singular, where the two stems are still preserved in English
I and
me. According to Beekes (1995), there were also two varieties for the accusative, genitive and dative cases, a stressed and an
enclitic form.
As for demonstratives, Beekes (1995) tentatively reconstructs a system with only two pronouns: "this, that" and "the (just named)" (
anaphoric). He also postulates three adverbial particles "here", "there" and "away, again", from which demonstratives were constructed in various later languages.
There was also an interrogative/indefinite pronoun with the stem (adjectival ), and probably a relative pronoun with the stem . A third-person reflexive pronoun (acc.), (gen.), (dat.), parallel to the first and second person singular personal pronouns, also existed, as well as possessive pronominal adjectives.
PIE had a separate set of endings for pronouns; many of these were later borrowed as nominal endings.
Verb
The Indo-European verb system is complex and exhibits a system of
ablaut, as is still visible in the Germanic languages (among others)—for example, the vowel in the English verb
to sing varies according to the conjugation of the verb:
sing,
sang, and
sung.
The system is clearly represented in
Ancient Greek and
Vedic Sanskrit, two of the most completely attested of the early daughter languages of Proto-Indo-European.
Verbs have at least four
moods (
indicative,
imperative,
subjunctive and
optative, as well as possibly the
injunctive, reconstructible from Vedic Sanskrit), two
voices (
active and
mediopassive), as well as three
persons (
first,
second and
third) and three
numbers (
singular,
dual and
plural). Verbs are conjugated in at least three "tenses" (
present,
aorist, and
perfect), which actually have primarily
aspectual value. Indicative forms of the
imperfect and (less likely) the
pluperfect may have existed. Verbs were also marked by a highly developed system of
participles, one for each combination of tense and mood, and an assorted array of
verbal nouns and adjectival formations.
A number of secondary forms could be created, such as the
causative,
intensive and
desiderative; technically these were part of the
derivational system rather than the
inflectional system, as they existed only for certain verbs and did not necessarily have completely predictable meanings (compare the remnants of causative constructions in English "
to fall vs.
to fell,
to sit vs.
to set,
to rise vs.
to raise and
to rear). The above-mentioned verbal nouns and adjectives were likewise part of the derivational system (compare the formation of verbal nouns in
English, using
-tion,
-ence,
-al, etc.), and it appears that the same originally applied to the different verb tenses. Some verbs in
Ancient Greek still have perfect tenses with unpredictable meanings " from
hist"mi "I set, I cause to stand":
hest"ka "I am standing"; from
mimn"iskō "I remind":
memn"mai "I remember"; from
peithō "I persuade":
pepoitha "I trust" as well as
pepeika "I have persuaded"; from
phūō "I produce":
pephūka "I am (by nature)". The present tense in Ancient Greek and in Sanskrit is formed by the unpredictable addition of one of a number of suffixes (at least 10, in Sanskrit; at least 6, in Greek) to the verbal root; the aorist and perfect are likewise formed, in each case from their own set of suffixes (7 for the Sanskrit aorist, at least 3 for the Greek aorist), with little or no relation between the suffixes used in one tense and in another. (The perfect tense in
Latin is likewise unpredictable, formed in one of at least six ways.) Sometimes more than one suffix can be applied to the same root, producing different present, aorist and/or perfect stems for the same verb, sometimes with the same meaning, sometimes with different meanings (see the above example with the Greek verb
peithō). All of this suggests that the various tenses were originally independent lexical formations, similarly to the way that verbal nouns in English are formed unpredictably in English from different suffixes, sometimes with two or more formations that may differ in meaning:
reference vs.
referral,
transference vs.
transferral vs.
transfer,
recitation vs.
recital,
delivery vs.
deliverance etc. (This is more understandable if one considers that the original meaning of these tenses was
aspectual.) Only later, and gradually, were these various forms combined into a single set of inflectional paradigms.
Vedic Sanskrit had still not completed the process, and even
Ancient Greek has places where the old unorganized system still shows through. (As a result, verbs in Vedic Sanskrit have the appearance at first glance of a fantastically complex and disorganized system, with numerous redundancies combined with inexplicable holes. The system of PIE must have looked even more strongly like this.)
The primary distinction in verbs between the different ways of forming the present tenses was between thematic () classes, with a "thematic" vowel or before the endings, and athematic () classes, with endings added directly to the root. The endings themselves differed somewhat, at the very least in the first-person singular, with the endings as indicated ( vs. ). Traditional accounts say that this is the only form where the endings differed, except for the presence or absence of the thematic vowel; but some newer researchers, e.g. Beekes (1995), have proposed a totally different set of thematic endings, based primarily on Greek and
Lithuanian. These proposals are still controversial, however.
| Buck 1933 | Beekes 1995 | | Athematic | Thematic | Athematic | Thematic |
| Singular | 1st | | | | |
| 2nd | | | | |
| 3rd | | | | |
| Plural | 1st | | | | |
| 2nd | | | | |
| 3rd | | | | |
The original meanings of the past tenses (aorist, perfect and imperfect) are often assumed to match their meanings in Greek. That is, the aorist represents a single action in the past, viewed as a discrete event; the imperfect represents a repeated past action or a past action viewed as extending over time, with the focus on some point in the middle of the action; and the perfect represents a present state resulting from a past action. This corresponds, approximately, to the English distinction between "I ate", "I was eating" and "I have eaten", respectively. (Note that the English "I have eaten" often has the meaning, or at least the strong implication, of "I am in the state resulting from having eaten", in other words "I am now full". Similarly, "I have sent the letter" means approximately "The letter is now (in the state of having been) sent". However, the Greek, and presumably PIE, perfect, more strongly emphasizes the
state resulting from an action, rather than the action itself, and can shade into a present tense.)
Note that in Greek the difference between the present, aorist and perfect tenses when used outside of the indicative (that is, in the subjunctive, optative, imperative, infinitive and participles) is almost entirely one of
grammatical aspect, not of tense. That is, the aorist refers to a simple action, the present to an ongoing action, and the perfect to a state resulting from a previous action. An aorist infinitive or imperative, for example, does
not refer to a past action, and in fact for many verbs (e.g. "kill") would likely be more common than a present infinitive or imperative. (In some participial constructions, however, an aorist participle can have either a tensal or aspectual meaning.) It is assumed that this distinction of aspect was the original significance of the PIE "tenses", rather than any actual tense distinction, and that tense distinctions were originally indicated by means of adverbs, as in
Chinese. However, it appears that by late PIE, the different tenses had already acquired a tensal meaning in particular contexts, as in Greek, and in later Indo-European languages this became dominant.
The meanings of the three tenses in the oldest
Vedic Sanskrit, however, differs somewhat from their meanings in Greek, and thus it is not clear whether the PIE meanings corresponded exactly to the Greek meanings. In particular, the Vedic imperfect had a meaning that was close to the Greek aorist, and the Vedic aorist had a meaning that was close to the Greek perfect. Meanwhile, the Vedic perfect was often indistinguishable from a present tense (Whitney 1924). In the moods other than the indicative, the present, aorist and perfect were almost indistinguishable from each other. (The lack of semantic distinction between different grammatical forms in a literary language often indicates that some of these forms no longer existed in the spoken language of the time. In fact, in
Classical Sanskrit, the subjunctive dropped out, as did all tenses of the optative and imperative other than the present; meanwhile, in the indicative the imperfect, aorist and perfect became largely interchangeable, and in later Classical Sanskrit, all three could be freely replaced by a participial construction. All of these developments appear to reflect changes in spoken
Middle Indo-Aryan; among the past tenses, for example, only the aorist survived into early Middle Indo-Aryan, which was later displaced by a participial past tense.)
Numbers
The numbers are generally reconstructed as follows:{
Sihler 1995, 402"24 | Beekes 1995, 212"16 | | one | * | * |
| two | * | * |
| three | * (full grade)/* (zero grade) | * |
| four | * (o-grade)/* (zero grade), see also the | * |
| five | * | * |
| six | *; originally perhaps * | * |
| seven | * | * |
| eight | *, * or *, * | * |
| nine | * | * |
| ten | * | * |
| twenty | *; originally perhaps * | * |
| thirty | *; originally perhaps * | * |
| forty | *; originally perhaps * | * |
| fifty | *; originally perhaps * | * |
| sixty | *; originally perhaps * | * |
| seventy | *; originally perhaps * | * |
| eighty | *; originally perhaps * | * |
| ninety | *; originally perhaps * | * |
| hundred | *; originally perhaps * | * |
| thousand | *, * | * |