R1a and R1b branches - Indo-European languages

Introduction

The relation between migration and the Indo-European languages is known for several years. In this page i analyse the main Y-DNA branches that are related to these languages. In the Time Maps you see the presence of branches in different time periods. In the histogram you see the population sizes of these branches, both in absolute numbers and in relative percentages. Population jumps can often be seen in the histogram with relative percentages.

Below i discuss the folowing languages and branches in different paragraphs:

link to phylogenetic tree: R-PF7562
link to phylogenetic tree: R-Z2103
link to phylogenetic tree: R-Z2118
link to phylogenetic tree: R-A8053
link to phylogenetic tree: R-S1194
link to phylogenetic tree: R-U106
link to phylogenetic tree: R-P312
link to phylogenetic tree: R1a
link to phylogenetic tree: R-Z283
link to phylogenetic tree: R-Z93

R1b - R-L754 - R-P297 - R-M269 - R-L23 - R-Z2103
R1b - R-L754 - R-P297 - R-M269 - R-L23 - R-L51 - R-Z2118
R1b - R-L754 - R-P297 - R-M269 - R-L23 - R-L51 - R-L52 - R-L151 - R-A8053
R1b - R-L754 - R-P297 - R-M269 - R-L23 - R-L51 - R-L52 - R-L151 - R-S1194
R1b - R-L754 - R-P297 - R-M269 - R-L23 - R-L51 - R-L52 - R-L151 - R-U106
R1b - R-L754 - R-P297 - R-M269 - R-L23 - R-L51 - R-L52 - R-L151 - R-P312
R1a
R1a - R-M459 - R-M198 - R-M417 - R-Z645 - R-Z283
R1a - R-M459 - R-M198 - R-M417 - R-Z645 - R-Z93

link to phylogenetic tree: R-L238
link to phylogenetic tree: R-DF99
link to phylogenetic tree: R-DF19
link to phylogenetic tree: R-DF27
link to phylogenetic tree: R-S461
link to phylogenetic tree: R-U152

yfull-links: R-PF6538 - R-L151 - R-P312 - R-L238
yfull-links: R-PF6538 - R-L151 - R-P312 - R-DF99
yfull-links: R-PF6538 - R-L151 - R-P312 - R-DF19
yfull-links: R-PF6538 - R-L151 - R-P312 - R-DF27
yfull-links: R-PF6538 - R-L151 - R-P312 - R-S461
yfull-links: R-PF6538 - R-L151 - R-P312 - R-U152

link to phylogenetic tree: R-Z2103
link to phylogenetic tree: R-Z2118
link to phylogenetic tree: R-A8053
link to phylogenetic tree: R-S1194
link to phylogenetic tree: R-U106
link to phylogenetic tree: R-P312
link to phylogenetic tree: R1a
link to phylogenetic tree: R-Z283
link to phylogenetic tree: R-Z93

yfull-links: R-Z2103
yfull-links: R-Y482 - R1 - R1b - R-L754 - R-L389 - R-P297 - R-M269 - R-L23 - R-L51 - R-Z2118
yfull-links: R-PF6538 - R-L151 - R-A8053
yfull-links: R-PF6538 - R-L151 - R-S1194
yfull-links: R-PF6538 - R-L151 - R-U106
yfull-links: R-PF6538 - R-L151 - R-P312
yfull-links: R1a
yfull-links: R1a - R-M459 - R-M735 - R-M198 - R-M417 - R-Z645 - R-Z283
yfull-links: R1a - R-M459 - R-M735 - R-M198 - R-M417 - R-Z645 - R-Z93

Figure caption: The above histograms shows the total number of branches at a certain time period. The resolution is chosen to be 100 years, so close to the time of one SNP-mutation (144 years in yfull or about 80 years in most tests). If sufficient statistics is available one can see the changes in time order. The first increase is the R1b-Z2103 increase are brought the Tocharian language and the Indo-European language in Anatolia (Hittite) and the Balkan: increase of cyan between 5300-5000. The next step was a migration of R-Z2118 to Italy (red). Next we see the increase of R1a-Z283 in eastern Europe (light blue). This branch had also a much later expansion again in the Slavic period (2000 ybp). The next expanding branch was R1b-S1194 (grey) in western-Europe, and then the Germanic R-U106 branch and followed by the R-P312 branch. Within the P312 family (diagram above) one can see that the R1b-DF27 and R1b-U152 branches expanded earlier (Spain and Italy) than the R1b-L21 (in R-S461) in Britain. The expansion of R1a-Z93, which led to the Medic and Indian languages, was probably a little bit later than the arrival in Europe, and the distribution of the large area might have taken a longer period.

Origin

Indo-Europeans had their Y-DNA origin in the Eurasian Steppe. They had a strong paternal line, and they brought their Y-DNA and Indo-European languages over a large geographical area. In this map you can see the region of the Eurasian steppen (image from wikipedia).

In the histogram you can see the sequence of the distribution: R1b-Z2103, R1b-Z2118 (Hittite and Tocharian), R1a-Z283 (eastern Europe), R1b-U106 (Northern-Europe), R1b-P312 (Western-Europe) and R1a-Z93 (India). The population was lower in Eastern-Europe and grew at the growth of the Slavic languages (2000 ybp).

It is well possible that the R-PF7562 was also an Indo-European branch. R-PF7562 and R1b-L23 are the two branches below R-M269 (no branchsplit is known between 13300-5700ybp).

Horse domestication (and the related horse burial) was thought by many people to be a component in the spread of the Indo-European peoples. Recently it became clear that the clear that the shared horse ancestor of the domesticated horse liver about 2000-2200 BCE. Apparently the domesticated horse spread over many tribes as a cultural phenomen, and did not play a role in the migration of this branch to Europe. Another chracteristic of the Indo-European tribes was probably the concept of social inequality. This kinship-based social inequality was found in different groups of early Indo-Europeans, and probably originated in the steppe. This social inequality was found in ancient DNA in Bronze Age Europe (Germany), the origin of the Indian Caste population and in laws in historic Greece. Three levels were present in the Gortyn code law in historic Greece: free, slave, foreign. Two varnas were present in the Early Vedic period (1500-1000 BCE) in India.

Hittite

The oldest known Indo-European language with documentation is Hittite in Anatolia. The known documentation is from a somewhat later period than the arrival time of the Y-DNA of R1b-Z2013 and R1b-Z2118. One can see the population growth in the Time Maps and histogram just before 5250yb.

Tocharian language

The Tocharian language was spoken on the border region of north-west China and Russia, so east of the R1a descendants. The ancient Y-DNA that is found from the Afanasievo culture (2900-2650 BCE) shows that the R1b-M269 migrated to the north eastern side of the satem languages. These ancient samples of Khakassia, Altay (Russia) and Batys (Kazahkstan) in R-Z2103 are reported in version 9.0 of yfull. Tocharian belongs to the Satem language group (see below).

Centum and Satem languages

Speculating the position of the Baltic languages

Three Baltic languages are present at the moment, Lithuanian, Latvian and Latgalian.

For a long time two relations of the Baltic languages were mentioned in the language research:

On the Slavic page i suggest that the Slavic language expansion is a result of the expansion of the I-Y3120 branch. This means that its origin is likely in the west past of the regions where presently Slavic languages are spoken. I therefor suspect, as was previously suggestion by Oleg Balanovsky, that the similarities between Slavic and Baltic languages is the result of a pre-Slavic substrate in East-Slavic language that consists of a Baltic related language, see Baltic languages.

Relation between R1a, R1b and Centum and Satem languages

The distribution of Centum and Satem languages looks very similar as the distribution of the R1b-M269 and R1a branches. This fits for most of the branches. For the Slavic languages it fits, if we follow the above suggestion (Slavic is an expansion from I-Y3120 Satem language in a Baltic (Centum) substrate, so the second option above). Using Y-DNA genetics, the easiest scenario for the Centum(R1b)-Satem(R1a) is that the R1a-branches spoke a different dialect of PIE than the R1b-M269-branches. Both migrations have had strong changes in the regions after the migrations, and only few characteristics between these early dialects are still found. These might be the result of the similarities between the Baltic languages and Sanskrit (Satem) and the other Indo-European languages (R1b) on the other side.

The difference between R1b-M269 and R1a fit with the following division:

Greek is generally classified as Centum, which suggests that the originating E-V13 ancestor learned the Indo-European language in a Centum language region, which would fit with an Anatolian or European origin.

The classification of Albanian and Armenian is discussed, and less clear. The Y-DNA haplogroups don't help in this respect.

The relation holds also for the Tocharian language, despite its large geographic distance to the other Centum languages.

Difference in historic context

The difference between the two groups (Centum=R1b-M269 and Satem=R1a) could be two different dialects of PIE in different parts of the Eurasian Steppe. The most common scenario would be that the R1b-ancestor lived possible further west and earlier than the R1a-ancestor.

In the steppe cultures several cultures who probably influenced each other are candidates for the ancestors of these branches. Sredny Stog (on the Ukrainian side), Maykop (on the Caucasus side) and Yamna (on the Don-Volga side) are all candidates for the ancestral lines. Sredny Stog seems easiest to fit for the Centum=R1b-M269 lineage, given the location to the Anatolian Indo-Europeans and the origin of the domestication of horses. The Yamna seems like an easier fit to the Satem=R1a lineage, where R1a was found in the ancient DNA of these peoples. A Indo-Europeans youtube presentation shows that a lot of ancient DNA-data that is found and shows that the distribution was probably much more complex than a simple grouping in two or three cultures. The difference between two dialects between R1b-M269 and R1a-branches seems likely.

Hellenic and Albanian languages

The languages in the environment near 0 CE

Greek and Albanian are two old Indo-European languages in Europe, parallel to Celtic, Germanic, Slavic, Baltic and Romanic languages in Europe. In the Roman and Greek period several other languages were still present in Europe, but and we have limited knowledge on the languages spoken. From the Pre-Indo-European languages little knowledge is available. Basque is still spoken, and reports are given on the Indo-European languages by Illyrian, Dacian and Thracian. Scripts are found in Tartessian, Iberian, Etruscan, Rhaetian language. Small amounts of scripts are found of Sicani and North Picenian, and Lemnian and Minoan on the island in the Mediterranean Sea. The disappeared languages have often only tiny pieces of script (duplicated from other scripts) and the knowledge is extremely limited. Language specialists debated on the options between Celtic languages, Indo-European or isolates. Hardly any consensus among the language specialists is present for these old languages. The only consensus seems to be that many languages were spoken, and at least part of the languages were Indo-European. Some have also suggested a shared origin of Pre-Indo-European languages in a language family (Tyrrhenian), including Rhaetic, Etruscan and Lemnian. Again no consensus is present among specialists. Most language specialist consider Etruscan and Basque as non-European. At least one person (Gianfranco Forni) wrote a scientific article and book on his arguments that Basque is a Indo-European language as well as a draft article on Etruscan being Indo-European. This would fit easier with the R1b-DF27 (e.g. R-Z214) and R1b-L2 characteristics in Basque and Etrusca respectively. In ancient DNA of the Etrusca region only R1b (in majority) and G (in minority) were found in Posth et. al (2021).

A first scenario for Greek and Albanian

The population growth of E-V13 (E-L618) fits with a population jump in Europe, close to the arrival of the other Indo-European arrivals. One difference with the other branches is present: it would require that the ancestor of E-V13 spoke Indo-European and had a Indo-European culture. Given the mixing of populations, I would suggest that Anatolia is the easiest location for this exchange of Y-DNA and language, and would fit in the influence of the early Hittite language in Anatolia. Alternatives (Levant or the Balkans) are more complex as an explanation.

The pattern for J2b-L283 is similar as that of E-V13. Again a population growth like the Indo-European groups in Europe, but a Y-DNA outside the R1a+R1b branches. ALso in this case Anatolia is probably the easiest area for the mix between Indo-European culture (and language) and J2b Y-DNA. Albanian is sometimes classified as Satem, and sometimes it is argued that it should have a special history. In case the J2 branch learned the language from a R1b-Z2103 Hittite line, it seems reasonable that it has both Satem characteristics, but also some characteristics of the language of the original J2b-lineage.

Notice:the oldest population jump of J2b-L283 and E-V13 is the jump of E-CTS1273, which is at 4500ybp.

A second scenario for Greek and Albanian

A second scenario could be related to the R1b-Z2110 branch (descendant of R-Z2103). This branch is part of the R1b-M269 branch that brought the Indo-European languages in many areas. In the time maps one can see this branch arrived near 5300 ybp in the Balkan-Italy area and expanded (it is in the population jump list at 5300). In this period other branches had also arrived in this area (e.g. E-V13 and J2b-L283). We have seen that many different haplogroups already mixed in the middle east for longer period, so this is not unreasonable. All three Y-DNA branches expanded. At the moment the two largest Y-DNA branches are E-V13 and J2b-L283, and the two remaining Indo-European languages in this region are Albanian and Greek. In this scenario R-Z2110 brought the Indo-european language to the Balkan area, and the E-V13 and J2b-L283 also arrived in the Balkans, and started to use the Indo_European language that was brought by the R-Z2110 branch. Notice that we find only seven samples J-L283 in Anatolia, and the oldest arriving sample might be 3100ybp; and all arrived from the west. The distribution of E-V13 is similar, but slightly less significant. If these branches spent a serious period in Anatolia to pick up the language, and then migrated to the Balkans one would expect to find descending lines in Anatolia. In the R1b-Z2103 Z2110+ we see the same pattern as J2b-L283 and E-V13 (almost no branch descendants in Anatolia), while the R1b-Z2103 Z2110- has older branches in Anatolia (and Armenia).

Greek

The Greek language can be traced from a shared language in Mycenae (using Linear B script) before the 14th century BCE to the later Greek City states (as Athens and Sparta), in the Empire of Alexander the Great and to modern Greek. Cultural continuation is reported from 3200 BCE to the later Greek period. The first population jump that might be related is the Z2110 branch split at 5300 ybp. Later branch splits of descending branches of R-Y18959 (4900ybp) and R-Y19469 (3800 ybp) could have also contributed to the spread, as was as the also present J2b-L283 and E-V13, both have population jumps in this later period in the same region.

Albanian

The history of Albanian is much more recent. A discussion is found on the Albania page, where it is argued that the R-Z2705 branch (subbranch of R-Z2103) was important in the history of the Albanian language.

Preferred scenario

This second scenario (E-V13 and J-L283 adopted the language in the Balkans, which was brought there by R1b-branches) seems easier than the first scenario (where E-V13 and J-L283 adopted the language in Anatolia).

Armenian

People have argued for Armenian as a Satem language, but also as a Centum language. If we follow the above idea (Y-DNA branches indicate language spread) it could fit if Armenian was a Centum language (R-Z2103) that was heavily influenced by the Iranian Satem language, as has been suggested several times.

Scandinavian branch

The I1 branch had it's fast expansion in Scandinavia: two scenario's have been suggested for the origin of I1: I1 was in Western-Europe when the Indo-Europeans arrived. Another option is that they were in the western part of the Eurasian Steppe, e.g. in the region of present day Ukraine. In both cases they were early part of the Indo-European tribe and part of the Indo-European expansion. Their fast growth (I-DF29) was slightly later than the R1b-U106 expansion.

Median language

The Median language arrived probably in the same migration as the migration that lead to the Indic languages, about 4000 ybp. The present countries of Iran, Afghanistan, Pakistan and India were populated by R1b-Z93 people in this period.

Indic languages

The Indic language arrived probably in the same migration as the migration that lead to the Median languages, about 4000 ybp. The present countries of Iran, Afghanistan, Pakistan and India were populated by R1b-Z93 people in this period.