In these diagrams the location is determined by the average location of the descending branches. This is done for the period between "formed" and "TMRCA" in yfull. This is usefull for periods of expanding branches. It is less usefull for narrow branches with a recent migration (e.g. J2a in western Europe). I will use the term "time maps" in several pages.
In the maps below you can see the evolution of a small branch I-Y56203 (I2a). In the yfull dataset it has 7 samples. One sample has no origin, so it is not used for these time maps.
The two samples below Y134582 have a shared origin and a shared location 175ybp. So the two samples share a location between 60-175ybp at the location of SRB [RS-28].
Below I-Y134578 we have three locations for descendants: samples YF71371, YF15014 and the ancestor I-Y134582. We position the branch I-Y134578 at the average location of the three locations of the descendants. This is positioned at the time 1050ybp, the TMRCA of I-Y134578. The three descending branches are interpolated on the map between the location of I-Y134578 at 1050ybp and the arriving locations and time (YF71371, 60ybp, SRB [RS-17]; YF15014, 60ybp, MNE [ME-11] and I-Y134582, 175ybp, location of I-Y134582).
Below I-FT25907 we have two locations for descendants: samples YF72419 and YF19234. The location (at time 1100ybp) is determined as the average location of the two samples. The two descending branches are interpolated on the map between the location of I-FT25907 at 1100ybp and the locations of the descending samples at 60ybp (YF72419, [BA-SRP] and YF18234 [ME-04]).
Below I-Y56203 we have two descending branches: I-FT25907 and I-Y134578. The location (at time 1600ybp) is determined as the average location of the descending branches. The descending branches are interpolated on the map.
The data was collected from the yfull archived website (9.04). The samples indicated with "new" were removed, since many of them were not yet fully positioned in the tree. Some recent samples were removed, since they are the result of recent distant migrations, in which we are not interested in this website. The selection is reported in the paragraph on Artifacts. If we would have kept them in the dataset, these recent distant migrations would add too much noise and disrupt the pattern of the old branches.
In case an ancient sample was reported with C14 age information, the time period ends at the age moment of the sample. For kits where living people were measured, the values 60 ybp was used (as is the case in the yfull website). The best value of the time estimate was used. In case the coverage of samples is insufficient and yfull does not report a time estimate (e.g. 1000 genomes project), the number of SNPs is used to determine a minimal time length of the branch, by using 80 years per SNP (see NGS Statistics). In four cases this resulted in negative time estimates. For the location i used a central position in the country in case country was given. In case a province was reported, the central position of the province was used. The countries and positions can be seen on the Positions page.
Artifacts
The used technique may give several artifacts.
In some regions and cultures the number of tested persons is very strong, while it is weak in others. This result can be seen in the rich parts of Arabia. Several lines originate elsewhere, but due to the limited number of tested people elsewhere, it suggests that some lines have been in Arabia for a long period, while the origin was elsewhere, and in a later period a few people migrated.
Some lines are narrow and long. This means that the moment of migration is highly uncertain. The is probably the case for several migrations of old narrow branches (e.g. E-V22 which originated in Egypt) and a migration in the Roman period with descendants in Western-Europe.
The location of an originating branch can be incorrect, if the number of tested people is very small in a specific region. Fairly few people with E-V22 were measured in Africa, while it is the location of their founding father. This method gives, with the present data-set, an incorrect location of their origin.
The location of an originating branch can be incorrect, if the migrations went in one direction. This can be seen in the E-PF2431 branch. Descending branches went to Europe and were tested. The Sahara barrier caused that no (or very few) migrations went east or south, while the descendents of the migration to the north, e.g. in Roman times, are present in the tests.
Recent migrations without a genealogy origin influence might influence the pattern. This is a real phenomenon, but it disrupts the visual patterns from an earlier period. Therefor these data were omitted, where possible. This includes:
Non Q-members in America
Non C, O, M, S -members in Australia and Polynesia
European R1b, R1a, I1, I2 and G (recent European branches) from Southern Africa, Australia, Phillipines.
Haplogroup G in Indonesia (which is a relative recent result of Yemenite merchants).
Jewish branches from jewishdna.net and publications of Behar et al.
Peculiarities that were probably caused by errors or recent migrations. US-SC in GBR, Chinese Q in Peru. NEO247 was omitted since a required software correction was too timeconsuming.
The position of samples has a limited accuracy. Only the province is indicated. This means that the end point in the time maps end on a limited number of positions (61ybp). Usingit means that positions near these moments (e.g. 100 or 200ybp) show a starlike pattern.
Some groups have members with a traceable heritage from a distant location. They should be removed. Examples are Lemba, Syrian Christians in India.
The accuracy of location is limited. The resolution is limited by two elements: the options in yfull, which results in provinces, and the accuracy that the reporter gives. In e.g. India (with the largest population in the world) less than 20 provinces can be selected in yfull. Some reporters give no information, some give a country and some give a provicence. In some countries a scientific publication of data can add significant to the number of samples, but, due to privacy limitations, only the country is reported. This will have a systematic effect on the location of branches within a country. A large Turkish research dataset has this characteristics.
Some groups
First i created the diagrams to better understand the early period of Indo-Europeans in Anatolia. If one uses the diagrams critically they can be used for other questions.