Saturday, February 18, 2017

Spain, Past and Present

A lot of changes arose in Spanish mtDNA after the Neolithic. mtDNA sequenced from Iron age Spain indicates those changes had occurred by circa 300 BC.

Modern and Spanish mHG frequencies are included in the following spreadsheets along with mHG frequencies of other ancient and modern Europeans…Spain
mHG Frequencies
JT, N1, U5

For convenience here is a picture comparing modern and ancient Spanish mHG frequencies. mHGs with significantly different frequencies in modern and ancient Spain are highlighted....Notice Iron age Spain is similar to Modern

The two mHG which differ in frequency the most are H and K.

Neolithic Spain had a whopping 27-30% K and an unimpressive 20-25% H. Modern and Iron age Spain has/had a whopping 40-45% H and an unimpressive 7% K.

mHG frequencies in modern Spain are basically indistinguishable to mHG frequencies in most of modern Europe. mHG frequencies in Neolithic Spain were basically indistinguishable to mHG frequencies in Neolithic Germany and Hungary.

The same mtDNA changes which occurred in Spain after the Neolithic occurred in much of Europe. I think a mixture of migration from “Asia”(inclu. mostly “Eastern Europe”) and natural selection caused(I gave my reasons in this post) those changes to occur.

Furthermore there are many mHGs which exist in modern Spain at over 1% or just under 1% but haven’t been found in Neolithic Spain yet…..
L(xM, N): Modern Spain(2%), Neolithic Spain(0.5%)
M1: Modern Spain(0.7%), Neolithic Spain(0%)
N1b1: Modern Spain(0.5%), Neolithic Spain(0%)
R1: Modern Spain(0.3-0.5%), Neolithic Spain(0%)
U6: Modern Spain(2%), Neolithic Spain(0%)
U5a: Modern Spain(2-3%), Iron age Spain(4%)Neolithic Spain(0.5%)
U8b1, U8a1a: Modern Spain(0.5-1%), Neolithic Spain(0%)
U9a: Modern Spain(0.5%), Neolithic Spain(0%)
T1a: Modern Spain(1.5-2%), Iron age Spain(2%)Neolithic Spain(0%)
T2c1: Modern Spain(1-1.5%), Neolithic Spain(0.5%)
I: Modern Spain(2-3%), Neolithic Spain(0%)
H6: Modern Spain(1-2%), Neolithic Spain(0%)
HV6-24: Modern Spain(1.5-2%), Neolithic Spain(0%)
W: Modern Spain(1%), Iron age Spain(6%)Neolithic Spain(0%)

U6, L(xM, N), and M1 indicate modern Spanish have maternal ancestry from Africa which Neolithic Spanish did not have. The L(xM, N) mHGs modern Spanish belong to are mostly the same L(xM, N) mHGs NorthWest Africans belong. The two most common L(xM, N) mHGs in both locations are L1b and L2a1. The single L(xM, N) from Neolithic Spain belonged to L1b.

The other mHGs don’t conclusively indicate maternal ancestry from any particular region. U5a, T1a, T2c1, I, R1, W, HV6-24, and H6 are all present in ancient mtDNA from Central and Eastern Europe at high frequencies. It’s possible that region lent Spain those mHGs. But I and T1a are also frequent in the Middle East.

Along with differences there are also noticeable similarities between Neolithic and Modern Spanish mtDNA. Both have a higher frequency of J2b1a, J2a1a, T2a1b, U5b, U5b3, and U5b1i than ancient and modern Europeans from other regions.

Nuclear DNA confirms considerable genetic changes took place in Spain after the Neolithic age. Here’s how modern Spanish come out when they’re modeled as a mixture of Neolithic Spanish and other ancient and modern humans.

Middle Neolithic Spain: 48%
Eastern Europe(Yamnaya): 24%
Near East(Cypriot): 22%
Africa(Mozabite): 6%

So nuclear DNA Spain probably received migration from Eastern Europe(Yamnaya), the Near East(Cypriot), and Africa(Mozabite) after the Neolithic. mtDNA is pretty consistent with this.

Friday, February 3, 2017

New mtDNA from Stone age Eastern Europe(Latvia, Ukraine)

Yesterday Jones at al. 2017 published genome-wide, including mtDNA, data of 8 ancient individuals from Latvia and Ukraine. Three are Mesolithic Latvians, one is a Mesolithic Ukrainian, one is a Early Neolithic Ukrainian, two are Middle Neolithic Latvians, one is a Late Neolithic Corded Ware Latvian.

Here's a link to Jones at al. 2017's Figure 1 which displays the mtDNA results of these 8 Stone age Eastern Europeans. I added the new ancient mtDNA to my European Hunter Gatherer and Bronze age Northern European spreadsheets.

These Stone age Eastern Europeans can potentially give detailed insight into the origins of modern Europeans. Because of other ancient DNA we know all modern Europeans are mostly a mixture of the "Steppe", "EEF", and "WHG" populations but we don't know which "Steppe", etc. populations contributed to which modern Europeans.

Maybe people similar to these Stone age Eastern Europeans specifically gave modern Eastern Europeans a lot of their "WHG" ancestry. Maybe other Europeans got a lot of their "WHG" from WHGs who lived in other parts of Europe.

Mesolithic Baltic(Sweden, Latvia, Lithuania) HGs can be labeled as WHG or at least very similar. Their mtDNA makeup though is different from Western European WHGs.

Western Europe HGsBaltic HGs

Now let's look at who in Europe today has the most and least U5b, U5a, and U4.

Most U5b...

Andalusia Spain6.5
Galicia Spain6
North Poland5.4

Least U5b...

North Italy2
South Italy2.2

Most U5a...

East Baltic11.3

Least U5a...

Andalusia Spain2.5
South Italy2.6
Galicia Spain3.5

Most U4...

NW Russia6
East Germany5.6
East Baltic5.2
Least U4...

North Italy0
SW France1.2
Andalusia Spain1.7
South Italy1.8

U5b, U5a, and U4 frequencies in modern Europe have geographic trends. U5a and especially U4 peak in Eastern Europe. Is this because Eastern European hunter gatherers had a lot of U4 and U5a? Does U5b peak in Iberia because Western European hunter gatherers had a lot of U5b? That's all just hypothesis, we'll have to wait for more data to confirm it.

I've recently gathered a lot of new European data and will make a post about haplogroup frequencies in Europe soon so stay tuned. 

Wednesday, January 25, 2017


Related image

For the first time I have added Finland to my mtDNA database. My first collection of Finnish mtDNA is 287 mitogenomes. In this post I’ll give an intro to Finnish mtDNA.

Here are links to my analysis of Finnish mtDNA.
Finland: mHG frequencies, Founder Effects
mtDNA matches: A lists unique and rare haplogroups found in Finland and where in the world I’ve found those haplogroups also exist.

Summary: 50% of Finns belong to founder effect mHGs unique to the region Finland is in. Only 1% of Finns have Siberian mtDNA. Finland's mtDNA is super European. Finland's mHG frequencies are similar to their neighbors in NorthEastern Europe. Finnish H1/H3 is similar to Danish and White American H1/H3 but not similar to Basque H1/H3. A modern Finnish U5a1* shares a mutation with a Mesolithic Swedish U5a1*.

Dominated by Founder Effects

As you can see in the Finland spreadsheet about 50% of Finns belong to founder effect mHGs. I listed all of the founder effect haplogroups and the mutations which make them unique in the spreadsheet. All of these founder effect mHGs are unique to Finland and its neighbors. They are either nonexistent or very rare outside of Finland and its neighbors.

These are the most common Founder Effects in Finland.
U5b1b1a(5.20%), H3h1(3.1%), H1a[10](2.8%), H1f1(2.8%), K1c1c(2.2%), H1n4(2.1%), U5b1b2(2.1%).

Because of the high frequency of Finnish-specific mtDNA in Finland I can confirm you have a Finnish maternal lineage if you email me your mtDNA at

Close Affinity to Karelia

When comparing Finnish mtDNA to other populations I discovered Finland shares several unique mHGs with its neighbor Karelia. Many of the founder effect mHGs in Finland I just discussed are also found in Karelia. Here’s a list of mHGs unique to Finland and Karelia and probably nearby peoples.

H1a(10), H1a(11), H1f1, H2a1(o), H3h1, V7a, U5a2a1a, U5b1b1a, U1b, D5a3a1a.

The H1 mHGs above and H3h1 take up 10% or more of Finnish and Karelian mtDNA. That’s a sizable fraction.

Archetypal European mtDNA

Only 1% of Finnish mtDNA is Siberian. The rest is West Eurasian. 99%(or whatever the actual percentage is) of it belongs to archetypal European mHGs. Finland’s mHG frequencies are pretty similar to its neighbors in NorthEastern Europe; see European mHG frequencies here.

European-specific mHGs found in Finland. They take up 65% of Finnish mtDNA. The number is certainly higher because not all European mHGs have been discovered.
H1 H3 H11a V HV0(xV) HV6-17 T1a1 T2b T2a1b1a J1c J2a1a J2b1a K1a4a, K1a1b K1c1 U5 U4 U2e U8a1a I

Some of the above European-specific mHGs are too vague; J1c, H1, H3, U5, and so on. I thoroughly compared Finland's J1c, H1, etc. subclades to other Europeans. In most of those mHGs Finland belongs to subclades either typical for all Europeans or only to ones geographically close to Finland.

Here’s the primary subclade of some of those European mHGs in Finland.
H1: H1a, H1b, H1c, H1f, H1n
H3: H3h, H3b.
V: V7a, V1a, V(29)
J1c: J1c2, J1c3
U5: U5b1b, U5a1b1.
U4: U4a2a, U4d1
I: I5a, I1a1

My collection posses an abundance of mitogenomes from only a few West Eurasian population whom I can compare to Finland; Iran, Denmark, Druze, Caucasus, Basque, White Americans.

Finland’s H1 and H3 is closely related to White American and Danish H1/H3. However it is pretty unrelated to Basque H1/H3. Also you can see in the mtDNA Matches document that Finland matches most often with Danish and White Americans. *****I must warn you to not misinterpret the matching because almost half of my mitogenomes are from White Americans and Danish.

Siberian mtDNA: D5a3a1a, G3a1

Three of the Finnish mito genomes belonged East Asian mHGs; two Ds and one G. More specifically the Ds were D5a3a1a and the G was G3a1. In my database these mHGs only exist in non Slavic Russians, Karelians, and Siberians. Therefore it’s safe to assume they travelled from Siberia to Finland at some point.

U5b1b1a, U5b1b2, U5b1e1, U5a1*

I think it's pretty likely that these U5 lineages unique to Finland and its neighbors; U5b1b1a, U5b1b2, U5b1e1, and U5a1* are descended from ancient NorthEastern Europea Hunter Gatherers. Finns have extra European hunter gatherer ancestry which can't be explained by Corded Ware or Funnel Beaker, see here.

The single U5a1* Finnish sample I'm referring to has already found a match with a NorthEastern European Hunter Gatherer.

Modern Finn:
U5a1*. Extra mutations: 195C 5237A 5460A 6267A 13651G
Mesolithic Swede(Motala3).
U5a1*. Extra mutations: G5460A, G8860A, A9389G, C16519T

Saturday, January 14, 2017

North Africa's West Eurasian mtDNA

North Africans are a mixture of people related to Sub Saharan Africans and West Eurasians. In this post I’ll examine mostly North African mtDNA, but also Y DNA and autosomal DNA, to gain insight into who their West Eurasian ancestors were.



In my opinion NorthWest African mtDNA is mostly descended from Neolithic Near Easterners closely related to Neolithic Europeans. Whether it visited Europe before arriving in NorthWest Africa or not is impossible to detect at this point. Egyptian mtDNA is mostly Ancient Near Easterners aswell but from Near Easterners related to Natufians and Neolithic Levanties not Europeans.


The spreadsheet below displays the frequency of Eurasian mHGs, mHG H subclades, mHG JT subclades, and European vs Middle Eastern vs Eurasian African mHG frequencies in North Africa.

West Eurasian HG frequencies

NorthWest Africans’ Eurasian mtDNA, especially in Berbers, mostly belong to European-specific and Northern African-specific lineages. Egyptian mtDNA though belongs mostly to Middle Eastern-specific lineages.

The frequency of European-specific mHGs is on average 40% in NorthWest Africa but at only 4% in Egypt. The frequency of Middle Eastern-specific mHGs in Egypt is 51.2% but at only about 10% in NorthWest Africa.

The region of the Middle East Egyptian mtDNA is most similar to Arabia. R0a, T1a(xEuropean and Iranian T1a1), and H2a peak in Egypt and Arabia.

Here’s some graphs visually displaying the links between NorthWest Africa to Europe and Egypt to the Middle East.

NorthWest Africa=Algeria, Tunisia, Tunisia, Morocoo.
Europe=Spain, Ukraine.
Middle East=Syria, Arabia, Turkey.

U6, M1: Eurasian mHGs most popular in Africa

U6 and M1 are mHGs which branch from the Eurasian mHGs U and M that are most popular in Africa. U6 peaks in NorthWest Africa at about 8%. U6 exists in SouthWest Asia and Egypt at about 1-3%. It exists in Iberia and Southern Italy at about 0.5-1%. M1 is at about 7-10% in NorthWest Africa and Egypt. M1 exists in SouthWest Asia at about 1-3%. I don't have a lot of Eastern or Western African mtDNA data yet but I do know U6 and M1 exist in both of those regions aswell.

Neolithic European-related Ancestry in NorthWest Africa?

I wrote “Neolithic European-related” not “Neolithic European” because the similarity in mtDNA between NorthWest Africa and Neolithic Europe could have been caused by common ancestry from the Near East not direct descent.

H1, H3, and HV0 are more frequent in NorthWest Africa than Europe itself(if you don’t count Sub Saharan mtDNA). Ancient mtDNA indicates HV0, H1, and H3 in modern Europe descend from Neolithic European farmers. They were especially frequent in Neolithic Iberians and French(See here).

NorthWest African mtDNA isn’t completely consistent with having lots of Neolithic European ancestry. T2b, K1a, and J1c took near 40% of Neolithic European mtDNA but aren’t frequent in modern NorthWest Africans(5-10%). Though NorthWest African JT is more similar to Neolithic and modern European than to Middle Eastern JT.

Neolithic Middle Eastern-related Ancestry in Egypt?

There’s a decent amount of Middle Eastern-specific mHGs in all of North Africa, not just Egypt, which are unheard of in Europe. R0a, HV(xHV0), U1, U3, and J2a2 exist in both NorthWest Africa and Egypt. J1b(xJ1b1a1), J1d1, T1a7, and U7 are Middle Eastern-specific lineages only found in Egypt. So in no way can NorthWest African mtDNA be a simple European+Sub Saharan mix.

There’s much less published ancient Middle Eastern mtDNA than European but important matches already exist between ancient Middle Eastern mtDNA and NorthWest Africa.

7722-7541 BC. Jordan. R0a(20.3% in Egypt).
11840-9760 BC. Israel. J2a2.
7446-7058 BC. Jordan. T1a.
3956-3796 BC. Iran. U7(Only exists in Egypt).
11000 BC. Israel. N1b.

Autosomal DNA Reflection

Here are results modelling North Africans as a mixture of ancient West Eurasians and Sub Saharan Africa. I used D-stats provided by David Wesoloski at Eurogenes blog to produce these results.

Levant Neolithic: 60.05%
Anatolia Neolithic: 0%
Caucasus Paleolithic: 5.4%
Iran Neolithic: 7.95%
Europe Mesolithic: 9.45
Sub Saharan African: 17.15%

Levant Neolithic: 60%
Anatolia Neolithic: 0%
Caucasus Paleolithic: 0%
Iran Neolithic: 23%
Europe Mesolithic: 10%
Sub Saharan African: 7.05%

So basically autosomal DNA indicates NorthWest Africans are mostly Neolithic Near Eastern and Sub Saharan African with a little Neolithic Iranian. Egyptians are the same mixture with more Neolithic Iranian and less Sub Saharan African.

Keep in mind Neolithic Near Easterners were closely related to Neolithic Europeans. Therefore the above results are consistent with the high amount of Neolithic European-like mtDNA in NorthWest Africa.

Y DNA Reflection

Like mtDNA and autosomal DNA, Y DNA connects Northern Africans mostly with ancient Near Easterners. Both Egyptian and NorthWest African Y DNA are mostly E1b1b and J1. NorthWest African Eb1b is specifically M81, with few exceptions. Ancient DNA indicates J1 is from the Mesolithic Caucasus and Iran and documents the presence of E1b1b in Mesolithic Israel(Natufians). 

Sunday, January 8, 2017

The First mtDNA from Mesolithic Greece is K1*

Image result for theopetra cave
Theopetra Cave. The Cave in Greece which Theo5 and 1 were found in.

Last June Hofmanova et al. 2016 published the first mtDNA results from Mesolithic Greece for two samples named Theo5 and Theo1. They labelled the two Mesolithic Greeks as K1c. Ian Logan has posted the rCRS mutations of these two samples on his site. So with that information I checked if the K1c label given by Hofmanova et al. 2016 is accurate and discovered it isn't. Instead these samples are best labelled as K1*.


What K1 in Mesolithic Greece says about their affinity to modern and ancient populations
K1 is a West Eurasian haplogroup. About 5% of people in West Eurasia belong to K1. About 20%  of Neolithic Europeans and Anatolians belonged to K1. Essentially 100% of modern K1 and all ancient K1 tested so far belong to K1a, K1b, or K1c. K1*s like the ones found in Mesolithic Greece are rare. Until we get autsomal DNA from Mesolithic Greece we can't be confident about their affinities to other ancient populations but their K1s do suggest they were closely related to Neolithic Anatolians and Near Easterners.


Here's a list of the haplogroups leading to K1, the subclades of K1, and their rCRS mutations. Mutations the two Mesolithic Greeks possessed are highlighted in green and mutations they didn't possess are highlighted in red.

U: A11467G  A12308G  G12372A
U2'3'4'7'8'9: A1811G 
U8: T9698C 
U8b'c:  A3480G
U8b: G9055A  C14167T
K: A10550G  T11299C  T14798C  T16224C  T16311C!
K1: T1189C  A10398G!

K1 subclades.
K1a: C497T  (T16093C)
K1b: G5913A
K1c: T146C!  T152C!  C498d
K1d'e'f: T16362C

Extra Mutations.
KU171094: T146C A215G C3107T T6351Y G6446A C13967T T16249C
KU171095: T146C A215G C3107T T6351Y G6446A C13967T T8462C A10113G

These Mesolithic Greeks possessed 1 of 3 of K1c's mutations but not all three. Therefore they can't confidently be labeled as K1c. They're best labelled as K1*.

Both Mesolithic Greeks shared many unique mutations. They form a unique K1 subclade. I checked my collection of 1,000s of ancient and modern Ks, including over 1,000 K mitogenomes, for matches with this Mesolithic Greek K1 lineage.

I could only find HVR1 matches for KU171094. I found the 16249C mutation KU171094 had in two Romanian Ks, a Punjab Northern Indian K, and three Neolithic LBK Ks from Germany. 16249C is a rare mutation. These individuals probably do/did infact belong to the same K1 lineage as these Mesolithic Greeks did.

It wouldn't be surprising if Mesolithic Greek mtDNA or mtDNA from close relatives of Mesolithic Greeks existed in modern Romania. It would be a surprise however if the same was true for modern India. As I discussed in my previous post modern Indian's West Eurasian ancestry is primary from ancient Iran and Russia. Their ancient Russian ancestry though may have included some ancestry from Neolithic Europeans who may have had ancestry from Mesolithic  Greeks which could explain a K in modern India being related to a K from Mesolithic Greece.

Monday, January 2, 2017

South Asia's West Eurasian mtDNA

South Asians are a two way mixture of West Eurasians and a people(s) distantly related to East Asians known as ASI. ASI is more likely to be the native population of South Asia and West Eurasians probably arrived in multiple waves from the NorthWest. A decisive majority of India's mtDNA is ASI but a big chunk is West Eurasian. In this post I’ll look at West Eurasian mtDNA in South Asia to gain insight into who their maternal West Eurasian ancestors were.

Bronze age European Steppe+Neolithic Iran

Lazaridis 2016 successfully modeled South/Central Asians as a mixture of Neolithic Iran, Bronze age Steppe, and East Asians(as a proxy for ASI). The results they got are below.

Y DNA so far is consistent with this idea. The most common West Eurasian yHGs in South Asia are R2, R1a-Z93, J2, and G. R2, J2, and G have been found in ancient Iran. R1a-Z93 has been found in the Bronze age European Steppe.

mtDNA is also consistent with this model. A large percentage, almost 50%, of South Asia’s West Eurasian mtDNA belongs to mHGs found in remains from Neolithic Iran and the Bronze age Steppe. I still think we should be open to more complex origins of South Asians’ West Eurasian ancestry though.

Mostly Middle Eastern with a dose of European?

mHG Frequencies: Haplogroup Frequencies of West Eurasian mtDNA in India and a few other South Asians. Clade Origins: Deep subclades of West Eurasian haplogroups found in India and where in West Eurasia they’re most common.
Clade Origins: Deep subclades of West Eurasian haplogroups found in India and where in West Eurasia they’re most common.

The region of West Eurasia India shares the most mtDNA with is first Iran and second the rest of the Middle East. U7, U1, HV2, HV14, R0a, R2, J1b1b, J1b3, J1d are Middle Eastern-specific lineages found in India. Most of them peak around Iran. U7 is by far the most common. It’s more common, when not counting ASI, than anywhere in the Middle East.

A string of European-like mtDNA exists in South Asia aswell especially in Afghanistan and Pakistan. U5a, U4, U2e, J1b1a1, T1a1, J2b1a, J1c, T2b all have a consistent presence throughout South Asia. U5a1a1, U5a1b are the main U5a1 clades in Europe aswell as South Asia. U5a and U4, which are two of the three most common European-like mtDNAs in South Asia, today peak in NorthEast Europe, Siberia, Scandinavia, and YugoSlavia.

On average about 18% of India’s West Eurasian mtDNA belongs to those European-specific subclades and about 44% belongs to the Middle Eastern-specific subclades I listed earlier. SC Asia(Afghanistan, Pakistan) has significantly less Middle Eastern-specific mtDNA and slightly more European-specific mtDNA.

South Asian mtDNA also has trends and haplogroups which aren’t comparable to anything in West Eurasia. There are subclades of West Eurasian haplogroups specific to South Asia; H2b, U7a3b, U7a7, U7a6, U7c, U5a1g, and several W subclades. mHG X, which has a presence in all of West Eurasia and parts of North America, is non existent in South Asia. In contrast mHG W is extraordinarily more popular than anywhere in West Eurasia.

Matches with Ancient West Eurasians

There are many interesting examples of modern South Asian mtDNA belonging to the same subclades as ancient West Eurasian. Examples are listed at at the bottom.

Kalash mtDNA in particular has lots matches with the Bronze age European Steppe. It’s mostly madeup of a four founder effects. 3 of 4 are typical of the Bronze age European Steppe. The only two Kalash mtDNAs I found that weren’t apart of these founder effects belonged to U4b1a4 and T2a1a, both of which have been found in the Bronze age European Steppe.

Two other interesting matches were found between the Bronze age European Steppe and modern SC Asia. I possess over 1,000 H mito-genomes and the only H2bs in my collection are from an ancient Yamnaya individual and several modern SC Asian individuals.  U5a1g today is mostly in Iran and SC Asia but an ancient Corded Ware individual from Germany belonged to it as well.

R2: 1.5% in India, 6% in SC Asia.
 Neolithic Iran, 8000-7700 BC.

HV2: 2.4% in India, 1% in SC Asia.
 Neolithic Iran, 9100-8600 BC.

R0a: 1% in India, 2% in SC Asia.
 Neolithic Levant, 7722-7541 BC

U7a: 20% in India, 9% in SC Asia
 Chalcolithic Iran 3956-3796 BC

I1c: Existent in all of SC Asia
 Chalcolithic Iran, 3972-3800 BC

U5a1a1: India existent but unknown %, 1% in SC Asia.
 Yamnaya Russia 3000 BC,
 Afanasievo Siberia 3322-2923 BC
 Bell Beaker Germany 2300 BC

U5a1b1: Existent in India.
 Corded Ware Germany 2400 BC
 Bell Beaker Germany 2500–2050 BC
 Bell Beaker Spain 2492-2334 BC
 Xijing China(Tarim Mummy) 2000 BC
 Unetice Poland 1885-1693 BC
T1a1: India existent but unknown %, 3% in SC Asia.
 Potapovka Russia 2125-2044 BC
 Srubnaya Russia 1850-1200 BC
 Germany 2570-2471 BC
 Hungary 2000 BC
 Sweden 1200 BC
T2a1a: India existent but unknown %, 1% in SC Asia.
 Yamnaya Russia 2887-2634 BC
 Sweden 1300 BC

U2e1h: Found in Kalash and Hazara
 Potapovka Russia 2200-1900 BC
 Sintashta Russia 1960-1756 BC

U4b1a2: Found in Kalash.
 Catacomb Russia 2700-2500 BC

U4a1: Found in all South Asian populations.
 Neolithic Hungary 5500 BC
 Yamnaya Russia 3000 BC
 Catacomb Russia 2500-2000 BC
 Andronovo Siberia 1746-1626 BC
 Corded Ware Germany 2400 BC

H2b: Existent in India and SC Asia
 Yamanya Russia 3000 BC

Friday, December 30, 2016

Natural Selection Did It!

In my previous post I hypothesized that natural selection changed mHG frequencies in Europe after 3000 BC. Since then I’ve tested that hypothesis by analysing modern and ancient European mtDNA. I’m presenting my findings in this post.

The data I used is presented below in links. Only these handful of ancient European populations have enough mtDNA published to compare to moderns.

Mesolithic Western Europeans. 11000-7000 BC. N=36
Early Neolithic Hungary, Germany. 7500-6500 BC. N=450
Neolithic Iberia, France. 7000-4500 BC. N=171
Middle Neolithic (mostly)Germany. 6000-4000 BC. N=151
Late Neolithic/Chalcolithic Iberia, France. 3500-2800 BC. N=143
Chalcolithic Pontic Caspian Steppe. 4000-3000 BC. N=98

Haplogroup Frequencies: Modern and Ancient.
H Subclade Frequencies: Modern and Ancient
JT, U5, N1, Subclade Frequencies: Modern and Ancient.

The ancient European populations I included in that spreadsheet can be broken up into three groups defined by mHGs mostly specific to them.

Mesolithic Western Europeans: U5b.
Neolithic Europeans: K, H1, H3, N1a1a, T2, J1c, HV0.
Pontic Caspian Steppe Folk: U5a, U4, T1a, H6, I

I easily distinguished European and West Asian mtDNA in this post. I can not however distinguish the mtDNA of different European populations to any significant degree using higher coverage mtDNA data. Uniformity is the adjective which best describes European mtDNA. Why is it is so uniform? Common ancestry is one reason. Natural selection is another and I give the reasons why I think that in this post.

Autosomally modern Europeans can be successfully fitted as a mixture of the three ancient populations listed above, with some extra stuff added for some. If mtDNA from these ancestors was passed down to modern Europeans with no natural selection affecting mHG frequencies then 1; modern European mtDNA diversity would follow the same trends as autosomal diversity, 2; modern European mtDNA could fit as a mixture of the mtDNA of those ancient populations.

So is that the case? Mostly No and a little a yes.

1. Do Autosomal and mtDNA Correlate?

Correlation mtDNA/Autosomal

The first method I used to learn if natural selection has affected European mtDNA or not is see how well autosomal DNA and mtDNA correlate. I did this by comparing the frequencies of typical Neolithic, Steppe, and Mesolithic mHGs with Neolithic, Steppe, and Mesolithic ancestry according to autosomal DNA.

U5b shows little correlation with Mesolithic ancestry.

The most typical Neolithic mHGs have terrible correlation with Neolithic ancestry. T2b and J1c might peak in Northern Europe, including Lithuania for T2b, where Neolithic ancestry is lowest. I didn’t include H1 in the above spreadsheet but it to has no correlation with Neolithic ancestry. There is some correlation in mHGs HV0 and K. K is lowest in NorthEast Europe where Neolithic ancestry is lowest(but is high in Ireland, Scandinavia). Both HV0 and K peak in Iberia and Southwest France where Neolithic ancestry peaks.

Typical Steppe mHGs U4 and U5a do correlate well with Steppe ancestry. Both peak in the NorthEast where Steppe ancestry peak. The next strongest presence for both might be in YugoSlavia and Scandinavia. T1a and I(N1a1b1) don’t correlate well with Steppe ancestry. H6 however might.

So there’s some correlation between mHG frequency and ancestry but not a lot. The frequencies of U5b, K, T2, J1c, HV0, H1, H3, U5a, U4, T1a in Europe overall aren’t consistent with being passed down from ancient populations with no natural selection affecting frequencies.

2. Steppe+Neolithic+Mesolithic mtDNA=Modern European mtDNA?

The second method I used to learn if the hypothesis that natural selection has affected European mHG frequencies is test if European mtDNA can fit as mixture of Neolithic, Steppe, and Mesolithic.

Here’s an explanation of how I did this. Let's say Polish have 40% Steppe mtDNA. Using simple math I can calculate what effect a 40% contribution of mtDNA from published ancient Steppe people would have on Polish mtDNA. Then I can create a zombie of the other 60% of the Polish mtDNA gene pool’s mHG frequencies. If the zombie’s mHG frequencies are similar to Neolithic Europeans natural selection hasn’t effected Polish mtDNA. If the zombie’s other 60% are radically different than natural selection has probably affected Polish mtDNA.

I did similar calculations for four European populations; Sweden, South Poland, SouthWest France, and North Spain. For each population their non-Steppe and non-Neolithic zombies’ mtDNA were pretty different from actual Steppe and Neolithic peoples’ mtDNA. Check out the Results in this spreadsheet.


If Swedes and Poles have 60% Middle Neolithic mtDNA than their non-Neolithic mtDNA would have to be 63-73% H and 0-(-)2% J, and (-)4-(-)10% K. That’s literally impossible, let alone very different from Steppe mHG frequencies. When they are given 70% or 80% or more Middle Neolithic mtDNA the results of their non-Neolithic zombie’s mtDNA gets crazier and crazier(80% H, -20% K, etc).

If SouthWest French and North Spanish have 65% Middle Neolithic mtDNA than their non-Neolithic mtDNA would have to be 80% H, -6% U5b, -5% J. If their Middle Neolithic mtDNA is higher than their non-Neolithic side’s mtDNA gets crazier and crazier.

What other explanation is there for this but natural selection?

One explanation is that there was a population in Europe with high frequencies of H, low frequencies of K, etc. who swept across the continent. I tested this hypothesis. It doesn’t work. As far as I can see it’s impossible.

Refer back to the results from the spreadsheet; mtDNA=/=Steppe+MN, I just discussed. If the mtDNA of the Europeans I tested in that spreadsheet are modelled as anything over 50% Middle Neolithic or Steppe the other 50%(“mtDNA zombie”) comes out with mHG frequencies in the negatives.

I know from results I didn’t present in the spreadsheet that negative frequency results only disappear when Steppe or Middle Neolithic make small contributions. The smaller their contribution the more reasonable the zombie results become because the zombie becomes more and more like the modern European population. For example, if I modelled Polish as having 1% Steppe mtDNA their 99% other mtDNA zombie would come out looking like Polish mtDNA. The only way to make the non-Neo or non-Steppe zombie of European mtDNA have reasonable results is if Steppe and Middle Neolithi make small contributions to European mtDNA while an unknown zombie population with mtDNA like modern Europeans makes a huge contribution.

In other words the only possible scenario in which natural selection didn’t affect European mtDNA is if there was an ancient population with mHG frequencies like modern Europeans that swept across Europe and replaced 60%+ of the preexisting mtDNA making everyone in Europe have similar mHG frequencies. Sounds crazy right?