Friday, December 30, 2016

Natural Selection Did It!

In my previous post I hypothesized that natural selection changed mHG frequencies in Europe after 3000 BC. Since then I’ve tested that hypothesis by analysing modern and ancient European mtDNA. I’m presenting my findings in this post.

The data I used is presented below in links. Only these handful of ancient European populations have enough mtDNA published to compare to moderns.

Mesolithic Western Europeans. 11000-7000 BC. N=36
Early Neolithic Hungary, Germany. 7500-6500 BC. N=450
Neolithic Iberia, France. 7000-4500 BC. N=171
Middle Neolithic (mostly)Germany. 6000-4000 BC. N=151
Late Neolithic/Chalcolithic Iberia, France. 3500-2800 BC. N=143
Chalcolithic Pontic Caspian Steppe. 4000-3000 BC. N=98

Haplogroup Frequencies: Modern and Ancient.
H Subclade Frequencies: Modern and Ancient
JT, U5, N1, Subclade Frequencies: Modern and Ancient.

The ancient European populations I included in that spreadsheet can be broken up into three groups defined by mHGs mostly specific to them.

Mesolithic Western Europeans: U5b.
Neolithic Europeans: K, H1, H3, N1a1a, T2, J1c, HV0.
Pontic Caspian Steppe Folk: U5a, U4, T1a, H6, I

I easily distinguished European and West Asian mtDNA in this post. I can not however distinguish the mtDNA of different European populations to any significant degree using higher coverage mtDNA data. Uniformity is the adjective which best describes European mtDNA. Why is it is so uniform? Common ancestry is one reason. Natural selection is another and I give the reasons why I think that in this post.

Autosomally modern Europeans can be successfully fitted as a mixture of the three ancient populations listed above, with some extra stuff added for some. If mtDNA from these ancestors was passed down to modern Europeans with no natural selection affecting mHG frequencies then 1; modern European mtDNA diversity would follow the same trends as autosomal diversity, 2; modern European mtDNA could fit as a mixture of the mtDNA of those ancient populations.

So is that the case? Mostly No and a little a yes.

1. Do Autosomal and mtDNA Correlate?

Correlation mtDNA/Autosomal

The first method I used to learn if natural selection has affected European mtDNA or not is see how well autosomal DNA and mtDNA correlate. I did this by comparing the frequencies of typical Neolithic, Steppe, and Mesolithic mHGs with Neolithic, Steppe, and Mesolithic ancestry according to autosomal DNA.

U5b shows little correlation with Mesolithic ancestry.

The most typical Neolithic mHGs have terrible correlation with Neolithic ancestry. T2b and J1c might peak in Northern Europe, including Lithuania for T2b, where Neolithic ancestry is lowest. I didn’t include H1 in the above spreadsheet but it to has no correlation with Neolithic ancestry. There is some correlation in mHGs HV0 and K. K is lowest in NorthEast Europe where Neolithic ancestry is lowest(but is high in Ireland, Scandinavia). Both HV0 and K peak in Iberia and Southwest France where Neolithic ancestry peaks.

Typical Steppe mHGs U4 and U5a do correlate well with Steppe ancestry. Both peak in the NorthEast where Steppe ancestry peak. The next strongest presence for both might be in YugoSlavia and Scandinavia. T1a and I(N1a1b1) don’t correlate well with Steppe ancestry. H6 however might.

So there’s some correlation between mHG frequency and ancestry but not a lot. The frequencies of U5b, K, T2, J1c, HV0, H1, H3, U5a, U4, T1a in Europe overall aren’t consistent with being passed down from ancient populations with no natural selection affecting frequencies.

2. Steppe+Neolithic+Mesolithic mtDNA=Modern European mtDNA?

The second method I used to learn if the hypothesis that natural selection has affected European mHG frequencies is test if European mtDNA can fit as mixture of Neolithic, Steppe, and Mesolithic.

Here’s an explanation of how I did this. Let's say Polish have 40% Steppe mtDNA. Using simple math I can calculate what effect a 40% contribution of mtDNA from published ancient Steppe people would have on Polish mtDNA. Then I can create a zombie of the other 60% of the Polish mtDNA gene pool’s mHG frequencies. If the zombie’s mHG frequencies are similar to Neolithic Europeans natural selection hasn’t effected Polish mtDNA. If the zombie’s other 60% are radically different than natural selection has probably affected Polish mtDNA.

I did similar calculations for four European populations; Sweden, South Poland, SouthWest France, and North Spain. For each population their non-Steppe and non-Neolithic zombies’ mtDNA were pretty different from actual Steppe and Neolithic peoples’ mtDNA. Check out the Results in this spreadsheet.


If Swedes and Poles have 60% Middle Neolithic mtDNA than their non-Neolithic mtDNA would have to be 63-73% H and 0-(-)2% J, and (-)4-(-)10% K. That’s literally impossible, let alone very different from Steppe mHG frequencies. When they are given 70% or 80% or more Middle Neolithic mtDNA the results of their non-Neolithic zombie’s mtDNA gets crazier and crazier(80% H, -20% K, etc).

If SouthWest French and North Spanish have 65% Middle Neolithic mtDNA than their non-Neolithic mtDNA would have to be 80% H, -6% U5b, -5% J. If their Middle Neolithic mtDNA is higher than their non-Neolithic side’s mtDNA gets crazier and crazier.

What other explanation is there for this but natural selection?

One explanation is that there was a population in Europe with high frequencies of H, low frequencies of K, etc. who swept across the continent. I tested this hypothesis. It doesn’t work. As far as I can see it’s impossible.

Refer back to the results from the spreadsheet; mtDNA=/=Steppe+MN, I just discussed. If the mtDNA of the Europeans I tested in that spreadsheet are modelled as anything over 50% Middle Neolithic or Steppe the other 50%(“mtDNA zombie”) comes out with mHG frequencies in the negatives.

I know from results I didn’t present in the spreadsheet that negative frequency results only disappear when Steppe or Middle Neolithic make small contributions. The smaller their contribution the more reasonable the zombie results become because the zombie becomes more and more like the modern European population. For example, if I modelled Polish as having 1% Steppe mtDNA their 99% other mtDNA zombie would come out looking like Polish mtDNA. The only way to make the non-Neo or non-Steppe zombie of European mtDNA have reasonable results is if Steppe and Middle Neolithi make small contributions to European mtDNA while an unknown zombie population with mtDNA like modern Europeans makes a huge contribution.

In other words the only possible scenario in which natural selection didn’t affect European mtDNA is if there was an ancient population with mHG frequencies like modern Europeans that swept across Europe and replaced 60%+ of the preexisting mtDNA making everyone in Europe have similar mHG frequencies. Sounds crazy right?

Friday, December 23, 2016

What the heck happened to European mtDNA?

The number of mtDNA samples from Neolithic/Chalcolithic Europe(roughly 5500-3000 BC) has grown to over 500. The principal locations the samples come from are Germany, Spain, and Hungary.

I've been studying European mtDNA from this era for two years. What has been blatantly obvious to me is that since the Neolithic/Chalcolithic the frequencies of mHGs in Europe have changed pretty dramatically. mHG K and T2 are about half as popular as they once were. In contrast mHG H is about twice as popular as it once was. 

Data from Germany, Spain, and Hungary all tell the same story. Here are a few states to demonstrate this change....

Frequency of Haplogroup H, K, T2, N1a1a
Early Neo Germany/Hungary: 20%, 18.7%, 25.4%, 9.4%
Early Neo Spain: 17%, 30%, 12.7%, 0.8%
Middle Neo/Chal Germany: 26.5%, 16.5%, 14%, 4.6%
Chalcolithic Spain: 23%, 22%, 4.3%, 0%
Modern Spain: 35%, 6.5%, 4%, 0%
Modern Poland: 45.2%, 3.4%, 9.4%, 0%

mHG frequencies in Spain and Germany/Hungary weren't identical back then and mHG frequencies among Europeans aren't identical today, but consistent mHG frequency trends exist in each era no matter the location in Europe. 

What caused mHG frequencies in Europe to change?

If my life depended on it I would guess natural selection is the answer. I would guess that certain mHGs affected post-Neolithic European women's' mitochrondrial DNA in ways which helped them have more daughters than women of other mHGs. I don't believe migration from a population with scary high frequencies of H and scary low frequencies of K and T2 is the answer. Genome-wide DNA from Pre-Historic Europe tell us migration into Europe after the Neolithic came primary from the Pontic Caspien Steppe which had pretty low frequencies of H. I discussed how migration from the Pontic Caspien Steppe affected European mtDNA here

Neolithic and Chalcolithic European mtDNA belongs (95%+++!)exclusively, exclusively to what today are European-specific haplogroups. So they are important ancestors of modern European mtDNA without a doubt. Genome-wide DNA is consistent with the idea they're an ancestor of modern Europeans, actually it suggests they're the most important ancestor. Haplogroup H was unfrequent in  Neolithic/Chalcolithic Iberia but about 90% of it was European-specific H1 and H3. Any theories that European specific mHGs, like H1, arrived from an unknown source are invalidated by ancient mtDNA. Ancient European belonged to the same mHGs bug at different frequencies.