Friday, December 30, 2016

Natural Selection Did It!

In my previous post I hypothesized that natural selection changed mHG frequencies in Europe after 3000 BC. Since then I’ve tested that hypothesis by analysing modern and ancient European mtDNA. I’m presenting my findings in this post.

The data I used is presented below in links. Only these handful of ancient European populations have enough mtDNA published to compare to moderns.

Mesolithic Western Europeans. 11000-7000 BC. N=36
Early Neolithic Hungary, Germany. 7500-6500 BC. N=450
Neolithic Iberia, France. 7000-4500 BC. N=171
Middle Neolithic (mostly)Germany. 6000-4000 BC. N=151
Late Neolithic/Chalcolithic Iberia, France. 3500-2800 BC. N=143
Chalcolithic Pontic Caspian Steppe. 4000-3000 BC. N=98

Haplogroup Frequencies: Modern and Ancient.
H Subclade Frequencies: Modern and Ancient
JT, U5, N1, Subclade Frequencies: Modern and Ancient.

The ancient European populations I included in that spreadsheet can be broken up into three groups defined by mHGs mostly specific to them.

Mesolithic Western Europeans: U5b.
Neolithic Europeans: K, H1, H3, N1a1a, T2, J1c, HV0.
Pontic Caspian Steppe Folk: U5a, U4, T1a, H6, I

I easily distinguished European and West Asian mtDNA in this post. I can not however distinguish the mtDNA of different European populations to any significant degree using higher coverage mtDNA data. Uniformity is the adjective which best describes European mtDNA. Why is it is so uniform? Common ancestry is one reason. Natural selection is another and I give the reasons why I think that in this post.

Autosomally modern Europeans can be successfully fitted as a mixture of the three ancient populations listed above, with some extra stuff added for some. If mtDNA from these ancestors was passed down to modern Europeans with no natural selection affecting mHG frequencies then 1; modern European mtDNA diversity would follow the same trends as autosomal diversity, 2; modern European mtDNA could fit as a mixture of the mtDNA of those ancient populations.

So is that the case? Mostly No and a little a yes.

1. Do Autosomal and mtDNA Correlate?

Correlation mtDNA/Autosomal

The first method I used to learn if natural selection has affected European mtDNA or not is see how well autosomal DNA and mtDNA correlate. I did this by comparing the frequencies of typical Neolithic, Steppe, and Mesolithic mHGs with Neolithic, Steppe, and Mesolithic ancestry according to autosomal DNA.

U5b shows little correlation with Mesolithic ancestry.

The most typical Neolithic mHGs have terrible correlation with Neolithic ancestry. T2b and J1c might peak in Northern Europe, including Lithuania for T2b, where Neolithic ancestry is lowest. I didn’t include H1 in the above spreadsheet but it to has no correlation with Neolithic ancestry. There is some correlation in mHGs HV0 and K. K is lowest in NorthEast Europe where Neolithic ancestry is lowest(but is high in Ireland, Scandinavia). Both HV0 and K peak in Iberia and Southwest France where Neolithic ancestry peaks.

Typical Steppe mHGs U4 and U5a do correlate well with Steppe ancestry. Both peak in the NorthEast where Steppe ancestry peak. The next strongest presence for both might be in YugoSlavia and Scandinavia. T1a and I(N1a1b1) don’t correlate well with Steppe ancestry. H6 however might.

So there’s some correlation between mHG frequency and ancestry but not a lot. The frequencies of U5b, K, T2, J1c, HV0, H1, H3, U5a, U4, T1a in Europe overall aren’t consistent with being passed down from ancient populations with no natural selection affecting frequencies.

2. Steppe+Neolithic+Mesolithic mtDNA=Modern European mtDNA?

The second method I used to learn if the hypothesis that natural selection has affected European mHG frequencies is test if European mtDNA can fit as mixture of Neolithic, Steppe, and Mesolithic.

Here’s an explanation of how I did this. Let's say Polish have 40% Steppe mtDNA. Using simple math I can calculate what effect a 40% contribution of mtDNA from published ancient Steppe people would have on Polish mtDNA. Then I can create a zombie of the other 60% of the Polish mtDNA gene pool’s mHG frequencies. If the zombie’s mHG frequencies are similar to Neolithic Europeans natural selection hasn’t effected Polish mtDNA. If the zombie’s other 60% are radically different than natural selection has probably affected Polish mtDNA.

I did similar calculations for four European populations; Sweden, South Poland, SouthWest France, and North Spain. For each population their non-Steppe and non-Neolithic zombies’ mtDNA were pretty different from actual Steppe and Neolithic peoples’ mtDNA. Check out the Results in this spreadsheet.


If Swedes and Poles have 60% Middle Neolithic mtDNA than their non-Neolithic mtDNA would have to be 63-73% H and 0-(-)2% J, and (-)4-(-)10% K. That’s literally impossible, let alone very different from Steppe mHG frequencies. When they are given 70% or 80% or more Middle Neolithic mtDNA the results of their non-Neolithic zombie’s mtDNA gets crazier and crazier(80% H, -20% K, etc).

If SouthWest French and North Spanish have 65% Middle Neolithic mtDNA than their non-Neolithic mtDNA would have to be 80% H, -6% U5b, -5% J. If their Middle Neolithic mtDNA is higher than their non-Neolithic side’s mtDNA gets crazier and crazier.

What other explanation is there for this but natural selection?

One explanation is that there was a population in Europe with high frequencies of H, low frequencies of K, etc. who swept across the continent. I tested this hypothesis. It doesn’t work. As far as I can see it’s impossible.

Refer back to the results from the spreadsheet; mtDNA=/=Steppe+MN, I just discussed. If the mtDNA of the Europeans I tested in that spreadsheet are modelled as anything over 50% Middle Neolithic or Steppe the other 50%(“mtDNA zombie”) comes out with mHG frequencies in the negatives.

I know from results I didn’t present in the spreadsheet that negative frequency results only disappear when Steppe or Middle Neolithic make small contributions. The smaller their contribution the more reasonable the zombie results become because the zombie becomes more and more like the modern European population. For example, if I modelled Polish as having 1% Steppe mtDNA their 99% other mtDNA zombie would come out looking like Polish mtDNA. The only way to make the non-Neo or non-Steppe zombie of European mtDNA have reasonable results is if Steppe and Middle Neolithi make small contributions to European mtDNA while an unknown zombie population with mtDNA like modern Europeans makes a huge contribution.

In other words the only possible scenario in which natural selection didn’t affect European mtDNA is if there was an ancient population with mHG frequencies like modern Europeans that swept across Europe and replaced 60%+ of the preexisting mtDNA making everyone in Europe have similar mHG frequencies. Sounds crazy right?


  1. I suppose I should have left my comment here...

    How much do you know about RH factor incompatibility? Rhesus negative blood types peak in Europeans and in Basques among Europeans. Given the distribution local to Europe it's fair to assume that Mesolithic Western Europeans were high or perhaps homozygous for the Rhesus negative blood type.

    If a Rhesus negative woman conceives a child with a Rhesus positive man then typically she would not be able to have more than 1 child due to RH factor incompatibility. Heterozygotes are RH positive so there would be no Rhesus factor incompatibility for a homozygous RH negative man conceiving children with a homozygous RH positive woman.

    If we have a situation in Neolithic Western Europe where incoming (homozygous) RH positive farmer groups are exchanging wives with local RH negative Hunter Gatherer groups, then we would see a significant decrease in fertility for EEF Male / HG Female pairings. Especially in relation to HG Male / EEF Female pairings. As RH positive alleles spread through HG groups this would further reduce fertility among Rhesus negative women in HG groups.

    Unfortunately none of the ancient genomes we have so far have been typed for the Rhesus factor snps/gene (I've checked). But given modern distributions, the only way we could have a peak of RH negative blood types in Basques is almost certainly from some Meso or Paleolithic founder effect in a refugium around the Pyrenees.

    The effect of this compounded over thousands of years would explain the disappearance of both mtDNA U5b (local HG mtdna) and Y DNA G2a. Natural selection in effect as you suggested.

  2. I've read that mtDNA H as certain amount of resistance to sepsis. This obviously would be an advantage in ancient Cultures. Also, I've read that yDNA R1b has more male offspring. I imagine that would contribute to certain mtDNA lineages vanishing. And the RH Factor mentioned above would contribute to Natural Selection.......**And sure wish you could answer the question regarding mtDNA H6. I believe mtDNA H6 is in modern populations in Russia & Ukraine. H6a1b-Yamnaya,H6a2-Poltavka,H6a1a-Corded Ware Culture, H6a1a-Srubnaya........then in 950 AD in a Magyar (came from the Caspian Steppes) cemetery in Northern Hungary-H6a1b & H6a1a again. My H6a1a line ended up in the Balkans in approx. 700 AD. jv

  3. What about bubonic plague? I've read that Yamnaya Culture populations may have carried this into Europe.jv

  4. Is there any way you can run the autosomal DNA of any mtDNA H6a sample? I would love to see the admix. Thank you, jv