No announcement yet.

E1b1a clusters may signal new haplogroups

  • Filter
  • Time
  • Show
Clear All
new posts

  • E1b1a clusters may signal new haplogroups

    To see this post with the original pics, you're better off going here:

    Last year, Lynn Sims did an important study that found a new SNP (U209) that reassigns more than half of the former E1b1a* samples (ex-E3a*) into the new haplogroup E1b1a8 (ex-E3a8).

    E1b1aSims2007.gif (see attachments below)

    I found 4 clusters within E1b1a, and 2 of them might indicate new haplogroups. I think the YCAII=19/19 cluster is E1b1a8, the YCAII=19/21 cluster is mostly E1b1a7 (former E3a7), the YCAII=21/21 cluster is either E1b1a1 (former E3a1) or a new haplogroup, and the DYS388=13 cluster is a new haplogroup.

    E1b1aclusters.gif (see attachments below)

    The above graph shows only 40 samples, but I originally used 500 samples from ysearch to detect these clusters, half of which had 25- and 37-STR haplotypes. In the graph I circled the obvious differences between clusters, but there are half a dozen more in each cluster, such as DYS19 and DYS389B to separate 19/19 from 19/21.

    I looked at thousands of samples from Africans in, and I was able to calculate the frequencies of these clusters throughout Africa. There were some coincidences and some misses with the results from the Sims study.

    E1b1aclusterfrequencies.gif (see attachments below)

    [continued in next post]
    Attached Files
    Last edited by argiedude; 8 September 2008, 10:08 PM.

  • #2
    Cluster 19/21
    This cluster makes up 50% to 55% of the E1b1a of African Americans. Sims found that in African Americans, E1b1a is made up of 40% E1b1a7 and 18% E1b1a*. I think the 19/21 cluster corresponds to both haplogroups, because of the coincidence in percentages and the fact that the 19/21 cluster has the highest diversity of the 4 clusters. Also, 19/21 has a "generalized" haplotype which gives it an equal genetic distance to the other 3 clusters, while the other 3 clusters are each closer to 19/21 than to each other. In other words, 19/21's modal haplotype is close to the root of E1b1a. Also, a study of Guinea-Bissau found very few cases of haplogroup E1b1a7, but looking at the accompanying haplotypes of the study I estimate that 50% of the E1b1a belongs to the 19/21 cluster.

    After becoming convinced this cluster encompasses 2 haplogroups, I looked at it carefully to find if there were any sub-clusters within it, and despite trying as hard as I could, I found absolutely nothing. But I don't think this invalidates that the cluster could belong to 2 haplogroups, because an identical situation can be seen in the many haplogroups of R1b1b2, all of them virtually indistinguishable from each other in their haplotypes.

    In there are 550 samples in E1b1a, but only 15 are listed under E1b1a7. Aside from the obvious fact that this must mean most of the E1b1a7 in ysearch is listed under E1b1a, the significant thing is that 14 of the 15 samples listed under E1b1a7 belong to the 19/21 cluster. I suppose these are samples that have been SNP-tested and that's why they know what haplogroup they belong to. So it's yet another indication that this cluster corresponds mostly to E1b1a7 (and a significant minority to E1b1a*).

    Cluster 19/19
    This cluster makes up 40% of E1b1a in African Americans. Sims found that E1b1a8 was 40% of the E1b1a of African Americans. It's the only option left, and the percentages fit perfectly, so I'm guessing cluster 19/19 is haplogroup E1b1a8. It has a somewhat lower diversity than 19/21 and, like 19/21 again, it's remarkably uniformly distributed across all of sub-Saharan Africa, at a lower frequency of 40% of the E1b1a samples (not 40% of the y-dna).

    Cluster 21/21
    This cluster makes up 3% of E1b1a in many populations of African Americans. E1b1a1 (former E3a1) also usually appears at a rate of 3% of the E1b1a in Africans. You would think this settles it, unfortunately I found a study (Butler 2002) which showed an extended haplotype for 2 SNP-tested samples of E1b1a1, and they definitely did not belong to the 21/21 cluster. Both samples were from southern Africa, where the 21/21 cluster is found at the same rate as in other places, so it's unlikely the southern African E1b1a1 belongs to a different haplotype than elsewhere. But...

    I realized something at the last moment that tilted the balance in favor of 21/21 being equal to haplogroup E1b1a1, after all. The Butler E1b1a1 samples have no close matches at all in ysearch, but it's a fact that E1b1a1 makes up 3% of the E1b1a of African Americans, who make up the vast majority of the ysearch samples, so the likely explanation is that the Butler E1b1a1 are indeed a deviant regional haplotype after all. Having taken care of these 2 stumbling blocks, then the other similarities make me think that the 21/21 cluster is in fact haplogroup E1b1a1 (ex-E3a1). Damn, I would've liked a new haplogroup.

    The cluster's diversity is pretty low, despite being found from Guinea-Bissau to South Africa.

    [continued in next post]
    Last edited by argiedude; 8 September 2008, 10:09 PM.


    • #3
      Cluster 388=13
      This is the most obvious cluster. It can be easily identified, even with a 12-marker haplotype. It has 2 extremely rare values 390=22 and 388 equal 13 (!!! I had to write 388 equal 13 because doing it with the equal sign gave me a weird "method not implemented" warning).

      This cluster is the only one that has an uneven distribution. It's centered in the Guinean region, and is likely very rare in the rest of West Africa and virtually inexistent beyond. It makes up about 15% of the E1b1a of Guinea-Bissau, and 4% in African Americans. Its frequency in African Americans fits very well with the composition of their ancestors, which further leads me to believe this cluster is extremely rare in the rest of West Africa and virtually inexistent outside it.

      It doesn't correspond with any haplogroup. E1b1a1 is uniformly present in all of Africa, its haplotype is extremely different (wether the Butler E1b1a1 samples or the 21/21 cluster), and I have good reason to believe E1b1a1 is already called for and belongs to cluster 21/21. Haplogroups E1b1a2 to E1b1a6 are either private haplogroups or very, very rare amongst African Americans, which doesn't fit with the 4% presence of this cluster amongst the E1b1a of African Americans. And E1b1a7 and E1b1a8 are already accounted for and have frequencies close to 50%, much higher than this cluster.

      So cluster DYS388=13 seems to be a great candidate to become a yet to be identified new haplogroup within E1b1a.

      EDIT: I made an initial estimate of the variance using 35 STRs. Now I did a 2nd estimate removing all multi-copy and fast markers, and there was an important difference. The 2nd estimate, using 18 STRs, is much more likely to be closer to the reality of the situation. It's interesting that 388=13 has the same variance as 19/19, given that 19/19 is a huge fraction of E1b1a and exists everywhere, while 388=13 is probably limited to the Guinean region (western 1/3 of West Africa) and even there isn't very abundant.

      E1b1avariance.gif (see attachments below)
      Last edited by argiedude; 8 September 2008, 10:08 PM.