I recently received results for a BigY700 kit I manage for a relative. Comparing these results to block trees at FTDNA, the Big Tree and YFull, I'm staring to get an appreciation for some of the complexities in estimating Most Recent Common Ancestors ("MRCAs"). I think the biggest problem for outside projects is ensuring full comparability of kit resolutions, but even the testing companies themselves seem to sometimes get stymied by no-calls, etc. One universal problem that I don't think is going to be simply and definitively resolved any time soon is arriving at an average mutation rate. Unless we're talking about a relatively recent MRCA with a large number of matching donors, age estimates are necessarily going to remain somewhat speculative.
FGC23343 has yet another weird problem that I'm not sure I completely understand. One subclade (i.e., FGC28370) has an abnormally large number of mutations recorded within its defining blocks--like twice as many as the next largest branch. How could this happen? I can only guess that it has something to do with a difference in test resolution among individual kits, but the magnitude still seems astounding to me. I know there was one SNP (i.e., FGC62822) that was recently allocated out of FGC28370 to FGC23343, probably due to some kind of low-coverage induced no-call situation, but I can't imagine too many more such "corrections" to the tree in the near future.
But I think I've come up with a solution to these dating problem that produces some pretty reasonable estimates. Below are my current guesses, based on a normalization algorithm that adjusts the reported number of SNPs based on the average number of SNPs among the lineages under FGC23343, and an assumed 85 year mutation rate.
Luckily, some of these dates are recent enough, and supported by STR profiles so as to allow some sort of tests to corroborate. I haven't yet exhausted all such possibilities, but preliminarily I can say that the mid-points of my normalized SNP-based curve and the STR curve for FGC28370's MRCA are only 30 years apart, per the McGee Y Utility and the McDonald Y DNA calculator. The confidence intervals for some of these estimates can be pretty shocking, but the MRCA dates seem to hold together with a coherence that I haven't seen in any other estimates. Seems like a good step forward in analyzing a tricky case.
FGC23343 has yet another weird problem that I'm not sure I completely understand. One subclade (i.e., FGC28370) has an abnormally large number of mutations recorded within its defining blocks--like twice as many as the next largest branch. How could this happen? I can only guess that it has something to do with a difference in test resolution among individual kits, but the magnitude still seems astounding to me. I know there was one SNP (i.e., FGC62822) that was recently allocated out of FGC28370 to FGC23343, probably due to some kind of low-coverage induced no-call situation, but I can't imagine too many more such "corrections" to the tree in the near future.
But I think I've come up with a solution to these dating problem that produces some pretty reasonable estimates. Below are my current guesses, based on a normalization algorithm that adjusts the reported number of SNPs based on the average number of SNPs among the lineages under FGC23343, and an assumed 85 year mutation rate.
Est. birth year | Clade |
250 A.D. | FGC23343 |
304 A.D. | FT372222 |
537 A.D. | BY97678 |
1430 A.D. | FGC28370 |
1462 A.D. | Y74676 |
1647 A.D. | FGC28369 |
Luckily, some of these dates are recent enough, and supported by STR profiles so as to allow some sort of tests to corroborate. I haven't yet exhausted all such possibilities, but preliminarily I can say that the mid-points of my normalized SNP-based curve and the STR curve for FGC28370's MRCA are only 30 years apart, per the McGee Y Utility and the McDonald Y DNA calculator. The confidence intervals for some of these estimates can be pretty shocking, but the MRCA dates seem to hold together with a coherence that I haven't seen in any other estimates. Seems like a good step forward in analyzing a tricky case.
Comment