I've been looking for a reliable estimate of average mutation rates for Big Y 700. To date I've seen crude estimates--with no support--varying between once in every 81 years to once in every 160 years. This was kind of troubling because I noted an enormous variety of mutations since the MRCA for the FGC23343+ group (i.e., from 13 to 34). I really wanted some semi-technical explanation.

I think I've finally found a good one, but I'd appreciate some input from knowledgeable users to see how far off my understanding is. Anyhow, here's the situation as I currently understand it: The base mutation rate is about 2.3*10(-8), or 1 per 23,255,814 base pairs tested.

https://pubmed.ncbi.nlm.nih.gov/19716302/

This estimate can be confidently converted into TMRCA only if you know the # of base pairs that have been

https://blog.familytreedna.com/wp-co...compressed.pdf

But that's NOT the same thing as the number of base pairs

Anyhow, assuming 100% fully expected coverage for the tested areas, I think the conversion to an estimate of average years per mutation should look like this:

23,255,814 / 15,000,000 * 33= 1 mutation every 51.16 years

The 23 million number are the base odds discussed above, and the 33 years is an estimate of the number of years between generations. The 15 million number is an estimate of the population average number of base pairs receiving reliable scans per product specs. I want to see some kind of support for THAT number. To date all I've seen is a flat statement from the DNAeXplained blog, without any technical specifics.

https://dna-explained.com/category/big-y-700/

I'm not questioning the veracity of the number. It hits the nail right on the head for two sets of donors under FGC23343. I just want to know how accurate this understanding is. I know there is a lot of discussion about whether particular mutations can be considered true SNPs based on whether they are located in or outside of the so-called "comBED regions", but I'm assuming that is not a relevant variable in this particular case because the FTDNA white paper specifies the region tested, and I'm assuming that FTDNA would not test it unless it fully met the "comBED region" criteria. I'm more interested in why there should be such a wide variety of reported SNPs since the MRCA among Big Y 700 donors.

And as a post-script, I think I may have inadvertently found a good workaround for age estimates within my wider FGC23343 group, using an average 85 years to mutation and adjusting each branch for the intra-clade average number of mutations to the average for the entire group. I arrived at that number subjectively by taking of survey of recommendations from internet posts, without any technical support or reasoning, but it seems to work pretty well with the calculated overall 51.16 year average for the "most active" subclades. It also jibes pretty well with STR results for one numerous subclade with divergent branches with MRCA's born in the 1600s and the 1400s. For that reason I'm pretty happy with it, although I doubt 85 years could be very useful as a population-wide generalization unless typical coverage is actually significantly lower figure implied by that 15 million base pair number. Hard to know without seeing where that number came from.

I think I've finally found a good one, but I'd appreciate some input from knowledgeable users to see how far off my understanding is. Anyhow, here's the situation as I currently understand it: The base mutation rate is about 2.3*10(-8), or 1 per 23,255,814 base pairs tested.

https://pubmed.ncbi.nlm.nih.gov/19716302/

This estimate can be confidently converted into TMRCA only if you know the # of base pairs that have been

*tested for a specific donor, and this is where it gets a bit tricky. Standard product specifications can only give us a good estimate of the absolute # of base pairs tested--about 23.6 million for Big Y 700.***reliably**https://blog.familytreedna.com/wp-co...compressed.pdf

But that's NOT the same thing as the number of base pairs

*tested because of flukey things outside of the company's control, like the quality of the sample, etc., which could prevent some regions from receiving an adequate number of high quality scans to be considered reliable. The sample for the kit I co-admin is nearly 13 years old, so I wouldn't be surprised if sample quality alone counted for our less-than-expected number of SNPs--although all in all, we actually fell within the average range for the FGC23343+ people, so I'm not disappointed. But I would like to better understand what the other variables could be.***reliably**Anyhow, assuming 100% fully expected coverage for the tested areas, I think the conversion to an estimate of average years per mutation should look like this:

23,255,814 / 15,000,000 * 33= 1 mutation every 51.16 years

The 23 million number are the base odds discussed above, and the 33 years is an estimate of the number of years between generations. The 15 million number is an estimate of the population average number of base pairs receiving reliable scans per product specs. I want to see some kind of support for THAT number. To date all I've seen is a flat statement from the DNAeXplained blog, without any technical specifics.

https://dna-explained.com/category/big-y-700/

I'm not questioning the veracity of the number. It hits the nail right on the head for two sets of donors under FGC23343. I just want to know how accurate this understanding is. I know there is a lot of discussion about whether particular mutations can be considered true SNPs based on whether they are located in or outside of the so-called "comBED regions", but I'm assuming that is not a relevant variable in this particular case because the FTDNA white paper specifies the region tested, and I'm assuming that FTDNA would not test it unless it fully met the "comBED region" criteria. I'm more interested in why there should be such a wide variety of reported SNPs since the MRCA among Big Y 700 donors.

And as a post-script, I think I may have inadvertently found a good workaround for age estimates within my wider FGC23343 group, using an average 85 years to mutation and adjusting each branch for the intra-clade average number of mutations to the average for the entire group. I arrived at that number subjectively by taking of survey of recommendations from internet posts, without any technical support or reasoning, but it seems to work pretty well with the calculated overall 51.16 year average for the "most active" subclades. It also jibes pretty well with STR results for one numerous subclade with divergent branches with MRCA's born in the 1600s and the 1400s. For that reason I'm pretty happy with it, although I doubt 85 years could be very useful as a population-wide generalization unless typical coverage is actually significantly lower figure implied by that 15 million base pair number. Hard to know without seeing where that number came from.

## Comment