Go Back   Family Tree DNA Forums > Universal Lineage Testing (Autosomal DNA) > Family Finder Advanced Topics

Family Finder Advanced Topics Advanced discussion about Family Tree DNA's Family Finder Product.

Thread Tools Display Modes
Old 11th July 2017, 10:40 AM
Frederator Frederator is offline
FTDNA Customer
Join Date: Jul 2010
Posts: 754
DNA segments as a component of ancestral contribution rather than genetic clock

Help a layman achieve a more correct understanding of recombination and how it relates to the retention of ancestral contribution.

As I understand it, the cM is not a unit of physical measurement, but instead a meta-measurement, an estimate of the propensity to recombination of a fixed range of base pairs on a specific chromosome.

Recombination is not completely random, but rather haphazard yet biased. Women recombine at a faster rate than men, and the physical proximity of certain chromosomes and certain regions within those chromosome make them more prone to recombine than others. But recombination does not happen at a fixed rate that would make it analogous to a genetic clock.

So the designation of a block of, let's say, 10 million base pairs as 30 cM compared to a different block of 30 million base pairs as 10 cM is just a reflection of the fact that the first block lies in a physical location particularly inclined to collide with its matching chromosome copy as compared to the physical location of the second block.

In other words, a contiguous block of 66 cM is no more likely to recombine than a set of 3 distinct blocks of 22 cM each spread evenly across the genome. It's not as if that 66 cM block were somehow 2/3 of the way to its next scheduled recombination whereas each of the 22 cM blocks were each just shy of 1/4 of the way. It's maybe closer to the truth to say that, collectively, the next recombination event is as likely to happen in one of the three 22 cM blocks as it is in the single 66 cM block.

Sorry if this seems a little obtuse, but I'm trying. Somebody wrote something in a different thread that just didn't intuit with me.

My interpretation of their comment was that somehow a single contiguous segment of a large cM count was more prone to complete replacement through recombination than an array of individually smaller but collectively equal cM spread more evenly over the 22 chromosomes.

My first thought was that they were claiming that the factors that lead to large block size also lead to relatively larger areas impacted by the typical recombination event. But, based on what I know now, I don't think that is the case. The size, in cM or base pairs, of regions effected by a given recombination event has nothing to do with the recombination rate in general. If that is wrong please correct me.

My second thought was that they were claiming that each contiguous segment of ancestral contribution functions as a genetic clock, with its vulnerability to loss through recombination directly proportional to its size in cM. This logic had a superficial appeal, but only if I would forget what the concept of cM actually represents and that the likelihood that a given event will occur is enhanced by increasing the number of trials.

What we're measuring when we calculate the probability of two cousins to match each other isn't the likely loss of particular randomly selected segments through recombination, but rather the likely loss of any part of a given ancestor's contribution through recombination. Therefore it doesn't matter whether the ancestor's contribution is in a single contiguous block or 3 smaller blocks distributed evenly across 22 chromosomes, provided the sum total of the blocks in terms of cM is equal.

If that really is the case, then the probability for cousins to match is a straightforward function of the number of recombination events in the specific chain of descent from the common ancestors. That in turn strongly correlates to the gender of the specific ancestors in the chain of descent, as males have a predictably different recombination rate than females. The faster the recombination rate, the faster the ancestral contribution is reduced below the target matching threshold in cM.

Again, sorry if this seems obtuse, but the reliability of a model I'm using is at stake. To date my model has agreed very closely to the published figures from FTDNA, so I really am reluctant to make unilateral alterations to its fundamental logic.

Last edited by Frederator; 11th July 2017 at 10:46 AM.
Reply With Quote
Old 12th July 2017, 10:45 AM
John McCoy John McCoy is offline
FTDNA Customer
Join Date: Nov 2013
Posts: 516
Yes, recombination is the most fascinating mystery of classical genetics! The development of the genetic map based on the frequency of recombination was a huge conceptual breakthrough. However, it soon became apparent that the idea of mapping by the probability of a crossover had some complications. One of them is the way multiple crossovers affect the measured frequency of recombination as marker pairs get farther apart. The observed frequency of recombination (at least in a "well-behaved" laboratory system) tops out at 50%, even for markers that are much farther apart than that on the genetic map.

But once we have a genetic map based on recombination frequency as measured for markers that are, say, no more than about 20 cM apart (so that the distortions caused by multiple crossovers are minimized), is that the end of the story, or are there additional constraints on the location and/or frequency of additional crossovers? What is it about a particular neighborhood on a chromosome that leads to an increased or reduced frequency of single or multiple crossovers? The literature, stretching back about a century, is full of observations about the phenomenological details of recombination - and most of them have nothing whatever to do with the number of base pairs, or even the existence of DNA. (DNA explains a lot, of course, but for the most part, the phenomena were discovered and rationalized before the double helix.)

One of the features of autosomal matching algorithms is that they only work down to a (somewhat fuzzy) lower limit. The matching algorithms must eventually fail to detect segments below some lower limit, whether from the number of SNP's, the degree of variation that is present within the segment, or some guessed-at minimum size expressed in cM. As a result, that part of a parental or grandparental contribution - the segments that are too short - becomes undetectable, and our statistics are therefore biased to some unknown extent.

The idea of the "biological clock" comes up in many contexts throughout biology, but I think it is frequently applied to situations where it is not appropriate. There is always an average mutation rate, or some other rate, but the leap from the mean rate of something (the number of events divided by the elapsed time) to the individual steps in a particular case is too often taken without considering whether the "law" of large numbers will really reveal the truth, and not just deceive us. The idea that matching segment length "decays" with each passing generation at a uniform and predictable rate, like some radioactive isotope, may be one of these situations. It has to be tested before we start basing conclusions on it! At least one company claims the shorter matching segments represent more remote common ancestry, but where are the statistics?

For me, there are too many confounding factors and unknowns. We want to push genetic genealogy much farther than the facts warrant. The probabilities for relationships beyond about second or third cousins seem to be based on a combination of theory with statistics that may be biased in some way. There is room for a lot of exploration and discussion!
Reply With Quote
Old 12th July 2017, 11:10 AM
georgian1950 georgian1950 is online now
FTDNA Customer
Join Date: Jun 2012
Posts: 617
Mainstream matching methodology has really gotten a lot wrong. For one the idea of a small segment likely being false is just plainly mistaken. If the chances of getting at least a half match for a particular SNP is 80%, then the chances of randomly matching 100 SNP's in a row is a near impossibility. What about all of these false positives, pseudo-segments, etc. which some experts have posited? A huge source of recent common ancestry exists which has been missed by genealogists. The popup small segment and growing segment for child are caused by the parents sharing this common ancestry and are not proof that some segments are false.

If you could filter out the matching segments which are caused by this common ancestry, the remaining segments may well uniquely identify a distant common ancestor.

Until we get the basics figured out, a lot of the higher analyses being done is just a waste.

Jack Wyatt
Reply With Quote
Old 12th July 2017, 12:18 PM
Frederator Frederator is offline
FTDNA Customer
Join Date: Jul 2010
Posts: 754
Originally Posted by John McCoy View Post
. . .For me, there are too many confounding factors and unknowns. We want to push genetic genealogy much farther than the facts warrant. The probabilities for relationships beyond about second or third cousins seem to be based on a combination of theory with statistics that may be biased in some way. There is room for a lot of exploration and discussion!

But that's just the nature of complex systems. There are no straight lines in nature, and there is no such thing as an absolute 100% probability.

Any attempt to move forward in any discipline requires incremental improvement of your understanding of the overall probabilistic environment. That's what progress is--pushing those boundaries further.

On the other hand, not everyone's got what it takes to be a pioneer. At least not in every conceivable discipline. Most of the time only a very vague understanding is good enough.

I set out using genetic genealogy because I'd exhausted the available information from conventional genealogy. I have what would be considered a small number of matches, but by being very diligent I've been able to accomplish a few impressive feats. I've definitively broken through the brick wall of more than one of my matches.

I still haven't definitively accomplished my own primary goal, but I think I'm getting closer. I'll have to pay attention to any clue, no matter how subtle, in order to get there.
Reply With Quote


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
New Ethnicity Component at Ancestry. What Do You Think? MoberlyDrake myOrigins Basics 20 2nd April 2017 02:35 AM
Autosomal DNA contribution from Grandparents and Great-grandparents psinacio Family Finder Basics 18 6th January 2017 02:42 PM
Ancestral Origins and Genetic History of Tibetan Highlanders PNGarrison Scientific Papers 0 1st September 2016 12:21 AM
Improved Calibration of the Human Mitochondrial Clock Using Ancient Genomes PNGarrison Scientific Papers 0 12th November 2014 03:11 AM
Principal component analyses jpulla DNA and Genealogy for Beginners 1 25th October 2010 04:13 PM

All times are GMT -5. The time now is 01:48 AM.

Family Tree DNA - World Headquarters

1445 North Loop West, Suite 820
Houston, Texas 77008, USA

Phone: (713) 868-1438 | Fax: (832) 201-7147
Copyright 2001-2010 Genealogy by Genetics, Ltd.
Powered by vBulletin® Version 3.8.4
Copyright ©2000 - 2017, Jelsoft Enterprises Ltd.