No announcement yet.

shrinking shared cM

  • Filter
  • Time
  • Show
Clear All
new posts

  • shrinking shared cM

    Does anyone know anything about FTDNA revising the Family Finder algorithm and so sharply reducing the shared cM for many of my matches? This seems to have happened sometime in the past year, perhaps even more recently than that.

  • #2
    Not surprising, although I don't recall specific documentation on this. There is not an absolute break point such that matching segments over a certain length are valid, while segments below that length are invalid. Rather, there is a probability that shorter segments are less likely to be valid. Also, there are sections on some chromosomes where apparently matching segments don't behave properly and are widely believed to be so problematic that they should be completely disregarded (e.g., "pile-up regions"). Another possible issue is whether to include matching X chromosome segments in the overall total. It is a matter of art and experience where to set the lower limit of a "valid" segment and which sections to exclude, and FTDNA revisits the issue from time to time as more data accumulate. The total shared cM reported for the same raw autosomal DNA files on different web sites (using different algorithms to determine shared cM) almost always differ, but in my experience, rarely enough to lead to different genealogical conclusions.


    • #3
      Thank you, Mr. McCoy. I asked FTDNA and learned that they did adjust their algorithm on July 1, 2021. (I don't recall them making any sort of announcement.) I haven't sifted through all of my notes, but it appears that the change had little or no effect on close relatives (first and second cousins), but beyond that, the changes were significant. For example: 64cM became 20cM, 54 became 21, 61 to 29, 81 to 33, 61 to 36, 77 to 31, 75 to 32, 61 to 42, 68 to 44, 66 to 55, 65 to 48 and 58 to 36. In general, the effect is to move many matches from what I'd call strong third cousins down to weak third cousins or more distant. I can only hope that the new algorithm is more reliable, but I regret the time and enthusiasm I spent investigating matches that now appear to be not very promising.


      • #4
        I believe you are referring to how FTDNA calculated the cM Totals. Now reported as the 'Shared DNA' column and before as 'Shared Centimorgans' column.

        Prior to July 1, if you had a match that consisted of 22 cM, 7 cM, 4 cM and 5 segments of 2 cM each, the total was 22 + 7 + 4 + 5*2, and was reported as 43 cM.

        Since July 1, segments less than 6 cM have no longer been reported in the total. In my previous example, the total would now be 29 cM.

        In more extreme examples, you might have a match with a total of 60cM, based upon an accumulation of many tiny segments, now being reduced to one modest segment of perhaps 8 cM, or even dropped from your match list. On closer matches, dropping from 250 to 225 was less dramatic.

        The absolute minimum for a match is now at least one segment over 7 cM. Before, there were a few ways, with 7.69 cM minimum segment (and a total of 20 cM) being the most common. Since the minimum dropped from 7.69 to 7, my accts generally added a NET of 10% more matches last July 1.

        Most people found the old algorithm of including segments as low as 1 cM in the totals, as rather misleading. In general, the "new and improved' algorithm after July 1, 2021 was seen positively but there are certainly some people who valued the potential for the very small segments. Hopefully, those people downloaded their Segment Data prior to July 1, 2021.

        FTDNA published a white paper, which showed segments below 6 as increasingly unreliable as you drop down to lower cM.

        No other testing site looks at segments below 5 cM. GEDmatch allows you to look at segments down to 3 cM, or 2 if using the Q matching, But they usually dont include these in the totals so you dont have the misleading totals that FTDNA was showing before the update.


        • #5
          Not sure I would call small segments misleading since the dna you would share with even very close family members are likely to get under single digits on occasion. My nephew for example & I share a small 6cm segment , right up to segments over 150cm. When you are matching people at 5 generations back obviously the small segments very much come into play when looking at your range of relationships in my view. Someone sharing a lone 9cm segment would be less important to me than someone else sharing 9cm plus 5 small 2cm segs on varying other chromos . I'm not saying they should be used for matching or even viewing on the browser but at least count them in an overall total.


          • #6
            From the FTDNA white paper 5.0, p15 Microsoft Word - FFM_WhitePaper_V6.docx (

            cM vs False Positives
            1 99.9
            2 99.6
            3 97.0
            4 81.5
            5 47.8
            6 19.7

            I think you can make a case for 5 cM being added to the total, but below 5 cM, the incidence of false negatives increases excessively.

            These results are consistent with some other studies, although some of those consider 6 to 7 cM to be the half-way point of IBS by IBD.

            I prefer the total to reflect higher-confidence segments, but that's just my opinion. I didnt care for predictions of 3rd cousins based upon an accumulation of numerous very small segments.


            • #7
              For me, totals including very small segments are indeed misleading, because there is no warning that the total shared cM might not be comparable to shared cM computed under other algorithms. The amount of shared autosomal DNA widely used to infer possible relationships is based on data collected under the usual constraint of counting only segments above a certain minimum value, usually around 7 cM. The result of including also the smaller segments is that relationships may sometimes appear closer than they actually are. A workaround would be to label the shared autosomal DNA values clearly, such as "total of shared autosomal DNA segments over 7 cM", and to provide a total of "smaller segments" that fall in a defined range.


              • #8
                Everyone's matching methodologies are greatly flawed. The match sizes are big enough for close relationships so that they get those right in spite of themselves. However, a lot of bad assumptions underly the search for more distant relationships.


                • #9
                  It seems there may be more to this than just the problem with small segments and population segments.

                  I noticed a drop in the shared DNA between my half-brother and his mother when the new algorithm went into effect last July. Looking at the chromosome browser, they do not match at all on chromosome 19 and some small segments of other chromosomes. When I compare the files are GEDmatch, there is a full half-match on all chromosomes as expected. I reported this to FTDNA support and they agreed there is a problem. They submitted a ticket to the development team but nothing ever changed. It is probably time to ping them about it again.

                  Interestingly, they each match with her mother's first cousin on most of chromosome 19. It is strange that they would then not match with each other. The problem could be with the new imputation algorithm since both files were imported from Ancestry but I would expect that to affect the cousin matching. It could also be a problem with the new endogomy algorithm. The family lines for the cousin likely have a high amount of endogomy. However the families were also very prolific with a high number of descendents doing DNA testing. This could be creating a bias in the algorithm.

                  It would be useful to know if others who have done parent/child testing are seeing similar issues.


                  • #10
                    An interesting observation! Here we see the need for honest, unfiltered display of genetic testing results. Problems in vendor algorithms can easily go undetected unless customers can see the unaltered results and alert the vendor to discrepancies. When each such illogical result is found, the vendor also needs to add data quality tests or edits, so that future algorithm changes can be tested for inconsistencies prior to implementation.