Announcement

Collapse
No announcement yet.

Centimorgans, Block and X-Match

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Centimorgans, Block and X-Match

    I'm sorry. I've searched and read a few posts on this and I still don't understand it. So here's an example. It's a real entry on my Family Finder list:

    Shared centimorgans: 188
    Longest block: 15
    X-Match: X-Match

    What does that mean. Are there ranges in the first two more significant than others? Is there an interrelationship between the three?

  • #2
    Originally posted by markk View Post
    I'm sorry. I've searched and read a few posts on this and I still don't understand it. So here's an example. It's a real entry on my Family Finder list:

    Shared centimorgans: 188
    Longest block: 15
    X-Match: X-Match

    What does that mean. Are there ranges in the first two more significant than others? Is there an interrelationship between the three?
    Shared centimorgans is how much total DNA is shared between you and your match, these are usually made up of many individual segments/blocks of shared DNA. Longest block is the longest individual segment/block of shared DNA between you and your match. See how you have a large number for the shared centimorgans and a very small number for the longest block? If you and your match are from an endogamous group that would be the reason why. For you I would focus on the Longest Block category and sort your list based on who shares the longest block of DNA. My guess is you would want to focus on those that have a longest block of probably at least 20 cM if not higher.

    As far as the X-Match it is just signifying that there is some shared DNA between the two of you on the X Chromosome. The X Chromosome category usually doesn't mean much unless you share above 20 cM or more on the X. You have to look at the chromosome browser to see how much is shared there on the X, most of the time the X Match will be very small and not worth your time investigating.

    Comment


    • #3
      The total centiMorgans (cM) shared, as measured by the major vendors by their own criteria (which normally means that only segments above some minimum size are included in the total), turns out to be highly predictive of the actual relationship between two people, out to about 3rd cousins. For more remote relationships, ranges overlap too much to be helpful. Total cM should NOT include any matching segments on the X chromosome. There is a handy chart of the average and range of shared cM that is found for various relationships. The chart is now scattered all over the internet, do a Google search on "DNA Detectives Autosomal Statistics Chart".

      188 cM puts you in the range of second cousins (and half a dozen other possibilities involving half relationships and "removed" cousins). The longest block is a nice-to-have statistic, not particularly useful in itself. The existence of an X match is often not useful at all, unless the relationship as measured by total cM is very close and the matching segment on the X chromosome is a large one, at least 10 cM, and some people recommend at least 20 cM. As a practical matter, even a large X match without close relationship as measured by total autosomal cM is rarely helpful, because it does not point you in any particular direction for discovering the identity of the shared ancestor. Rather, if you can work out the actual connection from the autosomal match with yourself and known relatives, the extra X match can sometimes provide additional validation.

      Comment


      • #4
        Originally posted by mattn View Post
        See how you have a large number for the shared centimorgans and a very small number for the longest block? If you and your match are from an endogamous group that would be the reason why. For you I would focus on the Longest Block category and sort your list based on who shares the longest block of DNA.
        Originally posted by John McCoy View Post
        The total centiMorgans (cM) shared, ...turns out to be highly predictive of the actual relationship between two people, out to about 3rd cousins.
        *** The longest block is a nice-to-have statistic, not particularly useful in itself.
        This is where I get lost. One of you says cM shared is highly predictive of an actual relationship between two people; the other says it may simply reflect that I am part of a large ethnic group which historically tended to avoid intermarriage.

        One says the longest block is important; the other that it's not particularly useful.

        Comment


        • #5
          I don't see how the leap to assigning both you AND your match to an "endogamous group" got into the conversation, with no background to suggest that that is actually the case for either of you. It is the total shared cM that is the most informative metric for most of us, and that is why the various charts and tables showing ranges of shared cM for different relationships have proved so useful and so popular among genetic genealogists. If there is a useful table based on the longest segment, I haven't found it. I would start with the total shared cM, and see where it leads.

          Comment


          • #6
            Originally posted by John McCoy View Post
            I don't see how the leap to assigning both you AND your match to an "endogamous group" got into the conversation, with no background to suggest that that is actually the case for either of you. It is the total shared cM that is the most informative metric for most of us, and that is why the various charts and tables showing ranges of shared cM for different relationships have proved so useful and so popular among genetic genealogists. If there is a useful table based on the longest segment, I haven't found it. I would start with the total shared cM, and see where it leads.
            And for that you may want to read Autosomal DNA statistics https://isogg.org/wiki/Autosomal_DNA_statistics

            You will find a nice table from The Shared cM Project there. Click to enlarge it. You may want to print it or bookmark it (or both). You will be referencing it a lot when working with Family Finder results.



            Mr. W.
            Last edited by dna; 22 April 2018, 05:56 PM.

            Comment


            • #7
              Originally posted by John McCoy View Post
              I don't see how the leap to assigning both you AND your match to an "endogamous group" got into the conversation, with no background to suggest that that is actually the case for either of you. It is the total shared cM that is the most informative metric for most of us, and that is why the various charts and tables showing ranges of shared cM for different relationships have proved so useful and so popular among genetic genealogists. If there is a useful table based on the longest segment, I haven't found it. I would start with the total shared cM, and see where it leads.
              It is common knowledge in genetic genealogy that a large amount of total shared DNA with many small segments are often times indicative of endogamy. It is by no means a "leap" as you say and I said "If you and your match are from an endogamous group". So the charts you follow are fine but they don't work in every situation so it is more a leap on your part to assume that will help without finding out the situation here.

              markk how many 2nd to 3rd cousins do you have in Family Finder and how many cousins in your list share more than 100 cMs total shared centimorgans? If you look at your myOrigins do you have Jewish Diaspora such as Ashkenazi assigned? Or are you from any ancestral groups that you know of such as French Canadian, Amish or Polynesian perhaps?

              Comment


              • #8
                Originally posted by mattn View Post
                It is common knowledge in genetic genealogy that a large amount of total shared DNA with many small segments are often times indicative of endogamy. It is by no means a "leap" as you say and I said "If you and your match are from an endogamous group". So the charts you follow are fine but they don't work in every situation so it is more a leap on your part to assume that will help without finding out the situation here.
                I second that. It's common knowledge within the genetic genealogy community about the emphasis on a rather large total shared compared to a small longest block (largest segment) is an indicator of an endogamous population.

                I have pages of 100cM and larger where the longest block is about 10cM - 12cM. Certainly, the predicted relationship based on the total shared will be completely off.

                Here's a good example of what my cousin shows with a person who shares no geographical ties at all. I identified the most distant matches in this graph with "4,000+ miles" indicating that the match comes from an island just over 4,000 miles away. They have been isolated in their island for 8 centuries just as we have for 8 centuries, and even after European contact none of the descendants ever crossed into each other's island nation yet.
                Attached Files

                Comment


                • #9
                  I viewed a video webinar by Blaine Bettinger (of "The Genetic Genealogist" blog) a month or so ago, at Legacy Family Tree Webinars. It's no longer available for non-subscribers, though. For all companies, he advised:
                  • ignoring company estimates for relationships (i.e., 2nd cousin, 4th-distant, etc.), and to use the total DNA you share instead.
                  • then, check a cM relationship chart, such as his own "Shared cM Project" chart, for which a link has been given earlier in this thread.
                  • be careful drawing your conclusions.

                  He had a special warning for Family Tree DNA, though, for which he advised removing all segments less than 5 cM, before proceeding with the above. This is because FTDNA includes many such small segments which raise the total amount of cM misleadingly. Doing this may leave just a few segments, or only a single larger segment, but if so, go ahead and use that with a cM relationship chart. You can see all the segments shared using the Chromosome Browser; using the link on that page to view the results in a table can be helpful for seeing all the small segments.

                  You really only want to use segments with a minimum of 7 cM, but even better 10 or 15 cM in length, to see matches that are real.

                  Mr. Bettinger's books are highly recommended, particularly "The Family Tree Guide to DNA Testing and Genetic Genealogy."
                  Last edited by KATM; 18 June 2018, 07:19 AM. Reason: added a sentence

                  Comment


                  • #10
                    Originally posted by KATM View Post
                    He had a special warning for Family Tree DNA, though, for which he advised removing all segments less than 5 cM, before proceeding with the above. This is because FTDNA includes many such small segments which raise the total amount of cM misleadingly. Doing this may leave just a few segments, or only a single larger segment, but if so, go ahead and use that with a cM relationship chart. You can see all the segments shared using the Chromosome Browser; using the link on that page to view the results in a table can be helpful for seeing all the small segments.

                    You really only want to use segments with a minimum of 7 cM, but even better 10 or 15 cM in length, to see matches that are real.

                    Mr. Bettinger's books are highly recommended, particularly "The Family Tree Guide to DNA Testing and Genetic Genealogy."
                    Going up against a renowned expert like him is difficult, but he is so wrong on that recommendation.

                    Jack Wyatt

                    Comment


                    • #11
                      Originally posted by georgian1950 View Post
                      Going up against a renowned expert like him is difficult, but he is so wrong on that recommendation.

                      Jack Wyatt
                      Which recommendation was wrong? Removing the small segments less than 7cM?

                      Comment


                      • #12
                        Originally posted by mamoahina View Post
                        Which recommendation was wrong? Removing the small segments less than 7cM?
                        That is part of it, but the whole methodology has bad assumptions underlying it. Family Finder uses enough SNP's with 1.0 cM segments that having a false positive is unlikely. This idea of distant, large matching segment being IBD is absurd. From a logic and probability viewpoint, smaller matching segments are more likely to be IBD than larger segments.

                        The large and many smaller matching segments come from a huge amount of common ancestry that we had within the last 300 years. The matches are built up by both kits having several paths back to the same common ancestor. I am working on a detailed proof of this and hope to have something out in a few weeks.

                        Jack Wyatt

                        Comment


                        • #13
                          http://www.statisticshowto.com/how-t...ring-together/

                          "Or"--two small segments, not necessarily located adjacent to one another

                          0.05 + 0.05 = 0.1 or 1/10

                          "And"--two small segments, located immediately adjacent to one another

                          0.05 * 0.05 = 0.0025 or 1/400

                          Clearly, the probability of two randomly selected people registering an IBS match on a single 8 cM segment is much lower than them registering as IBS matches on two separate 4 cM segments.

                          Multiplication is basic math here. Nothing fancy. Why do so many people get this wrong? Why do they keep doubling down on their error even after they've been repeatedly proven wrong? My guess is that it's embarrassing to have messed up such simple math.

                          I can only offer the additional observation that in this case, repeating the error only compounds it. This is a branch of the physical sciences. This isn't a social "science", like politics, where the only thing that matters is whether you are believed, not whether the statement is factually correct.

                          Comment


                          • #14
                            Originally posted by Frederator View Post
                            [url]Clearly, the probability of two randomly selected people registering an IBS match on a single 8 cM segment is much lower than them registering as IBS matches on two separate 4 cM segments.

                            Multiplication is basic math here. Nothing fancy. Why do so many people get this wrong? Why do they keep doubling down on their error even after they've been repeatedly proven wrong? My guess is that it's embarrassing to have messed up such simple math.

                            I can only offer the additional observation that in this case, repeating the error only compounds it. This is a branch of the physical sciences. This isn't a social "science", like politics, where the only thing that matters is whether you are believed, not whether the statement is factually correct.
                            Thanks for the analysis. I am not quite sure why you changed from the discussion from IBD to IBS, but IBS is another concept pushed by the establishment which has no merit. Some analysis that I did which needs to be revisited and expanded upon found about a 0.8 probability of getting at least a half match when comparing the same SNP on two individuals. The chances of matching 100 SNP's in a row at random is 0.8 to the one hundred power or essentially non-existent. For practical purposes, IBS can be ignored.

                            Jack Wyatt

                            Comment


                            • #15
                              IBS and IBD are complementary concepts within the wider concept of AtDNA matches. Knowing the probability of one has meaningful implications for the probability of the other.

                              It is impossible to directly calculate the probability of IBD. There is no practical way to assign meaningful values to the relevant variables (e.g., the specific MRCAs, the # of generations separation from each donor, the # of eligible donors at this level of relationship still living, the subjective likelihood of these donors to test, then to test with the same company, etc., etc.). More than this, it would be impossible to make a meaningful population-wide generalization about IBD probabilities based on the idiosyncratic set of data pertaining to individual lineages.

                              On the other hand, it is simple to calculate the probability of IBS. For each location, there are only two possible values. That's just the way G-A-T-C works. These assumptions are always equally valid for all genetically normal human beings.

                              The probability of a sustained IBS match along a contiguous region decreases incrementally with each additional SNP and cM in a very defined way.

                              All anyone can do is either:

                              1. Make general inferences about the aggregate IBD function through examining the mechanics of the complementary IBS function.

                              OR

                              2. Waste their time trying to derive a direct IBD function from the IBS variables.


                              So I'm not even a little curious as to how your derived that number. Given that you seem to think you have been working on a direct IBD function, there is zero likelihood that you have achieved a meaningful result.

                              We've been over this before. The odds that 2 individuals from a randomly selected pool of 37 will report a share of at least 100 contiguous IBS SNPs for any given location is practically 100%. For a population of over 7 billion people, that is meaningless.

                              http://forums.familytreedna.com/show...2&postcount=13

                              Actually, that calculation radically understated the situation. That calculation was with respect to only one specific location. Given that the typical commercially available test looks at somewhere around 750,000 SNPs, the probability that any 2 randomly selected individuals will share at least one reported IBS segment of at least 100 contiguous SNPs is 7,500%.

                              Absolutely meaningless.

                              Comment

                              Working...
                              X