Announcement

Collapse
No announcement yet.

Are small cM segments valid?

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Are small cM segments valid?

    I am very interested in knowing whether the small cM segments (some of which go all the way down to 1 cM), are valid. My interest relates to using these small segments in order to find whether relatives might share genetic traits with me.

  • #2
    Originally posted by DNAfinder View Post
    I am very interested in knowing whether the small cM segments (some of which go all the way down to 1 cM), are valid. My interest relates to using these small segments in order to find whether relatives might share genetic traits with me.
    In genetic genealogy (as in life generally), we're always dealing with probabilities. The smaller the shared segment, the lower the probability that it is Identical by Descent (IBD) ... or "valid". A 1 cM segment may be IBD/valid, but that is highly unlikely. On the other hand, a shared segment of at least 7 cM is highly likely to be IBD ... and the larger the segment the higher the likelihood. All of that said, if you find small shared segments which are completely "nested" within the range of large shared segments, the probability of the small segment being IBD increases.

    Comment


    • #3
      Originally posted by hfp43 View Post
      In genetic genealogy (as in life generally), we're always dealing with probabilities. The smaller the shared segment, the lower the probability that it is Identical by Descent (IBD) ... or "valid". A 1 cM segment may be IBD/valid, but that is highly unlikely. On the other hand, a shared segment of at least 7 cM is highly likely to be IBD ... and the larger the segment the higher the likelihood. All of that said, if you find small shared segments which are completely "nested" within the range of large shared segments, the probability of the small segment being IBD increases.
      Here is the IBD definiton:

      http://isogg.org/wiki/Identical_by_descent

      Basically the segment comes from an ancestor intact without being recombined with other parents in the line.

      Whether a 1 cM matching segment is likely to be false depends on the number of SNP's that it contains. Only 100 SNP's, a fair chance of being bad. With 300 SNP's, it is very unlikely to be bad. However at that level, it really does not matter. A match between two kits proves nothing as to where the relationship is. Triangulate it with another kit that is not closely related, and the match is almost certainly good and you can start to figure out where the lines intersect in the space-time continuum.

      Family Finder uses over 300 SNP's so you don't have to worry about 1 cM segments with it.

      Warning: my thinking is out of the mainstream, so expect to see a lot of dissenting views.

      Jack Wyatt

      Comment


      • #4
        Thank you very much for your reponses!

        I would really love for FTDNA or one of the other DNA providers to step up and provide full exome (or genome)sequencing! We could then move beyond cMs entirely.

        I noticed that a fair number of my matches who only had a 1 cM shared segment had 1000 or even 3000 shared SNPs! What would be more important more cM or more SNPS?

        Does FTDNA let you see what is happening under the hood with these calculations (that is, would there be a way of knowing what the calculated probability of these low cM matches might be)?

        Have these low cMs matches been seen to occur in computer simulations of genetic recombination?

        Comment


        • #5
          Originally posted by DNAfinder View Post
          Thank you very much for your reponses!

          I would really love for FTDNA or one of the other DNA providers to step up and provide full exome (or genome)sequencing! We could then move beyond cMs entirely.

          I noticed that a fair number of my matches who only had a 1 cM shared segment had 1000 or even 3000 shared SNPs! What would be more important more cM or more SNPS?

          Does FTDNA let you see what is happening under the hood with these calculations (that is, would there be a way of knowing what the calculated probability of these low cM matches might be)?

          Have these low cMs matches been seen to occur in computer simulations of genetic recombination?
          You might want to look at this about CentiMorgans:

          http://isogg.org/wiki/CentiMorgan

          The number of SNP's in a CentiMorgan varies to keep the probability of a crossover event constant for each Centimorgan.

          I am leaning toward SNP's being more important, but I have not given that topic much deep thought. Most software for genetic genealogy won't let you go below 1.0 cM anyhow, so in your example the matching segment has a lot of SNP's behind it, certainly more than would be needed to say that a false segment is unlikely.

          Jack

          Comment


          • #6
            Thank you again.

            I am wondering whether these 1 cM matches with 1000-3000 matching SNPs might simply be quiet areas of the genome.

            In some sections of our SNP files there are long runs of homozygosity. There can be lots and lots of SNPs, though they do not provide much information.

            It seems quite possible that these 1 cM segments might also be quiet regions with little genetic variation.These SNPs could pass from generation to generation with little change. If so, then even 3000 SNPs might not have much information content.

            Phasing might come in handy with these segments. Does FTDNA perform phasing if parent offspring DNA is provided?

            Comment


            • #7
              23AndMe does parent/child phasing, but I don't think FamilyTreeDNA does.

              Comment


              • #8
                Thank you everyone!

                This topic is of great significance to me and I am glad to have found a forum where I can talk things out.

                I have been contacted by relatives who share a trait of interest with me and I had thought that testing with 23andme would have been best because 23andme specifically report on traits and genetic illnesses.

                However, from what I now understand from this thread it would likely be best to test with FTDNA. 23andme only reports on large cM usually greater than 5 cM, while FTDNA reports right down to 1.0. For most of my matches these small segments are the majority of the total cMs.

                With 23andme I would have probably have wound up looking in the wrong part of the genome for the trait of interest in the largest cM segment that they would report. FTDNA reports on even small cMs so I will not be as easily mislead. It is, of course, possible, though not likely, that the trait of interest might even reside in a region that was below the 1 cM cutoff.

                Comment


                • #9
                  Originally posted by DNAfinder View Post
                  Thank you everyone!

                  However, from what I now understand from this thread it would likely be best to test with FTDNA. 23andme only reports on large cM usually greater than 5 cM, while FTDNA reports right down to 1.0. For most of my matches these small segments are the majority of the total cMs.
                  You're welcome.

                  You could load your kits on GEDmatch and look at segments at the 1.0 CM level there. If you have on older 23andMe kit with the V3 chipset, it is pretty much what you get from FF or until recently Ancestry DNA. The 23andMe V4 chipset kits have a lot fewer SNP's and for looking at smaller segments, they are not too compatible with the other kits.

                  Jack

                  Comment


                  • #10
                    Originally posted by DNAfinder View Post
                    Thank you everyone!

                    This topic is of great significance to me and I am glad to have found a forum where I can talk things out.

                    I have been contacted by relatives who share a trait of interest with me and I had thought that testing with 23andme would have been best because 23andme specifically report on traits and genetic illnesses.

                    However, from what I now understand from this thread it would likely be best to test with FTDNA. 23andme only reports on large cM usually greater than 5 cM, while FTDNA reports right down to 1.0. For most of my matches these small segments are the majority of the total cMs.

                    With 23andme I would have probably have wound up looking in the wrong part of the genome for the trait of interest in the largest cM segment that they would report. FTDNA reports on even small cMs so I will not be as easily mislead. It is, of course, possible, though not likely, that the trait of interest might even reside in a region that was below the 1 cM cutoff.
                    Most of the small segments at FTDNA are not true segments. I have an experimental file I uploaded with my son's phased results. It shows 158 matches in common with me. Those 158 matches have 1694 short segments in the unphased kit, but only 192 (11.3%) persisted in the phased kit. There was even a kit with 19 small segments, none of which persisted. It is likely that the drop-off would be even greater if the other party to the match also had phased results.

                    If these relatives are reasonably close, you can expect to find the trait is embedded within the long segments shared with them. I went through a similar process for my family

                    *please no links to outside company websites
                    Last edited by Ann Turner; 4 July 2016, 10:14 PM. Reason: add link about trait research

                    Comment


                    • #11
                      Originally posted by georgian1950 View Post
                      The number of SNP's in a CentiMorgan varies to keep the probability of a crossover event constant for each Centimorgan.

                      I am leaning toward SNP's being more important, but I have not given that topic much deep thought. Most software for genetic genealogy won't let you go below 1.0 cM anyhow, so in your example the matching segment has a lot of SNP's behind it, certainly more than would be needed to say that a false segment is unlikely.

                      Jack
                      The cM measure is indeed related to the probability of a cross-over, but that has nothing to do with how many SNPs are tested. The only reason we care about SNPs is to insure that we've sampled enough data points to uncover any contradictions that would break up the segment.

                      Comment


                      • #12
                        [QUOTE=Ann Turner;427725]Most of the small segments at FTDNA are not true segments. I have an experimental file I uploaded with my son's phased results. It shows 158 matches in common with me. Those 158 matches have 1694 short segments in the unphased kit, but only 192 (11.3%) persisted in the phased kit. There was even a kit with 19 small segments, none of which persisted. It is likely that the drop-off would be even greater if the other party to the match also had phased results.

                        If these relatives are reasonably close, you can expect to find the trait is embedded within the long segments shared with them. I went through a similar process for my family

                        *please no links to outside company websites*

                        Ann, any chance that you and the son's father are related say within the last 10 generations?

                        Jack

                        Comment


                        • #13
                          Originally posted by Ann Turner View Post
                          Most of the small segments at FTDNA are not true segments. I have an experimental file I uploaded with my son's phased results. It shows 158 matches in common with me. Those 158 matches have 1694 short segments in the unphased kit, but only 192 (11.3%) persisted in the phased kit. There was even a kit with 19 small segments, none of which persisted. It is likely that the drop-off would be even greater if the other party to the match also had phased results.

                          If these relatives are reasonably close, you can expect to find the trait is embedded within the long segments shared with them. I went through a similar process for my family

                          *please no links to outside company websites
                          My link to a blog post at 23andMe was removed. If you Google the phrase "tracking down a trait" it should pop up.

                          Comment


                          • #14
                            Originally posted by georgian1950 View Post
                            Ann, any chance that you and the son's father are related say within the last 10 generations?
                            Jack
                            I gather you want to ascribe the missing small segments in the phased file to the paternal side? Vaguely possible, but unlikely given the geographic areas that our ancestors lived. Even moderately large segments can fall apart if you use phased data, as I illustrate in this blog post:

                            http://segmentology.org/2015/10/02/a...n-ibs-segment/

                            Comment

                            Working...
                            X