Announcement

Collapse
No announcement yet.

Interpreting near miss segments

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Interpreting near miss segments

    Ann Turner and others not only counsel us to ignore segments <5cM but also to regard with great caution segments that are just >5cM, and even those that are >7.5cM but not by much. In other words, even some of the segments that qualify for a match here on FF and also on RF should be looked at skeptically.

    I understand and appreciate this - but can't stop being curious! Specifically I'm curious about this: Is there any difference between a segment that doesn't qualify based on cMs versus a segment that doesn't qualify based on SNPs?

    My wife shares a 27cM segment with someone who is not a match in FamilyFinder. I saw this on Gedmatch. Curious, looking further, I see that the segment spans only 360 SNPs! 27cM seems large; in fact my wife only has one other match (among 55 total matches) with a segment larger than that. It just seems odd to toss it out, but i understand if it doesn't meet the SNP threshold, it doesn't meet the threshold!

    I've seen other segments via Gedmatch that are as many as 1500 SNPs (or even more - way above the usual thresholds of 500 or 700) but may only be 4.5 cM (or even less).

    Are those two segments equally not worth looking at? (Equally meaningless insofar as reckoning a relationship)?

    As always, thanks for your time & thoughts --

    Dwight

  • #2
    Can you find a paper trail connecting to this person? If you can than dont ignore it. I have a cousin who both of us share Four 5th great grandparents. We match on FF but dont match enough to Illumina standards so she is not listed as a cousin match. I do however match to people I cant connect to on paper.

    Comment


    • #3
      Originally posted by dwight View Post
      Ann Turner and others not only counsel us to ignore segments <5cM but also to regard with great caution segments that are just >5cM, and even those that are >7.5cM but not by much. In other words, even some of the segments that qualify for a match here on FF and also on RF should be looked at skeptically.

      I understand and appreciate this - but can't stop being curious! Specifically I'm curious about this: Is there any difference between a segment that doesn't qualify based on cMs versus a segment that doesn't qualify based on SNPs?

      My wife shares a 27cM segment with someone who is not a match in FamilyFinder. I saw this on Gedmatch. Curious, looking further, I see that the segment spans only 360 SNPs! 27cM seems large; in fact my wife only has one other match (among 55 total matches) with a segment larger than that. It just seems odd to toss it out, but i understand if it doesn't meet the SNP threshold, it doesn't meet the threshold!

      I've seen other segments via Gedmatch that are as many as 1500 SNPs (or even more - way above the usual thresholds of 500 or 700) but may only be 4.5 cM (or even less).

      Are those two segments equally not worth looking at? (Equally meaningless insofar as reckoning a relationship)?

      As always, thanks for your time & thoughts --

      Dwight

      I think that SNPs are more valuable and informative than cM distance. If it's an area of few SNPs it might have a large segment in cM but not really mean much if there's only several hundred SNPs in that area.

      cM does not correspond directly to the number of SNPs, its more about the actual number of bases/nucleotides. SNPs in the other hand (single nucleotide polymorphisms) are specific bases/nucleotides that have been found to be variable among a specific species -- remember humans share 99.9% of their DNA, so it's that 0.1% that makes all the difference.

      Some regions of different chromosomes are found to have more SNPs than others. That's why a segment of 27cM in some places might have thousands of SNPs whereas in another place might only have a few hundred.

      The reverse is also true. You can have regions that are very SNP-heavy, and you might have a segment >5cM that might have several thousand SNPs. I have a segment of 3.94cM but 4200 SNPs!!

      Comment


      • #4
        Originally posted by Linds View Post
        I think that SNPs are more valuable and informative than cM distance. If it's an area of few SNPs it might have a large segment in cM but not really mean much if there's only several hundred SNPs in that area.

        cM does not correspond directly to the number of SNPs, its more about the actual number of bases/nucleotides. SNPs in the other hand (single nucleotide polymorphisms) are specific bases/nucleotides that have been found to be variable among a specific species -- remember humans share 99.9% of their DNA, so it's that 0.1% that makes all the difference.

        Some regions of different chromosomes are found to have more SNPs than others. That's why a segment of 27cM in some places might have thousands of SNPs whereas in another place might only have a few hundred.

        The reverse is also true. You can have regions that are very SNP-heavy, and you might have a segment >5cM that might have several thousand SNPs. I have a segment of 3.94cM but 4200 SNPs!!
        So interesting! So looking at things the way Dwight originally described, how many SNPs are enough to view without skepticism?
        v

        Comment


        • #5
          I have matches far beyond the 5 generation level and the LEAST I have showing is 32.01cM. The most I have is 64.56. On the other hand, the shortest BLOCK that I have is 12.51 and the longest is 29.48.

          I am trying to understand what your question is as my magnitudes are so different than the ones you reference.

          Originally posted by dwight View Post
          Ann Turner and others not only counsel us to ignore segments <5cM but also to regard with great caution segments that are just >5cM, and even those that are >7.5cM but not by much. In other words, even some of the segments that qualify for a match here on FF and also on RF should be looked at skeptically.

          I understand and appreciate this - but can't stop being curious! Specifically I'm curious about this: Is there any difference between a segment that doesn't qualify based on cMs versus a segment that doesn't qualify based on SNPs?

          My wife shares a 27cM segment with someone who is not a match in FamilyFinder. I saw this on Gedmatch. Curious, looking further, I see that the segment spans only 360 SNPs! 27cM seems large; in fact my wife only has one other match (among 55 total matches) with a segment larger than that. It just seems odd to toss it out, but i understand if it doesn't meet the SNP threshold, it doesn't meet the threshold!

          I've seen other segments via Gedmatch that are as many as 1500 SNPs (or even more - way above the usual thresholds of 500 or 700) but may only be 4.5 cM (or even less).

          Are those two segments equally not worth looking at? (Equally meaningless insofar as reckoning a relationship)?

          As always, thanks for your time & thoughts --

          Dwight

          Comment


          • #6
            Originally posted by vivianruth View Post
            So interesting! So looking at things the way Dwight originally described, how many SNPs are enough to view without skepticism?
            v
            I really have no idea. I think it's a complex thing, looking at not just a single segment but all segments, and if you have other shared matches and if those shared matches also share similar segments, etc.

            In some ways all of these different criteria are arbitrary. Many members have shown how off-base FTDNA's suggested relationships can be. Everything is based on theory and on averages. Criteria and thresholds cannot take into account individuals and unique circumstances. We all inherit different amounts of DNA from each of our ancestors. Sometimes (perhaps most times) we inherit a lopsided amount of DNA from one line or another. Not all of our DNA follows the typical recombination rates, so we might have a high percentage of our DNA identical to our g-grandparents and beyond, which can throw off relationships and more.

            Bottom line, at least IMHO is that a general threshold might not be the answer but rather a big-picture complex look at all the different contributing factors and individuals.

            Comment


            • #7
              Originally posted by Linds View Post
              I really have no idea. I think it's a complex thing, looking at not just a single segment but all segments, and if you have other shared matches and if those shared matches also share similar segments, etc.

              In some ways all of these different criteria are arbitrary. Many members have shown how off-base FTDNA's suggested relationships can be. Everything is based on theory and on averages. Criteria and thresholds cannot take into account individuals and unique circumstances. We all inherit different amounts of DNA from each of our ancestors. Sometimes (perhaps most times) we inherit a lopsided amount of DNA from one line or another. Not all of our DNA follows the typical recombination rates, so we might have a high percentage of our DNA identical to our g-grandparents and beyond, which can throw off relationships and more.

              Bottom line, at least IMHO is that a general threshold might not be the answer but rather a big-picture complex look at all the different contributing factors and individuals.
              Sigh! Linda, what you say makes sense. I currently manage three FF accounts, including my own, and each of us has about 600 matches (because we are Ashkenazim). Despite having not-bad paper trails for the past few generations (not great, either), we haven't been able to confirm a single one, even predicted 2nd cousins. I've taken to focusing on matches with at least two >10 cM blocks and looking at the shared pattern of blocks (or not) among the three accounts (because we are all closely related on one side and more distantly on another).

              I pine for the ability to download large amounts of chromosomal information (rather than having to go five at a time). I also pine for the ability to have more sophisticated filter criteria, not only on the matches, but in the chromosome browser. Moreover, I pine to be able to be logged into more than one account at the same time on the same computer. That would make these more holistic sorts of analyses much easier.

              Thanks, Linda.
              Last edited by vivianruth; 15 July 2011, 09:37 AM. Reason: two typos corrected

              Comment


              • #8
                Originally posted by vivianruth View Post
                Sigh! Linda, what you say makes sense. I currently manage three FF accounts, including my own, and each of us has about 600 matches (because we are Ashkenazim). Despite having not-bad paper trails for the past few generations (not great, either), we haven't been able to confirm a single one, even predicted 2nd cousins. I've taken to focusing on matches with at least two >10 cM blocks and looking at the shared pattern of blocks (or not) among the three accounts (because we are all closely related on one side and more distantly on another).

                I pine for the ability to download large amounts of chromosomal information (rather than having to go five at a time). I also pine for the ability to have more sophisticated filter criteria, not only on the matches, but in the chromosome browser. Moreover, I pine to be able to be logged into more than one account at the same time on the same computer. That would make these more holistic sorts of analyses much easier.

                Thanks, Linda.
                Have you uploaded the DNA results to GedMatch.com? You could do the comparisons there.

                Comment


                • #9
                  Originally posted by vivianruth View Post
                  I pine for the ability to download large amounts of chromosomal information (rather than having to go five at a time). I also pine for the ability to have more sophisticated filter criteria, not only on the matches, but in the chromosome browser. Moreover, I pine to be able to be logged into more than one account at the same time on the same computer. That would make these more holistic sorts of analyses much easier.

                  Thanks, Linda.
                  I agree those improvements would be welcome additions. But I believe you can be logged on to more than one account at a time if you log into each account through a different browser IE, Chrome, Firefox.

                  Comment


                  • #10
                    Originally posted by vivianruth View Post
                    Sigh! Linda, what you say makes sense. I currently manage three FF accounts, including my own, and each of us has about 600 matches (because we are Ashkenazim). Despite having not-bad paper trails for the past few generations (not great, either), we haven't been able to confirm a single one, even predicted 2nd cousins. I've taken to focusing on matches with at least two >10 cM blocks and looking at the shared pattern of blocks (or not) among the three accounts (because we are all closely related on one side and more distantly on another).

                    I pine for the ability to download large amounts of chromosomal information (rather than having to go five at a time). I also pine for the ability to have more sophisticated filter criteria, not only on the matches, but in the chromosome browser. Moreover, I pine to be able to be logged into more than one account at the same time on the same computer. That would make these more holistic sorts of analyses much easier.

                    Thanks, Linda.
                    It's Lindsay, not Linda, but no worries, I answer to just about anything (my mother constantly calls me by the dogs names....)

                    I think someone already suggested it but try GedMatch. You'd have to go and upload each of your data, but it allows you to compare an unlimited amount of matches. My advice on GedMatch is to set the thresholds when you do a search of the entire database to AT LEAST 10cM. Especially for Ashkenazi. Otherwise you're going to get a an astronomical number of matches and many are going to be well beyond what FTDNA sets as a limit. The plus is it will also allow you to find matches who tested with 23andMe.

                    Just understand that GedMatch does not have as strict of thresholds and I've found several FF 5th-distant cousins listed as much closer and that concerns me.

                    The great thing is it allows you to focus on specific chromosomes and see who shares what segments on certain chromosomes. It's a really awesome tool. Once you get to your list of matches (by searching the whole database with your kit# - FF adds an F in front), you click the boxes for "matrix" of the individuals you want to compare. Then you can compare either autosomal DNA (tells you the amount of shared DNA between all the individuals in the matrix) or the chromosome browser comparison (tells you what segments on each chromosome those people share with "you" (whoever's kit # you're searching) and all the other selected matches).

                    Comment


                    • #11
                      Originally posted by dwight View Post
                      Ann Turner and others not only counsel us to ignore segments <5cM but also to regard with great caution segments that are just >5cM, and even those that are >7.5cM but not by much. In other words, even some of the segments that qualify for a match here on FF and also on RF should be looked at skeptically.

                      I understand and appreciate this - but can't stop being curious! Specifically I'm curious about this: Is there any difference between a segment that doesn't qualify based on cMs versus a segment that doesn't qualify based on SNPs?

                      My wife shares a 27cM segment with someone who is not a match in FamilyFinder. I saw this on Gedmatch. Curious, looking further, I see that the segment spans only 360 SNPs! 27cM seems large; in fact my wife only has one other match (among 55 total matches) with a segment larger than that. It just seems odd to toss it out, but i understand if it doesn't meet the SNP threshold, it doesn't meet the threshold!

                      I've seen other segments via Gedmatch that are as many as 1500 SNPs (or even more - way above the usual thresholds of 500 or 700) but may only be 4.5 cM (or even less).

                      Are those two segments equally not worth looking at? (Equally meaningless insofar as reckoning a relationship)?

                      As always, thanks for your time & thoughts --

                      Dwight
                      I'd say they are equally meaningless (sorry!). Some chromosomal regions are just quirky.

                      For instance, Gedmatch may be detecting a region that spans the centromere of chromosome 9, which has very few SNPs (not enough to rule out coincidental matches).

                      Another region (HLA on chromosome 6) is the opposite -- it has a very high SNP density, but a low recombination rate. A match in that region is thus not as impressive as a region that breaks up more quickly with each generation.

                      Comment


                      • #12
                        The Value of HLA

                        Given that FF is limited to 5 generations, I consider that which is low on recombination is useful when it is in combination with other factors.

                        I did not know HLA had a low recombination rate, but it makes sense for a factor used in transplants. I also see that HLA is being used in population migration hypothesis that are a part of peer reviewed papers. I predict haplogroups of HLA may be used as an intermediate between the results of mtDNA/yDNA haplogroups and more personal haplotypes.

                        I do not understand what you mean by "high SNP density" unless you mean the SNPs in that area are highly documented. Every base pair is like a weed until you have documentation on how it is used. That is if, "Every plant is a weed until you find a use for it."

                        I look forward to your response.

                        Originally posted by Ann Turner View Post
                        Another region (HLA on chromosome 6) is the opposite -- it has a very high SNP density, but a low recombination rate. A match in that region is thus not as impressive as a region that breaks up more quickly with each generation.

                        Comment


                        • #13
                          Originally posted by JohnLloydScharf View Post
                          I do not understand what you mean by "high SNP density" unless you mean the SNPs in that area are highly documented.
                          The P in SNP (Single Nucleotide Polymorphism) refers to known variants. You could do a sliding window or running average calculation on your raw data download for number of SNPs per megabase.

                          Comment


                          • #14
                            Originally posted by Ann Turner View Post
                            I'd say they are equally meaningless (sorry!). Some chromosomal regions are just quirky.

                            For instance, Gedmatch may be detecting a region that spans the centromere of chromosome 9, which has very few SNPs (not enough to rule out coincidental matches).

                            Another region (HLA on chromosome 6) is the opposite -- it has a very high SNP density, but a low recombination rate. A match in that region is thus not as impressive as a region that breaks up more quickly with each generation.
                            For the record, Ann, the segment in question is on Chromosome 19,
                            start 282,028, end 9,112,220, 27.9 cM, 360 SNPs

                            Comment


                            • #15
                              I guess I do not understand that. Any single nucleotide can have a polymorphism if there is a change at any base pair, regardless of whether there is documentation of that SNP.

                              Originally posted by Ann Turner View Post
                              The P in SNP (Single Nucleotide Polymorphism) refers to known variants. You could do a sliding window or running average calculation on your raw data download for number of SNPs per megabase.

                              Comment

                              Working...
                              X