Grouping kits in Y-DNA results chart

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • TwiddlingThumbs
    FTDNA Customer
    • Jan 2016
    • 155

    Grouping kits in Y-DNA results chart

    As a surname project administrator, I think grouping the kits in the Y-DNA results chart in a useful manner may be one of the more important tasks. Grouping by haplotype does not seem sufficient because in most cases the haplotype group will be so broad that it includes groups of people who are not descended from a common male ancestor within the genealogical time frame.

    Ideally, I think the kits should be put into groups of testers whose STR results show are all probably descended from a common male ancestor within the genealogical time frame, based on FTDNA genetic distance guidelines. See:






    I have written a software program that does just that. If you are interested in seeing what the results of such an analysis look like for your surname project, I would be happy to run your project data through my program. Please send me a message if you are interested.

    For an example of a surname project that used the program to group the kits, see the Ashley project chart at: https://www.familytreedna.com/public...frame=yresults
  • georgian1950
    FTDNA Customer
    • Jun 2012
    • 989

    #2
    Like it!

    While I am not a project administrator or even a Y-DNA expert, to me most project results are not very well organized. Looks like you have made a big effort to change that.

    Jack Wyatt

    Comment

    • Martin Potter
      Project *****istrator
      • Sep 2007
      • 12

      #3
      Grouping by haplogroup is good and useful, provided that you get your members to test for their terminal SNP. As the discovery of new SNPs progresses, this successive testing can become a never ending process, but it is by such a goal that major discoveries are made.

      I group all my members by haplogroup, as far as that can be determined.

      ... Martin
      (Foad, Huntsman, and Mugford projects)

      Comment

      • TwiddlingThumbs
        FTDNA Customer
        • Jan 2016
        • 155

        #4
        "Grouping by haplogroup is good and useful, provided that you get your members to test for their terminal SNP"

        That's a big proviso. Maybe someday we will get there, but we are not there yet. Currently, if you group only by haplotype, you end up with big groups that contain lots of families that you KNOW, based on STR results, are not related within the genealogical time frame. You need to group by STRs to separate them out. A good hybrid approach is to group by STR results but indicate in the group name which haplotype the members of the group belong to. If the STR results indicate that the members of the group are related, it is safe to say that they share the same haplotype even if they haven't all tested yet.
        Last edited by TwiddlingThumbs; 4 July 2017, 09:41 PM.

        Comment

        • Martin Potter
          Project *****istrator
          • Sep 2007
          • 12

          #5
          Originally posted by TwiddlingThumbs View Post
          If the STR results indicate that the members of the group are related, it is safe to say that they share the same haplotype even if they haven't all tested yet.
          It has been demonstrated a number of times that men can match quite closely in STR values and yet have different haplotypes. This happens due to 'convergence' by random mutation of STR values from quite different populations.

          The true test of relatedness is sharing of SNPs along a branch of the haplotree right down to the terminal one. It is not enough to be R1b-M343+, as predicted by the lab. You have to test successively :
          R1b-M343+ → L11+ → P312+ → L21+ → DF13+ → L513+ → S6365+ → BY17+ etc, down to the most recently discovered SNPs in that sub-branch. If the TMRCA of the terminal SNP is less than a few hundred years (most of them have not been discovered yet, but will be eventually), then you are reasonably assured of a common, shared paternal-line ancestor.

          STR matches can never be more than approximate.

          Comment

          • TwiddlingThumbs
            FTDNA Customer
            • Jan 2016
            • 155

            #6
            I agree that SNPs are more definitive, but in most surname projects, only a very small minority of members have taken a SNP test. The haplotype assigned to the vast majority of kits in most surname studies is just a "predicted" haplotype, which (i) is based solely on their STR results and (ii) too general to separate all the kits into separate related families. (I note that your Foad, Huntsman and Mugford projects are very unusual in that virtually every member has had SNP testing. Most surname studies are a sea of red "predicted haplotypes".)

            Since every kit on the YDNA results page has STR results, while, in most cases, only a small minority have SNP results, basing groups on STR results is the only option for most surname projects.

            Also, while I agree that unrelated men can have matching or close STR results, particularly on the lower STR tests (eg, 37 or lower), the chances of men with the same surname (as assumed by the FTDNA genetic distance guidelines) having close STR results and yet being unrelated, is fairly small. Therefore, in the context of a surname project, using STRs to group kits is very reasonable and sound, provided you don't include in the groups any kits with different surnames (unless there is reason to believe that they are biologically descended from someone with that surname).

            Comment

            • Martin Potter
              Project *****istrator
              • Sep 2007
              • 12

              #7
              I grant what you say, and I especially agree with
              Originally posted by TwiddlingThumbs View Post
              Most surname studies are a sea of red "predicted haplotypes"
              It is, I think, a sad commentary.

              I have been very fortunate in convincing almost all of my project members to undertake SNP testing. The possible long-term benefits of SNP testing and its potential impact on genetic genealogy is not obvious to many people. Some of my project members were initially quite skeptical.

              But the benefits are real and I am very aware of, and thankful for, the work done by the various haplogroup projects and their researchers. There is a symbiotic relationship between surname projects and haplogroup projects that is not understood by everyone. The surname projects help people with their genealogies and provide "fodder" for the haplogroup projects, while the haplogroup projects do real science which eventually feeds back deep ancestry to the surname projects. As a surname project admin, I am glad to be able to contribute in a small way to both sides of this process.

              You mentioned the STR 37-marker and lower threshold. In my projects, we generally ignore all matches at 12 and 25 markers. Matches at 37 markers are generally taken as an indication of the need for testing more markers. At 39 USD a shot, the cost of SNP testing can be significant and I am glad that FTDNA has introduced the "Pack" tests to cover a bunch of SNPs all at the same time for a pretty reasonable price. But the Pack tests don't cover everything, so continuing commercial developments are necessary.

              Onward and upward ...

              Comment

              • TwiddlingThumbs
                FTDNA Customer
                • Jan 2016
                • 155

                #8
                My kit grouping software program has now been beta tested by a number of the larger surname projects starting with the letter A. The program has also been revised to incorporate (i) the changes in genetic distance calculations with respect to dropped markers and multi-copy markers that FTDNA made last year (see https://dna-explained.com/2016/07/27...etic-distance/ ) and (ii) the special way FTDNA calculates genetic distance with respect to DYS389i and DYS389ii (see http://forums.familytreedna.com/showthread.php?t=29826 and http://www.johnbrobb.com/Content/DNA/TMRCA&GD.pdf ). I believe the genetic distance calculations used by the program to suggest kit groupings are now 100% consistent with FTDNA's. Please send me a message if you are interested in seeing what kit groupings the program suggests for your surname project.

                Comment

                • Wheal
                  FTDNA Customer
                  • Apr 2017
                  • 41

                  #9
                  As a newbie to genetics, I find it would be more useful to me and I am sure to others, include a "LMS" last major subclade for grouping. Since my dad does not match anyone in his surname project (and yes I understand that could point to a NPE) and has "Your Confirmed Haplogroup is R-Y7378" and as far as I can tell the only member, it would be most helpful to simply say something like LMS Z2 Terminal Y7378. It would give us non-educated searchers at least a starting point that made sense. For now I search for: U106>Z381>L48>Z9>Z30>Z349>Z2>S15510>Y7378 in groups to try to locate my assignment.

                  I know this now, but had to look up on his haplotree to see what was his last major subclade and backtrack from there.

                  Those of us that are Heinz 37's are lost.

                  Comment

                  • TwiddlingThumbs
                    FTDNA Customer
                    • Jan 2016
                    • 155

                    #10
                    Originally posted by Wheal View Post
                    include a "LMS" last major subclade for grouping
                    Since the vast majority of the members of most surname projects have not done SNP testing, most surname projects can't create subgroups based on SNP testing. For groups based on STR results that contain members who have had SNP testing, the results of those SNP tests should be checked to make sure they are consistent with each other. Assuming they are, it makes sense to list the SNP hierarchy in the group heading. I'm not sure how listing "last major subclade" really helps, however, since if there is no match on the terminal SNP, they aren't related within the genealogical time frame even if they share a higher subclade.

                    BTW, I don't think that not matching anyone in a surname project suggests a NPE. In most surname projects, 20-33% of members are unmatched. The most probable reason is that no one else in the same line of that surname has submitted test results to the project yet. From looking at different surname projects, most seem to have 5-20 different groups that are not related within the genealogical time frame, and I know from looking at ones related to my own genealogy, that there are lots of other unrelated surname branches that are not yet represented in the projects -- probably at least as many as are in the projects.
                    Last edited by TwiddlingThumbs; 14 July 2017, 02:11 PM.

                    Comment

                    • Martin Potter
                      Project *****istrator
                      • Sep 2007
                      • 12

                      #11
                      STR vs SNP testing

                      For those who might be interested :



                      a transcript of an article ("Advanced Y-DNA Testing for the Acree One-Name Study") about a surname project which has replaced Y-STR testing with Y-SNP testing, thus "reducing cost" and "eliminating ambiguity", according to the author, Charles Acree.

                      The article has been or is being published in the "Journal of One-Name Studies" (Jul-Sep 2017), which I have not yet seen.

                      ... Martin

                      Comment

                      • TwiddlingThumbs
                        FTDNA Customer
                        • Jan 2016
                        • 155

                        #12
                        Originally posted by Martin Potter View Post
                        An interesting article. However, the kit grouping in the Acree project is identical to what one would get based purely off of their current STR results (based on their public data, which I ran through my program), so while the SNP testing provided definitiveness for those who took the test and matched, I don't think it saved any money or made any difference for grouping the kits.

                        Big Y and similar tests are expensive. If you just take a terminal SNP test with someone who has done the testing, it's a bit of a shot in the dark. I tried that route and it was a miss. Then I took a SNP panel test and found my terminal SNP, but no one else with my surname who had done SNP testing had that SNP. Then I took a 67 STR test, which showed I was not related to the guy who had taken the Big Y test but was related to another guy who we established, through genealogical evidence, shared a common ancestor with me.

                        STR test results are still useful and currently still have a number of advantages over SNP tests. In order to determine if you match with someone, you need to take a test that provides results that you can compare with the other person's results. A lot of people have taken 12, 25, 37, 67 and 111 STR tests. A match on any of these tests gives some affirmative evidence of a possible relationship. While a match at 12 or 25 is weak evidence, it's still worth a look, and the results of 37, 67 and 111 tests are quite suggestive. With SNP testing, you can only base a match on a SNP that probably originated within the genealogical time frame, which means that both you and the other person need to have done Big Y or tested for the terminal SNP. There are a lot fewer people out there who have done that testing, so you are currently much less likely to match with other people. I guess the choice is between finding more people who you might be a match with or finding a lot fewer or no people who you are definitely a match with.
                        Last edited by TwiddlingThumbs; 21 July 2017, 11:40 AM.

                        Comment

                        • Martin Potter
                          Project *****istrator
                          • Sep 2007
                          • 12

                          #13
                          Originally posted by TwiddlingThumbs View Post
                          ... the vast majority of the members of most surname projects have not done SNP testing ...
                          Are you sure of that? I don't have any figures to prove you wrong but you have to consider the large number of people who belong to the many haplogroup projects. They *all* came from surname projects. And all of them have done at least some SNP testing, while many have done extensive SNP testing, not to mention the ones who do Big-Y and other genomic type tests in order to provide the leading edge for haplogroup researchers to work on. I think the field is larger than you imagine.
                          ... Martin

                          Comment

                          • TwiddlingThumbs
                            FTDNA Customer
                            • Jan 2016
                            • 155

                            #14
                            Originally posted by Martin Potter View Post
                            Are you sure of that? I don't have any figures to prove you wrong but you have to consider the large number of people who belong to the many haplogroup projects. They *all* came from surname projects. And all of them have done at least some SNP testing, while many have done extensive SNP testing, not to mention the ones who do Big-Y and other genomic type tests in order to provide the leading edge for haplogroup researchers to work on. I think the field is larger than you imagine.
                            ... Martin
                            I've looked at maybe about 100 of the larger surname groups and in all of them the number of people with green haplotypes (indicating that they have been confirmed by SNP testing) is a distinct minority. 'Vast majority" and 'distinct minority" are subjective, of course. Just to quantify, I took a look at a couple of project. For the Adams project about 135/550 had their haplotype confirmed by SNP testing. For the Adkins group the number was about 43/150. Pretty consistent that only about 1/4 to 1/3 of kits with STR results had done SNP testing. So, based on those numbers, I would estimate that there are 3-4 times as many people who have done STR testing as have done SNP testing. Moreover, the number of people who have done Big Y (or have otherwise done testing that establishes their terminal SNP) is a small subset of those who have done SNP testing and only Big Y (or have otherwise done testing that establishes their terminal SNP) is useful for affirmatively matching people within the genealogical time frame.

                            Not all members of haplotype groups have done SNP testing. Most FTDNA haplotype groups accept members based on "predicted haplotype" which is based purely on their STR results.

                            Comment

                            • Jim Barrett
                              R-BY55907
                              • Apr 2003
                              • 2990

                              #15
                              Originally posted by TwiddlingThumbs View Post
                              Big Y and similar tests are expensive. If you just take a terminal SNP test with someone who has done the testing, it's a bit of a shot in the dark. I tried that route and it was a miss. Then I took a SNP panel test and found my terminal SNP, but no one else with my surname who had done SNP testing had that SNP. Then I took a 67 STR test, which showed I was not related to the guy who had taken the Big Y test but was related to another guy who we established, through genealogical evidence, shared a common ancestor with me.
                              The SNP Pack you ordered probable did not find your terminal SNP. It only found the most down stream SNP included in that SNP Pack.

                              Comment

                              Working...
                              X