Announcement

Collapse
No announcement yet.

Please comment - SNP vs STR

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Please comment - SNP vs STR

    Below is a segment from a book I am preparing about my one-name study. I would be interested in any comments you might have regarding these ideas:

    If SNP testing could result in projects where samplers could see surname pools as clearly as STR testing, then I would recommend nothing but SNP testing.

    For another opinion: here is a paraphrase of a recommendation of a SNP analyst whose name I protect, illustrating the divergence between an expert focusing on haplos instead of male-to-male lineage:

    Haplotype predictions for the cost of a Y-67 STR test are no different from a far cheaper 12 or 25 STR test, which is why I have been recommending men to test 12 STRs to get their haplogroup, then focus on testing SNPs, either testing an appropriate SNP Pack, or sequencing a Y genome.

    From the standpoint of the analyst, this is perfectly logical, from both a cost and data point-of-view. But it means that interpretation is entirely in the hands of the analyst who is almost certainly not interested in your surname or family history.

    Comments about yDNA STR testing as being a waste of money or somehow “second-class” coming from those who are biased toward SNP testing are not helpful: the idea that surname students should leave DNA research to the experts who can interpret SNPs totally misses the point. The estimates of aging and branching from a particular haplogroup are possible with large-scale SNP testing, but it is questionable that the most extensive SNP sequencing might lead to something of interest to those of us who are tracing males for genealogical surname purposes. A 2012 British publication about genealogical “brick walls” actually advises against yDNA STR testing until costs come down for genomic sequencing. I consider this hare-brained advice. To discourage any STR testing in the hopes of less expensive SNP sequencing in the future risks the loss of the sample altogether. Repeating myself: if SNP testing could result in projects where samplers could posit surname pools as clearly as with STR testing, then I would recommend nothing but SNP testing.

  • #2
    To see this issue in practice, compare Family Tree DNA with 23andMe. The former uses STRs and the latter only SNPs. I ask (rhetorically) which is better for surname research?

    Comment


    • #3
      I do not have time until Friday or Saturday, so just a short comment since the issue is a very important one.

      I am acutely aware that this is not the best solution money-wise, but I am trying to get to a point when there is at least one man, but preferably two, with Y-DNA67 (or Y-DNA111) and Big Y results in each patriline I am researching.

      My views might have been different a couple of years ago, but the SNPs help us (my family) to find out who the real matches at Y-DNA67 level are. Some haplotypes are clearly shared among the lines that diverged thousands of years ago, the net result being many (for example more than 50!) matches with 67 markers or many thousands of matches with 12 markers.

      On the other end of spectrum, there are men with no matches at all when using 12 markers (and while having no null values!). In such cases, Big Y offers some reassurance that one is not alone.


      So Y-DNA111 (as I am not aware of any false positives on that level). Possibly only supplemented with Big Y, if either nothing comes out at first or conversely too many matches start appearing. (And Big Y only when the budget is permitting such an investment.)


      Mr. W.

      Comment


      • #4
        More...

        How possibly SNPs alone could be used for a surname project, today?

        Sometime in a future? That cannot be excluded. However, one should notice that FTDNA has just bundled Y-DNA111 with the original Big Y (but Y-DNA111 is still available alone). Did FTDNA think that SNPs have to be supplemented by STRs ? ! ?


        Yes, SNPs can be very useful, when trying to investigate a possibly extremely distant cousin - like one from 4000+ years ago . Otherwise, one could correctly(!) interpret the inherently limited TiP results as a non-zero chance of being related in the last 400 years (16-20 generations). If we tested our SNPs (SNP pack might be enough), just one glance tells us that our branches had Most Recent Common Ancestor 4300-5300 years ago (95% confidence interval provided by YFull). But that is all in addition to STRs.


        Mr. W.


        P.S.
        The above is a real life example from the kits I manage here. Your mileage may vary. Past performance is not indicative of future results. Etc.

        Comment


        • #5
          For what it is worth: I would emphasis that both SNPs and STRs are essential. SNP mutations place you firmly on specific locations on the phylogentic tree, not necessarily on your final position, as the tree sprouts branches and twigs as the knowledge increases.

          Within each tree position the STR mutations help identification of possible genealogical relationships. The more markers tested the better the predictions are. However, be acutely aware of false positives with STR matching.

          One of the other replies indicates that he has not seen a false positive with 111 markers. I dont't know how he defines a false positive, but on my YFull STR match page my 3de and 4th closest matches, both 0.122 (45/370 and 28/229) are false positives in that they are 7-8 SNPs away from my sub-branch, i.e they are useless in predicting any sort of genuine relationship.

          Good luck in writing you study.

          Comment


          • #6
            @Svein Davidsen

            Originally posted by Svein Davidsen View Post
            For what it is worth: I would emphasis that both SNPs and STRs are essential. SNP mutations place you firmly on specific locations on the phylogentic tree, not necessarily on your final position, as the tree sprouts branches and twigs as the knowledge increases.

            Within each tree position the STR mutations help identification of possible genealogical relationships. The more markers tested the better the predictions are. However, be acutely aware of false positives with STR matching.

            One of the other replies indicates that he has not seen a false positive with 111 markers. I dont't know how he defines a false positive, but on my YFull STR match page my 3de and 4th closest matches, both 0.122 (45/370 and 28/229) are false positives in that they are 7-8 SNPs away from my sub-branch, i.e they are useless in predicting any sort of genuine relationship.

            Good luck in writing you study.
            I meant specifically matches with FTDNA Y-DNA111.

            And what is the difference to those two men, when you limit yourself to 111 markers used by FTDNA Y-DNA111 test? I mean the difference counted like FTDNA counts? (It is OK for some STRs to be not reported.)

            In my opinion, there is a very good chance that the matches you gave as examples would not show as matches with Y-DNA111 test.


            Mr. W.

            Comment


            • #7
              Originally posted by dna View Post
              I meant specifically matches with FTDNA Y-DNA111.

              And what is the difference to those two men, when you limit yourself to 111 markers used by FTDNA Y-DNA111 test? I mean the difference counted like FTDNA counts? (It is OK for some STRs to be not reported.)

              In my opinion, there is a very good chance that the matches you gave as examples would not show as matches with Y-DNA111 test.


              Mr. W.
              You are quite correct of course, they don't show up on my STR 111 match page, but then again I don't know if they have taken the 111 test.

              In fact I don't have a single match that meets the FTDNA match criteria for 111 markers. One recent tester have the same terminal SNP as me, N-Y17416, but he has only taken the 67 marker test, so no result for 111 markers, but neither does he show up on my 67, 37 or 25 marker comparison! We match 54/67, on my count, and as our ancestries are thousands of miles/kilometers apart we are not "kissing cousins"!

              EDIT
              Sorry, disregard the above paragraph - I mixed up two results. The 67 marker tester is 1 SNP downstream from me. This obviously is what is reflected in the STR difference between us!
              End Edit.

              With excuses to clintonslayton76 for hijacking his thread, does anyone know how the new 450 FTDNA's additional STR markers compare with the 4 - 500 markers used by YFull? Is there much of an overlap?
              Last edited by Svein Davidsen; 25th April 2018, 03:57 PM.

              Comment


              • #8
                In the Y-DNA111 test none of the STRs used is named FTY*, my guess FTY* are STRs discovered at FTDNA and not necessarily read by the other labs.

                Most of the new ones are FTY*. I had counted that 55 of the new STRs are not FTY*:

                DYF392, DYF398A, DYF398B, DYF405, DYS389B,
                DYS453, DYS466, DYS474, DYS475, DYS476,
                DYS477, DYS480, DYS483, DYS484, DYS488,
                DYS489, DYS493, DYS499, DYS502, DYS507,
                DYS508, DYS512, DYS514, DYS516, DYS523,
                DYS530, DYS538, DYS539, DYS541, DYS542,
                DYS543, DYS544, DYS551, DYS569, DYS573,
                DYS574, DYS577, DYS580, DYS581, DYS583,
                DYS584, DYS585, DYS598, DYS602, DYS608,
                DYS615, DYS616, DYS618, DYS620, DYS623,
                DYS624, DYS631, DYS637, DYS642, DYS645

                I have no idea whether they happen to be the ones used by YFull.


                Mr. W.


                P.S.
                Please also find included the full list of STRs reported by FTDNA (from my CSV download). And the same STRs sorted alphabetically.
                Attached Files

                Comment


                • #9
                  Thank you dna/Mr.W.

                  I guess I should ask on the YFull FB page if there is a list of the YFull markers.

                  Comment


                  • #10
                    Originally posted by Svein Davidsen View Post
                    Thank you dna/Mr.W.

                    I guess I should ask on the YFull FB page if there is a list of the YFull markers.
                    According to their FAQ https://yfull.com/faq/str-interpretations/ if you download your STR results with the option All you will get all of them, just some might be low confidence results.


                    Mr. W.


                    P.S.
                    A comparison might be easier than expected. It is possible that YFull has all or most of STRs beyond first 111 as either DYF* or DYR*. FTDNA does not have a single DYR and only the following DYFs: DYF392, DYF398A, DYF398B, DYF405, DYS389B.

                    My above guess is based on a low quality JPEG screenshot that does not include all of the YFull STRs.
                    Last edited by dna; 27th April 2018, 04:58 AM. Reason: +P.S.

                    Comment


                    • #11
                      Missing the point...

                      Originally posted by Svein Davidsen View Post
                      With excuses to clintonslayton76 for hijacking his thread, does anyone know how the new 450 FTDNA's additional STR markers compare with the 4 - 500 markers used by YFull? Is there much of an overlap?
                      Originally posted by dna View Post
                      It is possible that YFull has all or most of STRs beyond first 111 as either DYF* or DYR*. FTDNA does not have a single DYR and only the following DYFs: DYF392, DYF398A, DYF398B, DYF405, DYS389B.
                      This echoes my findings on my kit at YFull. The terminal SNP and most additional STRs from FTDNA have no counterparts on the ISOGG tree, so my terminal SNP at YFull and additional STRs have almost no points of comparison, at least at my (poor) level of understanding.

                      I have an older version of Office and trying to use CSV of +450 STRS was too much like herding cats, even for simply recording into my Project spreadsheet of 12-to-111 STRs for 50 men.

                      I have two other man who have done 111 and Big Y, but none of us three are from same Y pool: one uses the name but calculated to be misattributed, one (me) from my pool, and one from my wife's paternal ancestry. (We have two large pools of results that prove that our surnames are not from a common paternal within about 275 centuries, even though each family uses the same variant spellings.)

                      The tenor of my query is: I recognize the value of each type of testing, but question whether SNPs provide anything of value for pulling together subgroups of men who can see for themselves (or their sponsors) that their STRs match to the point of a worthwhile hypothesis. For me to encourage a cousin to spend $$$ for results (or for me to sponsor any more) that I myself have difficulty interpreting is at the nub of this issue.

                      If I followed the haplo adviser and suggested the least expensive 12 marker STR for my members, I would place no confidence in any TiP or GD calculation based on just knowing the haplotype predictions. I find anything less than 67 markers to be questionable without a lot of documentary support. My point is: the goals of the haplo advisor are not in the spirit of the goals of the family surname researcher.
                      Last edited by clintonslayton76; 27th April 2018, 11:09 AM. Reason: spelling, consolidation, emphasis

                      Comment


                      • #12
                        Here is a comparison for YF and FTDNA for my STRs above 111, for 9 calls that do not match, 30 of them did:

                        DYS502 13 FTDNA 8
                        DYS514 20 FTDNA 15
                        DYS516 15 FTDNA 16
                        DYS538 10 FTDNA 11
                        DYS542 15 FTDNA 12
                        DYS543 17 FTDNA 11
                        DYS544 13 FTDNA 9
                        DYS623 10 FTDNA 11
                        DYS631 10 FTDNA 11

                        The rest appeared non-comparable.

                        Comment


                        • #13
                          Originally posted by clintonslayton76 View Post
                          Here is a comparison for YF and FTDNA for my STRs above 111, for 9 calls that do not match, 30 of them did:

                          DYS502 13 FTDNA 8
                          DYS514 20 FTDNA 15
                          DYS516 15 FTDNA 16
                          DYS538 10 FTDNA 11
                          DYS542 15 FTDNA 12
                          DYS543 17 FTDNA 11
                          DYS544 13 FTDNA 9
                          DYS623 10 FTDNA 11
                          DYS631 10 FTDNA 11

                          The rest appeared non-comparable.
                          The above differences in the observed values might explain the following part of the FTDNA announcement
                          Please note - STR Results may be different than other companies score.
                          At FamilyTreeDNA, we typically count only perfect repeats so some of the Big Y-500 Panel 6 Y-STR allele values will consistently differ from that of other companies. The difference will be offset by only a few values above or below the results of other companies.


                          Mr. W.


                          P.S. I understand that the values could be different between different labs. It was like that before. However, I was under impression that there was a standard in place.

                          Comment


                          • #14
                            Originally posted by clintonslayton76 View Post
                            [----]
                            I have two other man who have done 111 and Big Y, but none of us three are from same Y pool: one uses the name but calculated to be misattributed, one (me) from my pool, and one from my wife's paternal ancestry. (We have two large pools of results that prove that our surnames are not from a common paternal within about 275 centuries, even though each family uses the same variant spellings.)

                            The tenor of my query is: I recognize the value of each type of testing, but question whether SNPs provide anything of value for pulling together subgroups of men who can see for themselves (or their sponsors) that their STRs match to the point of a worthwhile hypothesis. For me to encourage a cousin to spend $$$ for results (or for me to sponsor any more) that I myself have difficulty interpreting is at the nub of this issue.

                            If I followed the haplo adviser and suggested the least expensive 12 marker STR for my members, I would place no confidence in any TiP or GD calculation based on just knowing the haplotype predictions. I find anything less than 67 markers to be questionable without a lot of documentary support. My point is: the goals of the haplo advisor are not in the spirit of the goals of the family surname researcher.
                            This is the contentious point
                            whether SNPs provide anything of value for pulling together subgroups of men who can see for themselves (or their sponsors) that their STRs match to the point of a worthwhile hypothesis

                            But there is no controversy Like in any research (or life in general) small probabilities are with us for better or worse. Researches chase down all the possibilities until completely sure that everything that was plausible was eliminated. People play lotteries despite the odds. People fall in love against all the odds...

                            We have a case like yours, two branches with the same uncommon established name and MRCA thousands of years ago (according to SNPs). When running TiP in the GAP for Y-DNA67 results I get
                            20 generations 3.97%, 7.17%, 8.85%
                            24 generations 12.84%, 19.78%, 23.32%
                            I do not have Y-DNA111 results yet, but I do not expect TiP with them to be all 0.0%.

                            Without SNPs we would not have two separate branches.

                            The decision limit (either way: very small yes or very small no) seldom can be established with STRs alone. We need SNPs for some of the tested men.

                            Yes, you are 100% right Y DNA tests below 67 markers are not good for determining family branches (regardless how distant they are). And yet, almost exactly like with the Family Finder!, some people might be more interested in their deep ancestry = SNP results.

                            Although sponsoring STR testing and leaving SNP testing to the project members themselves sounds like a simple solution, complexity of social interactions suggests to me that it would not be a workable solution.



                            Mr. W.

                            Comment


                            • #15
                              My take is that 67-marker STR testing is very effective for placing surname project members into groups of men who share a common ancestor within the genealogical time frame. It's primary shortcoming is that it cannot distinguish between a match due to convergent mutations and a match due to true relationship. False matches due to convergent mutations are generally not a problem in most surname groups. The two situations where is it most likely to be an issue are (1) clan-based surname projects where lots of men could have ended up with variants of the same surname even though they were only related 1000+ years ago and (2) matches with men who do not have a variant of the project surname and no reason to believe they are descended from a male with that surname. In the first situation, convergent mutations can cause men with the same surname to appear to be more closely related than they are. This is one reason why you frequently find the biggest fans of SNP testing among people who have clan-based surnames (eg, most of the Irish and Scottish surname groups). In the second situation, I would always recommend SNP testing because there is a fairly high probability that a match between men with different surnames (and no genealogical reason to believe are related) is a false match due to convergent mutation.
                              Last edited by TwiddlingThumbs; 29th April 2018, 11:35 AM.

                              Comment

                              Working...
                              X