New free web app for use with surname projects

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • TwiddlingThumbs
    FTDNA Customer
    • Jan 2016
    • 155

    New free web app for use with surname projects

    I just completed a free web app that allows people to quickly and easily (1) view appropriate family groupings for kits in surname projects and (2) compare a particular kit against all other kits in a project. Here's the link: http://www.ydnagroupingapp.com/ The app is useful not just to surname project admins, but also to anyone interested in a surname project. (My apologies to anyone who saw my similar post in the project admins forum.)

    The app has an FAQ that answers basic questions about the app. I'd be interested hearing in any other questions, comments problems or suggestions anyone might have.

    Chase Ashley
  • jbarry6899
    FTDNA Customer
    • Jun 2012
    • 638

    #2
    Just wanted to thank Chase for this tool and for making it available via the web. It works very well as an initial sorter for our surname project, with only a few results that we know to be different because of deep SNP testing. In all of those cases the individuals had unusual, random STR values that made them appear unrelated to a group, but later SNP testing revealed a connection.

    A very good resource for either starting out or getting a methological check.

    Jim Barry
    Administrator, Barry DNA Project

    Comment

    • rrrobins
      FTDNA Customer
      • Apr 2017
      • 5

      #3
      Interesting - worked well except for the Group column. Seemed generally OK for group 1, but there were rows with multiple group numbers listed (1,16 or 1,16,18,21 -- generally the first was correct). Beyond group 1 many numbers seemed to be missing or random.

      Perhaps this is something specific to the format of the table for the group I checked:

      Comment

      • TwiddlingThumbs
        FTDNA Customer
        • Jan 2016
        • 155

        #4
        Thanks for your feedback. Hopefully, I can explain the results a bit. The app's grouping output is exactly what it is intended to be. Each kit that has the same group number is, under FTDNA's genetic distance guidelines, probably descended from a common ancestor with another kit with that group number. "Small kits" with only 12 or 25 markers are assigned multiple group numbers if (as is common) they match up with kits in different related groups. There are some higher kit numbers above lower kit numbers because, initially, the app forms kits based on "big kit" matches and then goes back and forms additional kits from small kits that match up only with other small kits. Any kit with no number assigned to it has too great a genetic distance from every other kit in the project to be deemed probably descended from a common ancestor with any other kit in the project. The details of the methodology are explained in the FAQ in the app.

        I think the Robbins project is an excellent example of how the app can quickly show which kits, based on their STR results and FTDNA's genetic distance interpretation guidelines, probably have a common ancestor within the genealogical time frame (ie, is a "match"). A quick look at the colorized chart indicates that most of the existing groups are based primarily on broad haplotypes and contain both subgroups that may have a common ancestor and lots of unrelated kits. https://www.familytreedna.com/public...ame=ycolorized The app confirms this and identifies which kits are, based on FTDNA's gd guidelines, probably related.

        In cases where there is a question about why a particular kit is or is not included (or close to being included) in a group, clicking on the kit number for that kit opens up the Relational Distance page for that kit and shows the genetic and relational distance between that kit and every other kit in the project, which shows why.

        Looking at what the app found with respect to each of the existing groups:

        Robbins 1 - The app confirmed that every kit in this group is a match. One 25 marker kits has a second number by it because it also matches with a kit that is not a match with any Robbins 1 kit. Further STR testing would determine which group it belonged in.

        Robbins 2 - This is identified in the project as consisting of kits that need further testing. 8 of the 9 kits are only 12 marker kits. The app shows most of these as matching with multiple groups, which just confirms that further testing is needed to determine which of those groups (if any) they belong in. One of the kits in Robbins 2, however, is a 37 marker kit. Clicking on the kit's number open's up the Relational Distance page for the kit and shows that the kit really does not come very close with matching with any other 37 or higher kit. In fact, the closest it comes is a gd of 19 on a 37 marker comparison. Further testing will probably just confirm that this kit does not match with any other kit in the project and instead this kit just needs more other Robbins to test to find a possible match.

        Robbins 3 - This group is identified as R-P311. I'm not sure what the basis for that is since none of them seem to have had SNP testing and they are all predicted R-M269. In any event, their STR results indicate that none of them are a match for each other, although the app suggests that one of the kits (a 12 marker) matches up with 2 different groups and could benefit from additional testing to determine which.

        Robbins 4 - This group contains a bunch of M-269 predicted kits and some R-P25 kits that happen to have a 13 in DYS 393. However, M-269, R-P25 and a 13 on DYS 393 are all very common and not helpful as a DNA "signature" and in fact a lot of kits with that description are in other groups. Based on STR results and FTDNA's gd guidelines, as determined by the app, the group contains a bunch of small kits that could match up with multiple groups and need additional testing, a bunch of kits that do not match up with any other kit. Two matching descendents of Thomas Robbins Sr. And probably most interestingly, 3 kits that match with kits in the Robbins 6 group and should probably be grouped with them.

        Robbins 5 - The app confirms that these 2 kits are a match.

        Robbins 6 - This group is identified by the project as containing R-L21 kits. The app shows that the first 6 kits in this group match with 3 of the kits in Robbins group 4. None of the other three kits in this group match with any other kit in the project. Clicking on the kit numbers for these kits opens up the Relationship Distance page for the kit and clearly shows they are not even close to matching those other kits and probably should not be grouped with them.

        Robbins 7 - This a single kit group. The app confirms the kit does not match with any other kit in the project.

        Robbins 8 - This is a small group consisting of E-L117 kits. The app shows, however, that the kits do not match. Clicking on one of the kits and opening up the Relational Distance page for the kit shows that they have a genetic distance of 5 on a 12 marker comparison.

        Robbins 9 - These are identified as G haplotype kits. A quick look at the colorized chart and a look at the Relational Distance page for any of the kits, however, shows that this group consists of 2 separate, unrelated groups.

        Robins 10 - This is a large group consisting of I haplotype kits. The app shows that the group contains 4 unrelated groups of kits.

        Robins 11 - This appears to be a small group that contains 2 kits: 1 public and 1 private. Curiously the group is designated as containing I haplotype kits, but the public kit is J predicted. The colorized chart strongly suggests that the 2 kits are not related.

        Ungrouped - Most of the kits in this group seem to be for unrelated surnames. The app confirms that none of these kits match any other kit, except a few small kits.

        Unrelated surname but matching Robins 1 modal - The app confirms that 3 of these kits match Robins 1 kits, but the last 3 do not.

        So, all in all, I think the app was pretty useful in analyzing the kit relationships in the Robbins project. One caveat is that the analysis based on the url link above is only for the public kits and it is possible that the result might be slightly different if a project administrator did an analysis based on full project results using a csv download of project data.

        Comment

        • rrrobins
          FTDNA Customer
          • Apr 2017
          • 5

          #5
          Thanks for the feedback - there is clearly a lot more going on with this app that I had understood and your explanations fit pretty well with what I have gleaned from the project admins. Some of that final group I know are only in there because they have the same FTDNA assigned R-M64 Y-DNA Haplogroup after Big Y as some of the Robbins 1 members - we have been working together through YFull to get the downstream branches and we now have two, including one for Robbins 1 with 3 members Big Y tested.

          I will pass on info on this app - now that I understand better I see how it can be very useful.

          Comment

          • RERobbins
            FTDNA Customer
            • Nov 2014
            • 2

            #6
            Thanks for your analysis of the groups. As administrator of the Robbins group, I realize that the grouping needs work. Any recommendations for grouping?

            Originally posted by TwiddlingThumbs View Post
            Thanks for your feedback. Hopefully, I can explain the results a bit. The app's grouping output is exactly what it is intended to be. Each kit that has the same group number is, under FTDNA's genetic distance guidelines, probably descended from a common ancestor with another kit with that group number. "Small kits" with only 12 or 25 markers are assigned multiple group numbers if (as is common) they match up with kits in different related groups. There are some higher kit numbers above lower kit numbers because, initially, the app forms kits based on "big kit" matches and then goes back and forms additional kits from small kits that match up only with other small kits. Any kit with no number assigned to it has too great a genetic distance from every other kit in the project to be deemed probably descended from a common ancestor with any other kit in the project. The details of the methodology are explained in the FAQ in the app.

            I think the Robbins project is an excellent example of how the app can quickly show which kits, based on their STR results and FTDNA's genetic distance interpretation guidelines, probably have a common ancestor within the genealogical time frame (ie, is a "match"). A quick look at the colorized chart indicates that most of the existing groups are based primarily on broad haplotypes and contain both subgroups that may have a common ancestor and lots of unrelated kits. https://www.familytreedna.com/public...ame=ycolorized The app confirms this and identifies which kits are, based on FTDNA's gd guidelines, probably related.

            In cases where there is a question about why a particular kit is or is not included (or close to being included) in a group, clicking on the kit number for that kit opens up the Relational Distance page for that kit and shows the genetic and relational distance between that kit and every other kit in the project, which shows why.

            Looking at what the app found with respect to each of the existing groups: <snip>
            Last edited by RERobbins; 9 August 2017, 12:24 PM.

            Comment

            • TwiddlingThumbs
              FTDNA Customer
              • Jan 2016
              • 155

              #7
              Originally posted by RERobbins View Post
              Thanks for your analysis of the groups. As administrator of the Robbins group, I realize that the grouping needs work. Any recommendations for grouping?
              RERobbins - I would suggest running the app on a csv file for the project's results. The csv file will contain full project data, including private kit data. If you need instructions on how to get the csv file, they are in the FAQ in the app. I would then reorganize the groups based on the guidelines in the FAQ section in the app entitled "How should a surname group administrator use the results of the app?"

              Comment

              • TwiddlingThumbs
                FTDNA Customer
                • Jan 2016
                • 155

                #8
                The Y-DNA Family Grouping App has just been upgraded to a new version that includes the following enhancements:
                - Addition of ability to use standard worldfamilies.net Y-DNA results tables as input
                - Inclusion of existing subgroup rows in the initial output table to more easily see how the groupings assigned by the app differ from existing groupings
                - Addition of a feature that allows you to see how the project kits would be organized if grouped in accordance with the app's results
                - Addition of a feature that allows you to see the relational distances and genetic distances between the modal values of each group identified by the app and all kits in the project
                - Colorized STR values results to show differences in values from the reference kit or group modal values
                - Use of vertical marker headings to allow more condensed presentation of marker value information (per FTDNA's charts)
                Last edited by TwiddlingThumbs; 20 August 2017, 11:32 AM.

                Comment

                • TwiddlingThumbs
                  FTDNA Customer
                  • Jan 2016
                  • 155

                  #9
                  New upgrade to Y-DNA Family Grouping App. Adds a feature that identifies more closely related subgroups within a group of related kits. The feature is most useful for those lucky projects that have one or more large groups of related kits. The feature is reached through the See Subgroups button on the Reorganized Table page.

                  Comment

                  • jmcgill
                    FTDNA Customer
                    • Nov 2017
                    • 9

                    #10
                    Would you consider having the app flag kits that match specific DNA signatures such as "Niall of the Nine Hostages", which is common in Ireland and Scotland?



                    There are probably similar STR clusters in other ethnic groups.

                    BTW, your app has been very helpful to me on the McGill project.

                    Comment

                    • TwiddlingThumbs
                      FTDNA Customer
                      • Jan 2016
                      • 155

                      #11
                      Hi McG, glad you've found the app useful. If I understand you correctly, you are talking about an app that would predict haplotype from STRs. Of course, FTDNA does that to some extent already, but maybe not down to the R-M222 level. I haven't checked them out, but I think there are already some independent apps that do that: https://isogg.org/wiki/Y-DNA_tools

                      Comment

                      • jmcgill
                        FTDNA Customer
                        • Nov 2017
                        • 9

                        #12
                        Thumbs,
                        Thanks for the answer. I am checking out those other resources.

                        I have another issue. We have a new kit on the McGill project (IN13215) that your app is not including in the results. I tried both as admin with cvs and as regular user.

                        This is the first kit number that I've seen that is two letters followed by numbers. Could this be a parsing error? Or is it because it is a new kit?

                        Comment

                        • TwiddlingThumbs
                          FTDNA Customer
                          • Jan 2016
                          • 155

                          #13
                          It was probably because the kit number has 2 letters to start. I tweaked the code so it should now recognize that as a legit kit number. Let me know if it works. BTW, what company did the kit come from?
                          Last edited by TwiddlingThumbs; 27 February 2018, 08:38 AM.

                          Comment

                          • jmcgill
                            FTDNA Customer
                            • Nov 2017
                            • 9

                            #14
                            That works great for the .csv generated from admin account. But it doesn't work for using the url method.

                            The kit was from familytreedna. The tester lives in Northern Ireland, so I think the IN either stands for International or Ireland Northern.

                            Comment

                            • Fern
                              FTDNA Customer
                              • Mar 2017
                              • 168

                              #15
                              Originally posted by jmcgill View Post
                              I think the IN either stands for International or Ireland Northern.
                              International. See #5 in the recent thread http://forums.familytreedna.com/show...ht=Kit+numbers

                              Comment

                              Working...
                              X