Announcement

Collapse
No announcement yet.

Evaluating a Big Y match

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    The first thing I do with Big Y results (after requesting a BAN file) is to download the SNP list, which includes known SNPs & novel variants, into Excel. I delete the known SNPs, saving only the novel variants. I add a column called Shared With. Then, I go to the match list & look at the shared novel variant drop down list. I highlight each of them on the novel variant spreadsheet, making a record of the number of shared matches. After I'm done, I sort the Shared with column. I assign 0 to all of those unshared. I count the number of these unshared novel variants; they will, in the future, prove to be the most important piece of data coming from the Big Y test.

    Timothy Peterman

    Comment


    • #17
      Originally posted by T E Peterman View Post
      The first thing I do with Big Y results (after requesting a BAN file) is to download the SNP list, which includes known SNPs & novel variants, into Excel. I delete the known SNPs, saving only the novel variants. I add a column called Shared With. Then, I go to the match list & look at the shared novel variant drop down list. I highlight each of them on the novel variant spreadsheet, making a record of the number of shared matches. After I'm done, I sort the Shared with column. I assign 0 to all of those unshared. I count the number of these unshared novel variants; they will, in the future, prove to be the most important piece of data coming from the Big Y test.

      Timothy Peterman
      Thank you for sharing your process. I'm going to give it a try when I receive my BAM file. Kevin

      Comment


      • #18
        Originally posted by T E Peterman View Post
        The first thing I do with Big Y results (after requesting a BAN file) is to download the SNP list, which includes known SNPs & novel variants, into Excel. I delete the known SNPs, saving only the novel variants. I add a column called Shared With. Then, I go to the match list & look at the shared novel variant drop down list. I highlight each of them on the novel variant spreadsheet, making a record of the number of shared matches. After I'm done, I sort the Shared with column. I assign 0 to all of those unshared. I count the number of these unshared novel variants; they will, in the future, prove to be the most important piece of data coming from the Big Y test.

        Timothy Peterman
        It would be easier if FTDNA posted the novel variants of a tester's Big Y results. It would save the tester a lot of work. When I take the test it will be for the novel variants only that I will be interested in.

        Comment


        • #19
          They do post the novel variants. But they don't tell you which ones are unshared.

          Timothy Peterman

          Comment


          • #20
            @1798

            Originally posted by T E Peterman View Post
            They do post the novel variants. But they don't tell you which ones are unshared.

            Timothy Peterman
            Big Y testers only see whether any of their matches shares novel variants with them.

            At the minimum, FTDNA should expand the Big Y list of known SNPs by including thousands of SNPs discovered through Big Y (the SNPs named BYxxx and BYxxxx).

            W. (Mr.)

            Comment


            • #21
              There is a tab where all novel variants are listed. If you download the CSV file, you will get a list of all novel variant values & all known SNP values.

              You can then check off the novel variants against the match list where novel variants are only reported if they are shared with someone. I've done this with all 5 Big Y tests I manage.

              Timothy Peterman

              Comment


              • #22
                Originally posted by T E Peterman View Post
                There is a tab where all novel variants are listed. If you download the CSV file, you will get a list of all novel variant values & all known SNP values.

                You can then check off the novel variants against the match list where novel variants are only reported if they are shared with someone. I've done this with all 5 Big Y tests I manage.
                The problem is that Big Y testers only see whether any of their matches shares novel variants with them. There is no indication whether anybody else out of those who tested Big Y also has it.

                Let me go back to the grand uncle (U)/grand nephew (N) pair.

                U has two SNPs at positions 222xxxxx and 223xxxxx, N does not have them. I might have thought that these two (novel) SNPs are unique to U. But they are not! Although none of U matches has them, one of N matches has them! Only because I have access to both kits, I could see that...

                W. (Mr.)

                Comment


                • #23
                  We can't compare data that doesn't exist. The only way we can get data on whether people outside Big Y have the novel variant is for them to take the Big Y test.

                  Timothy Peterman

                  Comment


                  • #24
                    The problem is that Big Y testers only see whether any of their matches shares novel variants with them.
                    Which begs the question just what does Family Tree consider a match?

                    Comment


                    • #25
                      Originally posted by Brunetmj View Post
                      Which begs the question just what does Family Tree consider a match?
                      Until yesterday, I thought I knew the answer. Last night I was looking at matches of N and U, and realized that either FTDNA is inconsistent or there is something going on behind the scenes, as far as Big Y matching is concerned.

                      Mr. W.

                      P.S.
                      FTDNA is inconsistent.
                      For example, if there is a set of rules that governs what SNPs are being placed on SNP certificates, and which ones are not, it must be full of special cases and exceptions.

                      Comment


                      • #26
                        Originally posted by dna View Post
                        Until yesterday, I thought I knew the answer. Last night I was looking at matches of N and U, and realized that either FTDNA is inconsistent or there is something going on behind the scenes, as far as Big Y matching is concerned.

                        Mr. W.

                        P.S.
                        FTDNA is inconsistent.
                        For example, if there is a set of rules that governs what SNPs are being placed on SNP certificates, and which ones are not, it must be full of special cases and exceptions.
                        It is because of the inconsistency of FTDNA that professional analysis by YFull, FGC, and Alex Williamson(if P312) are recommended.

                        Comment


                        • #27
                          Originally posted by Brunetmj View Post
                          Which begs the question just what does Family Tree consider a match?
                          I am by no means an expert on this topic, but it has been my understanding that your BigY Match list is based on your set of Known SNPs. Anybody that matches you with no more than four differences is considered to be a match. The match list is normally sorted by the number of Known SNP Differences, and the actual Non-Matching Known SNPs are shown in the next column.

                          In my own case, I have 38 matches listed. Of these, 4 are exact matches, 11 have one mismatched SNP, 12 have 2 mismatched SNPs, 9 have 3 mismatches, and 2 have 4 mismatches. I happen to be in haplogroup I-L813. All but one of my matches are also in I-L813. The one exception is I-Z74, which is one step up the FTDNA haplotree. He does not have one of the SNPs that define the I-L813 haplogroup.

                          All of the other Non-Matching Known SNPs that are listed appear to be private SNPs to either myself or my matches. For example six of my matches share the SNP known as F454. The other 34 matches are F454-. According to the ISOGG YBrowse utility, F454 was observed in haplogroup O3-M117. Not a reliable SNP from the looks of it, but it plays a significant part in the makeup of my match list.

                          By the way, there is a pull down near the top of your match report labelled "Filter matches by subclade". If you pull that down, you will see a list of all of the haplogroups (according to the FTDNA haplotree) that appear in your match list. You can select one of those (if you have more than one) as a filter. Since I only have one match that is not in my own Hg, I usually don't bother.

                          Comment


                          • #28
                            Originally posted by rrtipton1 View Post
                            I am by no means an expert on this topic, but it has been my understanding that your BigY Match list is based on your set of Known SNPs. Anybody that matches you with no more than four differences is considered to be a match. The match list is normally sorted by the number of Known SNP Differences, and the actual Non-Matching Known SNPs are shown in the next column. [----]
                            The same here, I am not an expert, and had an understanding like yours until last night... Now I have examples that do not fit the above rule...

                            Comment


                            • #29
                              The bottom line is that one should not consider what FTDNA supplies in terms of Big-Y matches and information. The only viable information from the match list is whether there might be a new result in your general haplogroup region. FTDNA is comparing Big-Y raw results against about 40% of the known Y-SNPs. FTDNA does not have an up-to-date internal tree to correctly identify and place called SNPs where they belong. This is where specific haplogroup analysis efforts and/or FGC/YFULL analysis provides the real answer. The U106 project admins visited FTDNA last fall to specifically show FTDNA IT how our comparison was done to remove the upstream SNPs and to properly identify the inconsistent ones that are provided as part of the "novel" results. They got the picture that they had missed the boat on a number of items related to properly analyzing and comparing Big-Y files. But so far no change has occurred in what they are providing to us the customer.

                              Comment


                              • #30
                                The only thing useful I've gotten from Family Tree DNA is the BAM file & the list of novel variants.

                                Send the BAM file to the project admins & also to Yfull. You will be helping to build a better tree.

                                The matches are simply those who differ on 4 or fewer known SNPs. In some cases, an important SNP may not have read & thus isn't included. But in many cases the closest are those who share a MRCA maybe 1,500 years or more in the past; sometimes as far back as 4,000 years in the past. As someone else said, you can limit by terminal SNP.

                                Timothy Peterman

                                Comment

                                Working...
                                X