Announcement

Collapse
No announcement yet.

Evaluating a Big Y match

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • MMaddi
    replied
    Originally posted by 1798 View Post
    I am not saying that I am right. Here is what Michal posted at anthrogenica.
    “The major conclusion remains unchanged, which means that U106 diverged most likely between 6500 and 5000 years ago, probably within the 6000-5500 BP time frame.”
    Well, that's wonderful, but what does it have to do with you disagreeing with my estimate that my 83/111 match who shares one of my Big Y singletons with me has a TMRCA with me of 1,500 years or maybe less? You could have left it at just saying that you may not be right. The rest has nothing to do with the question we're discussing.

    Leave a comment:


  • 1798
    replied
    Originally posted by MMaddi View Post
    The way I arrived at the figure of a TMRCA of 1,200-1,500 years is by comparing the 67 marker haplotypes of myself, the person who tested FGC13492+ and a closer match of 63/67 (now 104/111 with me). I used the McGee Y-DNA utility at http://www.mymcgee.com/tools/yutility.html. Many people regard the McGee utility as more accurate than FTDNA's TiP calculator.

    Since all three of us now have 111 markers, I reran the comparison. The McGee utility doesn't use all 111 markers. It only used 94. However, I got similar results as the comparison I'd done in the past with 67 markers. In fact, at this comparison level my 104/111 match has a TMRCA with the FGC13492+ match of 1,200 years (same as before) and I have a TMRCA of 1,350 years (not the 1,500 years I got with 67 markers) with the FGC13492+ match.

    I think you're using too simplistic a view of GD and TMRCA. I suspect that you're taking our GD of 28 at 111 markers and mulitplying that by 100 to come up with a TMRCA of 2,800 years. That's simplistic because it doesn't take into account the different mutation rates of different markers, which is something a TMRCA calculator like FTDNA's TiP or the McGee utility does. Not all marker mismatches are the same, as you seem to assume. Plus, there is an element of probability involved and it may be that the TMRCA between the FGC1392+ match and me is an outlier on the close end.
    I am not saying that I am right. Here is what Michal posted at anthrogenica.
    “The major conclusion remains unchanged, which means that U106 diverged most likely between 6500 and 5000 years ago, probably within the 6000-5500 BP time frame.”

    Leave a comment:


  • MMaddi
    replied
    Originally posted by 1798 View Post
    That is just an estimate not a fact. A GD of 28 seems a lot between two men who are recently related.
    The way I arrived at the figure of a TMRCA of 1,200-1,500 years is by comparing the 67 marker haplotypes of myself, the person who tested FGC13492+ and a closer match of 63/67 (now 104/111 with me). I used the McGee Y-DNA utility at http://www.mymcgee.com/tools/yutility.html. Many people regard the McGee utility as more accurate than FTDNA's TiP calculator.

    Since all three of us now have 111 markers, I reran the comparison. The McGee utility doesn't use all 111 markers. It only used 94. However, I got similar results as the comparison I'd done in the past with 67 markers. In fact, at this comparison level my 104/111 match has a TMRCA with the FGC13492+ match of 1,200 years (same as before) and I have a TMRCA of 1,350 years (not the 1,500 years I got with 67 markers) with the FGC13492+ match.

    I think you're using too simplistic a view of GD and TMRCA. I suspect that you're taking our GD of 28 at 111 markers and mulitplying that by 100 to come up with a TMRCA of 2,800 years. That's simplistic because it doesn't take into account the different mutation rates of different markers, which is something a TMRCA calculator like FTDNA's TiP or the McGee utility does. Not all marker mismatches are the same, as you seem to assume. Plus, there is an element of probability involved and it may be that the TMRCA between the FGC13492+ match and me is an outlier on the close end.
    Last edited by MMaddi; 16 July 2015, 09:37 AM.

    Leave a comment:


  • 1798
    replied
    Originally posted by MMaddi View Post
    This doesn't answer my question of how my novel variant from Big Y is shared by someone else. We are both CTS2509+, estimated by Iain McDonald to be 2,100 years old. We both agree (!) that Iain is very good at estimating subclade ages, based on Big Y results, which are more accurate than dating using STR results.

    So, I'll repeat the question. How can my novel variant (found in no other Big Y result from other CTS2509+ men), which has been named FGC13492, be 2,800 years old when it's downstream from CTS2509 which is only 2,100 years old? Did Mr. CTS2509 have a time machine that allowed him to go 700 years into the past and father his son? Or did Mr. FGC13492 have the time machine, which he used to go 700 years into the future to father his father?

    You can see why I'm confused, can't you? Maybe you can educate me.
    That is just an estimate not a fact. A GD of 28 seems a lot between two men who are recently related.

    Leave a comment:


  • MMaddi
    replied
    Originally posted by 1798 View Post
    I have a GD of 39 at 111 markers to other testers with the same terminal SNP and it is estimated to be 4000ybp,(S5520 (S5520) 2069 BC). I am in a U106 project at present.
    This doesn't answer my question of how my novel variant from Big Y is shared by someone else. We are both CTS2509+, estimated by Iain McDonald to be 2,100 years old. We both agree (!) that Iain is very good at estimating subclade ages, based on Big Y results, which are more accurate than dating using STR results.

    So, I'll repeat the question. How can my novel variant (found in no other Big Y result from other CTS2509+ men), which has been named FGC13492, be 2,800 years old when it's downstream from CTS2509 which is only 2,100 years old? Did Mr. CTS2509 have a time machine that allowed him to go 700 years into the past and father his son? Or did Mr. FGC13492 have the time machine, which he used to go 700 years into the future to father his father?

    You can see why I'm confused, can't you? Maybe you can educate me.

    Leave a comment:


  • 1798
    replied
    Originally posted by MMaddi View Post
    I see. So, you think that someone who shares one of the singletons from my Big Y results has a TMRCA of 2,800 years with me.

    The upstream subclade for both of us is CTS2509. According to Dr. McDonald, whom you seem to regard as knowledgeable (and he is), CTS2509 is about 2,100 years old. Can you explain to me how FGC13294 is 700 years older than its parent subclade?

    With all due respect, I'd have more interest in your commentary about Big Y results and the R1b-U106 Project if you've had the Big Y test yourself and had not left the R1b-U106 Project. But I guess you'd rather speak from your personal opinion from the outside than having some involvement in the process.
    I have a GD of 39 at 111 markers to other testers with the same terminal SNP and it is estimated to be 4000ybp,(S5520 (S5520) 2069 BC). I am in a U106 project at present.

    It might be a good idea to hold off on Big Y for a few months. There are hints that standard pricing will be reduced and more features added later this year.
    https://www.genomeweb.com/sequencing...e-y-chromosome
    Last edited by 1798; 15 July 2015, 01:00 AM.

    Leave a comment:


  • MMaddi
    replied
    Originally posted by 1798 View Post

    A GD of 28 at 111 markers is more like 2,800 ybp.
    I see. So, you think that someone who shares one of the singletons from my Big Y results has a TMRCA of 2,800 years with me.

    The upstream subclade for both of us is CTS2509. According to Dr. McDonald, whom you seem to regard as knowledgeable (and he is), CTS2509 is about 2,100 years old. Can you explain to me how FGC13294 is 700 years older than its parent subclade?

    With all due respect, I'd have more interest in your commentary about Big Y results and the R1b-U106 Project if you've had the Big Y test yourself and had not left the R1b-U106 Project. But I guess you'd rather speak from your personal opinion from the outside than having some involvement in the process.
    Last edited by MMaddi; 14 July 2015, 03:21 PM.

    Leave a comment:


  • 1798
    replied
    Originally posted by MMaddi View Post
    It seems that you haven't checked the project results pages for a long time - https://www.familytreedna.com/public...ction=yresults. We've incorporated new subclades, which are shared by two or more members, found in their Big Y results, as the members provide us with their bed and vcf files from Big Y. Those files are the raw data which are analyzed by the spreadsheet one of the project members developed to weed out false "novel variants" that FTDNA doesn't weed out. This is what wkauffman wrote about in his post, which you responded to.



    Read my post recently in another thread at http://forums.familytreedna.com/show...2&postcount=12. I wrote there what I regard as a success story: "I used FGC for analysis of my Big Y BAM file. I was given a list, based on their analysis, of my best quality novel variants, which they named - FGC13480-FGC13492. I then had 12 of the 13 made testable at YSEQ, which one of my semi-close matches at FTDNA tested. (We're an 83/111 match. My estimate of when our common ancestor lived is 1,200-1,500 years ago.) He was found to be FGC13492+, forming a new subclade of R-CTS2509." That new subclade is reflected in the project results page and our U106 haplotree.
    I have checked the results but it is the U106 Y-tree that has not been updated with all the Big Y SNPs.

    A GD of 28 at 111 markers is more like 2,800 ybp.

    Leave a comment:


  • rrtipton1
    replied
    By the way, I have added the novel variants data for all 37 of my I-L813 to a spreadsheet for analysis. There are about 925 novel variants among the 38 of us. Of these, about 100 appear to be useful for defining subclades. A total of 375 are currently unique to individuals, for an average of about 10 unique variants per man. An additional 75 appear to be associated with haplogroups above I-L813 in the haplotree. This leaves about 375 of the 925 novel variants in the category I call flakey or inconsistent. This means that about 40% of the novel variants are what I call weeds, that need to be removed from consideration.

    Leave a comment:


  • rrtipton1
    replied
    Originally posted by T E Peterman View Post
    What counts is unshared novel variants. A lot of the 78 are shared with everyone else in your haplogroup. To be clear, novel doesn't mean unique, it just means new, as in just discovered.
    Timothy Peterman
    I agree with Timothy. Luckily, FTDNA gives you the list of Novel Variants that you have and your match doesn't and vice versa. Unluckily, they don't tell you why they are unmatched.

    As an example, one of my matches and I share a surname, although our MRCA is probably some time before 1650. We share 194 novel variants. I have 39 that he doesn't have and he has 28 that I don't have. I have spent many hours reconciling all of these. In a large number of cases, one or the other of us have been found to have "no-calls" for the variant in question.

    If you only look at the match reports or the exported .csv files for the two of us, you cannot determine whether it is a true mismatch or a no-call. For that, you need to look at the .bed or .vcf file. Unfortunately, you only get to see that if the other party is willing to send you a copy. Even a project admin cannot download the raw data for anybody but himself.

    After taking into account the no-calls, the novel variants that occur higher up the haplotree, and the otherwise flakey results, I have cut this down to five variants that I have and he doesn't and two that he has and I don't. That averages out to about 3.5. If you apply the 135-years value to that, it gives an TMRCA of 465 years, or about 1550. Of course with only two samples, the error-bands are quite large.

    Both my match and I have tested to 111 STR markers. The FTDNA TiP report indicates about a 70% probability of the common ancestor being within the last 14 generations. If you assume a generation as 33 years, that would be about 462 years ago. That puts things in the same ballpark.

    Leave a comment:


  • MMaddi
    replied
    Originally posted by 1798 View Post
    Why has the U106 project not updated their SNP tree from all of the Big Y results that they have?
    It seems that you haven't checked the project results pages for a long time - https://www.familytreedna.com/public...ction=yresults. We've incorporated new subclades, which are shared by two or more members, found in their Big Y results, as the members provide us with their bed and vcf files from Big Y. Those files are the raw data which are analyzed by the spreadsheet one of the project members developed to weed out false "novel variants" that FTDNA doesn't weed out. This is what wkauffman wrote about in his post, which you responded to.

    Originally posted by 1798 View Post
    I don't see how the Big Y can be a success at the current price. I don't see many success stories at all from the Big-Y at present.
    Read my post recently in another thread at http://forums.familytreedna.com/show...2&postcount=12. I wrote there what I regard as a success story: "I used FGC for analysis of my Big Y BAM file. I was given a list, based on their analysis, of my best quality novel variants, which they named - FGC13480-FGC13492. I then had 12 of the 13 made testable at YSEQ, which one of my semi-close matches at FTDNA tested. (We're an 83/111 match. My estimate of when our common ancestor lived is 1,200-1,500 years ago.) He was found to be FGC13492+, forming a new subclade of R-CTS2509." That new subclade is reflected in the project results page and our U106 haplotree.

    Leave a comment:


  • 1798
    replied
    Originally posted by wkauffman View Post
    The bottom line is that one should not consider what FTDNA supplies in terms of Big-Y matches and information. The only viable information from the match list is whether there might be a new result in your general haplogroup region. FTDNA is comparing Big-Y raw results against about 40% of the known Y-SNPs. FTDNA does not have an up-to-date internal tree to correctly identify and place called SNPs where they belong. This is where specific haplogroup analysis efforts and/or FGC/YFULL analysis provides the real answer. The U106 project admins visited FTDNA last fall to specifically show FTDNA IT how our comparison was done to remove the upstream SNPs and to properly identify the inconsistent ones that are provided as part of the "novel" results. They got the picture that they had missed the boat on a number of items related to properly analyzing and comparing Big-Y files. But so far no change has occurred in what they are providing to us the customer.
    Why has the U106 project not updated their SNP tree from all of the Big Y results that they have? I don't see how the Big Y can be a success at the current price. I don't see many success stories at all from the Big-Y at present.

    Leave a comment:


  • T E Peterman
    replied
    The only thing useful I've gotten from Family Tree DNA is the BAM file & the list of novel variants.

    Send the BAM file to the project admins & also to Yfull. You will be helping to build a better tree.

    The matches are simply those who differ on 4 or fewer known SNPs. In some cases, an important SNP may not have read & thus isn't included. But in many cases the closest are those who share a MRCA maybe 1,500 years or more in the past; sometimes as far back as 4,000 years in the past. As someone else said, you can limit by terminal SNP.

    Timothy Peterman

    Leave a comment:


  • wkauffman
    replied
    The bottom line is that one should not consider what FTDNA supplies in terms of Big-Y matches and information. The only viable information from the match list is whether there might be a new result in your general haplogroup region. FTDNA is comparing Big-Y raw results against about 40% of the known Y-SNPs. FTDNA does not have an up-to-date internal tree to correctly identify and place called SNPs where they belong. This is where specific haplogroup analysis efforts and/or FGC/YFULL analysis provides the real answer. The U106 project admins visited FTDNA last fall to specifically show FTDNA IT how our comparison was done to remove the upstream SNPs and to properly identify the inconsistent ones that are provided as part of the "novel" results. They got the picture that they had missed the boat on a number of items related to properly analyzing and comparing Big-Y files. But so far no change has occurred in what they are providing to us the customer.

    Leave a comment:


  • dna
    replied
    Originally posted by rrtipton1 View Post
    I am by no means an expert on this topic, but it has been my understanding that your BigY Match list is based on your set of Known SNPs. Anybody that matches you with no more than four differences is considered to be a match. The match list is normally sorted by the number of Known SNP Differences, and the actual Non-Matching Known SNPs are shown in the next column. [----]
    The same here, I am not an expert, and had an understanding like yours until last night... Now I have examples that do not fit the above rule...

    Leave a comment:

Working...
X