No announcement yet.

The inconsistencies of the BigY test - How to Manage?

  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    I don't know what YFull does but I assume it is all software reading the file(s). There is no reason that the whole thing cannot be automated. Any "judgement" calls must follow some rules which can be added to the software algorithms.

    If humans are reviewing the results it is only because everything is new and there may be mistakes or incomplete algorithms. Eventually humans will not be needed except to add bells and whistles to the software. An open source project would be useful so 100s/1000s can contribute to various aspects of the program(s).


    • #17
      As I understand the dynamic nature of calls on some SNPs, it has to do with the original reference sample. Some of the reference "base" calls were actually mutations. It will take time and a larger population of samples to determine the true reference base for all locations. You may be positive on a location but that means you are just like almost everyone else.


      • #18
        Great feedback here.

        Igmayka - most of the issues are with Novel SNPs not being indicated as positive due to the low quality scan method... where after review of BAM files those locations do seem to indicate a positive call. Then there are also no-calls of known SNPs. I belong to a project where a small group of us have BigY results that are logically impossible to reconcile without accepting that there is erroneous data (actually a fair amount of erroneous data). I am sure there's explanations for all of it... but I would prefer good results & reports over excuses for mediocrity. We are at the mercy of the project administrators (who are more qualified and knowledgeable but are spread very thin) to try to sort out all the inconsistencies... so I am just looking for a way to improve my BigY output so that us layman can better participate in our personal goals of establishing Y tree branches downstream of what is on the radar today. The point is, why sell a test if the output can't be interpretated without expert assistance? I often feel that I am financing Other's research - and that's ok provided I also get good output. In other words, could we please make these products commercial/consumer grade...OR include expert analysis.

        The comment about YFULL being driven by researchers just looking for data seems to be a practical explanation. But if that is the case, I would appreciate such a disclosure. "For a small fee, we offer insight in exchange for your test data." Maybe I'll ask them! Anyway, thanks for the introduction to YFULL, I think I'll give them a go.


        • #19
          The first month YFull did the analysis for free, so it sure seems like they are into research.

          It would of course be better if FTDNA could give us "the whole shebang", but the test without the deep analysis is better than no test. Getting expertise knowledge into a program takes time, and the matching tool is at least one step from FTDNA.

          Saying that they can't sell a test without deep analysis is like a person without driving license saying that you shouldn't sell cars without a driver.


          • #20
            High call non-existant

            In looking at the results and VCF and BAM file for kit xxxxx FTDNA has called position 22229242 (G->C) as a high quality variant. in the kit results.

            The VCF indicates that there is a SNP present but with a lower quality reading.

            In the BAM file there is NO G-->C SNP present in any of the reads in this region. This kit does not have that SNP.

            This seems to be a flub in calling or else they have some internal databases referential integrity problems present.


            • #21
              Kit # 34795
              FTDNA reported a ?(no call) for Z382.

              The YFull report (YFO1850):
              Search in BAM file
              ChrY position: 3680849 (+strand)
              Reads: 5
              Position data: 5C
              Weight for C: 1.0
              Probability of error: 0.0 (0<->1)
              Sample allele: C
              Reference (hg19) allele: G
              Known SNPs at this position: Z382 (G->C)