No announcement yet.

Big Y: 9203590 Reads VS 18638834 Reads

  • Filter
  • Time
  • Show
Clear All
new posts

  • Big Y: 9203590 Reads VS 18638834 Reads

    Hello, My kit Number is 213004, i was among the first Big-Y participants who received results on February 28th.
    since then i requested the BAM file, and got it analysed by YFull team. strangely after being informed by project admins about the varying sizes of the incoming raw BAM files. my file size was already considered to be the Smallest compared to other Haplogroup (R1a)samples (20-25% less in size and reads).

    after seeing Felix's blog post about his experience with the BAM file interpretation at YFull. i was amazed to find out, how big is the difference in the amount of Reads between our two kits.

    i posted this also in various facebook interest groups and was advised to bring this topic in this forum:

    ...Felix's Big-Y BAM file 0.88GB all Reads 18638834 Private SNPs 126 Best Quality 90 --- now My Big-Y BAM file 0.43GB all Reads 9203590 Private SNPs 42 Best Quality 16 (+1 INDEL). How can this difference of 100% Coverage can be explained?

    as it is clearly visible from the numbers, the BAM file is more than twice larger. so is the amount of "all Reads" slightly more than twice bigger, and the Amount of Best Quality Private SNPs is more than twice!

    i am attaching screen shots of the Statistics from my Yfull Report and the ones from Felix's blog post.

    note #1: My Big-Y Test was made from a 2+ year old sample.

    it is important that official FTDNA lab staff to look through this and if possible to explain this odd and large discrepancy in Total Read Amounts?
    Attached Files
    Last edited by Borowski; 30 March 2014, 11:30 AM.

  • #2
    Have you emailed FTDNA?, if so what did they say?


    • #3
      Case Solution and Conclusion

      All right, i would like to update the information regarding my question: after long conversation with Vladimir Semargl and also with David Mittelman, after both looked deeper into the matter, i can say that the Reason for the Difference in amount of Reads and file size is due to difference in Read Length. in case of my sample it required less reads of the same regions to get to the same total outcome. i have received similar explanation independently from both Vladimir and David. so it seems that the varying amounts of Total Reads between Samples have no effect on the Quality or Coverage range. i also would like to apologize if my question put any one in an inconvenient position. and that in my opinion the verification of this matter was for the Benefit of all sides.


      • #4
        Thanks for the follow up. Much appreciated.