Announcement

Collapse
No announcement yet.

Comparing FamilyFinder raw data files

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Comparing FamilyFinder raw data files

    I want to compare two autosomal raw data files to see how much they have shared cM. My question is, how does each row of data (rsid, chromosome, position, genotype) relate to cM? Or how does positions relate to cM ? Forgive me if my question is so stupid, but I am no genetist.

  • #2
    They have to be interpolated. A matching set of alleles, say 500 or 700, forms a shared segment. The segment is measured in base pair (Ancestry) or cM (FTDNA and 23andMe). If cM, the segment position is considered, as is the chromosome and build, and then it is interpolated to a cM.

    Matt.
    Last edited by mkdexter; 5th July 2013, 01:38 AM.

    Comment


    • #3
      Originally posted by felix View Post
      I want to compare two autosomal raw data files to see how much they have shared cM. My question is, how does each row of data (rsid, chromosome, position, genotype) relate to cM? Or how does positions relate to cM ? Forgive me if my question is so stupid, but I am no genetist.
      The physical base positions can be converted to cM with a look-up table. I don't recall seeing a table explicitly for FTDNA SNPs, but it would probably be similar to the data in this file:

      http://www.openbioinformatics.org/pe...ss_hg19.pfb.gz

      You can also look up individual items here. The output you want is "sex average map position."

      http://compgen.rutgers.edu/RutgersMap/MapBrowser.aspx

      Comment


      • #4
        Thanks everyone. After putting my head around, I think I got the idea but someone must correct me if my understanding of the logic is correct.

        Let's take Chr 1 as example:
        • Chromosome 1 has 249 million base pairs. (ref)
        • 1 cM is equal to around 1 million base pairs. (ref)


        Since a SNP is a change, if the compared raw data has a same SNP, then the distance between the physical positions in the chromosomes of differing SNPs will give me a value which is a single segment equal to number of same base pairs. Dividing this number of basepair value by a million would give me a value in cM. If the threshold is above 5 cM, then, there has to be a shared segment of 5 million+ base pairs common.

        Is my understanding correct?

        Comment


        • #5
          Originally posted by felix View Post
          Thanks everyone. After putting my head around, I think I got the idea but someone must correct me if my understanding of the logic is correct.

          Let's take Chr 1 as example:
          • Chromosome 1 has 249 million base pairs. (ref)
          • 1 cM is equal to around 1 million base pairs. (ref)


          Since a SNP is a change, if the compared raw data has a same SNP, then the distance between the physical positions in the chromosomes of differing SNPs will give me a value which is a single segment equal to number of same base pairs. Dividing this number of basepair value by a million would give me a value in cM. If the threshold is above 5 cM, then, there has to be a shared segment of 5 million+ base pairs common.

          Is my understanding correct?
          No.

          If using FTNDA data, Chromosome 1 has 246 Mb, 267cM and 59444 SNPs. Note in the FTDNA test 246Mb does not equal 246cM, it equals 267cM. Also note the FTDNA data is build 36, the other companies are build 37 so they differ slightly.

          You can't exactly say 1cM = 1Mb because the cM value is based on the interpolated number which accounts for the positions start and stop of the segment, recombination and linkage.

          See the Rutgers maps @ http://compgen.rutgers.edu/maps


          Matt.
          Last edited by mkdexter; 8th July 2013, 01:54 AM.

          Comment


          • #6
            Originally posted by felix View Post
            Thanks everyone. After putting my head around, I think I got the idea but someone must correct me if my understanding of the logic is correct.

            Let's take Chr 1 as example:
            • Chromosome 1 has 249 million base pairs. (ref)
            • 1 cM is equal to around 1 million base pairs. (ref)


            Since a SNP is a change, if the compared raw data has a same SNP, then the distance between the physical positions in the chromosomes of differing SNPs will give me a value which is a single segment equal to number of same base pairs. Dividing this number of basepair value by a million would give me a value in cM. If the threshold is above 5 cM, then, there has to be a shared segment of 5 million+ base pairs common.

            Is my understanding correct?
            No, you can't convert from Mb to cM by a calculation. You have to look up experimental data.

            The Mb unit is a physical measure -- the position of a SNP along the chromosome.

            The cM unit is a probabilistic measure -- how frequently will SNPs at two different locations be broken up by recombination. It's an empirical number, based on observations in a particular dataset. It differs slightly from experiment to experiment, and differs dramatically between males and females.

            I think of the cM as being the "effective" distance between two markers. An analogy might be measuring off a mile on a map. That's the physical distance. How long does it take a person to get from point A to point B? It depends on the terrain. The "effective" distance is different for a smooth trail vs an obstacle course.

            CeCe Moore has some examples of how cM and Mb compare:

            http://www.yourgeneticgenealogist.co...tions-and.html

            Comment


            • #7
              Yes! I finally made it. It was a good learning experience.

              It seems, there is also a error radius where non matching SNPs are allowed - able to understand fully how matches are made from the JavaScript source code made available by David Pike in his tool. The example in this article : Ancesty DNA – Why Low Confidence Matches Matter helped a lot in understanding how to convert Mb to cM.

              Download: Autosomal Compare

              Comment


              • #8
                close......

                Great start but there are some buglets to squash... e-mail me dna_wayne@yahoo.com

                Comment


                • #9
                  Originally posted by wkauffman View Post
                  Great start but there are some buglets to squash... e-mail me dna_wayne@yahoo.com
                  Sure. I will fix them. Sending you an email about the details of the bug.

                  Comment


                  • #10
                    Just came back to say that all known bugs are fixed..

                    Your feedback/suggestions for improvements are most welcomed.

                    Comment

                    Working...
                    X