I want to compare two autosomal raw data files to see how much they have shared cM. My question is, how does each row of data (rsid, chromosome, position, genotype) relate to cM? Or how does positions relate to cM ? Forgive me if my question is so stupid, but I am no genetist.
Announcement
Collapse
No announcement yet.
Comparing FamilyFinder raw data files
Collapse
X

They have to be interpolated. A matching set of alleles, say 500 or 700, forms a shared segment. The segment is measured in base pair (Ancestry) or cM (FTDNA and 23andMe). If cM, the segment position is considered, as is the chromosome and build, and then it is interpolated to a cM.
Matt.Last edited by mkdexter; 5th July 2013, 02:38 AM.

Originally posted by felix View PostI want to compare two autosomal raw data files to see how much they have shared cM. My question is, how does each row of data (rsid, chromosome, position, genotype) relate to cM? Or how does positions relate to cM ? Forgive me if my question is so stupid, but I am no genetist.
http://www.openbioinformatics.org/pe...ss_hg19.pfb.gz
You can also look up individual items here. The output you want is "sex average map position."
http://compgen.rutgers.edu/RutgersMap/MapBrowser.aspx
Comment

Thanks everyone. After putting my head around, I think I got the idea but someone must correct me if my understanding of the logic is correct.
Let's take Chr 1 as example:
Since a SNP is a change, if the compared raw data has a same SNP, then the distance between the physical positions in the chromosomes of differing SNPs will give me a value which is a single segment equal to number of same base pairs. Dividing this number of basepair value by a million would give me a value in cM. If the threshold is above 5 cM, then, there has to be a shared segment of 5 million+ base pairs common.
Is my understanding correct?
Comment

Originally posted by felix View PostThanks everyone. After putting my head around, I think I got the idea but someone must correct me if my understanding of the logic is correct.
Let's take Chr 1 as example:
Since a SNP is a change, if the compared raw data has a same SNP, then the distance between the physical positions in the chromosomes of differing SNPs will give me a value which is a single segment equal to number of same base pairs. Dividing this number of basepair value by a million would give me a value in cM. If the threshold is above 5 cM, then, there has to be a shared segment of 5 million+ base pairs common.
Is my understanding correct?
If using FTNDA data, Chromosome 1 has 246 Mb, 267cM and 59444 SNPs. Note in the FTDNA test 246Mb does not equal 246cM, it equals 267cM. Also note the FTDNA data is build 36, the other companies are build 37 so they differ slightly.
You can't exactly say 1cM = 1Mb because the cM value is based on the interpolated number which accounts for the positions start and stop of the segment, recombination and linkage.
See the Rutgers maps @ http://compgen.rutgers.edu/maps
Matt.Last edited by mkdexter; 8th July 2013, 02:54 AM.
Comment

Originally posted by felix View PostThanks everyone. After putting my head around, I think I got the idea but someone must correct me if my understanding of the logic is correct.
Let's take Chr 1 as example:
Since a SNP is a change, if the compared raw data has a same SNP, then the distance between the physical positions in the chromosomes of differing SNPs will give me a value which is a single segment equal to number of same base pairs. Dividing this number of basepair value by a million would give me a value in cM. If the threshold is above 5 cM, then, there has to be a shared segment of 5 million+ base pairs common.
Is my understanding correct?
The Mb unit is a physical measure  the position of a SNP along the chromosome.
The cM unit is a probabilistic measure  how frequently will SNPs at two different locations be broken up by recombination. It's an empirical number, based on observations in a particular dataset. It differs slightly from experiment to experiment, and differs dramatically between males and females.
I think of the cM as being the "effective" distance between two markers. An analogy might be measuring off a mile on a map. That's the physical distance. How long does it take a person to get from point A to point B? It depends on the terrain. The "effective" distance is different for a smooth trail vs an obstacle course.
CeCe Moore has some examples of how cM and Mb compare:
http://www.yourgeneticgenealogist.co...tionsand.html
Comment

Yes! I finally made it. It was a good learning experience.
It seems, there is also a error radius where non matching SNPs are allowed  able to understand fully how matches are made from the JavaScript source code made available by David Pike in his tool. The example in this article : Ancesty DNA – Why Low Confidence Matches Matter helped a lot in understanding how to convert Mb to cM.
Download: Autosomal Compare
Comment

Comment
Comment