Announcement

Collapse
No announcement yet.

Zero step mtDNA match

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #31
    Hi MrsB,
    thanks for your reply. Just for the record I did share the full maternal line, including names, dates and locations with the adoptee match as soon as he contacted me, along with an offer of any further assistance that I can give. I've also sent him a link to this thread. It would make me very happy if we could find the common ancestor......but for the moment she remains elusive.

    Comment


    • #32
      Originally posted by felix View Post
      No, no, I wasn't teasing but trying to explain in a different way. Me and my wife truly match at 0-steps in FMS. I was trying to say that people matching each other at 0 steps in FMS can belong to a haplogroup branch which is not necessarily a terminal. Like before 2010 my placement on the tree will be just U with "no additional mutations" mentioned in the tree.
      OK. I will try to find time later to walk through what I think Gaye is trying to say.

      BTW, the FAQ date has a confidence interval. Getting to 100% takes many many more generations than getting to 95% or even 99%. As we Americans say, careful, the last step is a dozy.

      Comment


      • #33
        I actually do enjoy kidding - I have enough stress in my day job, so I'm here for fun. But when I get into major number crunching mode I probably seem too serious. So here comes another long post (following this one) with more numbers, but my conclusion is that people should always contact their FMS matches, regardless of the estimates in the FAQ, and if you don't have any 0-step matches, it could be worthwhile contacting 1, 2 or 3 step matches. If you don't have ANY FMS matches at all, and just a few HVR matches, it is also worthwhile contacting your HVR matches.

        I think there is a greatly increasing potential for mtDNA as the size of the database expands, but many people are not exploring the full potential. It would be great if we can find ways to encourage more people to share their maternal ancestry, contact their matches, and upgrade to the FMS.

        Comment


        • #34
          Originally posted by felix View Post
          JQ705953 is H2c1 because it has the defining marker C152T!!. No one can be of just plain H haplogroup and have all the mutations (approx ~55 mutations) differing from RSRS mapped in the path. There are several mutations for JQ705953 that does not exist in the path.
          Actually, JQ705953 is plain H with no extra mutations. Here is the rCRS sequence for JQ705953: A263G, 315+C, A750G, A1438G, A4769G, A8860G, A15326G, C16519T

          The rCRS sequence for H2c1 is: G1438A, G13708A, C13934T, C152T!! C194T G205A A3334G, so there several differences.

          The RSRS sequence for JQ705953 is: 73, 146, 152, 195, 247, 769, 825T, 1018, 2706, 2758, 2885, 3594, 4104, 4312, 7028, 7146, 7256, 7521, 8468, 8655, 8701, 9540, 10398, 10664, 10688, 10810, 10873, 10915, 11719, 11914, 12705, 13105, 13276, 13506, 13650, 14766, 16129, 16187, 16189, 16223, 16230, 16278, 16311

          JQ705953 has only 43 mutations relative to the RSRS (excluding 315+C which is usually ignored).

          This is consistent with the findings of Behar et al. 2012, and they discussed this in the section on "Violations of the Molecular Clock" that I referred to earlier, in which they give some statistics on distance from the RSRS:

          The mean distance is 57.1 substitutions, the median is 56 and the empirical standard deviation is 5.9. Widely different distances ranging from 41 substitutions in some L0d1a1 mitogenomes to 77 in some L2b1a mitogenomes are observed.
          So we have several consistent lines of evidence that all point to the same conclusion: the slow mutation rate of about 1 in 3600 years; high standard deviation in number of accumulated mutations reported by Behar et al.; some living people who match exactly with ancient samples that date to more than 5000 years ago (and in some cases, living people with fewer mutations, as in the 7000 year old La Brana sample); and large numbers of people who are at fairly old nodes in the Phylotree, for example, 36 people who are U5a1a1 (with an estimated age of about 6700 years) who have no extra mutations and who all match each other exactly.

          But I think the key point is that the rate at which mutations accumulate is higly variable (standard deviation of 5.9), so in some cases people could share a common maternal ancestor more recently than suggested in the FAQ, and in other cases, the common ancestor is more distant - it is difficult to create a table that is accurate for everyone.

          It would be a mistake to conclude that FMS matches are so distant that it is not worth contacting your matches (as some people have suggested in other discussion forums). I think it is always worthwhile to contact all of your matches and share information on your ancestry, and in some rare cases, even a 2 or 3 step FMS match can share a recent common ancestor.

          Comment


          • #35
            Originally posted by GST View Post
            Actually, JQ705953 is plain H with no extra mutations. Here is the rCRS sequence for JQ705953: A263G, 315+C, A750G, A1438G, A4769G, A8860G, A15326G, C16519T

            The rCRS sequence for H2c1 is: G1438A, G13708A, C13934T, C152T!! C194T G205A A3334G, so there several differences.

            The RSRS sequence for JQ705953 is: 73, 146, 152, 195, 247, 769, 825T, 1018, 2706, 2758, 2885, 3594, 4104, 4312, 7028, 7146, 7256, 7521, 8468, 8655, 8701, 9540, 10398, 10664, 10688, 10810, 10873, 10915, 11719, 11914, 12705, 13105, 13276, 13506, 13650, 14766, 16129, 16187, 16189, 16223, 16230, 16278, 16311

            JQ705953 has only 43 mutations relative to the RSRS (excluding 315+C which is usually ignored).

            This is consistent with the findings of Behar et al. 2012, and they discussed this in the section on "Violations of the Molecular Clock" that I referred to earlier, in which they give some statistics on distance from the RSRS:
            I believe the sequence for H2c1 is in RSRS. The RSRS sequence for JQ705953 does contain C152T as double back mutation which I believe it is H2c1.

            Originally posted by GST View Post
            So we have several consistent lines of evidence that all point to the same conclusion: the slow mutation rate of about 1 in 3600 years; high standard deviation in number of accumulated mutations reported by Behar et al.; some living people who match exactly with ancient samples that date to more than 5000 years ago (and in some cases, living people with fewer mutations, as in the 7000 year old La Brana sample); and large numbers of people who are at fairly old nodes in the Phylotree, for example, 36 people who are U5a1a1 (with an estimated age of about 6700 years) who have no extra mutations and who all match each other exactly.

            But I think the key point is that the rate at which mutations accumulate is higly variable (standard deviation of 5.9), so in some cases people could share a common maternal ancestor more recently than suggested in the FAQ, and in other cases, the common ancestor is more distant - it is difficult to create a table that is accurate for everyone.
            While having a slow mutation rate of 1 in every 3600 years or 1 in every 180 generations (20 years per generation), have you considered that is based on per generation and a single event? Which means, the mutation rate (or 0.005 for 1/180) per generation is the probability of a mutation occurring between a mother to a single child - not all her children. Based on number of children she delivers, this probability increases.

            To provide a better understanding, let us consider this simplest example of a dice of 6 faces. So, at each try, the probability of a particular value occurring is 1/6, which means, the value can occur for every 6 tries. Now, take 6 such dices and throw them simultaneously and calculate the probability of a particular value occurring in any one of those 6 dices? The answer is 1. Which means, at least 1 dice will always give that particular in every try.

            For a family of 2 offspring per generation and using mutation rate of 0.005/ generation, it will take approx. 8 generations to have 1 mutation to occur. This is because, 2^8 for more than 180 "generational" mother/child birth events to occur. So, it will take 180 generations (or mother/child birth events) for 1 mutation. However, 9th generation alone can produce a new mutation and from then on, each generation can produce several new mutations. It means, the entire mtDNA tree can be constructed (or all mutations can occur in the population within the first 21 generations). This is not necessarily that a single person will have all mutations.

            Which is why in my earlier post, I referred/questioned the ratio of mtDNA mutations in 100 million people born each year.

            Comment


            • #36
              Originally posted by felix View Post
              I believe the sequence for H2c1 is in RSRS. The RSRS sequence for JQ705953 does contain C152T as double back mutation which I believe it is H2c1.
              152 is a highly variable marker - T152C! is used to define the paragroup H2b and H2c, and then C152T!! is used again to define H2c1. So it is more reliable to look at the full set of other defining markers from H to H2c1, all of which are lacking in JQ705953.


              Originally posted by felix View Post
              To provide a better understanding, let us consider this simplest example of a dice of 6 faces. So, at each try, the probability of a particular value occurring is 1/6, which means, the value can occur for every 6 tries. Now, take 6 such dices and throw them simultaneously and calculate the probability of a particular value occurring in any one of those 6 dices? The answer is 1. Which means, at least 1 dice will always give that particular in every try.
              To calculate the probability of an event occuring with multiple tries, you calculate it is as one minus the probability of it not occuring on the nth try. So the probability of throwing a six on one try is 1/6 or 17%. For two tries, it would be 1 - (5/6)(5/6), or 31%, and for 6 tries it would be a 67% chance of throwing at least one six. So if you throw 6 dice a large number of times, on average you should throw at least one 6 about 2 out of 3 tries.

              But you might need to do a large number of throws to get the proper average. I once tried to prove a statistical argument to a colleague empirically by flipping a coin several times, and I improbably flipped heads 8 times in row (which has less than 1 in 100 chance of occurring, and did not help prove my point). To paraphrase Lady Bracknell: "sometimes the number of events seem to be considerably above the proper average that statistics have laid down for our guidance."

              But it is even more complicated for mtDNA because a normal probability distribution does not seem to apply to mtDNA mutations, so the above type of calculation will not be accurate either. There is some other process that affects the frequency of mtDNA mutations (perhaps environmental, disease, age, chemical or some other unknown stress?) so this might partly explain the clock violations that Behar et al. observed. But it's a mystery. As Behar et al. conclude: "We are currently unable to offer well-founded explanations for these findings, which remain the scope of future studies."
              Last edited by GST; 10 December 2013, 12:20 AM. Reason: typos

              Comment


              • #37
                Originally posted by GST View Post
                To calculate the probability of an event occuring with multiple tries, you calculate it is as one minus the probability of it not occuring on the nth try. So the probability of throwing a six on one try is 1/6 or 17%. For two tries, it would be 1 - (5/6)(5/6), or 31%, and for 6 tries it would be a 67% chance of throwing at least one six. So if you throw 6 dice a large number of times, on average you should throw at least one 6 about 2 out of 3 tries.
                The events are mutually exclusive because, all I am trying to find is when the first occurrence of a new mutation can occur. (i.e, when a particular value can first occur).

                P(A or B or C.. ) = P(A) + P(B)+ ..

                So, the probability is 1/6+1/6+1/6+1/6+1/6.

                You can also practically experiment it. Click roll 6 times and all these 6 times are still 1 generation because assuming there are 6 dices rolled simultaneously. When do you think a particular value occurs?

                Even with non-mutually exclusive events, it hits 90% in 12 tries and 99% in 24 tries.

                Calculating probability for 1 in 180 generations, (i.e, 0.005 / generation) and taking it as mutually exclusive event (even though I think they aren't), you still get a new mutation in 12th generation.


                In the table, the mistake many do is calculate the probability on "generation" - which is actually a misled term because of mutation rate "per generation". The probability must be calculated on the number of birth or generational events which is = 1-POWER(179/180,<no of generational/birth events>). As you can see, it just takes ~10 generations for a new mutation to occur, after which the new mutations occurs exponentially.

                This doesn't mean a person's lineage will have a mutation every ~10 generations. It simply means, after 10 generations, there is a possibility of 1 new mutation among all descendent starting a new branch and after that, every generation will have exponentially new mutations appearing until to a point where pedigree collapses and merges - depending on the group size.

                Originally posted by GST View Post
                But it is even more complicated for mtDNA because a normal probability distribution does not seem to apply to mtDNA mutations, so the above type of calculation will not be accurate either. There is some other process that affects the frequency of mtDNA mutations (perhaps environmental, disease, age, chemical or some other unknown stress?) so this might partly explain the clock violations that Behar et al. observed. But it's a mystery. As Behar et al. conclude: "We are currently unable to offer well-founded explanations for these findings, which remain the scope of future studies."
                I am not sure why you say normal probability distribution does not apply because it is the very same on which mutation rate itself is based on. However, what is missed is, they calculate "assuming" a mother is having one child (as one event) which misleads many to think a haplogroup is several thousand years old. However, this assumption is never mentioned in any academic papers nor considered in their calculations. I am not blaming anyone but I honestly believe they overlooked this important factor.

                You are right that we do not know why a mutation occurs but we do know it is simply a mistake. Irrespective of whatever mutation rate mentioned in any academic paper, you can always arrive zero mutations for the first 10-12 generations and an exponential increase in new branches (as mentioned in the table) and then settles based on pedigree collapse. I doubt a new mutation every several thousand years as it does not consider these factors.

                Also, 0-step FMS within ~500 years aligns with this above explanation taking population growth into consideration.
                Attached Files

                Comment


                • #38
                  I think speaking of mutation rates w.r.t populations here will be off-topic. I had posted a blog and created a new thread where it can be discussed. It is based on the scientific study, The Mutation Rate in the Human mtDNA Control Region where 3 mutations in HVR1 and HVR2 are found in just 15 generations.

                  Comment

                  Working...
                  X