Announcement

Collapse
No announcement yet.

Heteroplasmy and Genetic Distance

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Heteroplasmy and Genetic Distance

    Hi there,

    I've recently tested mtFull and am looking at my results, and there's a rare heteroplasmy in one of the common HVR2 mutations: C195Y

    Now I'm wondering how this affects my matches. For 'HVR1, HVR2, CODING REGIONS' matching, there are no matches with Genetic Distance of 0, but 9x with GD1, 23x GD2, ...

    Two of the GD1 matches I found in a Project and therefore can see the HVR1/HVR2 values, also three of the GD2 matches.

    For the two visible GD1 matches the HVR and HVR2 the mutations are the same - except for the heteroplasmy - when ignoring the 309 and 315 mutations as suggested by ftdna.

    The question that's important for me to understand the matching rules: Am I right to assume that the Coding Region - which I cannot see for matches in projects - must be exactly equal to mine, as the GD1 comes from the heteroplasmy, as suggested by ftdna?

    This is what ftdna say about it:
    "For those who have tested the mtDNA Full Sequence (mtFullSequence), three differences are allowed. These differences include cases of heteroplasmy. Two high frequency insertion/deletion locations are completely excluded from difference counts. These are mutations at positions 309 and 315."

    If this was the case, to get a GD0 match, the match would also need to have the heteroplasmy detected in this position, right?

    For me this would mean, that I should generally subtract 1 from the GD, to get a value that is comparable to non-heteroplasmy matching (as long as the heteroplasmy is not also in the compared kit) ...

    Thanks in Advance,
    Christian Kessler

  • #2
    Unfortunately I still didn't find rules, how Genetic Distance is calculated, so I just asked FTDNA about this. Meanwhile, this is what I had found so far, and what got me thinking:


    Blaine Bettinger; The Family Tree Guide to DNA Testing and Genetic Genealogy: 'Heteroplasmy can affect mtDNA matching through Family Tree DNA. For example, if two individuals have identical mitochondrial genomes with the exception of a heteroplasmic mutation in one of them, they may show up as having a genetic distance of one.'

    ... or they may not - in which cases? I understand he speaks about the case that a heteroplasmy has been detected by FTDNA (as lots more heteroplasmy exists than gets detected by FTDNA). ... the text goes on:

    'Thus, if Bill has mutation 16230A and John has mutation 16230W (indicating an A or T at this location), they are not shown as exact matches.'

    ... and does that mean they are shown as Genetic Distance of 1? I have the heteroplasmy not in HVR1 but in HVR2, still very similar to the example.


    Roberta Estes; 2013/06/28: "And if the heteroplasmy is in the HVR1 or HVR2 region, it’s a mess." ... hopefully outdated


    FTDNA; mtDNA – Matches Page Questions / On the mtDNA - Matches page, are only exact matches shown?: "For those who have tested the mtDNA Full Sequence (mtFullSequence), three differences are allowed. These differences include cases of heteroplasmy." ... that just says that heteroplasmy is taken into account somehow. But how? There are many combinations of heteroplasmy and homoplasmy, some do match and some not...

    Comment


    • #3
      Heteroplasmy

      Originally posted by chr View Post
      Unfortunately I still didn't find rules, how Genetic Distance is calculated, so I just asked FTDNA about this. Meanwhile, this is what I had found so far, and what got me thinking:


      Blaine Bettinger; The Family Tree Guide to DNA Testing and Genetic Genealogy: 'Heteroplasmy can affect mtDNA matching through Family Tree DNA. For example, if two individuals have identical mitochondrial genomes with the exception of a heteroplasmic mutation in one of them, they may show up as having a genetic distance of one.'

      ... or they may not - in which cases? I understand he speaks about the case that a heteroplasmy has been detected by FTDNA (as lots more heteroplasmy exists than gets detected by FTDNA). ... the text goes on:

      'Thus, if Bill has mutation 16230A and John has mutation 16230W (indicating an A or T at this location), they are not shown as exact matches.'

      ... and does that mean they are shown as Genetic Distance of 1? I have the heteroplasmy not in HVR1 but in HVR2, still very similar to the example.


      Roberta Estes; 2013/06/28: "And if the heteroplasmy is in the HVR1 or HVR2 region, it’s a mess." ... hopefully outdated


      FTDNA; mtDNA – Matches Page Questions / On the mtDNA - Matches page, are only exact matches shown?: "For those who have tested the mtDNA Full Sequence (mtFullSequence), three differences are allowed. These differences include cases of heteroplasmy." ... that just says that heteroplasmy is taken into account somehow. But how? There are many combinations of heteroplasmy and homoplasmy, some do match and some not...
      I have two heteroplasmies, which were found when I ordered the full mtdna genome test. My closest mtdna match does match one of my heteroplasmies, so I have an extra heteroplasmy that he does not. Our common ancestor lived in the 18th century. My understanding is that a heteroplasmy is not a fully formed mutation, so it is not counted as a one step mutation, like fully formed mutations.

      Best regards, Douglas W. Fisher(Adoptee)
      Confirmed paternal surname "Wells"
      Maternal grandmother "Moreman" V19 mtdna group
      Kit#122883

      Comment


      • #4
        Hello Douglas W. Fisher,

        First, thanks for your help with this.

        Second, just to understand you right - By "I have two heteroplasmies" you mean heteroplasmy at two different positions, for example 123Y (meaning 123C and 123T) and 456W (meaning 456A and 456G)?

        Or do you mean heteroplasmy in one position, that has two values; for example 123Y (meaning 123C and 123T)?

        By "My closest mtdna match does match one of my heteroplasmies" do I understand you right that your match doesn't have a heteroplasmy in the position, but his value there matches one of the two values of your heteroplasmy?

        This is the case for me and some of my matches, whose HVR1/HVR2 regions I can see in a project. I'm trying to figure out whether this is the reason why I have a Genetic Distance of 1 to these matches - meaning "Y" (C or T) vs "T" count as one step difference; or do "Y" vs "T" count as zero Genetic Distance (would be logical to me since they overlap), so that there must be another difference in the Coding Region, that I cannot see.

        My estimate is that a perfect (same kind) heteroplasmy/heteroplasmy match would be an additional strong proof of a recent shared ancestor, but a heteroplasmy/homoplasmy match that overlaps should still be considered a 100% match and shouldn't increase the Genetic Distance. It is well possible that there is a heteroplasmy but it was not detected because it was smaller than the limit that FTDNA can detect. The degree of heteroplasmy can vary a lot from mother to child: Y could be detected by FTDNA as Y, but also as T or C because of the 20% (? ... don't find it in the current FAQ) threshold detection used by FTDNA. I've read it will take many generations (like 150 ... really?) to stabilize and become homoplasmy.

        May I ask you at what "Genetic Distance" (for "HVR1/HVR2/FMS" comparison) your closest match shows up?

        This all may not sound interesting for most people who tested at FTDNA because of the low heteroplasmy detection rate, but it affects also mtDNA testers that don't have a heteroplasmy detected for themselves: many might have small undetected heteroplasmies and many more have matches with a heteroplasmy that might only appear to be distant because of the unclear distance handling.

        Christian Keßler

        Comment


        • #5
          Two heteroplasmies

          Originally posted by chr View Post
          Hello Douglas W. Fisher,

          First, thanks for your help with this.

          Second, just to understand you right - By "I have two heteroplasmies" you mean heteroplasmy at two different positions, for example 123Y (meaning 123C and 123T) and 456W (meaning 456A and 456G)?

          Or do you mean heteroplasmy in one position, that has two values; for example 123Y (meaning 123C and 123T)?

          By "My closest mtdna match does match one of my heteroplasmies" do I understand you right that your match doesn't have a heteroplasmy in the position, but his value there matches one of the two values of your heteroplasmy?

          This is the case for me and some of my matches, whose HVR1/HVR2 regions I can see in a project. I'm trying to figure out whether this is the reason why I have a Genetic Distance of 1 to these matches - meaning "Y" (C or T) vs "T" count as one step difference; or do "Y" vs "T" count as zero Genetic Distance (would be logical to me since they overlap), so that there must be another difference in the Coding Region, that I cannot see.

          My estimate is that a perfect (same kind) heteroplasmy/heteroplasmy match would be an additional strong proof of a recent shared ancestor, but a heteroplasmy/homoplasmy match that overlaps should still be considered a 100% match and shouldn't increase the Genetic Distance. It is well possible that there is a heteroplasmy but it was not detected because it was smaller than the limit that FTDNA can detect. The degree of heteroplasmy can vary a lot from mother to child: Y could be detected by FTDNA as Y, but also as T or C because of the 20% (? ... don't find it in the current FAQ) threshold detection used by FTDNA. I've read it will take many generations (like 150 ... really?) to stabilize and become homoplasmy.

          May I ask you at what "Genetic Distance" (for "HVR1/HVR2/FMS" comparison) your closest match shows up?

          This all may not sound interesting for most people who tested at FTDNA because of the low heteroplasmy detection rate, but it affects also mtDNA testers that don't have a heteroplasmy detected for themselves: many might have small undetected heteroplasmies and many more have matches with a heteroplasmy that might only appear to be distant because of the unclear distance handling.

          Christian Keßler
          I actually do have two different heteroplasmies; T7310Y & A11172R. My closest mtdna match only has one of these, so I have the extra heteroplasmy. I compared family trees with him, and our common ancestor lived in the 18th century. This is why heteroplasmies are realistically not equal to a full mutation step difference. A fully mature mutation difference can equate to a common maternal ancestor up to 1500 yrs ago.

          Best regards, Doug
          Kit#122883

          Comment


          • #6
            Originally posted by chr View Post
            This is the case for me and some of my matches, whose HVR1/HVR2 regions I can see in a project. I'm trying to figure out whether this is the reason why I have a Genetic Distance of 1 to these matches. My estimate is that a perfect (same kind) heteroplasmy/heteroplasmy match would be an additional strong proof of a recent shared ancestor, but a heteroplasmy/homoplasmy match that overlaps should still be considered a 100% match and shouldn't increase the Genetic Distance.
            Yes, in your FTDNA full sequence match list the heteroplasmy counts as 1 step in the genetic distance. Heteroplasmies are more common at some markers than others, and I generally ignore common heteroplasmies. If you have an unusual heteroplasmy, for example at a coding region marker as Douglas described, this could be more useful for finding closer matches with other people who have a mutation at that marker.


            Originally posted by chr View Post
            This all may not sound interesting for most people who tested at FTDNA because of the low heteroplasmy detection rate, but it affects also mtDNA testers that don't have a heteroplasmy detected for themselves: many might have small undetected heteroplasmies and many more have matches with a heteroplasmy that might only appear to be distant because of the unclear distance handling.
            This is a problem that affects a large number of people, and the FTDNA treatment of heteroplasmies for matching is still "a mess" as Roberta described it. In some cases a very common heteroplasmy is counted as 2 steps, for example, if the reference marker is C16093T, your close match is T16093C, and you are C16093Y, you are considered a 2 step match to the close match because you both differ by a GD of 1 from the reference. Also, extremely common insertions or deletions (indels) at marker 523 are counted as 2 steps. As a result, it is unusual but possible that close relatives can have a genetic distance of 4 because of very common indels or heteroplasmies that really should be ignored. One solution would be to add an option for "relaxed" matching criteria that ignores differences at heteroplasmies and that also ignores common indels.

            Comment


            • #7
              Thank you Doug and GST for clearing this up.

              Especially GST, what you share is very valuable and detailed info, and I would like to find info like this in the FTDNA mtDNA-FAQ.

              GST, when you say 'for example, if the reference marker is C16093T', does that mean the reference is the RSRS value, or would the reference be the (both me and my matches best fitting) Haplogroup? This could be what the so-called 'Smart-Matching' is about...?

              My Haplogroup H86 (like almost any other Haplogroup since L2) contains the mutation C195T. For me the heteroplasmy C195Y was detected - which is "C195T" and "T195C!", meaning a tendency to back-mutate to the state of 100.000+ years ago ... I probably listened to Devo "I'm a Potato" one time too often . The match has C195T, which is Haplotype standard. The genetic distance from me to match is 1. So this is because I am different from the Haplogroup, but not my match. Hope I have started to understand...

              Comment


              • #8
                Originally posted by chr View Post
                GST, when you say 'for example, if the reference marker is C16093T', does that mean the reference is the RSRS value, or would the reference be the (both me and my matches best fitting) Haplogroup? This could be what the so-called 'Smart-Matching' is about...?
                My Haplogroup H86 (like almost any other Haplogroup since L2) contains the mutation C195T. For me the heteroplasmy C195Y was detected - which is "C195T" and "T195C!"
                It appears that the RSRS is always the reference used for the genetic distance, not the reference value of the marker for the haplogroup. I listed two examples below for hapgroup H10 for which 16093 is extremely unstable and should be excluded from matching. If you have 3 people in H10 with identical results except for marker 16093:

                H10
                A: C16093T (RSRS)
                B: T16093C
                C: C16093Y

                Person C is a 1 step distance from person A.
                Person C is a 2 step distance from person B.

                H10e includes the defining mutation T16093C
                A: T16093C (H10e reference)
                B: CT16093T! (person in H10e with reversion at 16093)
                C: T16093Y

                Person C is 2 step distance from person A.
                Person C is 1 step distance from person B.

                Comment


                • #9
                  a blog on heteroplasmy

                  http://www.legalgenealogist.com/2017...th-of-the-gd0/

                  Comment


                  • #10
                    Thanks for sharing this.

                    Originally posted by mabrams View Post

                    Excellent read, thanks for sharing this.

                    Comment


                    • #11
                      hello GST, thanks for your help with that. What you've written is the most detailed I found on how GD is calculated. I just have a hard time understanding this, I tried this a few times but needed a break because I was frustrated. It seems like the calculation you mentioned does not work for the match that I'm looking at. Probably I didn't understand you.

                      Originally posted by GST View Post
                      H10e includes the defining mutation T16093C
                      A: T16093C (H10e reference)
                      B: CT16093T! (person in H10e with reversion at 16093)
                      C: T16093Y

                      Person C is 2 step distance from person A.
                      Person C is 1 step distance from person B.
                      This looks similar to my example above, except I get GD1 instead of GD2 like in your quoted example.

                      H86 implies C195T (from way back group "'L2'3'4'6'"), so:

                      A: C195T (my match; just for this SNP: GD0 to H86 reference, GD1 to RSRS)
                      C: C195Y (me; just for this SNP: GD1 to H86 reference, GD1 to RSRS)

                      If I understand your examples right, this would mean for the calculated GD between us:
                      (1) each of us has like a "GD1" to RSRS for this position
                      (2) we also have a different value from each other in this position, so:
                      (3) our "GDs" to RSRS in this position are added, resulting in overall GD2 between us. Did I understand you right in that?

                      Unfortunately this calculation does not work for me, as I received GD1 (and not GD2!) for this match. Every other position that I can see (HVR1, HVR2) being the same apart from position 195. It barely makes sense to receive GD1 for this match, but GD2 would make even less sense.

                      I'm a bit disappointed that there's no info by FTDNA what concerns the actual rules of GD calculation, maybe these unknown rules also changed over time...?

                      Meanwhile, since I also didn't get info from FTDNA about the actual degree of my heteroplasmy (for Y: the ratio between DNA with C and DNA with T Allele) other than there's a general 20% threshold for heteroplasmy reporting, I re-tested HVR1+2 at YSEQ, and received the results last night. I'll do a post on that in the next days.

                      Happy Christmas everyone!

                      Comment

                      Working...
                      X