Announcement

Collapse
No announcement yet.

myOrigins white paper now available

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • myOrigins white paper now available

    The authors are Rhazib Khan and Rui Hu.



    I haven't read it yet myself -- I just now noticed it when I was searching the Learning Center for something else

  • #2
    Thank you, Ann.

    Comment


    • #3
      Thanks Ann
      So would the Hardy-Weinberg Equilibrium be the reason that some populations are quite scarce and others have 3 times as many? Larger population need more reference samples?

      The graduate program I completed was chosen by me for it's LACK of statistics..

      Comment


      • #4
        Two comments -

        1. From the white paper, the shape of the density plots is determined in part by the number of samples from each country. There are 124 samples from Spain compared to 17 samples from Germany. So the shapes of the density plots are not necessarily telling us very much about our country specific European ancestry, rather, they are showing us the locations of the reference samples.

        2. The reference populations seem designed to show how we match the present day distribution of populations, not the ancient ancient origins of modern populations. Some people had speculated that the odd results reflected the fact that the maps indicated ancient origins, but this seems not to be the case.

        For people of primarily European ancestry, this would be a more useful tool with a much larger and more carefully selected set of reference populations. The current tool seems to be more useful for people with diverse ancestry from multiple regions around the world.

        Comment


        • #5
          Originally posted by GST View Post
          Two comments -

          1. From the white paper, the shape of the density plots is determined in part by the number of samples from each country. There are 124 samples from Spain compared to 17 samples from Germany. So the shapes of the density plots are not necessarily telling us very much about our country specific European ancestry, rather, they are showing us the locations of the reference samples.
          Thank you for that explanation.

          Originally posted by GST View Post
          2. The reference populations seem designed to show how we match the present day distribution of populations, not the ancient ancient origins of modern populations. Some people had speculated that the odd results reflected the fact that the maps indicated ancient origins, but this seems not to be the case.
          Thanks for clarifying this.

          Originally posted by GST View Post
          For people of primarily European ancestry, this would be a more useful tool with a much larger and more carefully selected set of reference populations. The current tool seems to be more useful for people with diverse ancestry from multiple regions around the world.
          Essentially what I posted in another thread. In that one, I suggested that FTDNA invest some $ in subsidized European testing. Then I suggested that they use the aging populations of Great Immigration children still alive. Now I remember that some time in the last few years FTDNA attempted to survey its customer base regarding their ethnic origins. I remember there being a lot of uproar over that, but with a properly designed survey, they could probably identify hundreds of customers who have at least all four grandparents coming from the same particular region who could then form the primary database for MO.

          Comment


          • #6
            Sounds like the same thing they do at dnatribes.

            At dnatribes, their sample are from present day people, from population of that area for 500 years.Then the others are native (ameridian) and some diaspora population.

            Then they compare how your dna profile (STR)matches against those populations. Then for each population they show how more likely you are part of that population comparing to the others and then a score in percentage you match within that population.

            Let say your dna is found in 300 populations (of their database of 1200). They show you the top 20 and then you can order a "zoom" on the area of a population you want. My top 20 was European, than I asked for the Europa and then it broke down even more the European and you can ask for the zoom of Africa, Native American, Middle Eastern etc.

            On the middle eastern panel, I had a faint match with non-hasidic Jews from NY at 1.45 and 2%, which is pretty much nothing as I am confirmed descendent of Sephard.

            I would think that dnatribes is way more detailed.

            The PF confirmed the majority of my dna is on the Russian region, but now I can't read MyOrigins. It is showing the Iberia (which is down on the line on my dnatribes, not even showing on the str-base.org and the Russian area is very wide.






            [/I]

            Comment


            • #7
              I love how four (Han, Basque, Melanesian and Mandenka) out of the five reference populations that are shown in figure 1 are not included in myOrigins, but were included in Population Finder.

              Comment


              • #8
                Stat Question

                The White Paper indicates that the Bayesian approach of 'Admixture' was employed rather than PCA. A recent post indicated that the MO percentages reflect PCA explained variance determinations. Were both strategies used.

                Comment


                • #9
                  My Error

                  That was my post on PCA, made before reading the white paper. I'll have to look at the references te get a clearer understanding of exactly how their Bayesian approach worked and how to interpret the percentages.

                  Jim

                  Comment


                  • #10
                    Originally posted by jbarry6899 View Post
                    That was my post on PCA, made before reading the white paper. I'll have to look at the references te get a clearer understanding of exactly how their Bayesian approach worked and how to interpret the percentages.

                    Jim
                    Thanks

                    Comment


                    • #11
                      Originally posted by Kathleen Carrow View Post
                      Thanks Ann
                      So would the Hardy-Weinberg Equilibrium be the reason that some populations are quite scarce and others have 3 times as many? Larger population need more reference samples?

                      The graduate program I completed was chosen by me for it's LACK of statistics..
                      The Hardy-Weinburg Equilbrium (H-WE) assumption is that the allele frequencies seen in the reference populations today reflect the allele frequencies from some (unspecified) time in the past.

                      Allele frequencies can change over time for a variety of reasons, including random drift, but in a large population, the changes will be small.

                      Allele frequencies can also change with migration of a new population into the existing population.

                      The populations with a small number of samples are actually more likely to violate the H-WE assumption. I suspect the reason for small sample sizes is very mundane: they didn't have much to choose from.

                      Comment


                      • #12
                        Have I understood correctly?

                        Have I understood correctly that myOrigins uses 377 AIMs (ancestry informative markers) to determine one's ancestry?

                        Comment


                        • #13
                          Originally posted by robe3b View Post
                          Have I understood correctly that myOrigins uses 377 AIMs (ancestry informative markers) to determine one's ancestry?
                          What was I thinking? I've got it completely wrong, the actual number of SNPs used by myOrigins appears to be 290,874. Right?

                          Comment


                          • #14
                            Apparently another paper is in the works to answer the many questions raised about the populations and methodology used for MO. It will posted at the FTDNA Learning Center.

                            Comment


                            • #15
                              From ISOGG FB: "...we run admixture 10 times, come up with average, check for high variance in result and remove outliers of runs. The run is supervised and everyone who isn't part of the reference is a combination of the references."

                              Comment

                              Working...
                              X