If this is your first visit, be sure to
check out the FAQ by clicking the
link above. You may have to register
before you can post: click the register link above to proceed. To start viewing messages,
select the forum that you want to visit from the selection below.
You and I seem close on the 25 marker. Looks like I might come out E3b* as well. I think that's you anyway.
Rick
I noticed that, too. But there's no sure way of knowing until you get your results back. As you can see the clustering pattern is broken by a couple of M123s; one near the E3b* and another near the E3b1.
Victor
p.s. Also, on the fluxus diagram your haplotype appears on a separate vector and closer to some E3b1s.
I looked at the microsatellite graph. you linked to in your post. Very interesting the E3b1 Alpha sub-group looks predominantly european. The other cluster looks almost exclusively East-African. So I suspect that 2 of the 4 regions (sub-clades?) will hold up even when additional markers are considered.
Notwithstanding the multiple and historical migrations that we know of that took place in the Old World, there seems to be some correlation with the location and/or origin of the haplotypes listed, at least in the two larger groups that you noticed. But within the larger group there might be represented more than just one subclade in which it will be harder to draw a dividing line so to speak. Even in the smaller cluster I suspect that we have represented more than one subclade also. The SNP tests will show.
So I believe that you will be able to identify some sub-clades from STRs. I just wonder as an individual haplotype if you are not a perfect match to the modal haplotype STRs how close do you have to be before considering an SNP test to confirm the results?
Being close to the Modal is not necessarily a guarantee that a haplotype is within a particular subclade it only increases the likelihood. I still would encourage everyone to go ahead and make plans to have their SNP tested when possible to determine their subclade without a shadow of doubt. And if we get a "snip" result that breaks the pattern of the cladograms it means that we have to go back to square one and redefine the parameters of the whole process. Learning by trial and error!
Also, how far back in time do we estimate these splits to have occured? Is there a way to measure given that SNPs are on average over 5000 years old? I am just trying to get my head around the practical implications.
As far as E3b haplogroup is concerned, a quick glance at Cruciani's study shows for example that the TMRCA for M81 (the youngest subclade) is estimated at 5.6 ky; M78 on the other hand, is listed as 23.2 ky. old with each of its subclusters spliting at a subsequent age; Could not readily find M123's age.
For example: I find a 23 on 25 match with another individual who is also R1b but does not share the same surname, was it convergence or is this guy more closely related to me than I would otherwise think? If STRs are that predictive with only 4 or 5 markers...
BTW: I really appreciate your feedback
IMO, a 23/25 non-surname match can not or should not be automatically discarded. You have to consider the geographical and historical context of the lineage and genealogy of those being compared. Surname continuity from generation to generation is not reliable in many cases, especially the farthest back we go in time. Conversely, at 25 markers it would certainly be more likely to find a case of convergence than at 37 if there is no geo-historical background in common. FTDNA's guidelines give a good explanation.
I looked at the microsatellite graph. you linked to in your post. Very interesting the E3b1 Alpha sub-group looks predominantly european. The other cluster looks almost exclusively East-African. So I suspect that 2 of the 4 regions (sub-clades?) will hold up even when additional markers are considered.
So I believe that you will be able to identify some sub-clades from STRs. I just wonder as an individual haplotype if you are not a perfect match to the modal haplotype STRs how close do you have to be before considering an SNP test to confirm the results? Also, how far back in time do we estimate these splits to have occured? Is there a way to measure given that SNPs are on average over 5000 years old? I am just trying to get my head around the practical implications.
For example: I find a 23 on 25 match with another individual who is also R1b but does not share the same surname, was it convergence or is this guy more closely related to me than I would otherwise think? If STRs are that predictive with only 4 or 5 markers...
I have no doubts that at the higher level branches only a few STRs are required. I just think that when you start breaking it down further to say the twig level, you will need more markers because the variability in individual marker values will decrease. The recent marketing of the Niall of the Nine Hostages hints at this when they compare the 12-marker vs 25 marker tests. You get a better picture if you include the 464 cluster.
That being said I am not clear on how old some of the sub-branches are? I have to do some more reading.
I am personally interested in E3B because my maternal Grandfathers Family name which traces back to Northern France in the late 1500s appears to be E3B. I will need to get an uncle to participate!
I agree. A higher number of markers will always produce better discriminating resolution for certain clustering purposes. For example, a comparison based on the 37-marker panel is much better in defining the "twigs" in a large DNA surname project, where participants share a common ancestor within the last few centuries or even like in your example of the Niall of the Nine Hostages that dates back to the 5th. (?) century.
On the other hand, my logic tells me (although I could be wrong) that in a Haplogroup project like E3b, where a common ancestor dates back to several thousands of years ago, an intermediate number of selected markers may produce better clustering results than including a lot of fast mutating markers which could possibly add a lot of noise in the calculations. Besides, the 37-marker haplotypes represented in our recordset are too few so I can not fully test that idea.
And as you know, these software tools allow us to do these learning exercises and help us visually grasp an otherwise unintelligible rosary of numbers. I hope everyone understands that no guarantee about the accuracy of the results can be given.
However, it could also be that what matters isn't only the count of markers but which markers are selected for the analysis. For example in the study Phylogeographic Analysis of Haplogroup E3b (E-M215) Y Chromosomes Reveals Multiple Migratory Events Within and Out Of Africa, the researchers use only 11 markers, as the quote below shows, and still they are able to plot the structure of the haplogroups, even the four sub-clusters within E-M78.
I have no doubts that at the higher level branches only a few STRs are required. I just think that when you start breaking it down further to say the twig level, you will need more markers because the variability in individual marker values will decrease. The recent marketing of the Niall of the Nine Hostages hints at this when they compare the 12-marker vs 25 marker tests. You get a better picture if you include the 464 cluster.
That being said I am not clear on how old some of the sub-branches are? I have to do some more reading.
I am personally interested in E3B because my maternal Grandfathers Family name which traces back to Northern France in the late 1500s appears to be E3B. I will need to get an uncle to participate!
I have created again a couple of cladograms with the latest available recordset from the E3b Haplogroup Project.
One was generated with 12 markers and the other with 25 markers. And, although as we have previously discussed the limited value of a 12 marker cladogram to correctly infer subclades, I've decided to do it anyway for the benefit of those participants who have recently joined the project who only have a 12-marker haplotype.
There are currently 135 records in the project of which 6 have an SNP result that I know of. But new SNP results are about to come soon and then we'll see if the clustering pattern still holds up. Also, if anyone else from the project has already a "snip" result not in the cladograms I would appreciate if you could post it here so we can learn more about our haplogroup.
So, for whatever is worth, here are the links to the diagrams. And as always your comments are welcome.
p.s. The data was processed in the Kitsch module from the PHYLIP (Phylogeny Inference Program) and the diagrams generated with Tree Explorer. The raw data was prepared with McGee's Y-DNA Tools.
Sorry Victor, by duplicated I meant out on a similiar limb, so to speak. I was not clear. I am trying to figure out what would be the meaning of the "perspective", if any.
Yes, I would like the ID numbers for all the haplotypes in the diagram. Do you need my email address?
Thanks,
Rick
Rick,
Haplotype 10640 is indeed out on a similar limb but the new diagram inverted the positions of the two main clusters. It is as if we were looking from the opposite side. As I explained, when new records are added to the haplotype dataset the Fluxus software rearranges the diagram as it inserts the new data. Also, although it isn't very clear in the images, and contrary to the plain phylogenetic trees or cladograms, the networking in these diagrams is supossed to be a three dimensional structure.
Please send me a private message with an email address where I can send you the file with the haplotype numbers.
Sorry Victor, by duplicated I meant out on a similiar limb, so to speak. I was not clear. I am trying to figure out what would be the meaning of the "perspective", if any.
Yes, I would like the ID numbers for all the haplotypes in the diagram. Do you need my email address?
Thanks,
Rick
Victor, I think I now understand the Fluxus diagram. I need the haplotype labels or at least mine.
Actually, 10640 is duplicated in the latest diagram but the perspective is a little different. Everytime new records are added the fluxus software rearranges the relative positions of each haplotype.
If you want I can send you a small pdf file that shows the Id numbers of all haplotypes in the diagram. The same goes for anyone else interested.
I'd be interested in your deep clade results. There are some others of us who are also expecting to hear news in the coming weeks.
Victor
Sorry Victor, by duplicated I meant out on a similiar limb, so to speak. I was not clear. I am trying to figure out what would be the meaning of the "perspective", if any.
Yes, I would like the ID numbers for all the haplotypes in the diagram. Do you need my email address?
I touched on this point in another thread. IMHO the 12 marker level is insufficient data for building the cladogram. You would need an incredibly large sample for good results at only the 12 marker level.
I applaud your efforts, and your willingness to share. It is my hope that in the near future the Genographic Project will start putting out the same type of information for the global tree. It will be interesting to see if your results concur with theirs...
Right. I'm aware of discussions elsewhere about the limited value of 12-marker haplotypes to distinctly resolve the sub-branching of haplogroup E3b and/or other haplogroups.
However, it could also be that what matters isn't only the count of markers but which markers are selected for the analysis. For example in the study Phylogeographic Analysis of Haplogroup E3b (E-M215) Y Chromosomes Reveals Multiple Migratory Events Within and Out Of Africa, the researchers use only 11 markers, as the quote below shows, and still they are able to plot the structure of the haplogroups, even the four sub-clusters within E-M78.
We further typed 509 of the 515 E3b subjects for seven GATA STR (A7.1, A7.2, and A10 [White et al. 1999]; DYS19, DYS391, and DYS393 [Roewer et al. 1992, 1996]; and DYS439 [Ayub et al. 2000]) and four CA dinucleotide repeat (YCAIIa, YCAIIb, DYS413a, and DYS413b [Mathias et al. 1994; Malaspina et al. 1997]) polymorphisms.
Even selecting from the 37-marker panel (*) there would still be four markers missing. (Can someone verify if I got the equivalent names right?)
As to the Genographic Project making their data available in some shape or form that would be great although I doubt they will spontaneously do it. That's why we're motivated (on a very small scale) to make our feeble attempts and try to find some sense in all of this as best we can.
I do not see it duplicated in later diagrams.
Rick
Rick,
Actually, 10640 is duplicated in the latest diagram but the perspective is a little different. Everytime new records are added the fluxus software rearranges the relative positions of each haplotype.
If you want I can send you a small pdf file that shows the Id numbers of all haplotypes in the diagram. The same goes for anyone else interested.
I'd be interested in your deep clade results. There are some others of us who are also expecting to hear news in the coming weeks.
In other words, all our haplotypes are very similar at the 12 marker level regardless of what subclades we belong to. It is at a higher number of markers where the differences (or genetic distance) start to show.
I touched on this point in another thread. IMHO the 12 marker level is insufficient data for building the cladogram. You would need an incredibly large sample for good results at only the 12 marker level.
I applaud your efforts, and your willingness to share. It is my hope that in the near future the Genographic Project will start putting out the same type of information for the global tree. It will be interesting to see if your results concur with theirs...
Leave a comment: