Can someone confirm my suspicions about sort order within subgroups in Y results pages? It's automatic, right, and no way to change it?
I think it's doing a numeric sort order on columns from left to right, which would be simple and sensible--except for multiple-copy markers, where it seems to be sadly and inconsistently reverting to lexicographical order.
E.g. looking at the FGC32899 subgroup under the M222 project (as it is now), in DYS439 the first kit is 11 and the next three are 12, so far so good, but the next difference is DYS459b. The first three are 9-10, and the fourth is 9-9. Numerically 9-9 should come before 9-10, but lexicographically "9-9" > "9-10".
Likewise, in the Kennedy project, I see:
29 | 17 | 9-10
29 | 17 | 9-10
29 | 17 | 9-11
29 | 17 | 9-9 ???
29 | 17 | 9-9
It might require some extra logic to compare these markers where the copy numbers differ, but sorting multicopy markers lexicographically is lazy, wrong, and inconsistent.
This would just be an annoyance with the results presentation and grouping, but it also makes me worry whether genetic distance computations on multicopy markers are wrong too. I have one kit with an uncommon 6-copy DYS464 who has a distance 1 match at comparing 12 markers (for a 37-marker test) that doesn't show up at all under 37 markers.
I think it's doing a numeric sort order on columns from left to right, which would be simple and sensible--except for multiple-copy markers, where it seems to be sadly and inconsistently reverting to lexicographical order.
E.g. looking at the FGC32899 subgroup under the M222 project (as it is now), in DYS439 the first kit is 11 and the next three are 12, so far so good, but the next difference is DYS459b. The first three are 9-10, and the fourth is 9-9. Numerically 9-9 should come before 9-10, but lexicographically "9-9" > "9-10".
Likewise, in the Kennedy project, I see:
29 | 17 | 9-10
29 | 17 | 9-10
29 | 17 | 9-11
29 | 17 | 9-9 ???
29 | 17 | 9-9
It might require some extra logic to compare these markers where the copy numbers differ, but sorting multicopy markers lexicographically is lazy, wrong, and inconsistent.
This would just be an annoyance with the results presentation and grouping, but it also makes me worry whether genetic distance computations on multicopy markers are wrong too. I have one kit with an uncommon 6-copy DYS464 who has a distance 1 match at comparing 12 markers (for a 37-marker test) that doesn't show up at all under 37 markers.
Comment