Announcement

Collapse
No announcement yet.

Add Columns for Polynomic Markers

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Add Columns for Polynomic Markers

    A request to FTDNA, and if anyone has a workaround, I'd be grateful.

    In GAP, you can download .csv files for STR results. However, because the results for polynomic markers are in one cell, Excel converts many of those with two values to dates, e.g. 12-13 becomes 13-Dec. I've tried all of the conversion tricks that I could find online, and the only way I've found to correct this is by tediously going through each entry and making the changes by hand. This is not feasible for projects with hundreds of members or more.

    Would in be possible to convert the polynomic marker values to separate columns for the download?

    And again, if anyone has a trick for fixing this please let us know!

    Jim

  • #2
    Originally posted by jbarry6899 View Post
    And again, if anyone has a trick for fixing this please let us know!
    Five years ago, someone gave me an Excel macro to split those multivalue alleles into separate columns. The macro is called FTDNA_MCSplit_v2 and still works correctly in Excel 2016.

    If you can't find it online, I can give you the person's email address via PM.

    Comment


    • #3
      thanks, Larry. I couldn't find it online but did find some rootsweb messages from Bob May who seems to be the creator. I just sent him an email. I appreciate your help.

      Jim

      Comment


      • #4
        Originally posted by jbarry6899 View Post
        A request to FTDNA, and if anyone has a workaround, I'd be grateful.

        In GAP, you can download .csv files for STR results. However, because the results for polynomic markers are in one cell, Excel converts many of those with two values to dates, e.g. 12-13 becomes 13-Dec. I've tried all of the conversion tricks that I could find online, and the only way I've found to correct this is by tediously going through each entry and making the changes by hand. This is not feasible for projects with hundreds of members or more.

        Would in be possible to convert the polynomic marker values to separate columns for the download?

        And again, if anyone has a trick for fixing this please let us know!

        Jim
        @Jim, I had thought that I resolved that very issue a year ago, so I wanted to replicate it today and cannot see a page in my GAP where I could download STR results as CSV... Can you point me to the right URL?

        Mr W

        Comment


        • #5
          Originally posted by dna View Post
          @Jim, I had thought that I resolved that very issue a year ago, so I wanted to replicate it today and cannot see a page in my GAP where I could download STR results as CSV... Can you point me to the right URL?

          Mr W
          Go to GAP, Project Administration, Download Files. This is the url in our project: https://gap.familytreedna.com/download-files.aspx

          I have not been able to contact the creator of the macro and spent most of last evening banging my head against the wall trying every combination of formatting and importing that I could find. Any help on this would be gratefully appreciated.

          FYI, what I want to do is submit the marker values to the McGee utility to get a check on the clustering of results: http://www.mymcgee.com/tools/yutility.html

          Jim

          Comment


          • #6
            Originally posted by jbarry6899 View Post
            A request to FTDNA, and if anyone has a workaround, I'd be grateful.

            In GAP, you can download .csv files for STR results. However, because the results for polynomic markers are in one cell, Excel converts many of those with two values to dates, e.g. 12-13 becomes 13-Dec. I've tried all of the conversion tricks that I could find online, and the only way I've found to correct this is by tediously going through each entry and making the changes by hand. This is not feasible for projects with hundreds of members or more.

            Would in be possible to convert the polynomic marker values to separate columns for the download?

            And again, if anyone has a trick for fixing this please let us know!

            Jim
            Originally posted by jbarry6899 View Post
            Go to GAP, Project Administration, Download Files. This is the url in our project: https://gap.familytreedna.com/download-files.aspx

            I have not been able to contact the creator of the macro and spent most of last evening banging my head against the wall trying every combination of formatting and importing that I could find. Any help on this would be gratefully appreciated.

            FYI, what I want to do is submit the marker values to the McGee utility to get a check on the clustering of results: http://www.mymcgee.com/tools/yutility.html

            Jim
            I have no Excel macro. As it was easy for me I have a crude script that can be run on Linux, Terminal on Mac, or using Cygwin under Windows. I will share it in the next post. It takes a CSV file and makes another CSV file that has all markers values in separate columns.

            About imports, there is only a tiny chance that we have the same Excel versions, however since at least Office 2003 you can properly import entries such as "12-19" as text. Open a blank sheet. Select import data. Then on one of the screen you have to manually select each of the 7 column separately and designate that it contains Text and not anything else. There is no global setting (at least in your Excel version and mine).

            FTDNA actually has an article in the Learning Center describing why they are going to list multi-copy markers in single columns: https://www.familytreedna.com/learn/...single-column/ In GAP 2.0, why are multi-copy STR markers shown in a single column?

            Mr W

            P.S.
            If you had found the macro, please share with us, since to most people that would be the easiest route.

            Comment


            • #7
              Code:
              #!/bin/sh
              new_name="`echo $1 | sed -e 's/\./_single-column./'`"
              
              cut '-d"' -f1-13 < $1 > /tmp/$$_file1_$$
              
              cut '-d"' -f14- < $1 | sed \
               -e s/Y-GATA-H4/Y_GATA_H4/ \
               -e s/Y-GATA-A10/Y_GATA_A10/ \
               -e s/Y-GGAAT-1B07/Y_GGAAT_1B07/ \
               -e 's/DYS385/DYS385a","DYS385b/' \
               -e 's/DYS459/DYS459a","DYS459b/' \
               -e 's/DYS464/DYS464a","DYS464b","DYS464c","DYS464d/' \
               -e 's/YCAII/YCAIIa","YCAIIb/' \
               -e 's/CDY/CDYa","CDYb/' \
               -e 's/DYF395S1/DYF395S1a","DYF395S1b/' \
               -e 's/DYS413/DYS413a","DYS413b/' \
               | sed -e 's/-/","/g' \
               -e s/Y_GATA_H4/Y-GATA-H4/ \
               -e s/Y_GATA_A10/Y-GATA-A10/ \
               -e s/Y_GGAAT_1B07/Y-GGAAT-1B07/ > /tmp/$$_file2_$$
              
              paste /tmp/$$_file1_$$ /tmp/$$_file2_$$ | tr -d '\011' > $new_name
              
              rm /tmp/$$_file1_$$ /tmp/$$_file2_$$
              The code is very crude since, for example, it
              • assumes file names with no space and weird characters
              • does not count how many copies of STRs are present
              • has no error reporting
              • does not display of the output file name (the same as original with _single-column added)


              The file needs to be made executable, and takes a single CSV file as an argument.

              Mr W

              P.S.
              I have developed it last year to count differences between arbitrary results and had not used it since, but thanks to you pointing out the McGee utility, I am using it again. Thanks for the pointer!

              Comment


              • #8
                Originally posted by jbarry6899 View Post
                A request to FTDNA, and if anyone has a workaround, I'd be grateful.

                In GAP, you can download .csv files for STR results. However, because the results for polynomic markers are in one cell, Excel converts many of those with two values to dates, e.g. 12-13 becomes 13-Dec. I've tried all of the conversion tricks that I could find online, and the only way I've found to correct this is by tediously going through each entry and making the changes by hand. This is not feasible for projects with hundreds of members or more.

                Would in be possible to convert the polynomic marker values to separate columns for the download?

                And again, if anyone has a trick for fixing this please let us know!

                Jim
                I use excel 2013
                Download excel file (.xml file instead of the .csv file) and save to your computer
                Open Excel
                In excel go to File > Open > then browse to saved file

                the xml file does not covert the polynomic markers dates

                Once opened in Excel, You can either save it as an excel spreadsheet or leave it as an .xml file

                If you would like to have these values in their own separate column, insert the appropriate number of columns beside column of interest, highlight multi copy marker column and go to Data>Text to Columns>delimited>next
                select other and enter -
                This will put each value in its own column
                Last edited by prairielad; 19 December 2015, 05:14 PM.

                Comment


                • #9
                  Originally posted by prairielad View Post
                  I use excel 2013
                  Download excel file (.xml file instead of the .csv file) and save to your computer
                  Open Excel
                  In excel go to File > Open > then browse to saved file

                  the xml file does not covert the polynomic markers dates

                  Once opened in Excel, You can either save it as an excel spreadsheet or leave it as an .xml file

                  I
                  File won't open using Excel 2007 for Windows. Get error message that there is a problem with style. Opening it directly from the download folder yields this:

                  This XML file does not appear to have any style information associated with it. The document tree is shown below.
                  <?mso-application progid='Excel.Sheet'?>
                  <s:Workbook xmlns:x="urn:schemas-microsoft-comffice:excel" xmlns="urn:schemas-microsoft-comfficeffice" xmlns:s="urn:schemas-microsoft-comffice:spreadsheet">
                  <s:Styles>
                  <s:Style s:ID="HeaderStyle">
                  <s:Alignment s:Horizontal="Center"/>
                  <s:Font s:Bold="1"/>
                  </s:Style>
                  </s:Styles>
                  <s:Worksheet s:Name="Barry_YDNA_Results">
                  <s:Table>
                  <s:Row>

                  Etc.

                  I'll try running the program suggested by dna.

                  Jim

                  Comment


                  • #10
                    Originally posted by jbarry6899 View Post
                    File won't open using Excel 2007 for Windows. Get error message that there is a problem with style. Opening it directly from the download folder yields this:

                    This XML file does not appear to have any style information associated with it. The document tree is shown below.
                    <?mso-application progid='Excel.Sheet'?>
                    <s:Workbook xmlns:x="urn:schemas-microsoft-comffice:excel" xmlns="urn:schemas-microsoft-comfficeffice" xmlns:s="urn:schemas-microsoft-comffice:spreadsheet">
                    <s:Styles>
                    <s:Style s:ID="HeaderStyle">
                    <s:Alignment s:Horizontal="Center"/>
                    <s:Font s:Bold="1"/>
                    </s:Style>
                    </s:Styles>
                    <s:Worksheet s:Name="Barry_YDNA_Results">
                    <s:Table>
                    <s:Row>

                    Etc.

                    I'll try running the program suggested by dna.

                    Jim
                    True, you can not open file directly, unless you can set excel as the default program for opening XML files. Right clicking file and choosing Open with may give the option of Excel.

                    For 2007 excel try the following

                    Open blank workbook
                    Go to Data Tab
                    In Get External Data box (upper left side) select from Other Sources
                    Select From XML Data Import
                    Select your downloaded XML file
                    click OK on "the specified XML source..." message
                    select "Existing Workbook"

                    Code:
                    https://www.youtube.com/watch?v=r0S2CQd46C4
                    Last edited by prairielad; 19 December 2015, 06:27 PM.

                    Comment


                    • #11
                      Originally posted by prairielad View Post
                      True, you can not open file directly, unless you can set excel as the default program for opening XML files. Right clicking file and choosing Open with may give the option of Excel.

                      For 2007 excel try the following

                      Open blank workbook
                      Go to Data Tab
                      In Get External Data box (upper left side) select from Other Sources
                      Select From XML Data Import
                      Select your downloaded XML file
                      click OK on "the specified XML source..." message
                      select "Existing Workbook"

                      Code:
                      https://www.youtube.com/watch?v=r0S2CQd46C4
                      Unfortunately neither of those options worked. I'm giving up for now. Thanks a lot for your persistence.

                      Comment


                      • #12
                        only thing I can think of offhand is maybe the file is corrupt, did you try deleting file and downloading a fresh copy?

                        Maybe someone else with Excel 2007 can see if they can open xml file successfully....

                        Comment


                        • #13
                          Originally posted by prairielad View Post
                          only thing I can think of offhand is maybe the file is corrupt, did you try deleting file and downloading a fresh copy?

                          Maybe someone else with Excel 2007 can see if they can open xml file successfully....
                          Yes, I've tried several downloads and different browsers. But thanks again.

                          Comment


                          • #14
                            the other option is to edit the csv file in notepad/Programers Notepad (ctrl+H)


                            Find "DYS385" and Replace with "DYS385",""
                            Find "DYS459" and Replace with "DYS459",""
                            Find "DYS464" and Replace with "DYS464","","","" (number of ,"" depends on how many values present in multicopy marker values, this example is for 4 values ie 14-15-15-17)
                            Find "YCAII" and Replace with "YCAII",""
                            Find "CDY" and Replace with "CDY",""
                            Find "DYF395S1" and Replace with "DYF395S1",""
                            Find "DYS413" and Replace with "DYS413",""

                            Change following marker names temporarily so you can edit the - in multicopy marker values
                            Find Y-GATA-H4 and Replace with YrGATArH4
                            Find Y-GATA-A10 and Replace with YrGATAr10
                            Find Y-GGAAT-1B07 and Replace with YrGGAATr1B07

                            Replace All - with ","

                            Change back above markers
                            Find YrGATArH4 and replace with Y-GATA-H4
                            Find YrGATAr10 and replace with Y-GATA-10
                            Find YrGGAATr1B07 and replace with Y-GGAAT-1B07

                            Due to replacing - , Haplogroup assignments will also be separated in to 2 columns ie R-Z283 will place R in one column and Z283 in another.

                            You will have to also Find "Haplogroup" and Replace with "Haplogroup",""

                            You can edit Haplogroup assignment back using a simple excel formula after if so wish,ie Concatenate Formula =concatenate(cell 1,"-",cell 2)

                            Save file and open in excel
                            may have to adjust the end values for the IsHaploPredicted and HaploShort columns

                            Bit of work.....

                            I am not familiar with macros, so will be no help there.
                            Last edited by prairielad; 19 December 2015, 08:51 PM.

                            Comment


                            • #15
                              above will not work if there are other - in the document as it will mess up spacing by separating whats before and after - into adjacent columns

                              ie in b.c.1775-1795 Tyrone, Ireland of kit holders description it will separate 1775 and 1795 shifting all values in that row right taking them out of alignment.

                              So it depends on what your csv consists of to whether editing is feasible.

                              Comment

                              Working...
                              X