Announcement

Collapse
No announcement yet.

Problem with FASTA file on mtdna page

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Problem with FASTA file on mtdna page

    I am having trouble with teh FASTA file of the results of my mtdna full sequence, on the mtdna results page of my FTDNA site.

    My kit number is 103192, if any FTDNA staff happen to be monitoring.

    This is entire code of a full sequence - all 16,500 or so base pairs, not the mutation list of HVR1 and HVR2.

    As far as I know, and I triple checked, a FASTA file is a simple text file containing a genetic sequence, in the case of mtdna the 16,500 or so base pairs that make up the DNA sequence of mtdna. It consists of a single line of descriptive information, followed by the genetic code itself, line after line, with carriage returns, and nothing else. It can be read and edited like any other text file.

    The mtdna code in FASTA format on the Genbank web site corresponds to the CRS except for the mutations.

    All instructions I've seen involving FASTA files assume that the letters are teh actual code, and not some coding of the code that can only be read with a specialized program or something. For one thing, instructions on how to prepare a FASTA file as part of the process of preparing a Genbank submission, say to identify the positions where mutations are, change the letter to the mutation, and change the color of that letter to red, and things of that sort.

    The FASTA file I downloaded from my mtdna results page contains nearly the same number of base pairs as the complete sequence - give or take about ten characters, and it consists entirely of the right four letters, but it bears no resemblance whatever to the CRS. Considering whether the actual code might somehow start in a different place did not make a difference.

    Does FTDNA use some strange format, did they screw up the file, or do I need to do something to the file that I didn't know about?

    Here are the first five lines of it, followed by the first few lines of the CRS.
    ATTCTAATTTAAACTATTCTCTGTTCTTTCATGGGGAAGCAGATTTGGGT ACCACCCAAGTATTGACTCACCCATCAACA
    ACCGCTATGTATTTCGTACATTACTGCCAGCCACCATGAATATTGTACGG TACCATAAATACTTGACCACCTGTAGTACA
    TAAAAACCCAATCCACATCAAAACCCCCTCCCCATGCTTACAAGCAAGTA CAGCAATCAACCCTCAACTATCACACATCA
    ACTGCAACTCCAAAGCCACCCCTCACCCACTAGGATACCAACAAACCTAC CCACCCTTAACAGTACATAGTACATAAAGC
    CATTTACCGTACATAGCACATTACAGTCAAATCCCTTCTCGTCCCCATGG ATGACCCCCCTCAGATAGGGGTCCCTTGAC

    First few lines of CRS:

    GATCACAGGTCTATCACCCTATTAACCACTCACGGGAGCTCTCCATGCAT TTGGTATTTTCGTCTGGGGG
    GTATGCACGCGATAGCATTGCGAGACGCTGGAGCCGGAGCACCCTATGTC GCAGTATCTGTCTTTGATTC
    CTGCCTCATCCTATTATTTATCGCACCTACGTTCAATATTACAGGCGAAC ATACTTACTAAAGTGTGTTA
    ATTAATTAATGCTTGTAGGACATAATAATAACAATTGAATGTCTGCACAG CCACTTTCCACACAGACATC
    ATAACAAAAAATTTCCACCAAACCCCCCCTCCCCCGCTTCTGGCCACAGC ACTTAAACACATCTCTGCCA

    Yours,
    Dora Smith

  • #2
    Dora,

    the first sequence is identical to the CRS between 16001 and 16400, the second one is the CRS too, the range is 1-350.

    Comment


    • #3
      Someone answered my question - answer is that 16001 (I think he said) through the end were listed at teh beginning and had to be moved back to teh end.

      Yours,
      Dora Smith

      Comment

      Working...
      X