I am having trouble with teh FASTA file of the results of my mtdna full sequence, on the mtdna results page of my FTDNA site.
My kit number is 103192, if any FTDNA staff happen to be monitoring.
This is entire code of a full sequence - all 16,500 or so base pairs, not the mutation list of HVR1 and HVR2.
As far as I know, and I triple checked, a FASTA file is a simple text file containing a genetic sequence, in the case of mtdna the 16,500 or so base pairs that make up the DNA sequence of mtdna. It consists of a single line of descriptive information, followed by the genetic code itself, line after line, with carriage returns, and nothing else. It can be read and edited like any other text file.
The mtdna code in FASTA format on the Genbank web site corresponds to the CRS except for the mutations.
All instructions I've seen involving FASTA files assume that the letters are teh actual code, and not some coding of the code that can only be read with a specialized program or something. For one thing, instructions on how to prepare a FASTA file as part of the process of preparing a Genbank submission, say to identify the positions where mutations are, change the letter to the mutation, and change the color of that letter to red, and things of that sort.
The FASTA file I downloaded from my mtdna results page contains nearly the same number of base pairs as the complete sequence - give or take about ten characters, and it consists entirely of the right four letters, but it bears no resemblance whatever to the CRS. Considering whether the actual code might somehow start in a different place did not make a difference.
Does FTDNA use some strange format, did they screw up the file, or do I need to do something to the file that I didn't know about?
Here are the first five lines of it, followed by the first few lines of the CRS.
ATTCTAATTTAAACTATTCTCTGTTCTTTCATGGGGAAGCAGATTTGGGT ACCACCCAAGTATTGACTCACCCATCAACA
ACCGCTATGTATTTCGTACATTACTGCCAGCCACCATGAATATTGTACGG TACCATAAATACTTGACCACCTGTAGTACA
TAAAAACCCAATCCACATCAAAACCCCCTCCCCATGCTTACAAGCAAGTA CAGCAATCAACCCTCAACTATCACACATCA
ACTGCAACTCCAAAGCCACCCCTCACCCACTAGGATACCAACAAACCTAC CCACCCTTAACAGTACATAGTACATAAAGC
CATTTACCGTACATAGCACATTACAGTCAAATCCCTTCTCGTCCCCATGG ATGACCCCCCTCAGATAGGGGTCCCTTGAC
First few lines of CRS:
GATCACAGGTCTATCACCCTATTAACCACTCACGGGAGCTCTCCATGCAT TTGGTATTTTCGTCTGGGGG
GTATGCACGCGATAGCATTGCGAGACGCTGGAGCCGGAGCACCCTATGTC GCAGTATCTGTCTTTGATTC
CTGCCTCATCCTATTATTTATCGCACCTACGTTCAATATTACAGGCGAAC ATACTTACTAAAGTGTGTTA
ATTAATTAATGCTTGTAGGACATAATAATAACAATTGAATGTCTGCACAG CCACTTTCCACACAGACATC
ATAACAAAAAATTTCCACCAAACCCCCCCTCCCCCGCTTCTGGCCACAGC ACTTAAACACATCTCTGCCA
Yours,
Dora Smith
My kit number is 103192, if any FTDNA staff happen to be monitoring.
This is entire code of a full sequence - all 16,500 or so base pairs, not the mutation list of HVR1 and HVR2.
As far as I know, and I triple checked, a FASTA file is a simple text file containing a genetic sequence, in the case of mtdna the 16,500 or so base pairs that make up the DNA sequence of mtdna. It consists of a single line of descriptive information, followed by the genetic code itself, line after line, with carriage returns, and nothing else. It can be read and edited like any other text file.
The mtdna code in FASTA format on the Genbank web site corresponds to the CRS except for the mutations.
All instructions I've seen involving FASTA files assume that the letters are teh actual code, and not some coding of the code that can only be read with a specialized program or something. For one thing, instructions on how to prepare a FASTA file as part of the process of preparing a Genbank submission, say to identify the positions where mutations are, change the letter to the mutation, and change the color of that letter to red, and things of that sort.
The FASTA file I downloaded from my mtdna results page contains nearly the same number of base pairs as the complete sequence - give or take about ten characters, and it consists entirely of the right four letters, but it bears no resemblance whatever to the CRS. Considering whether the actual code might somehow start in a different place did not make a difference.
Does FTDNA use some strange format, did they screw up the file, or do I need to do something to the file that I didn't know about?
Here are the first five lines of it, followed by the first few lines of the CRS.
ATTCTAATTTAAACTATTCTCTGTTCTTTCATGGGGAAGCAGATTTGGGT ACCACCCAAGTATTGACTCACCCATCAACA
ACCGCTATGTATTTCGTACATTACTGCCAGCCACCATGAATATTGTACGG TACCATAAATACTTGACCACCTGTAGTACA
TAAAAACCCAATCCACATCAAAACCCCCTCCCCATGCTTACAAGCAAGTA CAGCAATCAACCCTCAACTATCACACATCA
ACTGCAACTCCAAAGCCACCCCTCACCCACTAGGATACCAACAAACCTAC CCACCCTTAACAGTACATAGTACATAAAGC
CATTTACCGTACATAGCACATTACAGTCAAATCCCTTCTCGTCCCCATGG ATGACCCCCCTCAGATAGGGGTCCCTTGAC
First few lines of CRS:
GATCACAGGTCTATCACCCTATTAACCACTCACGGGAGCTCTCCATGCAT TTGGTATTTTCGTCTGGGGG
GTATGCACGCGATAGCATTGCGAGACGCTGGAGCCGGAGCACCCTATGTC GCAGTATCTGTCTTTGATTC
CTGCCTCATCCTATTATTTATCGCACCTACGTTCAATATTACAGGCGAAC ATACTTACTAAAGTGTGTTA
ATTAATTAATGCTTGTAGGACATAATAATAACAATTGAATGTCTGCACAG CCACTTTCCACACAGACATC
ATAACAAAAAATTTCCACCAAACCCCCCCTCCCCCGCTTCTGGCCACAGC ACTTAAACACATCTCTGCCA
Yours,
Dora Smith
Comment