SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
dbSNP frequencies JohnK Bioinformatics 2 12-05-2013 08:36 AM
dbSNP download ddaneels Bioinformatics 3 01-24-2013 02:06 PM
IGV-dbSNP paolo.kunder Bioinformatics 2 02-20-2012 11:15 PM
dbSNP question boetsie Bioinformatics 0 02-15-2011 04:01 AM
The meaning of the figures in specification table Michael L. Altshuler Illumina/Solexa 0 05-29-2010 07:02 AM

Reply
 
Thread Tools
Old 06-25-2012, 04:57 PM   #1
Applemelon
Junior Member
 
Location: China

Join Date: Jun 2012
Posts: 4
Default What is the specification of dbSNP?

I downloaded from NCBI dbSNP a file, whose format is something like fasta. Here is one example:

>gnl|dbSNP|ss244318098 ss=244318098|pos=394|len=894|handle="BGI"|subid="Gm01-394"|taxid=3847|mol="Genomic"|class=1|alleles="A/C"
GGTTTGGTGTTTGGGTTTTAGGTTTTAGGTTTTAGGTTTTACGGTTTAGGGTTTATGGTTTATGGTTTAGGGTTTAGGGT
TAGGAAATAATTTGGGTCTTTCATCTTTCAACAAAAAATTAAGGGATTTAGAGTAATTTTTAGGGTTTAGGGTTTAAGGT
TTTAGGTTTCGGGTTTGGGTTTTAGATTTTACGGCTTACGGTTTAAAGTTTAGGGGTTAGGGTTTAGGGTTTAGAAATAA
ATTTGAGTGTTTGACATTTGAACACAAAATTAAGGCATTTAGAGTCATTTTTAGGGTTTACGGTTTAGGGTTTAGCAAGA
AATTTCGGTGTTTCATCTTCGAACACAAAATTAAGGCAGTTAAAGTCTTTTTTTGGGTTTAGGGTTTAGGGTT
M
TTTGCCTGGGTGTGCCAGTGGCGTGAGCAAATGGAGGGCGGCCATTTCTCATGTTTGGACGTCAAAGAACCCATAAAAAA
TAGTCCTGTTCCCCGGTTTCGTCAACTAACACGTAAAAACAATGCCTTAACACAAAATTAAGGCATTTAGAGGCATTTTT
AGGGTTTACGGTTTAGGGTTTACCAAGAAATTTCGGTGTTTCATCTTTGAACACAAAATTAAGGCAGTTAAAGTCTTTTT

I was confused by "class=1 in the header and "M" in the body. What are their meanings? How to transfer this format to VCF format? I can't find VCF file of soybean. Thank you very much.
Applemelon is offline   Reply With Quote
Old 06-25-2012, 11:29 PM   #2
ulz_peter
Senior Member
 
Location: Graz, Austria

Join Date: Feb 2010
Posts: 219
Default

The lonesome "M" in the middle of the text is IUPAC Ambiguity Code. This is used if you want to state that two or more bases are likely to be in the same position. In this case the letter M is short for aMino, which means there can be either C or A at that position.

I don't think it is possible to generate a VCF file from just having this position, since it neither states a chromsome nor a position for that SNP, which is necessary for creating a VCF.

If you've got a lot of time you might align those seuqences and then annotate the missing infomration to get a VCF file, but there might be easier solutions for that
ulz_peter is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 07:28 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO