SEQanswers

SEQanswers (http://seqanswers.com/forums/index.php)
-   Bioinformatics (http://seqanswers.com/forums/forumdisplay.php?f=18)
-   -   Non A T G C base in RefSeq zebrafish mRNA sequence (http://seqanswers.com/forums/showthread.php?t=64480)

xiangwulu 11-23-2015 05:33 AM

Non A T G C base in RefSeq zebrafish mRNA sequence
 
I found some non A T C G base in refseq zebrafish mRNA sequence:
'Y', 'K', 'R', 'M', 'W'

couldn't find out what do they mean.

e.g.
NM_001012366.1
http://www.ncbi.nlm.nih.gov/nuccore/...5?report=fasta
at line:
CCCCCACAGTCCCTGCATTACGGGAATGTGCAGGCAAGAGGAAGCGGTCTCAGGGAGAGGAGGMCGAAGG

this is causing error for some alignment tool, such as bfast.

Thanks.

GenoMax 11-23-2015 05:41 AM

Look at IUPAC codes for degenerate nucleotides: https://en.wikipedia.org/wiki/Nucleic_acid_notation

xiangwulu 11-23-2015 05:58 AM

thanks very much.

Brian Bushnell 11-23-2015 08:13 AM

You can use Reformat with the iupacton flag to turn non-ACGT symbols to N, like this:

reformat.sh in=mrna.fasta out=reformatted.fasta iupacton


All times are GMT -8. The time now is 11:10 PM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.