SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
"allele balance ratio" and "quality by depth" in VCF files efoss Bioinformatics 2 10-25-2011 12:13 PM
Relatively large proportion of "LOWDATA", "FAIL" of FPKM_status running cufflink ruben6um Bioinformatics 3 10-12-2011 01:39 AM
The position file formats ".clocs" and "_pos.txt"? Ist there any difference? elgor Illumina/Solexa 0 06-27-2011 08:55 AM
"Systems biology and administration" & "Genome generation: no engineering allowed" seb567 Bioinformatics 0 05-25-2010 01:19 PM
SEQanswers second "publication": "How to map billions of short reads onto genomes" ECO Literature Watch 0 06-30-2009 12:49 AM

Reply
 
Thread Tools
Old 01-26-2011, 10:06 AM   #1
rdu
Member
 
Location: USA

Join Date: Aug 2010
Posts: 29
Default "N"s in DNA sequence data

I found there're many "N"s in a initial DNA genome sequence data, like:

CTGAAATCACTACTTTCCTTGTTAGGCTCGGCGCATGTGTTAAGTAGNNN
NTTATANNCNGNNNAGNATTTATNNNNNNNNCTTNNNNNNCGGTTATATG

What's the explanation for it? I took BLASTX allignment, no error messages popped up, then how BLAST treats them? Thanks.
rdu is offline   Reply With Quote
Old 01-27-2011, 10:55 AM   #2
apratap
Member
 
Location: Bay Area

Join Date: Jan 2009
Posts: 58
Default

Hi rdu

I see these messy N's always in the Illumina data. I think it could mean variety of different things but this is just my interpretation. Mianly I think when the base caller is not able to decipher which nucleotide to call it leaves "N" behind. If you see bunch of N a the end it could relate to falling quality. However if you see a long stretch in the middle, it could also mean complexity in that genomic region.

HTH
-Abhi
apratap is offline   Reply With Quote
Old 01-27-2011, 11:59 AM   #3
rdu
Member
 
Location: USA

Join Date: Aug 2010
Posts: 29
Default

Hi apratap,

Thank you. Your anwser sounds reasonable. If I know some related futher, will also share with you. Have a nice day.
rdu is offline   Reply With Quote
Old 01-27-2011, 12:40 PM   #4
aloliveira
Member
 
Location: Brazil

Join Date: Aug 2010
Posts: 46
Default

Hello,

The Blast ignore that sequences, there is a lower-complexity filter by default on BLAST to avoid miss-called bases or masked bases (normally N or X correspondingly). You can turn-off that parameter if you want.

Best regards,
André.
aloliveira is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 05:45 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2018, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO