SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
GATK base quality recalibration suppose to keep old and new quality scores? Heisman Bioinformatics 2 10-21-2011 08:40 AM
Illumina quality scores dlepp Illumina/Solexa 6 03-01-2011 12:09 AM
Illumina quality scores ewilbanks Bioinformatics 3 11-10-2010 09:52 AM
mira quality scores skingan De novo discovery 0 08-10-2010 07:17 AM
fastq quality scores bioxyz Bioinformatics 2 11-25-2009 04:28 PM

Reply
 
Thread Tools
Old 05-26-2010, 02:23 PM   #1
thinkRNA
Member
 
Location: Carlsbad,CA

Join Date: Jan 2010
Posts: 94
Default Considering Quality scores of reads when aligning

I have reads with quality scale on phred64. I get an error when using --solexa1.3-quals option in tophat which is a known error (Error: could not execute prep_reads)

So, since I can't use this option, does it mean my reads get aligned without the quality scores being taken into consideration by bowtie?

Finally, how does one decide what is a good quality score? What if there are really bad quality reads in the seed region (beginning of the read) but good ones towards the end giving it a high quality score. In this case, I would like to throw this read .

Any one has any thoughts on threshold used?

Last edited by thinkRNA; 05-26-2010 at 03:19 PM.
thinkRNA is offline   Reply With Quote
Old 05-31-2010, 06:53 AM   #2
maubp
Peter (Biopython etc)
 
Location: Dundee, Scotland, UK

Join Date: Jul 2009
Posts: 1,543
Default

I'm not sure what exactly TopHat would do in this situation.

You could try converting your reads from the Illumina 1.3+ FASTQ file format (aka phred64) to a Sanger FASTQ file (aka phred33).

There are lots of tools to do this conversion (search the forum), I'm biased but would suggest EMBOSS seqret for a command line tool, or for a script based solution BioPython (use function Bio.SeqIO.convert for this) or BioPerl etc.

Last edited by maubp; 06-01-2010 at 07:17 AM. Reason: corrected a typo
maubp is offline   Reply With Quote
Old 06-01-2010, 08:40 AM   #3
mattanswers
Member
 
Location: Boston

Join Date: Oct 2009
Posts: 65
Default

In the following URL the is a graph relating Q score (Sanger and Solexa) with p-value: http://en.wikipedia.org/wiki/FASTQ_format
Basically, I think anything above a Q (solexa) score of 20 is very acceptable. From 20-13 the probability begins to vary much more. Around a Q score of 13 it seems that there is a 0.05 chance of a bad call.
mattanswers is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 05:22 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO