SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Quick question on DESeq output baseMeanA and baseMeanB tdyo Bioinformatics 3 06-25-2013 09:22 AM
interpretation of output from HTSeq syintel87 Bioinformatics 1 01-06-2013 02:12 PM
Varscan-output interpretation bioman1 Bioinformatics 0 05-24-2012 01:50 PM
SAMtools flagstat output interpretation a2z@blr Bioinformatics 2 10-20-2011 02:23 PM
interpretation of FASTQC Overrepresented Kmers mattanswers Bioinformatics 1 09-20-2011 01:40 PM

Reply
 
Thread Tools
Old 11-22-2013, 06:50 AM   #1
rnastar
Member
 
Location: Boston, MA

Join Date: Aug 2013
Posts: 13
Default Quick Interpretation of FastQC output

Dear all,

I have attached a plot of the average per-base quality of reads in my sample.

It appears that surprisingly the 5' ends of my reads have lower quality on average than the 3' end.

Is this indicative of the fact that I need to trim my reads on the 5' end? There does not appear to be any detected adapter contamination. Any thoughts?

Thank you!
Attached Images
File Type: jpg per_base_quality.jpg (71.6 KB, 47 views)
rnastar is offline   Reply With Quote
Old 11-22-2013, 07:02 AM   #2
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,138
Default

Nothing to worry about. This looks like a very typical of illumina run. There should be no need to trim from 5'-end.
GenoMax is offline   Reply With Quote
Old 11-22-2013, 07:04 AM   #3
rnastar
Member
 
Location: Boston, MA

Join Date: Aug 2013
Posts: 13
Default

Thanks! Any idea on why we see the drop off in quality on the 5' end, or what could be causing it?
rnastar is offline   Reply With Quote
Old 11-22-2013, 07:20 AM   #4
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,138
Default

It would be technically correct to call it a drop-off .. if you compare it to the rest of the plot. But you are still above Q30 for all bases which is excellent quality data.

As RTA ramps up in the first 10-12 cycles it is calculating different metrics (color matrix, phasing etc).
GenoMax is offline   Reply With Quote
Old 11-22-2013, 12:43 PM   #5
FWOS
Epigenomics NGS Beast
 
Location: New Jersey

Join Date: Oct 2010
Posts: 17
Default

All Illumina HiSeq reads start out lower. Mostly this is improved after template generation (cycle 4). This is partially due to lower quality bases that result from the prep process, and it is also partially do do with how the per-base sequence quality is assessed. Once template generation is complete and the phasing / pre phasing data are taken into account the overall Q scores should hover above Q30 until the read length becomes a factor on per base sequence quality.
FWOS is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 01:36 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2022, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO