SEQanswers

Go Back   SEQanswers > Sequencing Technologies/Companies > Illumina/Solexa



Similar Threads
Thread Thread Starter Forum Replies Last Post
Fastqc results small RNA run frymor Bioinformatics 4 10-24-2013 11:21 AM
samtools mpileup unexpected behaviour CHRYSES Bioinformatics 4 06-19-2012 07:19 AM
questions of illumina pe reads fastqc results arrchi Bioinformatics 1 12-01-2011 04:07 PM
unexpected insert size eren Illumina/Solexa 2 08-24-2011 08:27 AM
Need help for FastQC results. Thanks!! byou678 Bioinformatics 18 08-23-2011 02:53 PM

Reply
 
Thread Tools
Old 04-13-2012, 07:36 AM   #1
Rocketknight
Member
 
Location: Ireland

Join Date: Sep 2011
Posts: 86
Default Unexpected FastQC results

Hi all, I was wondering if anyone could suggest an explanation for the somewhat weird FastQC results we got on our last run (human exome samples, Nimblegen v2 enrichment kit, 10 samples per lane at 2x100bp on a HiSeq 2000 with Version 3 chemistry). These are paired-end of reads from the same lane - all samples in the same lane showed this same pattern, though all of those samples were prepped at the same time too. The first reads seem completely fine but the second ones tank in quality around position 15-19. We did all sample prep in-house but we outsourced the sequencing (since we don't have a HiSeq on site).

The reads also showed some moderate sequence duplication (~21%) and an AT-bias, both of which I presume are caused by over-amplification during sample prep, so I'm going to tone down that in future. I still feel like that doesn't explain the weird quality drop here though. Can anyone suggest what it might be caused by?


Last edited by Rocketknight; 04-13-2012 at 07:38 AM.
Rocketknight is offline   Reply With Quote
Old 04-13-2012, 09:00 AM   #2
kmcarr
Senior Member
 
Location: USA, Midwest

Join Date: May 2008
Posts: 1,178
Default

rk,

In my experience results like this indicate a problem with the sequencing run, not the library(ies), and typically something pretty mundane like a (partial) blockage of the reagent flow to the lane in question.
kmcarr is offline   Reply With Quote
Old 04-14-2012, 01:11 AM   #3
simonandrews
Simon Andrews
 
Location: Babraham Inst, Cambridge, UK

Join Date: May 2009
Posts: 871
Default

I suspect that if you look at the per-sequence quality plots you'll see that this will be a subset of sequences with continually poor quality, and that a significant proportion of your library will be OK. If you look at the plots then the medians are OK on the second read, so it's a subset which is dragging down the lower quartile.

If you quality filter your data you'll probably have enough data left to continue your analysis.
simonandrews is offline   Reply With Quote
Old 04-14-2012, 03:37 AM   #4
Rocketknight
Member
 
Location: Ireland

Join Date: Sep 2011
Posts: 86
Default

That's exactly what I see, yeah. Thanks to both of you for the explanation. Should I just remove those low-quality reads and align their pairs as singletons, or should I go ahead aligning them anyway in the hopes of extracting some information from them?

Also, I noticed the per-base N content looks pretty interesting. Would I be right in saying that there was a blockage or other mishap for a few cycles (around 15-19) and that as a result the affected clusters ended up out of sync and so no longer generated a clear signal for the rest of the run?



Rocketknight is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 08:02 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO