SEQanswers

Go Back   SEQanswers > Sequencing Technologies/Companies > Illumina/Solexa



Similar Threads
Thread Thread Starter Forum Replies Last Post
250bp paired end Miseq reads alignment empyrean Bioinformatics 3 12-12-2012 01:36 PM
Miseq Paired-End Cluster Identification retroant Illumina/Solexa 4 10-19-2012 04:55 AM
MetaSim: why paired end reverse read is much shorter than forward read?? gen_argentino Bioinformatics 0 09-06-2012 07:38 AM
How to Demultiplex a Nextera paired-end MiSeq run allo Illumina/Solexa 6 02-27-2012 08:10 AM
Average Read Coverage for 454 paired end read data lisa1102 Core Facilities 8 10-18-2011 09:40 AM

Reply
 
Thread Tools
Old 03-17-2013, 12:57 PM   #1
abyss
Member
 
Location: NY

Join Date: Jan 2013
Posts: 17
Default Miseq Read 2 quality much poorer! Why??

Hi All,
I am pretty new to sequencing on the Miseq, so just learning the ropes with amounts and cluster generation on the flow cell.
I am currently doing ChIP-Seq Experiments and multiplex my libraries by using a 7bp in-line barcode. The way I generate sequence complexity is by multiplexing at least 6 experiments together and spiking 10% PhiX, or so I thought.
I quantified my libraries using KAPA qPCR, and estimated size distribution using Agilent Bioanalyzer 2100 (DNA High Sensitivity). Just being conservative for an initial run I diluted each different Library separately (since amounts were less for some) to 12pM, and since I was adding equal amounts of the library, I mixed equal volumes of each for the final loading volume. I imagine the final concentration should remain the same although each individual library should get diluted. So to 900ul of this combined sample library I added 100ul of 12.5pM PhiX as spike in.
I have attached some of the SAV pics and Run stats as a ppt file for reference.
What I basically observed was that Read1 was really good quality and I got great clusters (98.6% greater than Q30) at a cluster density of 373K/mm2 (Lower than I thought). But as soon as the paired end clusters were formed the quality (% greater than Q30) dropped quite significantly (by 20%). The % aligned reads of the PhiX library drops from 18% in read 1 to ~2% in Read 2. The most bizarre observation was that when the sequence starts reading the sample after the barcode, Read 1 shows a good constant AT:GC ratio, as one would expect, but read 2, some how has a weird C bias. Has anyone encountered this before. Is the run OK, or should I just consider the data from Read 1.
Please provide me your input, as I am thoroughly confused by these Read 2 metrics.
Thanks.
Attached Files
File Type: ppt 3.15.13_Run_Details.ppt (681.0 KB, 210 views)

Last edited by abyss; 03-18-2013 at 07:59 AM.
abyss is offline   Reply With Quote
Old 03-17-2013, 07:49 PM   #2
danwiththeplan
Member
 
Location: Auckland

Join Date: Sep 2011
Posts: 72
Default

I am not an expert at all but possibly this post contains some useful info?

http://pathogenomics.bham.ac.uk/blog...llumina-miseq/

Seems to suggest a higher level of phiX spiking and mentions a problem that sounds related to your problem (OK initial quality followed by a rapid drop-off)
danwiththeplan is offline   Reply With Quote
Old 03-18-2013, 07:43 AM   #3
abyss
Member
 
Location: NY

Join Date: Jan 2013
Posts: 17
Default

Thanks for the feedback.
But I believe that the problem lies when the clusters are flipped over to do Read2. I guess if I had continued all my cycles with just read 1, my data would have been better.
I don't exactly know what happened when the clusters are flipped over.
But maybe my assumption is entirely wrong.
abyss is offline   Reply With Quote
Old 03-18-2013, 09:04 AM   #4
microgirl123
Senior Member
 
Location: New England

Join Date: Jun 2012
Posts: 192
Default

I'm wondering if you're having problems with not enough base balance in your indices - it looks like your Read 2 cleans up nicely in the middle of the run.
microgirl123 is offline   Reply With Quote
Old 03-18-2013, 09:33 AM   #5
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,800
Default

Low nucleotide diversity will throw the Q-scores off significantly. As microgirl123 pointed out they seem to have recovered after sometime, but the weird C bias is strange.

Have you run the self test on the machine to see if everything is ok (as far as valves/flow goes)?
GenoMax is offline   Reply With Quote
Old 03-18-2013, 10:58 AM   #6
abyss
Member
 
Location: NY

Join Date: Jan 2013
Posts: 17
Default

I guess one of the things I forgot to mention was that, out of the 6 multiplexed libraries, half of them had a size range of 200bp and the other half had a size range of 300bp.
I'm wondering whether the smaller sized libraries couldn't flip properly, destroying a lot of the complexity and not getting sequenced well either??
abyss is offline   Reply With Quote
Old 03-18-2013, 11:17 AM   #7
microgirl123
Senior Member
 
Location: New England

Join Date: Jun 2012
Posts: 192
Default

Do you mean your insert size is 200 or 300 bp or the entire library (with adapters) is 200 or 300 bp? If it is your insert that is 200-300 bp, then it should have had no trouble flipping over.

I've looked at your run metrics again (I'm not familiar with ChipSeq) - are you using Illumina indexed adapters as well as your own individual indices? If so, are the Illumina indexed adapters properly balanced?

Last edited by microgirl123; 03-18-2013 at 11:19 AM.
microgirl123 is offline   Reply With Quote
Old 03-18-2013, 12:00 PM   #8
abyss
Member
 
Location: NY

Join Date: Jan 2013
Posts: 17
Default

Quote:
Originally Posted by microgirl123 View Post
Do you mean your insert size is 200 or 300 bp or the entire library (with adapters) is 200 or 300 bp? If it is your insert that is 200-300 bp, then it should have had no trouble flipping over.

I've looked at your run metrics again (I'm not familiar with ChipSeq) - are you using Illumina indexed adapters as well as your own individual indices? If so, are the Illumina indexed adapters properly balanced?
The total size (including adapter sequence) is 200bp or 300bp.
I don't have Illumina's indexing on these adapters and just have my own inline indicies.
abyss is offline   Reply With Quote
Old 04-26-2013, 10:41 AM   #9
agent99
Member
 
Location: San Francisco

Join Date: Jul 2010
Posts: 10
Default low quality 2nd read from HiSeq too

We are seeing the same low quality for the first few bases of the 2nd read in paired end sequencing on the HiSeq. We have seen this from multiple sequencing centers and from at least two different library prep methods. It seems like there is a problem with chemistry on the sequencer.

We were told by one sequencing center:
"It turned out that there was a NaOH problem…the protocol uses NaOH to strip of the index prior to sequencing the 2nd read. While the NaOH reagent sat on the machine, for some reason, it degraded and proper removal of the index was not achieved. Adding fresh NaOH a day before index2 did the trick. The Qscore looks amazing...."

Another sequencing center said:

"They [Illumina] have suggested that NaOH is not working well anymore to denature index1 away. If this is not complete, it will continue sequencing 7 dark cycles after index1 and then 8nt from the adapter, which is exactly the adapter sequence that is the top match in my 8mer analysis."

I'm attaching an image of what our poor quality scores look like for the 2nd end; the first end looks great. I'd like to hear if anyone else is seeing the same problem and whether you have a similar or different answer from Illumina or your sequencing centers.

Thanks!
Attached Images
File Type: png per_base_quality.png (11.2 KB, 102 views)
agent99 is offline   Reply With Quote
Old 04-29-2013, 08:34 AM   #10
HeinKey
Member
 
Location: Wageningen, Netherlands

Join Date: May 2009
Posts: 21
Default primer hybridization?

Quote:
Originally Posted by agent99 View Post
We are seeing the same low quality for the first few bases of the 2nd read in paired end sequencing on the HiSeq. We have seen this from multiple sequencing centers and from at least two different library prep methods. It seems like there is a problem with chemistry on the sequencer.

We were told by one sequencing center:
"It turned out that there was a NaOH problem…the protocol uses NaOH to strip of the index prior to sequencing the 2nd read. While the NaOH reagent sat on the machine, for some reason, it degraded and proper removal of the index was not achieved. Adding fresh NaOH a day before index2 did the trick. The Qscore looks amazing...."

Another sequencing center said:

"They [Illumina] have suggested that NaOH is not working well anymore to denature index1 away. If this is not complete, it will continue sequencing 7 dark cycles after index1 and then 8nt from the adapter, which is exactly the adapter sequence that is the top match in my 8mer analysis."

I'm attaching an image of what our poor quality scores look like for the 2nd end; the first end looks great. I'd like to hear if anyone else is seeing the same problem and whether you have a similar or different answer from Illumina or your sequencing centers.

Thanks!
Hello Agent99,
If I understand correctly the issue of poor read3 is due to the index (read2) still present on your strands.
I can't understand how this would happen since reclustering takes place after read2.
So 14 cycli of turnaround would not remove the index read? I find this hard to believe.
I wonder if the read3 (reverse read) primer has hybridized correctly? If this did not work well you will get low intensities and lower Qscores.
Could this be causing your problems?

Regards,
Hein
HeinKey is offline   Reply With Quote
Old 04-29-2013, 08:53 AM   #11
agent99
Member
 
Location: San Francisco

Join Date: Jul 2010
Posts: 10
Default

Thanks for the response, Hein.

That sounds plausible except that these data are coming from 6-7 different experiments where the libraries were produced and sequenced at 3 different sequencing centers around the country. The library production methods differed between the 3 centers, but the sequencing was all done on HiSeq2000s within a one month period. This sounds more like a systemic problem with chemistry on the sequencer to me, but I could be wrong. I'm not in the lab handling samples - just reporting what I'm seeing on the bioinformatics side and what Illumina has told my colleagues at sequencing centers. Hoping to hear if you have seen the same thing and whether Illumina recommends other solutions to the problem.

--Alisha
agent99 is offline   Reply With Quote
Old 04-29-2013, 09:21 AM   #12
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,800
Default

Alisha: You should have started a new thread for your observation since the sequencing times are vastly different on MiSeq and HiSeq.

Sticking this observation in a MiSeq thread is sure to get some folks (like me) confused.
GenoMax is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 07:23 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2018, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO