SEQanswers

SEQanswers (http://seqanswers.com/forums/index.php)
-   Illumina/Solexa (http://seqanswers.com/forums/forumdisplay.php?f=6)
-   -   3000/4000 read 2 (http://seqanswers.com/forums/showthread.php?t=64775)

GW_OK 12-08-2015 07:37 AM

3000/4000 read 2
 
4 Attachment(s)
Anybody here with a 3000/4000 seen poor quality read 2's lately? I have attached example plots from 4 previous runs below. It seems to be library agnostic.

I've been lead to understand that there is an escalating investigation internal to Illumina into possible chemistry issues. They're not comp-ing runs until lot numbers are given for the failed runs, I know that.

Just wondering how far the problem extends.

Brian Bushnell 12-08-2015 02:17 PM

That is consistent with my analysis of data from the HiSeq 4000 platform, provided to us by Illumina. The chemistry (as of a few months ago) was absolutely not up to 2x150bp runs IMO, and read 2 particularly had very low quality. I also analyzed some 2x100bp PhiX data that had decent quality for both read 1 and read 2. So, the platform may be adequate for 2x100bp runs.

GenoMax 12-09-2015 06:47 AM

We have usual "read2-gate" (like the antenna-gate, bend-gate for new apple products) for a new Illumina technology? :D

On a serious note, @GW_OK what kind of libraries are represented by these 4 examples?

GW_OK 12-09-2015 09:15 AM

Libraries given above are (in no order) a mix of:
Kapa Hyper Prep, Bioo Scientific Nextflex PCR-free, Illumina Nextera

We've got a Bioo Scientific Nextflex flowcell running now. I'll update once it completes.

Illumina is sending FSE's to pull our entire optics module and check alignments next week. There is hope that it will be less-bad afterwards.

luc 12-09-2015 06:35 PM

Hi GWOK, thanks for starting this thread!

We are experiencing the same problems (as is another lab in the neighborhood). The quality of the PE150 bp reads was perfectly fine a few months ago. Illumina likley has reagent quality problems at the moment.
The quality scores are now dropping considerably faster than a few months ago.
The % at which the median Q30 score curve ends (e.g. at cycle 310) seems to be a good quantifier for the problems. The curve used to end at about 60%, now often under 20%, even under 10%.

Illumina tech support did inform us that there are "NO reagent quality problems" - because all reagents individually pass QC. However they are investigating if there is a similar phenomenon ( a "non-problem"?) as with the MiSeq reagents which generate reduced read qualities scores since more than half a year now ( http://seqanswers.com/forums/showthread.php?t=59558 ).

Please keep us updated.

Brian Bushnell 12-09-2015 07:55 PM

Luc, which platform (and chemistry, and software) are you having trouble with?

FYI, our HiSeq 2000/2500 machines seem fine, which is one of the reasons I have recommended against "upgrading" to HS3000 or 4000 (from which I have never seen good data). As far as I know, our NextSeq is fine too, but our MiSeqs are outputting junk for 2x300bp amplicon runs.

GW_OK 12-10-2015 07:18 AM

4 Attachment(s)
Latest run has finished. Two large Bioo NextFlex pools across respective halves of the flowcell.

Read 2 still looks weak. One thing of note is that the top surface looks better than the bottom. That being said neither surface's read 2 look nearly as good as read 1.

luc, thanks for posting. I'm glad I'm not the only one seeing this.

I've also included a Q30 plot from Illumina's "HiSeq 4000: TruSeq PCR-Free (NA12878)" publicly shared run on BaseSpace as an example of what I imagine good runs are supposed to look like.

luc 12-10-2015 11:48 AM

Hi Brian,

We have seen the problems on the HiSeq 4000 the last 6 weeks, and since quite a while on our Miseqs (we are underclustering considerably for these to get any usable data).



Quote:

Originally Posted by Brian Bushnell (Post 186149)
Luc, which platform (and chemistry, and software) are you having trouble with?

FYI, our HiSeq 2000/2500 machines seem fine, which is one of the reasons I have recommended against "upgrading" to HS3000 or 4000 (from which I have never seen good data). As far as I know, our NextSeq is fine too, but our MiSeqs are outputting junk for 2x300bp amplicon runs.


GW_OK 12-18-2015 09:56 AM

Update.
Our FSE has been on site all week working on realigning the optical path and running several tests on the fluidics, hoping that this will mitigate the issue. Based on what the factory people told him from data sent off we were ever so slightly off in some sort of image correction model. I view the instrument tweaking as dubious but have been told that this must be done first before the chemistry is brought into question.

We'll have our first test run on Monday. I'll upload performance once it completes.

GW_OK 01-04-2016 06:30 AM

2 Attachment(s)
Update.
After running a mix of PhiX and other libraries Q30 plots are still not great but perhaps less terrible. Attached here are two plots showing a PhiX lane as well as a Nextera lane (PhiX is missing one of the two index reads).

While neither are bottoming out as before there's certainly a steep downward trend in read after the intensity jump. This run did "meet spec" for a 3k run (75% >Q30) but Illumina still wants to send out an "instrument expert" to have a look at our machine.

GW_OK 01-04-2016 06:41 AM

2 Attachment(s)
Two more since then have given mixed results. The first was a Bioo NextFlex PCR-free and is perhaps one of the best runs we've had on this machine. The Q30 plot is attached (the one with black boxes).

After that run another, with a mix of library types, gave poorer performance. The Q30 plot for this is also attached, this time with blue boxes.

luc 01-08-2016 04:50 PM

Ours tend to look unfortunately like your last example - we have not seen significant differences between customer prepped libraries our own libraries and also PCR-free libraries.
At what concentration are you loading the PCR-free libraries and and how many clusters did you sequence?

Thanks!
Quote:

Originally Posted by GW_OK (Post 187139)
Two more since then have given mixed results. The first was a Bioo NextFlex PCR-free and is perhaps one of the best runs we've had on this machine. The Q30 plot is attached (the one with black boxes).

After that run another, with a mix of library types, gave poorer performance. The Q30 plot for this is also attached, this time with blue boxes.


GW_OK 01-11-2016 09:48 AM

A couple more runs that don't meet spec.

Here's the first. It's two pools of NextFlex PCR-free on respective halves. Loaded at 150pM.
Q30 tanks towards the end:
http://i.imgur.com/kJxodBx.jpg

We have a look at the %base and say "Ah Ha, sneaky adapter dimers slipping in!". Granted one should be able to clearly see adapter dimer traces on a Tapestation or Bioanalyzer but with PCR-free adapters things don't migrate properly and sizes are all off (we have tried denaturing and running it on a RNA tape, with some success).
http://i.imgur.com/wkcmNab.jpg

Apparently the ExAmp and ordered flowcells just go bonkers with short libraries and that's what's killing the end of our reads, right?

So we take the two pools, give them a 0.8x bead cleanup (as opposed to our typical 0.9x) and load them back up again. We load at 100pM this time to really knock down any possibility of polyclonal wells.

The %base plot shows no more adapter dimers:
http://i.imgur.com/qAKyPHE.jpg

And read 1 Q30 looks pretty good relative to what you can expect on a 3k, but read 2 Q30 is an almost straight slope down:
http://i.imgur.com/SUi5JS0.jpg

I'm given to understand there's yet another lab out there that's having troubles with a 3k/4k. They've decided to not even bother going out to 150bp and just truncate the runs to 120bp. If that's the way this ends up going here I better not have to pay for those extra cycles.

Brian Bushnell 01-11-2016 08:34 PM

From what I've seen, you'll get much better results doing 2x100 runs in the first place, rather than truncating the 2x150 runs...

GenoMax 01-12-2016 06:23 AM

1 Attachment(s)
Our initial 4000 run (2D x 151) was a mix of random samples from past with a lane of phiX. With the caveat that n=1 things do not look too bad. Still analyzing the data.

GW_OK 01-18-2016 12:06 PM

1 Attachment(s)
Another run. Not as bad this time. Still not 2500-level good (for read 2) but not as terrible as before.

A few things we've learned thus far (empirically and from talking Illumina) regarding the 3k:

1) Anything short will take over the ordered flowcell with a vengeance. As best you can, DO NOT have library fragments <300bp (150bp insert). ALL adapter dimer must be gone. Do at least a 0.8x SPRI cleanup. This flowcell is much less forgiving than the 2500.

2) If you have libraries on the shorter side (200-300bp insert) it is better to OVER load than UNDER load.

GenoMax 01-18-2016 12:57 PM

Perhaps this is the reason why HiSeq X was such a controlled release and for a specific application. Patterned flowcells generate a lot of data but the quality is good (in terms of duplicates) only when libraries fit a narrow/strict quality profile.

Bukowski 01-19-2016 08:28 AM

Timely article from James Hadfield reiterating some of the points from GW_OK

http://core-genomics.blogspot.co.uk/...d-to-know.html

luc 01-19-2016 10:01 AM

Thanks for the link. The article does not present anything new (yet)- James just received his HS4000.
We have been very happy with it, the exception being that the qualities of the PE150 reads are no longer as good as they were last spring and summer. The other mentioned caveats can be dealt with; long insert libraries need to be run on the Hiseq 2500 of course.

Quote:

Originally Posted by Bukowski (Post 187867)
Timely article from James Hadfield reiterating some of the points from GW_OK

http://core-genomics.blogspot.co.uk/...d-to-know.html


melop 01-19-2016 02:16 PM

Dear all. I'm an end user. Recently we got 2 HiSeq 4000 lanes from a local sequencing facility (150bp x 2 reads), PCR-free gDNA libraries made at the same facility. The quality looks quite bad, especially in read 2, only 60% and 70% of bases >Q30, which doesn't pass the Illumina specs. In this case, is it a common practice for the facility to redo the sequencing for us? We ask because it sounded like they don't want to redo it for us... How about the policy at your sequencing core?

GenoMax 01-19-2016 03:01 PM

Quote:

Originally Posted by melop (Post 187877)
Dear all. I'm an end user. Recently we got 2 HiSeq 4000 lanes from a local sequencing facility (150bp x 2 reads), PCR-free gDNA libraries made at the same facility. The quality looks quite bad, especially in read 2, only 60% and 70% of bases >Q30, which doesn't pass the Illumina specs. In this case, is it a common practice for the facility to redo the sequencing for us? We ask because it sounded like they don't want to redo it for us... How about the policy at your sequencing core?

Local policies at facilities differ and it is possible that your facility may not have a specific one for 4000 data due to the peculiarities you have been reading about in this thread.

That said, there is certainly enough reason to show your QC data to the facility and discuss your concerns/available options. If only your samples show this trend and other samples on the flowcell did not have a problem then that may be a reason why the facility may seem reluctant to redo your samples.

There is no harm (except a bit of additional work) for the facility to open a ticket with Illumina and have them look at the metrics of run in question. If Illumina provides replacement reagents (that is they determine that there is potential reagent/hardware issue) then you should be able to get your samples re-sequenced at no charge. If no free replacement kit is forthcoming then your options revert to getting the sample sequenced on a different sequencer (2500) at full and/or subsidized cost (since the facility made your libraries they have a responsibility to get you usable data).

melop 01-19-2016 03:13 PM

I see. But is that a common practice to let the end user bare the loss even if the facility prepares the library knowing that they are going on a HiSeq 4000? I understand that if Illumina agrees to replace the reagents then there will be no problem, but I will be surprised if that should become the end user's burden if such a performance issue is caused by inadequate library prep (say insert size not tight enough or negligence in adapter cleaning, suggested by previous posts). One problem is that this core does not have any written policy regarding quality, and their policies all seem to be post hoc .....

Quote:

Originally Posted by GenoMax (Post 187880)
Local policies at facilities differ and it is possible that your facility may not have a specific one for 4000 data due to the peculiarities you have been reading about in this thread.

That said, there is certainly enough reason to show your QC data to the facility and discuss your concerns/available options. If only your samples show this trend and other samples on the flowcell did not have a problem then that may be a reason why the facility may seem reluctant to redo your samples. There is no harm (except a bit of additional work) for the facility to open a ticket with Illumina and have them look at the metrics of run in question.

If Illumina provides replacement reagents (if they determine that there is potential reagent issue) then you should be able to get your samples re-sequenced at no charge. If no free replacement kit is forthcoming then your options revert to getting the sample sequenced on a different sequencer (2500) at your cost.


GenoMax 01-19-2016 03:31 PM

Quote:

Originally Posted by melop (Post 187881)
I see. But is that a common practice to let the end user bare the loss even if the facility prepares the library knowing that they are going on a HiSeq 4000? I understand that if Illumina agrees to replace the reagents then there will be no problem, but I will be surprised if that should become the end user's burden if such a performance issue is caused by inadequate library prep (say insert size not tight enough or negligence in adapter cleaning, suggested by previous posts). One problem is that this core does not have any written policy regarding quality, and their policies all seem to be post hoc .....

You are referring to two separate issues and they need to be addressed separately. Sounds like you are ok with the run metrics explanation.

If you have questions about the quality of the libraries that were prepared then ask the facility to share their library QC data with you. Ask them if they bead treated the libraries as recommended? If the libraries look suboptimal then they may need to clean them up and re-run your samples at no cost. Again things you should discuss in good faith with your facility.

There are current instances (e.g. MiSeq 2x300 kits) where dismal Q-scores seem to rule late in read 2 but that does not affect the actual sequence. That may apply in your case as well. You may have perfectly good sequence data that has a Q-score issue that people have been discussing in this thread.

Brian Bushnell 01-19-2016 07:34 PM

I encourage you to measure the quality-score accuracy of the runs, as per this thread (near the top of the first post), if you want hard data to show to your sequencing facility. As GenoMax mentioned, quality scores do not always correctly reflect accuracy - I've seen them both too high and too low.

luc 01-19-2016 09:15 PM

It is possible (if the library was correctly sized and the run was not overloaded; unfortunately the latter is hard to diagnose on the HS4000 ) that the problem is due to the reagents as discussed in this thread. If we spend some time with the Illumina tech support on the phone, Illumina will usually replace the reagents for us in these cases.

We would definitely re-run your libraries (as they were generated by the core), but is actually very difficult to assign responsibility/blame for failures on the HS4000 if only a single lane is run.


Quote:

Originally Posted by melop (Post 187877)
Dear all. I'm an end user. Recently we got 2 HiSeq 4000 lanes from a local sequencing facility (150bp x 2 reads), PCR-free gDNA libraries made at the same facility. The quality looks quite bad, especially in read 2, only 60% and 70% of bases >Q30, which doesn't pass the Illumina specs. In this case, is it a common practice for the facility to redo the sequencing for us? We ask because it sounded like they don't want to redo it for us... How about the policy at your sequencing core?


GW_OK 01-20-2016 06:44 AM

Quote:

Originally Posted by luc (Post 187896)
It is possible (if the library was correctly sized and the run was not overloaded; unfortunately the latter is hard to diagnose on the HS4000 ) that the problem is due to the reagents as discussed in this thread. If we spend some time with the Illumina tech support on the phone, Illumina will usually replace the reagents for us in these cases.

We would definitely re-run your libraries (as they were generated by the core), but is actually very difficult to assign responsibility/blame for failures on the HS4000 if only a single lane is run.

I completely agree.

I am puzzled, though, why any Core would not immediately have had the run replaced by Illumina and reloaded if it did not meet spec.

neurula 01-25-2016 04:52 PM

Just to add another voice, we noticed this on our validation runs at install in late Nov/December. Q30s dropping after 100 cycles of read 2, ending the run at ~40%. Error rates up above 7-8% as well. It was actually most obvious in the control lanes of PhiX that we ran. Seems like it must be a reagent issue, probably with enzyme or cleavage mix.
This thread just prompted me to email our FAS about it again. We didn't have a very satisfactory response from tech support, they said they hadn't seen enough 4K runs to determine what was "normal", but that our runs meet spec.
My PI said he's heard through the grapevine that this problem is widespread though - at Yale, Cornell, NYGC etc.

melop 01-27-2016 09:00 AM

Quote:

Originally Posted by Brian Bushnell (Post 187890)
I encourage you to measure the quality-score accuracy of the runs, as per this thread (near the top of the first post), if you want hard data to show to your sequencing facility. As GenoMax mentioned, quality scores do not always correctly reflect accuracy - I've seen them both too high and too low.

This looks interesting and definitely will be useful for downstream analyses.
However I feel that the statistics should be measured in the same way Illumina does, since Illumina guarantees 75% bases >Q30 based on their software output.

melop 01-27-2016 09:07 AM

Quote:

Originally Posted by luc (Post 187896)
It is possible (if the library was correctly sized and the run was not overloaded; unfortunately the latter is hard to diagnose on the HS4000 ) that the problem is due to the reagents as discussed in this thread. If we spend some time with the Illumina tech support on the phone, Illumina will usually replace the reagents for us in these cases.

We would definitely re-run your libraries (as they were generated by the core), but is actually very difficult to assign responsibility/blame for failures on the HS4000 if only a single lane is run.

So I have 2 lanes, both of which fall below 75% >Q30 (60% and 70%). As far as I know, the other 6 lanes are all borderline, right at 75% >Q30. So the average across the whole flowcell would still be below standard. So perhaps there's a combination of different problems. I'm not sure how successful will they be in negotiating with Illumina with this borderline case though.

melop 01-27-2016 09:15 AM

Quote:

Originally Posted by GW_OK (Post 187929)
I completely agree.

I am puzzled, though, why any Core would not immediately have had the run replaced by Illumina and reloaded if it did not meet spec.

We are puzzled too. Last time I heard from them the core told us that they will only replace the run if Illumina replaces reagents for them. They were vague about what to do if Illumina doesn't. That was why I asked to see if this is a common practice among the other cores.

GenoMax 01-27-2016 09:44 AM

@melop: It is still not clear if you have talked with the facility about your data/concerns and what the result of that conversation was? Do you know if they contacted Illumina tech support about this run?

Based on your post in #29 it seems that other 6 lanes on this flowcell were very close to acceptable spec (Illumina spec only specifies 75% bases > Q30 for 2 x 150bp, which seems to refer to entire sequence output from that flowcell).

It is certainly possible that Illumina refused to provide free replacements. As with any customer service issue the kind of response one gets (in this case free replacements) may entirely depend on the support person you reach.

Ultimately your result may be due the characteristic of the library. A re-run may produce a very similar result. Only option then would be to remake the library and try again.

melop 01-27-2016 10:20 AM

Quote:

Originally Posted by GenoMax (Post 188306)
@melop: It is still not clear if you have talked with the facility about your data/concerns and what the result of that conversation was? Do you know if they contacted Illumina tech support about this run?

Based on your post in #29 it seems that other 6 lanes on this flowcell were very close to acceptable spec (Illumina spec only specifies 75% bases > Q30 for 2 x 150bp, which seems to refer to entire sequence output from that flowcell).

It is certainly possible that Illumina refused to provide free replacements. As with any customer service issue the kind of response one gets (in this case free replacements) may entirely depend on the support person you reach.

Ultimately your result may be due the characteristic of the library. A re-run may produce a very similar result. Only option then would be to remake the library and try again.

Yes, they said they will rerun if illumina replaces the reagent for them. They did not say what happens if not. But from the message they seemed to imply that rerunning is contingent upon their success in the negotiations. They stopped replying to our emails after that.

GenoMax 01-27-2016 11:03 AM

That is unfortunate. If they don't intend to re-run your samples for free (due to not getting free replacements) then they should have informed you.

Since they made the libraries this is one of those tough situations (for them, in terms of cost) where they should at least do something to retain your business (either re-run the current libraries to see if the problem recurs or remake the libraries for free but then charge you for running them, only if the new data looks good).

NextGenSeq 01-28-2016 09:15 AM

Quote:

Originally Posted by melop (Post 187877)
Dear all. I'm an end user. Recently we got 2 HiSeq 4000 lanes from a local sequencing facility (150bp x 2 reads), PCR-free gDNA libraries made at the same facility. The quality looks quite bad, especially in read 2, only 60% and 70% of bases >Q30, which doesn't pass the Illumina specs. In this case, is it a common practice for the facility to redo the sequencing for us? We ask because it sounded like they don't want to redo it for us... How about the policy at your sequencing core?

Illumina will almost always send replacement reagents if you use a small amount of spike-in PhiX sequenced with your library AND the PhiX did not give good sequencing results. If the PhiX gives good sequencing results the problem is the library not the instrument. If both give poor results it is likely an instrument or reagent problem.

If the sequencing facility did not do this they are amateurs and you shouldn't use them again.


All times are GMT -8. The time now is 09:28 AM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.