SEQanswers

Go Back   SEQanswers > Sequencing Technologies/Companies > Vendor Forum



Similar Threads
Thread Thread Starter Forum Replies Last Post
NextSeq 500 and HiSeq X Ten Services Coming Soon to Genohub.com Genohub Vendor Forum 11 04-22-2014 08:46 AM
$1,000 Exomes|$6,500 Genomes from EdgeBio EdgeBio Vendor Forum 1 10-18-2012 12:54 PM
MiSeq 500 cycle kits available yet? Bucky Illumina/Solexa 6 08-14-2012 12:11 PM
help! samtools gave me more than 500,000 snps slowsmile Bioinformatics 1 12-15-2011 08:24 AM
500 errors on the wiki... dan Wiki Discussion 3 08-14-2011 07:35 AM

Reply
 
Thread Tools
Old 05-12-2015, 05:50 PM   #81
kentawan
Member
 
Location: Singapore

Join Date: Apr 2014
Posts: 14
Default

Quote:
Originally Posted by TonyBrooks View Post
This upgrade really is a pain in the backside. I have no idea why Illumina couldn't make this back compatible and then use the RFID chips to determine whether to process the run as required.

I wonder what happens if someone comes wanting to compare to an older data set once we've switched to v2. They'll need to pay to re-run those old libraries on v2.
I agreed. We wrote a feedback email to their engineering team too. As you might have expected, it is a disappointing email. Lesson learnt, don't buy too many kits in one shot.
kentawan is offline   Reply With Quote
Old 05-12-2015, 05:56 PM   #82
kentawan
Member
 
Location: Singapore

Join Date: Apr 2014
Posts: 14
Default

Quote:
Originally Posted by rogerzzw View Post
Hi, Kentawan.

I am a beginner of NGS on Nextseq 500. I am primarily doing whole genome bisulfite sequencing with Nextseq 500. I saw your post on this forum and would like to learn from you.

1. based on what you said, you use V2 kit and got a very good cluster density. we used V2 but never get that high cluster density. Ours is about 140K. Can you share your experience on that? Appreciate!

2. We always had a big issue on Barcode reading, other metrics look very good. There are a high portion (35%) reads are reading as "GGGGGG", I think it is due to reading failure, but we do not know how to pin it down, we pooled two samples each time. Unfortunately, there was one time, we pooled with other 4 Chip-seq samples and still the same. Do you some idea how to fix it?

Thank you very much

Roger
Hi Roger,

We spiked in a good 20% freshly diluted PhiX library, as our library's first 6 bases are the same due to restriction enzyme processing.

As for your question 1, I'd like to know how do you do your library quantification? We used KAPA's library quantification kits. You might want to do it the old school way, clone the library into T-vector or something and Sanger sequence it to see if the P5 and P7 are sequence correct and in good shape. I suspect your P5 and P7s are not working properly.

as for question 2, again, would like to know if it's possible for you to Sanger sequence your library (let's say 50 clones would be good.). Illumina's P5 and P7 sequences are actually close to the annealing sites for the index primers. I suspect your P5 and P7 sequences and nearby sequences are off, hence the failure for the indexing primer to anneal and subsequently you get dark spots on the cluster (a.k.a. GGGGGG)

Hope this helps! Good luck and may the bases be with you!

Last edited by kentawan; 05-12-2015 at 06:01 PM.
kentawan is offline   Reply With Quote
Old 05-13-2015, 04:55 AM   #83
rogerzzw
Member
 
Location: Grand Rapids

Join Date: Mar 2015
Posts: 16
Default

@Kentawan

Hi, Kentawan

we used Kapa quantification kit, too. and we applied very strict quantification. We quantified individual library concentration firstly and pooled them together based on the measured concentration. We did another quantification when we dilute the pool in 20pm to make sure it is 20pm indeed. I believe our quantification is good enough.
considering cloning check, I do not know if it is proper because we wanna do whole genome sequencing. But I do agree that P5 and P7 might not be good enough and it is possible due to index annealing. I just do not how to confirm it. Do you get some idea?

Thank you very much

Roger
rogerzzw is offline   Reply With Quote
Old 05-13-2015, 05:26 AM   #84
rogerzzw
Member
 
Location: Grand Rapids

Join Date: Mar 2015
Posts: 16
Default

BTW, Kentawan

Another lab using the same kit and protocol do not have this kind issue at all, which makes me very confusion.
rogerzzw is offline   Reply With Quote
Old 12-14-2015, 03:06 PM   #85
cmbetts
Senior Member
 
Location: Bay Area

Join Date: Jun 2012
Posts: 109
Default

Quote:
Originally Posted by williamhorne View Post
Using High output we are actually getting over 500 million reads per run. Unlike our GAII, and HighSeq, we actually have to pay very close attention to cluster density. The target cluster density for high quality samples is 1.75pM-2pM. Anything above and below will results in under/over clustering. So your samples need to be very exact with concentration.

These are solely made to be streamlined with the BaseSpace. Right now it only works with BaseSpace onsite, not in the cloud as they are having some majority broker issues that still are not resolved. Make sure you do your research in regards to output files and data in regards to basespace because it is not a visual machine. It gives you the output files and you must use 3rd party software on a different computer to view the results. Very annoying.

Overall very impressed with the NextSeq's, not so much BaseSapce.
Does anyone have any feedback on what density would be considered overclustered on a NextSeq using v2 chemistry? We just got data back from a collaborator with terrible error rates in read 2 with lots of random stretches of variable length polyGs. Comparing to the SAV files from another successful run with an identically constructed library by the same facility, the only obvious run metric that jumps out at me (besides the terrible read quality) is that the failed run had ~20% higher cluster density (240k/mm^2 70%PF vs 200k/mm^2 80%PF). I'm mostly used to looking at HiSeq and MiSeq data, so I'm not sure whether this is significant or not.
cmbetts is offline   Reply With Quote
Old 12-15-2015, 01:04 AM   #86
TonyBrooks
Senior Member
 
Location: London

Join Date: Jun 2009
Posts: 298
Default

Quote:
Originally Posted by cmbetts View Post
Does anyone have any feedback on what density would be considered overclustered on a NextSeq using v2 chemistry? We just got data back from a collaborator with terrible error rates in read 2 with lots of random stretches of variable length polyGs. Comparing to the SAV files from another successful run with an identically constructed library by the same facility, the only obvious run metric that jumps out at me (besides the terrible read quality) is that the failed run had ~20% higher cluster density (240k/mm^2 70%PF vs 200k/mm^2 80%PF). I'm mostly used to looking at HiSeq and MiSeq data, so I'm not sure whether this is significant or not.
We've run exomes that clustered at 259k/mm2. The data still looked fine to us (92% >Q30, >90% alignment rates). The quality does begin to tail off when over-clustered though. 75bp are generally fine at that density, but the >100bp begins to look really poor. We also use short paired reads for RNA-Seq (43bp paired end) and this tolerates over-clustering much better.

On another note, we regularly see poly-G reads, (fastqc shows around 2-3% of over-represented sequences) but curiously this tends to happen on read 2 only (failed resynthesis?)
TonyBrooks is offline   Reply With Quote
Old 01-15-2016, 06:01 AM   #87
AlexT
Junior Member
 
Location: Germany

Join Date: May 2012
Posts: 8
Default

I am not sure how much we can use the cluster densities as a measure of run quality. We had good runs with low and high cluster densities, as well as very poor run with normal densities.
Compared to the MiSeq, our NextSeq is very fragile and the cluster densities go up and down without showing an obvious pattern. In case of the MiSeq we have very stable densities (but in this case usually prepared with Nextera).

Actually our highest clustered run performed very well and had the following specs:
clusters: 287-301k/mm^2
PF: 83,0-84,3
Q30: 87,9-90,1
also at 75bp
AlexT is offline   Reply With Quote
Old 11-17-2016, 06:14 AM   #88
cement_head
Senior Member
 
Location: Oxford, Ohio

Join Date: Mar 2012
Posts: 232
Default

I know this is an older thread, but now that more and more users of the NextSeq are out there - what is the concensus on the NextSeq data? Is it still problematic relative to MiSeq Data?

Thanks
cement_head is offline   Reply With Quote
Old 11-17-2016, 09:47 AM   #89
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

Unfortunately, Illumina's taken a turn for the worse again. I just analyzed some recent data from the NextSeq, HiSeq2500, and HiSeq 1T platforms of the same library. The NextSeq data is dramatically worse than last time I looked at it. Error rates are several times higher, there's a major A/T base frequency divergence in read 2, and the quality scores are inflated again at ~6 points higher than the actual quality. More disturbingly, the HiSeq quality scores are completely inaccurate now, as well, though the actual measured quality is still very high - average Q33 for read 1 and Q29 for read 2 for HiSeq2500, versus Q24 for read 1 and Q18 for read 2 on the NextSeq (those numbers are as measured by counting the match/mismatch rates from mapping, so essentially, NextSeq has roughly 10X the error rate of HiSeq). But the measured discrepancy between claimed and measured quality scores for the HiSeq2500 and HiSeq 1T are BOTH worse than the NextSeq, despite the NextSeq having binned quality scores, and as you can see there are large regions of quality scores simply missing from the HiSeq2500, such as Q3-Q11, Q17-Q21, and Q29. There are clearly major problems with Illumina's current base-calling software, as quality score assignment has drastically regressed since last time I measured it.

You can see the graphs in this Excel sheet that I've linked. "Raw" is the raw data, "Recal" is after recalibration (which changes the quality scores but nothing else). "NS" is NextSeq, "2500" is HiSeq2500, and "1T" is HiSeq 1T which unfortunately was only run at 2x101bp instead of 2x151bp on the other 2 platforms.

https://drive.google.com/file/d/0B3l...ew?usp=sharing

Last edited by Brian Bushnell; 11-17-2016 at 09:49 AM.
Brian Bushnell is offline   Reply With Quote
Old 11-17-2016, 09:57 AM   #90
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,978
Default

HiSeq 1T = HiSeq 2500 HO mode?

Bottom line: If one has a different sequencer accessible walk away from a NextSeq?

Are Q-scores still important (other than for de novo or diagnostic analyses)?

Do you know what version of bcl2fastq is being used for your data?

Last edited by GenoMax; 11-17-2016 at 10:07 AM.
GenoMax is offline   Reply With Quote
Old 11-17-2016, 10:06 AM   #91
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

I'm not sure; I think it's probably fine for quantification unless there's some bias issue, which I have not looked into. I wouldn't want to use it for variant-calling, particularly because a lot of the errors seem like systematic errors that cannot be overcome simply by sequencing deeper. We do use it for multiplexed single cells, because the NextSeq platform has shown lower rates of cross-talk than HiSeq or MiSeq and single-cell sequencing is greatly affected by even low levels of cross-talk. Also, I understand NextSeq is cheaper per base. But certainly, I would avoid the NextSeq (and HiSeq 3000/4000 which I suspect are similar) when possible, if you have access to Illumina's high quality platforms (HiSeq 2000/2500 or MiSeq).
Brian Bushnell is offline   Reply With Quote
Old 11-17-2016, 10:15 AM   #92
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,978
Default

Perhaps what you are observing is differences in bcl2fastq v.1.8.4 and 2.18.x?

bcl2fastq v.2.x is required for processing data from NextSeq and HiSeq 3000/4000. It can be used to process data from all current Illumina sequencers. It does binned quality for reads as I recall.

Is your data processed with the same version of bcl2fastq in all cases or was 2500 data processed using bcl2fastq v.1.8.4?
GenoMax is offline   Reply With Quote
Old 11-17-2016, 10:18 AM   #93
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

I'm not really sure. The HiSeq quality scores are not binned, though. I'm going to talk to the person who manages the Illumina software versions after gathering some more evidence, because we probably will want to roll back to an earlier version, once it's clear which earlier version was better.

Also, does have experience with 3rd-party Illumina base-callers?

Edit: We are using 2.16 for NextSeq and 1.8.4 for everything else.

Last edited by Brian Bushnell; 11-17-2016 at 12:20 PM.
Brian Bushnell is offline   Reply With Quote
Old 11-17-2016, 10:24 AM   #94
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,978
Default

Quote:
Originally Posted by Brian Bushnell View Post
I'm not really sure. The HiSeq quality scores are not binned, though.
That probably means they are using the older bcl2fastq (or CASAVA) v.1.8.4.

Quote:
I'm going to talk to the person who manages the Illumina software versions after gathering some more evidence, because we probably will want to roll back to an earlier version, once it's clear which earlier version was better.

Also, does have experience with 3rd-party Illumina base-callers?
That is NOT an option for NextSeq and HiSeq 3000/4000 which require bcl2fastq v.2.1x for conversion. Perhaps you can ask the person in charge to reprocess HiSeq 2500 data using bcl2fastq v.2.18.

I don't know if there are any 3rd party callers for new data.
GenoMax is offline   Reply With Quote
Old 11-17-2016, 10:56 AM   #95
AllSeq
Registered Vendor
 
Location: San Diego, CA

Join Date: Oct 2013
Posts: 138
Default

Quote:
Originally Posted by Brian Bushnell View Post
But certainly, I would avoid the NextSeq (and HiSeq 3000/4000 which I suspect are similar) when possible, if you have access to Illumina's high quality platforms (HiSeq 2000/2500 or MiSeq).
Why would the NextSeq and HiSeq 3000/4000 be similar? The use different chemistries and different flow cells. Wouldn't the 3000/4000 be most similar to the HiSeq X? (Or did you just mean they're similar in that they're both bad platforms, but for different reasons?)
__________________
AllSeq - The Sequencing Marketplace
info@AllSeq.com
www.AllSeq.com
AllSeq is offline   Reply With Quote
Old 11-17-2016, 11:05 AM   #96
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,978
Default

Quote:
Originally Posted by AllSeq View Post
Why would the NextSeq and HiSeq 3000/4000 be similar? The use different chemistries and different flow cells. Wouldn't the 3000/4000 be most similar to the HiSeq X? (Or did you just mean they're similar in that they're both bad platforms, but for different reasons?)
I think it comes down to the bcl2fastq version used for data processing (binned q-scores) for NextSeq and HiSeq 4000.

Hopefully @Brian will have some clarification once he has chased that down that information.
GenoMax is offline   Reply With Quote
Old 11-17-2016, 11:59 AM   #97
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

Quote:
Originally Posted by AllSeq View Post
Why would the NextSeq and HiSeq 3000/4000 be similar? The use different chemistries and different flow cells. Wouldn't the 3000/4000 be most similar to the HiSeq X? (Or did you just mean they're similar in that they're both bad platforms, but for different reasons?)
They use 2-color chemistry (IIRC). I don't know if the problem is the chemistry, the optics, or the software; but if it's the software, I'd expect the 3000/4000 to be more similar to the NextSeq than the 2500. Also, I've only looked at a single sample of HiSeq 4000 data, but the quality was low; similar to the NextSeq. Since I've seen both good and bad data from the same NextSeq machine, it's obviously possible to produce good data with 2-color chemistry and NextSeq optics. It would be nice if this was all a software issue.
Brian Bushnell is offline   Reply With Quote
Old 11-17-2016, 01:41 PM   #98
AllSeq
Registered Vendor
 
Location: San Diego, CA

Join Date: Oct 2013
Posts: 138
Default

The NextSeq uses 2 color chemistry, but the 3000/4000 uses the 'standard' 4 color chemistry (with patterned flow cells, just like the X).
__________________
AllSeq - The Sequencing Marketplace
info@AllSeq.com
www.AllSeq.com
AllSeq is offline   Reply With Quote
Old 11-17-2016, 05:28 PM   #99
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

Quote:
Originally Posted by AllSeq View Post
The NextSeq uses 2 color chemistry, but the 3000/4000 uses the 'standard' 4 color chemistry (with patterned flow cells, just like the X).
Ah, my mistake.
Brian Bushnell is offline   Reply With Quote
Old 11-24-2016, 07:40 AM   #100
Michal2213
Junior Member
 
Location: UK

Join Date: Apr 2015
Posts: 4
Default NextSeq suitable for allele-specific analysis?

Do you think that NextSeq would be suitable for allele-specific analysis? I am using mouse cells with hybrid genome and sort the reads belonging to different alleles based on SNP content. So far I was using HiSeq2000 which worked well. With NextSeq I would get the data several times faster but having read this whole thread I am not sure whether the NextSeq data quality will be good enough.
Michal2213 is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 11:13 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO