SEQanswers

Go Back   SEQanswers > Sequencing Technologies/Companies > Vendor Forum



Similar Threads
Thread Thread Starter Forum Replies Last Post
NextSeq 500 and HiSeq X Ten Services Coming Soon to Genohub.com Genohub Vendor Forum 11 04-22-2014 08:46 AM
$1,000 Exomes|$6,500 Genomes from EdgeBio EdgeBio Vendor Forum 1 10-18-2012 12:54 PM
MiSeq 500 cycle kits available yet? Bucky Illumina/Solexa 6 08-14-2012 12:11 PM
help! samtools gave me more than 500,000 snps slowsmile Bioinformatics 1 12-15-2011 08:24 AM
500 errors on the wiki... dan Wiki Discussion 3 08-14-2011 07:35 AM

Reply
 
Thread Tools
Old 03-06-2015, 06:51 AM   #41
fanli
Senior Member
 
Location: California

Join Date: Jul 2014
Posts: 198
Default

Hi all,

Just wanted to add our data as well - this was from an RNA-seq library and I don't have paired HiSeq data. Still, you can see some ambitious reporting of quality scores albeit not nearly as bad as what aeonsim showed.

We're hopefully going to run a v2 kit soon and I'll update with those stats when I get them!
Attached Files
File Type: pdf recalibration_plots.pdf (165.7 KB, 75 views)
fanli is offline   Reply With Quote
Old 03-06-2015, 09:01 AM   #42
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

Quote:
Originally Posted by GenoMax View Post
The index should be ok. I think Brian is concatenating all chromosomes and then creating the index so that file is not a literal equivalent of human/mouse genome (file I have looks similar to yours).
That's correct. They're called chromosomes for legacy reasons (the chunks used to be one real chromosome each) but it's more efficient to pack them.
Brian Bushnell is offline   Reply With Quote
Old 03-08-2015, 04:09 AM   #43
aeonsim
Member
 
Location: Belgium

Join Date: Jun 2011
Posts: 45
Default

Quote:
Originally Posted by fanli View Post
Hi all,

Just wanted to add our data as well - this was from an RNA-seq library and I don't have paired HiSeq data. Still, you can see some ambitious reporting of quality scores albeit not nearly as bad as what aeonsim showed.

We're hopefully going to run a v2 kit soon and I'll update with those stats when I get them!
I'd acctually say they're worse than what we had, considering your using PE80bp and the first 10 or so bases on the forward reads shows an average Quality score drop of ~10 on the Phred Scale (~30 to 20).

However our conculsion from our testing is that the NextSeq with V1 chemistry is ok for RNAseq as the reads still map fine and the coverage is high, it's however not suitable for variant calling especially when one is interested in de novo variants or low coverage. As a result it's only being used internally for RNAseq currently.

We will aparently get access to the V2 kits as soon as they're available to see if that fixes the issue.

Last edited by aeonsim; 03-08-2015 at 04:26 AM.
aeonsim is offline   Reply With Quote
Old 03-09-2015, 01:54 AM   #44
Elsie
Member
 
Location: Australia

Join Date: Mar 2011
Posts: 85
Default

No error, just Nas which is why I think I am missing something:
reformat.sh in1=R1.fastq in2=R2.fastq out=interleaved
gzip interleaved
bbmap.sh ref=hg19.fa
bbduk.sh in=interleaved.gz out=trimmed.fq.gz ktrim=r k=23 hdist=1 mink=11 tpe tbo minlen=90 ref=truseq.fa.gz,nextera.fa.gz
bbmap.sh maxindex=200 in=trimmed.fq.fq mhist=mhist.txt bhist=bhist.txt qhist=qhist.txt qahist=qahist.txt

BBMap version 34.56
Set match histogram output to mhist.txt
Set base content histogram output to bhist.txt
Set quality histogram output to qhist.txt
Set quality accuracy histogram output to qahist.txt
Retaining first best site only for ambiguous mappings.
No output file.
Set genome to 1

Loaded Reference: 5.025 seconds.
Loading index for chunk 1-7, build 1
Generated Index: 6.192 seconds.
Analyzed Index: 7.512 seconds.
Cleared Memory: 0.461 seconds.
Processing reads in single-ended mode.
Started read stream.
Started 16 mapping threads.
Detecting finished threads: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15

------------------ Results ------------------

Genome: 1
Key Length: 13
Max Indel: 200
Minimum Score Ratio: 0.56
Mapping Mode: normal
Reads Used: 0 (0 bases)

Mapping: 0.447 seconds.
Reads/sec: 0.00
kBases/sec: 0.00


Read 1 data: pct reads num reads pct bases num bases

mapped: NaN% 0 NaN% 0
unambiguous: NaN% 0 NaN% 0
ambiguous: NaN% 0 NaN% 0
low-Q discards: NaN% 0 NaN% 0

perfect best site: NaN% 0 NaN% 0
semiperfect site: NaN% 0 NaN% 0

Match Rate: NA NA NaN% 0
Error Rate: NaN% 0 NaN% 0
Sub Rate: NaN% 0 NaN% 0
Del Rate: NaN% 0 NaN% 0
Ins Rate: NaN% 0 NaN% 0
N Rate: NaN% 0 NaN% 0

Total time: 19.975 seconds.

Any advice greatly appreciated. thanks.
Elsie is offline   Reply With Quote
Old 03-09-2015, 03:50 AM   #45
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,976
Default

@Elsie: Are you trying to analyze NextSeq500 data or just creating stats? This thread was originally about quality of NextSeq500 reads and the procedure that Brian had posted was to create stats files (not actual alignments).

If you are actually trying to analyze real data then there is no need to create interleaved data sets. You can directly trim and then align R1/R2 reads against human genome. You need to specify an output file to store the aligned reads.

A minimal command line for doing the mapping would be following. More examples in the BBMap thread: http://seqanswers.com/forums/showthread.php?t=41057
Code:
$ bbmap.sh -Xmx30g in=trimmedfq.gz path=/path_to_BBMap_index_top_folder_with_ref_directory/ out=sample_ID.sam qin=33
Change the path to BBMap index according to your local path. Add additional options (there are plenty) as needed depending on kind of experiment you are analyzing.
GenoMax is offline   Reply With Quote
Old 03-09-2015, 09:05 AM   #46
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

I think I see the problem - you're using the wrong file name for the reads:

bbduk.sh in=interleaved.gz out=trimmed.fq.gz ktrim=r k=23 hdist=1 mink=11 tpe tbo minlen=90 ref=truseq.fa.gz,nextera.fa.gz

bbmap.sh maxindex=200 in=trimmed.fq.fq mhist=mhist.txt bhist=bhist.txt qhist=qhist.txt qahist=qahist.txt

...

Reads Used: 0 (0 bases)

Normally, BBMap should throw an exception saying it can't find the input file if it does not exist, so I assume there is an empty file named "trimmed.fq.fq".
Brian Bushnell is offline   Reply With Quote
Old 03-09-2015, 11:32 AM   #47
Elsie
Member
 
Location: Australia

Join Date: Mar 2011
Posts: 85
Default

Hi Brian, hi Genomax,

thanks for the replies.
Brian - that was just a typo by me, it was the correct file.
Genomax - I am after stats.
I'm just on my way into work, 6.30am here, I'll log on again there and have another look at my commands.
thanks.
Elsie is offline   Reply With Quote
Old 03-09-2015, 11:39 AM   #48
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,976
Default

@Elsie: This may sound like a dumb question but have you made sure that your interleaved.fq.gz and trimmed.fq.gz files have contents (i.e. they are non-zero bytes in size).

Post 3-4 sequences from your interleaved and trimmed.fq.gz files (use zmore or zless).
GenoMax is offline   Reply With Quote
Old 03-09-2015, 01:07 PM   #49
Elsie
Member
 
Location: Australia

Join Date: Mar 2011
Posts: 85
Default

Hi GenoMax,

Your dumb question was, unfortunately, spot on - interleaved is fine but trimmed is tiny - hard to map when there is nothing to map! I'll double check this, thank you.
Elsie is offline   Reply With Quote
Old 03-09-2015, 02:49 PM   #50
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

I'd be interested in seeing the stderr log of BBDuk... it's implausible that ALL of your reads are 100% adapter.
Brian Bushnell is offline   Reply With Quote
Old 03-09-2015, 02:50 PM   #51
Elsie
Member
 
Location: Australia

Join Date: Mar 2011
Posts: 85
Default

Thanks Brian. BBmap is currently running, I'll post the results. By the way, I always want to type bbduck...
thanks.
Elsie is offline   Reply With Quote
Old 03-30-2015, 12:50 PM   #52
Elsie
Member
 
Location: Australia

Join Date: Mar 2011
Posts: 85
Default

Hi Brian,
My plots look similar, slightly better than yours, but I have decided to postpone posting them as we are moving to V2 chemistry in a few weeks. I'll redo the plots then and post them. Thanks again for all your help, it is greatly appreciated.
Elsie is offline   Reply With Quote
Old 03-30-2015, 03:45 PM   #53
kentawan
Member
 
Location: Singapore

Join Date: Apr 2014
Posts: 14
Default

Hi guys,

Just to take note that the v2 chemistry requires NCS 1.4

NCS 1.4 is NOT BACKWARD COMPATIBLE for v1 kits. Better to clear up your V1 kits before proceeding with NCS 1.4 upgrades. Illumina is making our life here difficult since the machine is a shared asset in our institution, hence we have to wait for the entire institute to clear up their v1 kits before we can proceed with the upgrade.
kentawan is offline   Reply With Quote
Old 04-01-2015, 07:18 AM   #54
fanli
Senior Member
 
Location: California

Join Date: Jul 2014
Posts: 198
Default

You could technically dual boot to have both NCS versions
fanli is offline   Reply With Quote
Old 04-15-2015, 02:22 PM   #55
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

Hi all,

I have the comparative results now for the same library using NexSeq V1 and V2 chemistry, 2x150bp from a bacteria. V2 looks far better.

V1:




V2:




For V2 chemistry, the measured quality is much higher (particularly for read 1), the quality scores are more accurate, and the base frequencies don't diverge as much. Also, 79% of the reads mapped error-free, up from 50% for V1.

I also looked at HiSeq2500 data for the same library and it is similar in quality to the NextSeq V2.
Attached Images
File Type: png NextSeq_V1_1.png (34.6 KB, 375 views)
File Type: png NextSeq_V1_2.png (60.1 KB, 344 views)
File Type: png NextSeq_V2_1.png (34.5 KB, 361 views)
File Type: png NextSeq_V2_2.png (55.5 KB, 351 views)
Brian Bushnell is offline   Reply With Quote
Old 04-15-2015, 02:42 PM   #56
ymc
Senior Member
 
Location: Hong Kong

Join Date: Mar 2010
Posts: 498
Default

Quote:
Originally Posted by Brian Bushnell View Post
Hi all,

I have the comparative results now for the same library using NexSeq V1 and V2 chemistry, 2x150bp from a bacteria. V2 looks far better.

V1:


V2:


For V2 chemistry, the measured quality is much higher (particularly for read 1), the quality scores are more accurate, and the base frequencies don't diverge as much. Also, 79% of the reads mapped error-free, up from 50% for V1.

I also looked at HiSeq2500 data for the same library and it is similar in quality to the NextSeq V2.
That's great news! Illumina should thank you for your report.
ymc is offline   Reply With Quote
Old 04-21-2015, 04:32 PM   #57
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

I finished a quality recalibration tool. It seems to work well on NextSeq data with binned quality scores.



That shows two different recalibration methods. Even though method 2 looks more accurate, 1 has a larger range and seems to give better results.
Attached Images
File Type: png Quality_Recalibration_small.png (61.8 KB, 287 views)
Brian Bushnell is offline   Reply With Quote
Old 04-30-2015, 12:53 AM   #58
Zwerver
Junior Member
 
Location: North Europe

Join Date: Apr 2015
Posts: 1
Default

Hello,

we have one big project (RNA and exome seq) midway, for which we have about half of the necessary NextSeq kits (v1). What do you think, can we switch to v2 in the middle of the project, after the v1 kits have been used up? Do you see a problem in the data analysis part, so the data is not actually comparable as the v2 quality is higher?
Zwerver is offline   Reply With Quote
Old 04-30-2015, 02:37 AM   #59
TonyBrooks
Senior Member
 
Location: London

Join Date: Jun 2009
Posts: 298
Default

I asked the same question to Illumina Tech Support last week. Here's their response.

"Besides the changes in the software, there was a significant change in the v2 reagents. We changed the dyes of each base to improve the base call and overall sequencing data. This means that we have changed Q30 tables and other features in the software. Overall we do not recommend to compare data obtained between different versions of software.

Depending on which application, experimental design and data analysis workflow you may not have significatnt differences. It is your choice to set up the appropriate controls and technical/biological replicates.

The best test is to re-sequence a sample with v2 reagents that you have sequenced before with v1 reagents and compare directly the results. "
TonyBrooks is offline   Reply With Quote
Old 04-30-2015, 03:19 AM   #60
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,976
Default

@Zwerver: This is going to be a tough decision.

You could buy a bunch of V1 reagents but V1/V2 require different versions of NCS so until you go through V1 reagents no one else will be able to use V2 chemistry on that machine.
GenoMax is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 03:31 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO