SEQanswers

Go Back   SEQanswers > Sequencing Technologies/Companies > Illumina/Solexa



Similar Threads
Thread Thread Starter Forum Replies Last Post
Low Cluster Densities in MiSEQ?? Possible explanations??? abyss Illumina/Solexa 30 07-08-2015 10:42 AM
Illumina HiSeq2500 data stormin Bioinformatics 3 08-27-2014 08:00 PM
Cluster density on a HiSeq2500 pmiguel Illumina/Solexa 2 12-07-2012 07:00 AM
MiSeq vs HiSeq cluster densities pmiguel Illumina/Solexa 3 05-17-2012 01:18 AM
v3: Effect of high cluster densities on cluster PF and %Q30 pmiguel Illumina/Solexa 3 10-05-2011 05:36 AM

Reply
 
Thread Tools
Old 01-14-2016, 06:31 AM   #1
Sonderkar
Junior Member
 
Location: Denmarrk, Aalborg

Join Date: Jan 2009
Posts: 3
Default Obtaining cluster densities from a Hiseq2500 data set

Hi,
I'm making a wrapper for demultiplexing with bcl2fastq2 Conversion Software v2.17. Following demultiplexing I would like to collect various statistics from the run in a file.
It is fairly easy to get raw cluster counters, PF cluster counts etc. However I'm having problems finding a file that contains information about the cluster density.
I know that it can be found in the interop folder in binary format, and viewed with the Sequence Analysis Viewer, but I would like to collect it in a single file for later use.

Any Ideas?
Sonderkar is offline   Reply With Quote
Old 01-14-2016, 07:02 AM   #2
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,795
Default

This can be used for parsing the InterOp folder files: https://bitbucket.org/invitae/illuminate
GenoMax is offline   Reply With Quote
Old 01-14-2016, 07:48 AM   #3
Jessica_L
Senior Member
 
Location: Washington, D.C. metro area

Join Date: Feb 2010
Posts: 116
Default

Oh, that's excellent. I was wondering just yesterday if something existed to parse from the InterOp binaries. Thanks!
Jessica_L is offline   Reply With Quote
Old 01-14-2016, 08:28 AM   #4
ECO
--Site Admin--
 
Location: SF Bay Area, CA, USA

Join Date: Oct 2007
Posts: 1,355
Default

I wrote the very first tiny part of that, and people at my company finished it! I still can't believe ILMN doesn't provide anything...grrrr
ECO is offline   Reply With Quote
Old 01-14-2016, 09:52 AM   #5
dpryan
Devon Ryan
 
Location: Freiburg, Germany

Join Date: Jul 2011
Posts: 3,479
Default

Nice, our sequencing folks were asking me about programmatically storing stuff from those files just yesterday. Now I don't have to reinvent the wheel!
dpryan is offline   Reply With Quote
Old 01-14-2016, 10:24 AM   #6
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,795
Default

Not directly related to original post but with the patterned flowcells, cluster number/density becomes irrelevant (since that is a fixed number). %PF is the thing to watch and numbers are in the demultiplex report coming from bcl2fastq v.2.17.x.

Last edited by GenoMax; 01-14-2016 at 01:17 PM.
GenoMax is offline   Reply With Quote
Old 01-15-2016, 01:19 AM   #7
Sonderkar
Junior Member
 
Location: Denmarrk, Aalborg

Join Date: Jan 2009
Posts: 3
Default

First of all, thanks to Genomax for leading me to the illuminate program. this provide just what I needed. While located in the runfolder I ran the command "illuminate --tile ." and got a quick summary
TILE METRICS
------------
Mean Cluster Density: 829082
Mean PF Cluster Density: 497376
Total Clusters: 305632923
Total PF Clusters: 183352987
Percentage of Clusters PF: 59.991242
Aligned to PhiX: 0.000014
Read - PHASING / PRE-PHASING:
1 - 0.001078 / 0.000119
2 - 0.000000 / 0.000000
3 - 0.000955 / 0.000337

However I needed to get the density per lane.
Adding --csv to the command " illuminate --tile --csv . > tileinfo.csv" enabled me to parse the information of each tile to a CSV file. In my search for other parses I found the R package savR, and here I got the information on what the different codes are:
100 Cluster Density
101 PF Cluster Density
102 Number of clusters
103 Number of PF clusters
400 Control lane

Now it was fairly simple, to filter lines based on the code, and to sum up the numbers for each lane, and get the average cluster density per lane, I checked and I got the same number as shown in the summery tab using the Sequence analysis viewer :-)
Sonderkar is offline   Reply With Quote
Old 10-13-2016, 11:50 PM   #8
angsm
Junior Member
 
Location: Singapore

Join Date: Oct 2016
Posts: 2
Default

I am using Illuminate, its awesome, but it does not support the files from NextSeq... Anyone have any recommendations?
angsm is offline   Reply With Quote
Old 10-14-2016, 03:24 AM   #9
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,795
Default

Illumina makes a C++ library available to parse the contents of InterOp folder here: https://github.com/Illumina/interop That should be compatible will all extant Illumina sequencers.
GenoMax is offline   Reply With Quote
Old 10-14-2016, 08:57 AM   #10
kcchan
Senior Member
 
Location: USA

Join Date: Jul 2012
Posts: 182
Default

Quote:
Originally Posted by angsm View Post
I am using Illuminate, its awesome, but it does not support the files from NextSeq... Anyone have any recommendations?
From my experience it works just fine on NextSeq runs. The intensity metrics are a little funny due to the 2 color chemistry but other than that there shouldn't be any problems.
kcchan is offline   Reply With Quote
Old 10-16-2016, 10:51 PM   #11
angsm
Junior Member
 
Location: Singapore

Join Date: Oct 2016
Posts: 2
Default

Ohhh. I did not try the python library itself, it works! I was using the command line and the version was older.

Thanks!
angsm is offline   Reply With Quote
Reply

Tags
bcl2fastq2, cluster density

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 12:00 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2018, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO