Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Obtaining cluster densities from a Hiseq2500 data set

    Hi,
    I'm making a wrapper for demultiplexing with bcl2fastq2 Conversion Software v2.17. Following demultiplexing I would like to collect various statistics from the run in a file.
    It is fairly easy to get raw cluster counters, PF cluster counts etc. However I'm having problems finding a file that contains information about the cluster density.
    I know that it can be found in the interop folder in binary format, and viewed with the Sequence Analysis Viewer, but I would like to collect it in a single file for later use.

    Any Ideas?

  • #2
    This can be used for parsing the InterOp folder files: https://bitbucket.org/invitae/illuminate

    Comment


    • #3
      Oh, that's excellent. I was wondering just yesterday if something existed to parse from the InterOp binaries. Thanks!

      Comment


      • #4
        I wrote the very first tiny part of that, and people at my company finished it! I still can't believe ILMN doesn't provide anything...grrrr

        Comment


        • #5
          Nice, our sequencing folks were asking me about programmatically storing stuff from those files just yesterday. Now I don't have to reinvent the wheel!

          Comment


          • #6
            Not directly related to original post but with the patterned flowcells, cluster number/density becomes irrelevant (since that is a fixed number). %PF is the thing to watch and numbers are in the demultiplex report coming from bcl2fastq v.2.17.x.
            Last edited by GenoMax; 01-14-2016, 02:17 PM.

            Comment


            • #7
              First of all, thanks to Genomax for leading me to the illuminate program. this provide just what I needed. While located in the runfolder I ran the command "illuminate --tile ." and got a quick summary
              TILE METRICS
              ------------
              Mean Cluster Density: 829082
              Mean PF Cluster Density: 497376
              Total Clusters: 305632923
              Total PF Clusters: 183352987
              Percentage of Clusters PF: 59.991242
              Aligned to PhiX: 0.000014
              Read - PHASING / PRE-PHASING:
              1 - 0.001078 / 0.000119
              2 - 0.000000 / 0.000000
              3 - 0.000955 / 0.000337

              However I needed to get the density per lane.
              Adding --csv to the command " illuminate --tile --csv . > tileinfo.csv" enabled me to parse the information of each tile to a CSV file. In my search for other parses I found the R package savR, and here I got the information on what the different codes are:
              100 Cluster Density
              101 PF Cluster Density
              102 Number of clusters
              103 Number of PF clusters
              400 Control lane

              Now it was fairly simple, to filter lines based on the code, and to sum up the numbers for each lane, and get the average cluster density per lane, I checked and I got the same number as shown in the summery tab using the Sequence analysis viewer :-)

              Comment


              • #8
                I am using Illuminate, its awesome, but it does not support the files from NextSeq... Anyone have any recommendations?

                Comment


                • #9
                  Illumina makes a C++ library available to parse the contents of InterOp folder here: https://github.com/Illumina/interop That should be compatible will all extant Illumina sequencers.

                  Comment


                  • #10
                    Originally posted by angsm View Post
                    I am using Illuminate, its awesome, but it does not support the files from NextSeq... Anyone have any recommendations?
                    From my experience it works just fine on NextSeq runs. The intensity metrics are a little funny due to the 2 color chemistry but other than that there shouldn't be any problems.

                    Comment


                    • #11
                      Ohhh. I did not try the python library itself, it works! I was using the command line and the version was older.

                      Thanks!

                      Comment

                      Latest Articles

                      Collapse

                      • seqadmin
                        Strategies for Sequencing Challenging Samples
                        by seqadmin


                        Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                        03-22-2024, 06:39 AM
                      • seqadmin
                        Techniques and Challenges in Conservation Genomics
                        by seqadmin



                        The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                        Avian Conservation
                        Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                        03-08-2024, 10:41 AM

                      ad_right_rmr

                      Collapse

                      News

                      Collapse

                      Topics Statistics Last Post
                      Started by seqadmin, Yesterday, 06:37 PM
                      0 responses
                      8 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, Yesterday, 06:07 PM
                      0 responses
                      8 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 03-22-2024, 10:03 AM
                      0 responses
                      49 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 03-21-2024, 07:32 AM
                      0 responses
                      66 views
                      0 likes
                      Last Post seqadmin  
                      Working...
                      X