Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Run statistics on GA2 or GA2x...

    Hi everyone,
    I have thought for a while that it would be incredibly useful to get some idea of how other peoples instruments are performing compared to mine. It would certainly give me some possibilities when something looks suspicious with a run. There is always the possibility of spotting issues more quickl with a reagent as well, failures might crop up all over the place at the same time. This idea comes from an Affymetrix database that was setup in my last institute to compare Affy rpt QC files. I was also reinspired by the Sanger plots; http://www.sanger.ac.uk/Teams/Team117/#mpsa_error.

    To do this I would like to see some run metrics brought together in one place in a format that would be easily downloadable and comparable. Wherever they are brought together it would be best if anyone could upload files of run data easily. The easier it is to get data into a site the more useable the database becomes and it becomes self perpetuating.

    Posible metrics would include some of the summary stats; yield, cpt, %PF, error, etc. Alongside this we would need to see run length, run type, instrument version, library type. It might also be handy to have some data that people might consider sensitive, genome, instrument location, operator, etc. I would be happy to publish most of this from my facility and a quick question to the submitting scientist should release the rest.

    It is getting difficult to find out how well systems are perfomring when I talk to people as it is too easy to forget that we are not comparing identical runs; SE and PE, mRNA or ChIP, 35bp or 100bp, etc, etc, etc. What yield do you get from a 45bp SE run, we have had almost 6GBp which I think is good but I would like to kow if we should be trying harder!

    So how do we get started? Who will take up the chalenge and how can we decide they can be trusted to build something reliable? Would we be happy asking Illumina to do this? Will anyone else out there upload data, with or without the more sensitive metadata? Will anyone look at it?

    James.

  • #2
    This would be most interesting -
    I have wondered quite some time now, why the machine I get my data from differs so much from specs:
    GA2 (I guess - I'm just the data-guy):
    PhiX Lane 5: 4'897'847 reads PE (so 9795694 reads in total) each 76bp long.

    To me this seems way below what is possible, doesn't it?

    Best
    -Jonathan

    Comment


    • #3
      Jonathan,

      If the 4.9 million reads are PF reads then it is not unreasonably low assuming the sequencing facility is not loading a high amount of the controls. You would really need to know the number of raw clusters per lane to see if the flow cell is being utilized to its fullest. With the new pipeline 1.4 software (or SCS 2.4 with RTA) Illumina recommends loading at a concentration which will achieve 180,000-220,000 raw clusters per tile. For a GAII (100 tiles per lane) this would be 18-22 million raw clusters per lane and for the GAIIx (120 tiles per lane) the yield would be 21.6-26.4 million raw. Prior to the new software release the recommendation was for 120,000-150,000 clusters per tile. If the facility has not upped the density on their flow cells I would not be terribly disappointed to see 5 million PF clusters in a lane.

      Comment


      • #4
        Hi I'm interested in the metric of relating cluster density to % use-able sequences after built-in QC functions are run on the machine itself. I've heard stories of people getting great cluster density (>150,000) but only being able to map (and therefore use) 30%-50% of the reads. This type of QC might not be relevant to ChIP-seq users but resequencers and metagenomics people would like to know how efficient these machines are in practice.

        thanks for sharing your experiences,
        _der.

        Comment


        • #5
          cluster density

          Has anyone ever seen a Drop in intensities mid run? we Recently upgraded to the 1.4 pipline and the scs2.4 and the new PEM with the GA2x with the large reagent bottles. most of our runs were in the 40 cycle range. we recently ran a longer one 76 cycle. at about cycle 40 we saw a huge drop in signal intensity, across both lasers and all 4 nucleotides. Our FAS is saying that the FC is overloaded (at about 220 raw clusters per tile). Saying that for longer reads RTA has issues at cycle 40.. Has anyone else heard of this? Seen anything similar?? Sounds like BS to me. But I wanted to see if anyone has seen anything similar.

          He's also suggesting lower cluster number for longer reads? I'm not sure I understand why it would matter?

          Thanks,

          Sandy

          Comment


          • #6
            I can see why cluster density should be reduced when fragment size increases because clusters of fragments are not static but dynamic and can "sway" thus potentially interfering with neighbouring clusters. But why read length should matter, I don't know.

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Essential Discoveries and Tools in Epitranscriptomics
              by seqadmin




              The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
              04-22-2024, 07:01 AM
            • seqadmin
              Current Approaches to Protein Sequencing
              by seqadmin


              Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
              04-04-2024, 04:25 PM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, Today, 08:47 AM
            0 responses
            12 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-11-2024, 12:08 PM
            0 responses
            60 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 10:19 PM
            0 responses
            59 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 09:21 AM
            0 responses
            54 views
            0 likes
            Last Post seqadmin  
            Working...
            X