Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • base composition and base calling

    I've read that the illumina basecalling software has problems calibrating itself if the base composition in the first few bases of the reads isn't a roughly equal mix of nucleotides. We're thinking of sequencing constructs that begin with our own barcode and were wondering what the parameters are for correct base calling:

    - how many positions are used to calibrate?
    - what are the bounds on acceptable nucleotide mixture? eg, how far off from 25% each can you be?
    - I believe you can calibrate on something other than the first four bases. How far into the read can you wait to calibrate?
    - can different lanes in a run be calibrated differently? eg, if our sample is one lane of a run, does that make this easier or harder for the sequencing facilitiy?
    - does any of this vary between the GAII and HiSeq?

    Thanks!
    Alex

  • #2
    Hi Alex,

    To my knowledge the Illumina pipeline performs its crosstalk matrix and phasing/prephasing calibration during the first 4 cycles by default, and this can be altered with --matrix-cycles=n. Similar to using many cycles for cluster detection this will probably mean that the workstation PC will need to store more Images until the intitial calibration calculations are done, at which point the real-time data analysis will start. Using lots of cycles will cause a back-log on the workstation, but this should be manageable for at least 10 or so cycles I would think (at elast on a GA, not so sure about the HiSeq as it generates so much more data).

    You can avoid these problems by specifying a control-lane with a relatively normal base composition (--control-lane=..), such as a lane of PhiX or whole genome shotgun sequencing. Alternatively it is also possible not to perform calibration on the sample and use a pre-formatted calibration table (probably slightly different ones for GA and HiSeq).

    Something else you should consider is that you might potentially lose a certain amount of data because the cluster detection does not work normally if you have low-diversity at the start of sequences, and this is completely independent of a skewed base composition. This depends mainly on the number of barcodes you have in your sample, and the cluster density. In summary, the fewer barcodes and the higher you cluster density the more data you are likely going to lose. Please refer to this post for more information (http://seqanswers.com/forums/showthr...light=bareback), or send me an email if you have any further questions.

    Comment


    • #3
      Thank you! This plus your paper is very helpful.

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Strategies for Sequencing Challenging Samples
        by seqadmin


        Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
        03-22-2024, 06:39 AM
      • seqadmin
        Techniques and Challenges in Conservation Genomics
        by seqadmin



        The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

        Avian Conservation
        Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
        03-08-2024, 10:41 AM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, Yesterday, 06:37 PM
      0 responses
      10 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, Yesterday, 06:07 PM
      0 responses
      9 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 03-22-2024, 10:03 AM
      0 responses
      49 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 03-21-2024, 07:32 AM
      0 responses
      67 views
      0 likes
      Last Post seqadmin  
      Working...
      X