Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • nouse
    Member
    • Sep 2013
    • 11

    What is the official "first sequenced position" (for overlap calculation)??

    The question ermerged after reviewing this older application note by Illumina:



    Their figure 1 indicates that the first position which is counted towards the targeted read length is the first base following the 3' end of the gene-specific primer.

    Hence, the 515F-806R system targeting the V4 of the 16S rRNA gene is perfectly useable with a 2 x 150 base pairs MiSeq run, because of an 46 bp overlap within the 253 bp fragment covered between both gene-specific primers.

    However, if one would assume that the first base counting for the target read length is the base following the sequencing primer, that obviously changes. So, if one would count in both primers, the overlap is reduced to <10 bases, even in the predictably worst part of boths read, qualitywise. Adding barcode(s) would even result in no overlap.

    I hope this examples clarifies the question.

    The underlying task is to find a primer pair that is feasible with MiSeq 2 x 250 with very good coverage and HiSeq 2 x 150 with lower coverage but higher yields.

    I doubt that 2x150 bp HiSeq is a good system for hiseq. However, it seems that its OK to just use the forward read (according to caporaso et al 2011). What do you think?
    Last edited by nouse; 02-14-2017, 02:26 PM.
  • Brian Bushnell
    Super Moderator
    • Jan 2014
    • 2709

    #2
    For amplicons, my understanding is (and I would appreciate if someone closer to the wet-lab could confirm or deny this) that the molecules being sequenced are laid out like this:

    [adapter1][barcode1][more adapter1][sequencing primer1][pcr primer1] actual genomic sequence [pcr primer2][sequencing primer2][more adapter2][barcode2][adapter2]

    The sequencing primers are not (usually) part of the read, unless you are using staggered variable-length primers to increase library diversity, but in that case only a few bp of it get sequenced. The PCR primers are always part of the read. I think that whether the PCR primers are genomic or synthetic depends on the process; I've never really gotten a conclusive answer on that.

    Comment

    • nouse
      Member
      • Sep 2013
      • 11

      #3
      Thanks for your answer.
      From my experience with the HiSeq, the raw sequences i got included barcodes and pcr primers (which makes sense, since they have been sequenced after all).

      This indicates that the figure 1 of the illumina application note is either misleading or wrong or they used their pcr primer regions as a target for another sequencing round.

      Comment

      • nucacidhunter
        Jafar Jabbari
        • Jan 2013
        • 1250

        #4
        Brian’s explanation showing amplicon library structure is correct. However, I would add that if there are variable length diversity nucleotides or barcodes at 5’ end of either PCR primers they will be sequenced as well along with the PCR primers. If someone uses custom sequencing primers that binds to the PCR primers then PCR primers will not be sequenced (sequencing primers will not be required to be included in adapter design). In this case diversity nucleotides added to 5’ end of primers will not be useful because they cannot be sequenced.

        Fig 1 in Illumina’s note indicates that the hypervariable region is 254 bp and the minimum length of amplified region including conserved 5’ and 3’ flanking regions (used for priming) is 291 bp so 2x150 will not be enough to provide 46 bp overlap unless custom primers were used for sequencing. But the figure indicates that standard Illumina sequencing primers were used for sequencing thus the figure is incorrect.

        Comment

        • kmcarr
          Senior Member
          • May 2008
          • 1181

          #5
          Originally posted by Brian Bushnell View Post
          For amplicons, my understanding is (and I would appreciate if someone closer to the wet-lab could confirm or deny this) that the molecules being sequenced are laid out like this:

          [adapter1][barcode1][more adapter1][sequencing primer1][pcr primer1] actual genomic sequence [pcr primer2][sequencing primer2][more adapter2][barcode2][adapter2]

          The sequencing primers are not (usually) part of the read, unless you are using staggered variable-length primers to increase library diversity, but in that case only a few bp of it get sequenced. The PCR primers are always part of the read. I think that whether the PCR primers are genomic or synthetic depends on the process; I've never really gotten a conclusive answer on that.
          It is not always the case that the PCR primers are part of the read. In the two most cited 16S-V4 protocols (Caporaso & Knight, Kozich & Schloss) custom sequencing primers which match the target specific PCR primer are added to the MiSeq run. This results in read data which starts immediately after the 3' ends of the PCR primers so there is no PCR primer sequence to trim from your reads, and hence no wasted sequence.

          Caporaso, J. G., Lauber, C. L., Walters, W. A., Berg-Lyons, D., Lozupone, C. A., Turnbaugh, P. J., et al. (2011). Global patterns of 16S rRNA diversity at a depth of millions of sequences per sample. Proceedings of the National Academy of Sciences, 108 Suppl 1, 4516–4522. http://doi.org/10.1073/pnas.1000080107

          Kozich, J. J., Westcott, S. L., Baxter, N. T., Highlander, S. K., & Schloss, P. D. (2013). Development of a dual-index sequencing strategy and curation pipeline for analyzing amplicon sequence data on the MiSeq Illumina sequencing platform. Applied and Environmental Microbiology, 79(17), 5112–5120. http://doi.org/10.1128/AEM.01043-13

          Comment

          • Brian Bushnell
            Super Moderator
            • Jan 2014
            • 2709

            #6
            Originally posted by kmcarr View Post
            In the two most cited 16S-V4 protocols (Caporaso & Knight, Kozich & Schloss) custom sequencing primers which match the target specific PCR primer are added to the MiSeq run. This results in read data which starts immediately after the 3' ends of the PCR primers so there is no PCR primer sequence to trim from your reads, and hence no wasted sequence.
            Oh, that's clever. I wonder if that caused some compromises that limit the diversity of organisms that will amplify? I guess I need to read the papers

            Comment

            Latest Articles

            Collapse

            • SEQadmin2
              Nine Things a Sample Prep Scientist Thinks About Before Sequencing
              by SEQadmin2


              I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

              Here are nine questions we think about, in roughly the order they matter, before...
              06-18-2026, 07:11 AM
            • SEQadmin2
              From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
              by SEQadmin2


              Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


              The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
              ...
              06-02-2026, 10:05 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by SEQadmin2, 06-26-2026, 11:10 AM
            0 responses
            15 views
            0 reactions
            Last Post SEQadmin2  
            Started by SEQadmin2, 06-17-2026, 06:09 AM
            0 responses
            49 views
            0 reactions
            Last Post SEQadmin2  
            Started by SEQadmin2, 06-09-2026, 11:58 AM
            0 responses
            107 views
            0 reactions
            Last Post SEQadmin2  
            Started by SEQadmin2, 06-05-2026, 10:09 AM
            0 responses
            125 views
            0 reactions
            Last Post SEQadmin2  
            Working...