Hi all,
I'd love some help sorting this out, as I'm new to this field. I want to start making 16S amplicon libraries for illumina sequencing. As this is pretty new, there aren't super well-established protocols yet, and I'm trying to pick through what's out there and come up with the easiest, most effective, most cost-conscious method.
Caporaso et al, PNAS 2011 (here) use a three-primer system, which I guess is what Illumina's expensive 12-maximum indexing system uses, as well as the Nextera system. My home core is nervous about the 3-primer system, but could be convinced. I'm worried about the increased bioinformatic difficulty (is there any?), concerns I've seen here on seqanswers about library quality, and general increased complexity of this approach.
Bartram et al in AEM 2011 (here) use a different, seemingly simpler strategy with barcodes in line with the sequencing primer and illumina adapters. I think this ends up with them sequencing through the primer, which "wastes" precious illumina read length, but if it ends up being simpler, I'm intrigued by it. I don't understand, though, how the barcode gets read, as it seems to me to be on the wrong side of the illumina sequencing primer?
here's the description: Lowercase letters denote adapter sequences necessary for binding to the flow cell, underlined lowercase are binding sites for the Illumina sequencing primers, bold uppercase highlight the index sequences (the first 12 indexes were obtained from Illumina) and regular uppercase are the V3 region primers (341F on for the forward primers and 518R for the reverse primers).
and the sequences:
V3_f_modified
aatgatacggcgaccaccgagatctacactctttccctacacgacgctcttccgatctNNNNCCTACGGGAGGCAGCAG
an example of a barcoded reverse:
aagcagaagacggcatacgagatCGTGATgtgactggagttcagacgtgtgctcttccgatctATTACCGCGGCTGCTGG
Final questions:
- what sort of modifications do you need to order on these primers? phosphate, phosphothioate, ...?
- the second paper mentions the improvement when using the four N's, so as to improve the complexity and help with cluster generation. Does anyone have experience with this?
- any other practical help would be really appreciated!
I'd love some help sorting this out, as I'm new to this field. I want to start making 16S amplicon libraries for illumina sequencing. As this is pretty new, there aren't super well-established protocols yet, and I'm trying to pick through what's out there and come up with the easiest, most effective, most cost-conscious method.
Caporaso et al, PNAS 2011 (here) use a three-primer system, which I guess is what Illumina's expensive 12-maximum indexing system uses, as well as the Nextera system. My home core is nervous about the 3-primer system, but could be convinced. I'm worried about the increased bioinformatic difficulty (is there any?), concerns I've seen here on seqanswers about library quality, and general increased complexity of this approach.
Bartram et al in AEM 2011 (here) use a different, seemingly simpler strategy with barcodes in line with the sequencing primer and illumina adapters. I think this ends up with them sequencing through the primer, which "wastes" precious illumina read length, but if it ends up being simpler, I'm intrigued by it. I don't understand, though, how the barcode gets read, as it seems to me to be on the wrong side of the illumina sequencing primer?
here's the description: Lowercase letters denote adapter sequences necessary for binding to the flow cell, underlined lowercase are binding sites for the Illumina sequencing primers, bold uppercase highlight the index sequences (the first 12 indexes were obtained from Illumina) and regular uppercase are the V3 region primers (341F on for the forward primers and 518R for the reverse primers).
and the sequences:
V3_f_modified
aatgatacggcgaccaccgagatctacactctttccctacacgacgctcttccgatctNNNNCCTACGGGAGGCAGCAG
an example of a barcoded reverse:
aagcagaagacggcatacgagatCGTGATgtgactggagttcagacgtgtgctcttccgatctATTACCGCGGCTGCTGG
Final questions:
- what sort of modifications do you need to order on these primers? phosphate, phosphothioate, ...?
- the second paper mentions the improvement when using the four N's, so as to improve the complexity and help with cluster generation. Does anyone have experience with this?
- any other practical help would be really appreciated!
Comment