I am aiming to sequence 72 amplicon libraries in one lane, and I plan to use in-line barcodes, 12 on 5'-end and 6 on 3'. As I understand, first 4 nucleotides of each read are crucial for cluster identification and therefore should be base-balanced. I checked standard Illumina Truseq barcode sequences, they are supposed to be well-balanced, but it seems that they aren't (or I'm missing something).
See:
RPI01 ATCACG
RPI02 CGATGT
RPI03 TTAGGC
RPI04 TGACCA
RPI05 ACAGTG
RPI06 GCCAAT
RPI07 CAGATC
RPI08 ACTTGA
RPI09 GATCAG
RPI10 TAGCTT
RPI11 GGCTAC
RPI12 CTTGTA
on position 3, A is overrepresented, at the expence of G. Is there any reason for this? What's better, to leave it this way, or to order better-balanced barcode sequence instead of RPI02, CGGTGT?
For 3'-end barcodes I need 6 sequences, and as I understand they should be base-balanced as well. There are no way to get equal amounts of each nucleotide in 6 barcodes, so 2 should come in duplicates and 2 should be unique.
The first 6 barcodes do not meet this requirement due to overrepresented A, so I came up with this set:
RPI01 ATCACG
RPIUU CGGATT
RPI03 TTAGGC
RPI04 TGACCA
RPI08 ACTTGA
RPI09 GATCAG
I had to make custom barcode, RPUU, since there were no matching sequences within 48 Illumina's barcodes.
The questions that I have: Did I get it right about balancing requirements for amplicon starting nucleotides, or there are some tricks that I missed?
Are my barcode sets well-suited for multiplexing?
See:
RPI01 ATCACG
RPI02 CGATGT
RPI03 TTAGGC
RPI04 TGACCA
RPI05 ACAGTG
RPI06 GCCAAT
RPI07 CAGATC
RPI08 ACTTGA
RPI09 GATCAG
RPI10 TAGCTT
RPI11 GGCTAC
RPI12 CTTGTA
on position 3, A is overrepresented, at the expence of G. Is there any reason for this? What's better, to leave it this way, or to order better-balanced barcode sequence instead of RPI02, CGGTGT?
For 3'-end barcodes I need 6 sequences, and as I understand they should be base-balanced as well. There are no way to get equal amounts of each nucleotide in 6 barcodes, so 2 should come in duplicates and 2 should be unique.
The first 6 barcodes do not meet this requirement due to overrepresented A, so I came up with this set:
RPI01 ATCACG
RPIUU CGGATT
RPI03 TTAGGC
RPI04 TGACCA
RPI08 ACTTGA
RPI09 GATCAG
I had to make custom barcode, RPUU, since there were no matching sequences within 48 Illumina's barcodes.
The questions that I have: Did I get it right about balancing requirements for amplicon starting nucleotides, or there are some tricks that I missed?
Are my barcode sets well-suited for multiplexing?
Comment