Seqanswers Leaderboard Ad

**kmcarr** · 08-05-2016, 06:38 AM

Originally posted by Samarpana View Post

I am trying to replicate a protocol established in the labs of our collaborator. The problem is they have provided me with the adapter sequences for barcoding multiple samples but did not provide any further information. Nor are they responding to emails.

I have always used Illumina generated indexes (nextera) so can't figure out what are the index sequences in these adapters. Any help will be appreciated.

The sequences are:

Code:

a1: ACA CTC TTT CCC TAC ACG ACG CTC TTC CGA TCT GTC ATG A*T
a2: /5Phos/TCA TGA CAG ATC GGA AGA GCG TCG TGT AGG GAA AGA GTG T

a3: ACA CTC TTT CCC TAC ACG ACG CTC TTC CGA TCT ACA TCT C*T
a4: /5Phos/GAG ATG TAG ATC GGA AGA GCG TCG TGT AGG GAA AGA GTG T

* = phosphorothiorate bond

Which are the index sequences to be used for demultiplexing the samples, once the run is over?

The indexes are the 7 bases immediately upstream of the T overhang. Read 1 sequencing primer will anneal upstream of that so your barcodes will be the first 7 bases of read #1 (followed by a T for every read) instead of the usual dedicated index reads. I have redrawn the primer pairs in the annealed state, with the bottom strand in it's proper 3'->5' orientation. Barcode is highlighted in red and position of R1 sequencing primer is shown

Code:

5'-ACACTCTTTCCCTACACGACGCTCTTCCGATCT[COLOR="Red"]GTCATGA[/COLOR]*T
3'-TGTGAGAAAGGGATGTGCTGCGAGAAGGCTAGACAGTACT-p

5'-ACACTCTTTCCCTACACGACGCTCTTCCGATCT[COLOR="Red"]ACATCTC[/COLOR]*T
3'-TGTGAGAAAGGGATGTGCTGCGAGAAGGCTAGATGTAGAG-p

5'-ACACTCTTTCCCTACACGACGCTCTTCCGATCT-->
Read 1 primer

**Samarpana** · 08-05-2016, 08:31 AM

Thanks so much for explaining. So, when I use it in the sample sheet for demultiplexing, should I use the 7 base's complimentary sequence?

Originally posted by kmcarr View Post

The indexes are the 7 bases immediately upstream of the T overhang. Read 1 sequencing primer will anneal upstream of that so your barcodes will be the first 7 bases of read #1 (followed by a T for every read) instead of the usual dedicated index reads. I have redrawn the primer pairs in the annealed state, with the bottom strand in it's proper 3'->5' orientation. Barcode is highlighted in red and position of R1 sequencing primer is shown

Code:

5'-ACACTCTTTCCCTACACGACGCTCTTCCGATCT[COLOR="Red"]GTCATGA[/COLOR]*T
3'-TGTGAGAAAGGGATGTGCTGCGAGAAGGCTAGACAGTACT-p

5'-ACACTCTTTCCCTACACGACGCTCTTCCGATCT[COLOR="Red"]ACATCTC[/COLOR]*T
3'-TGTGAGAAAGGGATGTGCTGCGAGAAGGCTAGATGTAGAG-p

5'-ACACTCTTTCCCTACACGACGCTCTTCCGATCT-->
Read 1 primer

**thermophile** · 08-05-2016, 09:57 AM

index 1 (the one on the p7 end) needs to be reverse complimented. I2 (the one on the P5 end) does not.

**kmcarr** · 08-05-2016, 10:32 AM

Originally posted by thermophile View Post

index 1 (the one on the p7 end) needs to be reverse complimented. I2 (the one on the P5 end) does not.

That does not really apply in this case since the index is part of sequencing read 1, not a dedicated read.

In this case enter the barcode as it appears (do not reverse complement). That is the orientation it will appear in the sequencing read so that is the orientation which should be in the sample sheet.

If using Illumina Bcl2fastq to demultiplex your data the correct --use-bases-mask configuration would be "I7NY*" for single end reads and "I7NY*,Y*" for paired end reads. This tells Bcl2fastq to use the first 7 bases of read 1 as the index, skip over the "T" following the index and output the remainder of read 1 as the sequence read.

I find myself wanting to ask how you are adding the P5 & P7 oligos? As shown these adapters will not produce full Illumina library fragments as they only contain the sequence priming site, not the P5 & P7 oligos necessary for flow cell binding and bridge PCR. Is there a PCR step after ligation to add the additional sequences? That's the usual strategy.

**Samarpana** · 08-07-2016, 02:23 AM

Yes, you have guessed it correct. After ligating these adapters, there is a PCR step that adds sequences corresponding to the P5 and P7 oligo sequences to the DNA fragment, to allow binding to the flow cell/bridge amplification.

Thanks for guiding through the demultiplexing procedure. I will try what you have suggested and let you know how it works out.

Originally posted by kmcarr View Post

That does not really apply in this case since the index is part of sequencing read 1, not a dedicated read.

In this case enter the barcode as it appears (do not reverse complement). That is the orientation it will appear in the sequencing read so that is the orientation which should be in the sample sheet.

If using Illumina Bcl2fastq to demultiplex your data the correct --use-bases-mask configuration would be "I7NY*" for single end reads and "I7NY*,Y*" for paired end reads. This tells Bcl2fastq to use the first 7 bases of read 1 as the index, skip over the "T" following the index and output the remainder of read 1 as the sequence read.

I find myself wanting to ask how you are adding the P5 & P7 oligos? As shown these adapters will not produce full Illumina library fragments as they only contain the sequence priming site, not the P5 & P7 oligos necessary for flow cell binding and bridge PCR. Is there a PCR step after ligation to add the additional sequences? That's the usual strategy.

**thermophile** · 08-08-2016, 12:30 PM

Originally posted by kmcarr View Post

That does not really apply in this case since the index is part of sequencing read 1, not a dedicated read.

sorry, my bad for not looking at the sequence close enough

**Samarpana** · 08-09-2016, 10:23 PM

Hi kmcarr,

I tried demultiplexing the way you suggested. I am now getting 3 files for each sample, as the command didn't work with just I7Y*, Y*.

Code:

/illumina/pipeline/bin/bcl2fastq --runfolder-dir /illumina/runs/DAta/Data_220716_Exome/160723_SN963_0297_BC66FUACXX --output-dir /illumina/runs/DAta/Data_220716_Exome/160723_SN963_0297_BC66FUACXX/HELP_gt --sample-sheet Sample_Sheet_Samarpana_230716_HELP-GT.csv --use-bases-mask I7Y*,Y*,Y*

Just wanted to know what is to be done with the 8 base sequence that lies after read 1, which is generally used for indexes in the Illumina paired end libraries (101,8,101)? Should I discard/mask it or consider it a part of read 1?

I finally got a response from our collaborators and they mentioned using Picard for demultiplexing. Do you also recommend shifting to Picard for this? I have only used Picard for SortSAM and Marking Duplicates, previously.

Originally posted by kmcarr View Post

That does not really apply in this case since the index is part of sequencing read 1, not a dedicated read.

In this case enter the barcode as it appears (do not reverse complement). That is the orientation it will appear in the sequencing read so that is the orientation which should be in the sample sheet.

If using Illumina Bcl2fastq to demultiplex your data the correct --use-bases-mask configuration would be "I7NY*" for single end reads and "I7NY*,Y*" for paired end reads. This tells Bcl2fastq to use the first 7 bases of read 1 as the index, skip over the "T" following the index and output the remainder of read 1 as the sequence read.

I find myself wanting to ask how you are adding the P5 & P7 oligos? As shown these adapters will not produce full Illumina library fragments as they only contain the sequence priming site, not the P5 & P7 oligos necessary for flow cell binding and bridge PCR. Is there a PCR step after ligation to add the additional sequences? That's the usual strategy.

**kmcarr** · 08-10-2016, 04:26 AM

Originally posted by Samarpana View Post

Hi kmcarr,

I tried demultiplexing the way you suggested. I am now getting 3 files for each sample, as the command didn't work with just I7Y*, Y*.

Code:

/illumina/pipeline/bin/bcl2fastq --runfolder-dir /illumina/runs/DAta/Data_220716_Exome/160723_SN963_0297_BC66FUACXX --output-dir /illumina/runs/DAta/Data_220716_Exome/160723_SN963_0297_BC66FUACXX/HELP_gt --sample-sheet Sample_Sheet_Samarpana_230716_HELP-GT.csv --use-bases-mask I7Y*,Y*,Y*

Just wanted to know what is to be done with the 8 base sequence that lies after read 1, which is generally used for indexes in the Illumina paired end libraries (101,8,101)? Should I discard/mask it or consider it a part of read 1?

I finally got a response from our collaborators and they mentioned using Picard for demultiplexing. Do you also recommend shifting to Picard for this? I have only used Picard for SortSAM and Marking Duplicates, previously.

Since the sequencing format was paired end with a dedicated index read added you will need to adjust the --use-bases-mask appropriately. On the assumption that the dedicated index read is not providing you any useful information (your index is part of read 1) it is safe to ignore it. Note I also mentioned ignoring the "T" which is in between your index and your actual read. Given the design of your libraries and the run format used by the sequencing center try this

Code:

--use-bases-mask "I7NY*,N*,Y*"

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 39 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 41 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 35 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 55 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

Index Sequence in adapters for multiplexing

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News