Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Index Sequence in adapters for multiplexing

    I am trying to replicate a protocol established in the labs of our collaborator. The problem is they have provided me with the adapter sequences for barcoding multiple samples but did not provide any further information. Nor are they responding to emails.

    I have always used Illumina generated indexes (nextera) so can't figure out what are the index sequences in these adapters. Any help will be appreciated.

    The sequences are:
    a1: ACA CTC TTT CCC TAC ACG ACG CTC TTC CGA TCT GTC ATG A*T
    a2: /5Phos/TCA TGA CAG ATC GGA AGA GCG TCG TGT AGG GAA AGA GTG T

    a3: ACA CTC TTT CCC TAC ACG ACG CTC TTC CGA TCT ACA TCT C*T
    a4: /5Phos/GAG ATG TAG ATC GGA AGA GCG TCG TGT AGG GAA AGA GTG T

    * = phosphorothiorate bond

    Which are the index sequences to be used for demultiplexing the samples, once the run is over?

  • #2
    Originally posted by Samarpana View Post
    I am trying to replicate a protocol established in the labs of our collaborator. The problem is they have provided me with the adapter sequences for barcoding multiple samples but did not provide any further information. Nor are they responding to emails.

    I have always used Illumina generated indexes (nextera) so can't figure out what are the index sequences in these adapters. Any help will be appreciated.

    The sequences are:
    Code:
    a1: ACA CTC TTT CCC TAC ACG ACG CTC TTC CGA TCT GTC ATG A*T
    a2: /5Phos/TCA TGA CAG ATC GGA AGA GCG TCG TGT AGG GAA AGA GTG T
    
    a3: ACA CTC TTT CCC TAC ACG ACG CTC TTC CGA TCT ACA TCT C*T
    a4: /5Phos/GAG ATG TAG ATC GGA AGA GCG TCG TGT AGG GAA AGA GTG T
    * = phosphorothiorate bond

    Which are the index sequences to be used for demultiplexing the samples, once the run is over?
    The indexes are the 7 bases immediately upstream of the T overhang. Read 1 sequencing primer will anneal upstream of that so your barcodes will be the first 7 bases of read #1 (followed by a T for every read) instead of the usual dedicated index reads. I have redrawn the primer pairs in the annealed state, with the bottom strand in it's proper 3'->5' orientation. Barcode is highlighted in red and position of R1 sequencing primer is shown

    Code:
    5'-ACACTCTTTCCCTACACGACGCTCTTCCGATCT[COLOR="Red"]GTCATGA[/COLOR]*T
    3'-TGTGAGAAAGGGATGTGCTGCGAGAAGGCTAGACAGTACT-p
    
    5'-ACACTCTTTCCCTACACGACGCTCTTCCGATCT[COLOR="Red"]ACATCTC[/COLOR]*T
    3'-TGTGAGAAAGGGATGTGCTGCGAGAAGGCTAGATGTAGAG-p
    
    5'-ACACTCTTTCCCTACACGACGCTCTTCCGATCT-->
    Read 1 primer

    Comment


    • #3
      Thanks so much for explaining. So, when I use it in the sample sheet for demultiplexing, should I use the 7 base's complimentary sequence?

      Originally posted by kmcarr View Post
      The indexes are the 7 bases immediately upstream of the T overhang. Read 1 sequencing primer will anneal upstream of that so your barcodes will be the first 7 bases of read #1 (followed by a T for every read) instead of the usual dedicated index reads. I have redrawn the primer pairs in the annealed state, with the bottom strand in it's proper 3'->5' orientation. Barcode is highlighted in red and position of R1 sequencing primer is shown

      Code:
      5'-ACACTCTTTCCCTACACGACGCTCTTCCGATCT[COLOR="Red"]GTCATGA[/COLOR]*T
      3'-TGTGAGAAAGGGATGTGCTGCGAGAAGGCTAGACAGTACT-p
      
      5'-ACACTCTTTCCCTACACGACGCTCTTCCGATCT[COLOR="Red"]ACATCTC[/COLOR]*T
      3'-TGTGAGAAAGGGATGTGCTGCGAGAAGGCTAGATGTAGAG-p
      
      5'-ACACTCTTTCCCTACACGACGCTCTTCCGATCT-->
      Read 1 primer

      Comment


      • #4
        index 1 (the one on the p7 end) needs to be reverse complimented. I2 (the one on the P5 end) does not.
        Microbial ecologist, running a sequencing core. I have lots of strong opinions on how to survey communities, pretty sure some are even correct.

        Comment


        • #5
          Originally posted by thermophile View Post
          index 1 (the one on the p7 end) needs to be reverse complimented. I2 (the one on the P5 end) does not.
          That does not really apply in this case since the index is part of sequencing read 1, not a dedicated read.

          In this case enter the barcode as it appears (do not reverse complement). That is the orientation it will appear in the sequencing read so that is the orientation which should be in the sample sheet.

          If using Illumina Bcl2fastq to demultiplex your data the correct --use-bases-mask configuration would be "I7NY*" for single end reads and "I7NY*,Y*" for paired end reads. This tells Bcl2fastq to use the first 7 bases of read 1 as the index, skip over the "T" following the index and output the remainder of read 1 as the sequence read.

          I find myself wanting to ask how you are adding the P5 & P7 oligos? As shown these adapters will not produce full Illumina library fragments as they only contain the sequence priming site, not the P5 & P7 oligos necessary for flow cell binding and bridge PCR. Is there a PCR step after ligation to add the additional sequences? That's the usual strategy.

          Comment


          • #6
            Yes, you have guessed it correct. After ligating these adapters, there is a PCR step that adds sequences corresponding to the P5 and P7 oligo sequences to the DNA fragment, to allow binding to the flow cell/bridge amplification.

            Thanks for guiding through the demultiplexing procedure. I will try what you have suggested and let you know how it works out.

            Originally posted by kmcarr View Post
            That does not really apply in this case since the index is part of sequencing read 1, not a dedicated read.

            In this case enter the barcode as it appears (do not reverse complement). That is the orientation it will appear in the sequencing read so that is the orientation which should be in the sample sheet.

            If using Illumina Bcl2fastq to demultiplex your data the correct --use-bases-mask configuration would be "I7NY*" for single end reads and "I7NY*,Y*" for paired end reads. This tells Bcl2fastq to use the first 7 bases of read 1 as the index, skip over the "T" following the index and output the remainder of read 1 as the sequence read.

            I find myself wanting to ask how you are adding the P5 & P7 oligos? As shown these adapters will not produce full Illumina library fragments as they only contain the sequence priming site, not the P5 & P7 oligos necessary for flow cell binding and bridge PCR. Is there a PCR step after ligation to add the additional sequences? That's the usual strategy.

            Comment


            • #7
              Originally posted by kmcarr View Post
              That does not really apply in this case since the index is part of sequencing read 1, not a dedicated read.
              sorry, my bad for not looking at the sequence close enough
              Microbial ecologist, running a sequencing core. I have lots of strong opinions on how to survey communities, pretty sure some are even correct.

              Comment


              • #8
                Hi kmcarr,

                I tried demultiplexing the way you suggested. I am now getting 3 files for each sample, as the command didn't work with just I7Y*, Y*.

                Code:
                /illumina/pipeline/bin/bcl2fastq --runfolder-dir /illumina/runs/DAta/Data_220716_Exome/160723_SN963_0297_BC66FUACXX --output-dir /illumina/runs/DAta/Data_220716_Exome/160723_SN963_0297_BC66FUACXX/HELP_gt --sample-sheet Sample_Sheet_Samarpana_230716_HELP-GT.csv --use-bases-mask I7Y*,Y*,Y*
                Just wanted to know what is to be done with the 8 base sequence that lies after read 1, which is generally used for indexes in the Illumina paired end libraries (101,8,101)? Should I discard/mask it or consider it a part of read 1?

                I finally got a response from our collaborators and they mentioned using Picard for demultiplexing. Do you also recommend shifting to Picard for this? I have only used Picard for SortSAM and Marking Duplicates, previously.

                Originally posted by kmcarr View Post
                That does not really apply in this case since the index is part of sequencing read 1, not a dedicated read.

                In this case enter the barcode as it appears (do not reverse complement). That is the orientation it will appear in the sequencing read so that is the orientation which should be in the sample sheet.

                If using Illumina Bcl2fastq to demultiplex your data the correct --use-bases-mask configuration would be "I7NY*" for single end reads and "I7NY*,Y*" for paired end reads. This tells Bcl2fastq to use the first 7 bases of read 1 as the index, skip over the "T" following the index and output the remainder of read 1 as the sequence read.

                I find myself wanting to ask how you are adding the P5 & P7 oligos? As shown these adapters will not produce full Illumina library fragments as they only contain the sequence priming site, not the P5 & P7 oligos necessary for flow cell binding and bridge PCR. Is there a PCR step after ligation to add the additional sequences? That's the usual strategy.

                Comment


                • #9
                  Originally posted by Samarpana View Post
                  Hi kmcarr,

                  I tried demultiplexing the way you suggested. I am now getting 3 files for each sample, as the command didn't work with just I7Y*, Y*.

                  Code:
                  /illumina/pipeline/bin/bcl2fastq --runfolder-dir /illumina/runs/DAta/Data_220716_Exome/160723_SN963_0297_BC66FUACXX --output-dir /illumina/runs/DAta/Data_220716_Exome/160723_SN963_0297_BC66FUACXX/HELP_gt --sample-sheet Sample_Sheet_Samarpana_230716_HELP-GT.csv --use-bases-mask I7Y*,Y*,Y*
                  Just wanted to know what is to be done with the 8 base sequence that lies after read 1, which is generally used for indexes in the Illumina paired end libraries (101,8,101)? Should I discard/mask it or consider it a part of read 1?

                  I finally got a response from our collaborators and they mentioned using Picard for demultiplexing. Do you also recommend shifting to Picard for this? I have only used Picard for SortSAM and Marking Duplicates, previously.
                  Since the sequencing format was paired end with a dedicated index read added you will need to adjust the --use-bases-mask appropriately. On the assumption that the dedicated index read is not providing you any useful information (your index is part of read 1) it is safe to ignore it. Note I also mentioned ignoring the "T" which is in between your index and your actual read. Given the design of your libraries and the run format used by the sequencing center try this

                  Code:
                  --use-bases-mask "I7NY*,N*,Y*"

                  Comment

                  Latest Articles

                  Collapse

                  • seqadmin
                    Essential Discoveries and Tools in Epitranscriptomics
                    by seqadmin


                    The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist on Modified Bases...
                    Yesterday, 07:01 AM
                  • seqadmin
                    Current Approaches to Protein Sequencing
                    by seqadmin


                    Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                    04-04-2024, 04:25 PM

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by seqadmin, 04-11-2024, 12:08 PM
                  0 responses
                  39 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 04-10-2024, 10:19 PM
                  0 responses
                  41 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 04-10-2024, 09:21 AM
                  0 responses
                  35 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 04-04-2024, 09:00 AM
                  0 responses
                  55 views
                  0 likes
                  Last Post seqadmin  
                  Working...
                  X