Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • index contamination in multiplexed run

    Hi all,

    I have an issue with adapter/index contamination in a multiplexed lane:

    I multiplexed 8 samples in a single HiSeq run. Unfortunately >90% of my reads are associated with an adapter/index that I did not use. The majority of these reads are my organism (not a common one) so contamination from another user's sample is unlikely. I have not used this particular index in a year however and have sequenced samples since then without problem. I think this means that my library prep reagents are not contaminated. This leaves the adapters themselves, or... something else? Perhaps something about the indexing read???

    Thanks for any help!

  • #2
    Did you do the demultiplexing yourself or was it done using CASAVA? It may simply be a case of an incorrect samplesheet being used (easily fixable by a re-run of CASAVA).

    It is not clear from your post if all 8 samples are affected or just one.

    Comment


    • #3
      Thanks for the response, GenoMax.

      I did not do the demultiplexing myself. Perhaps I can get the facility to re-run CASAVA. Though if it's an incorrect samplesheet it's still not clear this will fix my problem.

      >90% of the reads are associated with index #1 (an index I did not use). The remaining reads are split between my 8 samples. So in that regard, all 8 of my samples are affected.

      Comment


      • #4
        So in this case the demultiplexing actually has worked, with a tag that you did not expect to be there.

        Did you make these libraries or did the facility make them? In case you did then this outcome does not make sense.

        Comment


        • #5
          I guess the demultiplexing has worked. It's just given me an unexpected result.

          I made the libraries myself. And I agree it does not make sense! I'm struggling to figure out what happened. Like I said in the OP, index #1 is not one I used in this particular library prep (though I have used it in the past, nearly a year ago). I have had several successful Illumina runs in the meantime without this index #1 contamination. Most of the library kit reagents are new and therefore unlikely to be contaminated by index #1. So it kind of leaves me with contaminated adapters (some of which are new) as the likely source of the problem. But the contamination would have to be pretty severe to leave me with >90% reads associated with index#1. Perhaps several of the adapters I used contained index#1 by mistake? I quantified these samples by qPCR, so if they were loaded evenly by the facility then it's all the more confusing.

          Comment


          • #6
            So was index 1 was determined to be present based on the reads that ended up the "undetermined" bin because it should not have been included in the normal samplesheet since you had not used it. What is stranger is that if it was someone else's sample that contaminated your pool those reads are aligning to your genome (unless the contaminating sample also happens to be from the same species).

            Comment


            • #7
              You're correct. All of these index#1 reads are in the Undetermined bin.

              I'd be surprised if someone else was using this genome for NGS in our campus facility (I would think our lab would know as they would have likely acquired the organism from us). Contamination from someone else's sample would certainly explain my results but I'm thinking the real answer is elsewhere. I guess I could do some QC on the adapters I used. Though the company has agreed to send new adapters.

              Comment


              • #8
                Check if your expected indexes are present in the fastq for index 1. I would guess someone made an error in the samplesheet, e.g. by using the wrong base 7 in the index reads.

                Comment


                • #9
                  I'm assuming that you provided a pool of your 8 libraries to the facility to sequence. It sounds as though your remaining 10% of total reads were evenly split between the expected 8 indexes. If you had a contamination issue with the index 1 adapter at the library stage, it would be highly unlikely you would be seeing >90% of the reads for this one index. I would more likely suspect a mix-up at the dilution step immediately prior to clustering. You can test this by remaking the pool and submitting again. The facility may be willing to test the new pool as a spike-in to the control lane. If the new pool shows no sign of index 1 and only your expected 8 indexes, you have your answer.

                  Comment


                  • #10
                    Hi Chipper. The demultiplexing should have pulled all reads with my expected indices so I don't think they'll be in the Undetermined bin (the fastq that contains, among other things, reads with index#1).

                    Hi MU Core. I provided individual samples. The facility runs Bioanalyzer as QC, then dilutes and pools. They have the individual libraries so re-pooling and sequencing is a possibility, as is sequencing the individual libraries singly. I'm leaning towards prepping new libraries using new barcoded adapters and seeing if I can spike a pooled sample in a control lane.

                    Comment


                    • #11
                      I wanted to follow up with some additional information. The data was re-run through CASAVA and the results were no different. The run statistics are attached in a pdf. Anything here stand out as a potential cause? The '% Perfect Index Reads' seem off. And the '% of raw clusters per lane' seem low as well, though this may just reflect the low read number #'s for most samples.

                      Would appreciate any thoughts. Still trying to make sense of all this.
                      Attached Files

                      Comment


                      • #12
                        The high % of one mismatch numbers suggest a run or chemistry issue. You definitely need to sit down with the folks generating the sequence and review the run metrics. There may have been some issues with the focus, etc. that could explain what appears to be poor base calling for at least the index read.

                        Comment


                        • #13
                          Originally posted by btb View Post
                          Would appreciate any thoughts. Still trying to make sense of all this.
                          This is where you look through the "undetermined" pile to see what tags are represented there. If there are N's then not much you can do but there could be tags that you just did not expect (but are there anyway).

                          Comment


                          • #14
                            Hi btb,
                            Did you figure out the source of the unused index contamination in your libraries. I ran into exactly same issue and would really appreciate your input.
                            Last edited by parulagwl; 12-03-2014, 02:37 PM.

                            Comment


                            • #15
                              Can you confirm if you have exactly the same situation btb had (i.e. majority of the reads have unexpected barcodes and thus end up in the "undetermined" bin)?

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Essential Discoveries and Tools in Epitranscriptomics
                                by seqadmin




                                The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                                04-22-2024, 07:01 AM
                              • seqadmin
                                Current Approaches to Protein Sequencing
                                by seqadmin


                                Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                                04-04-2024, 04:25 PM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, 04-11-2024, 12:08 PM
                              0 responses
                              59 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-10-2024, 10:19 PM
                              0 responses
                              57 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-10-2024, 09:21 AM
                              0 responses
                              51 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-04-2024, 09:00 AM
                              0 responses
                              55 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X