Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    Originally posted by seb567 View Post
    Greetings,

    It is very likely that an error or numerous errors occured during library preparation.

    The demultiplexer that ships with CASAVA 1.8.2 is very stringent to avoid altogether contaminations like those you described above.

    CASAVA 1.8.2 allows 0 mismatches in each index by default. This can be changed to 1 -- 1 is the maximum number of mismatches in any index for bar-coded data.


    At our institution, we developed our own demultiplexer, called FastDemultiplexer, that allows more mismatches, which in turns increases yields and decreases clusters that are unclassified.

    I hope you sort out this data confusion although the information you provided indicate erroneous library preparation.


    Sébastien Boisvert
    ^^
    Sébastien,

    Technically the CASAVA pipeline will allow more mismatches (just pass a higher number with the --mistmatches parameter) but it will fail if there is a collision between barcodes caused by allowing multiple mismatches per index. For example:

    TruSeq Index #18 == GTCCGC
    TruSeq Index #19 == GTGAAA

    Now suppose your encounter an index read which is GTCCAA. How do you resolve the conflict of this being #18 with the GC->AA at the end OR #19 with GA->CC in the middle. Unless you very carefully choose the mixture of barcodes you are likely to encounter these types of collisions when allowing more than one error per index read.

    Comment


    • #17
      Could you please expand on what you mean by "Indexes with specific DNA composition patterns". We have been tearing our hair out recently because suddenly the quality of our index reads is horrible, leading to massive loss of sequence data because we can't determine the index sequence. This is happening on both our HiSeq2k and GAIIx. We have considered, and tentatively ruled out cluster density and degree of barcode diversity as the source of the problem. Any findings you could share would be greatly appreciated.
      Hi kmcarr,
      What we noticed during some of our multiplex-runs (single-end) was that the sequencing of the actual read was great... high quality scores, several million reads, etc. The indexes on the other hand were saturated with N's. Some indexes had the occasional N but as you know, CASAVA has a flag to handle such situations.
      Nonetheless, we ran CASAVA demultiplexing on this dataset. The resultant CASAVA-build had very few reads in it simply due to the fact that the indexes are not mapped properly (too many Ns). What we saw is that samples that failed had indexes with a high T and/or G content. Whether it was index-sequencing error or these bases played a role in failed samples.. it's tough to say.
      Have you looked at the thumbnails and found any anomalies?

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Essential Discoveries and Tools in Epitranscriptomics
        by seqadmin




        The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
        04-22-2024, 07:01 AM
      • seqadmin
        Current Approaches to Protein Sequencing
        by seqadmin


        Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
        04-04-2024, 04:25 PM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, Yesterday, 08:47 AM
      0 responses
      16 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-11-2024, 12:08 PM
      0 responses
      60 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-10-2024, 10:19 PM
      0 responses
      60 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-10-2024, 09:21 AM
      0 responses
      54 views
      0 likes
      Last Post seqadmin  
      Working...
      X