Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • New NextSeq: First run. A few issues.

    Hi All,
    We're running standalone with our new NextSeq500 and the very first user came in with 30 multiplexed NEB Next Ultra2 libraries, with BIOO single index 12bp long adapters, for us to run on a MID-flowcell as genomic DNA (2x150) using an Illumina kit.
    We spent some time checking with NEB, and then BIOO, and Illumina, before proceeding. We set up a plate with the 12bp indices on the I7 sequence primer. In short, although slightly overclustered (233M/mm2) the run looked okay (89% >Q30), and all samples demultiplexed. A few things not specifically discussed on these forums have appeared in fastqc and the flowcell summary.

    Newbie Question: Roughly 10% of my reads went to "Undetermined". This is not far off from our 10% PhiX spike in. Of those that were PhiX, at least 90% had "GGGGGGGGGGGG" as the barcode. Is it normal for bcl2Fastq to keep these dark indices? It also kept over a million reads with a barcode of "NNNNNNNNNNNN". Normal?

    PROBLEMS:
    For the "good" demultiplexed sequences of ONE sample (~4.7M reads):
    1. In read-pair file reads #1, there were ~42K consisting of a stretch of Ns, exactly 35 bp long. Also >37K were PhiX reads ending in a long run of Gs. They showed the correct index.
    2. Same sample, in read-pair file #2, about 243 sequences were reads dominated by mostly Ns, with some a string consisting of only 35 Ns. There were also about 1500 PhiX reads ending in long runs of Gs. They also showed the correct index.

    How would this happen? It doesn't seem like a tile edge effect, and while I know that PhiX makes it into the "good" reads sometimes, and I know that "G"s can be heavily over represented at the ends of read sequences, why so many, and why so different depending on the read pairs?

    Thanks.
    -p

  • #2
    Answering my own question: The long "G"s are likely adapter/primer dimers, where the read one primer bound, but not the read two primer. Not yet sure if this is due to the use of NEB+BIOO+Illumina kits. The 35 "N"s are from adapter/primer dimers being N-masked.
    -p

    Comment


    • #3
      Illumina's phiX is not indexed (if I recall right). The GGGG's basically is NextSeq saying I see no signal = G basecalls (2-color chemistry).

      Comment


      • #4
        I'm sure you are correct, thanks. The weirdness (among other things) was the numerical disparity between read 1 and read 2. Maybe because the read2 index in the plate file wasn't entered (which causes the read2 to be automatically 'masked'), and also the libraries had very low complexity (e.g long runs of Gs). In the end, the run worked fine but may need some cleanup.

        After a couple days with Illumina, they said :
        I pulled the 47,000 sequences that gave Ns in R1 from R2 and it looks like pretty much all of them have 100% G calls. A G call in NextSeq is usually associated with dark cycles (or no image). This tells me that the R2 primer may not have bound correctly to these obviously adapter dimer sequences (since R1 is 35bp of Ns) and that is why we don't see the adapters in R2 region and hence no subsequence conversion to Ns.
        This is still a little murky to me, so I plan to draw out the whole NEB+12bp-BIOO+Illumina indexing on paper.

        Comment


        • #5
          Originally posted by hoytpr View Post
          I'm sure you are correct, thanks. The weirdness (among other things) was the numerical disparity between read 1 and read 2.
          I don't have any hands on experience with the NextSeq, but there seems to be a tendency for NextSeq data to have more polyG sequences in R2 than in R1. Possibly due to some inefficiency in the Read 2 re-synthesis. Maybe this is worse for adapter dimers?
          Registered SEQanswers sponsors/vendors can post commercial content here. Please support our sponsors!

          Bridged amplification & clustering followed by sequencing by synthesis. (Genome Analyzer / HiSeq / MiSeq)


          bcl2fastq default will also mask reads that are less than 22 bases after trimming with Ns along with adapter dimers.


          Maybe because the read2 index in the plate file wasn't entered (which causes the read2 to be automatically 'masked')
          If you don't enter an I5 index than the sequencer will just read I7 and not I5. So for paired end single index there will be reads for R1, I7, and R2. The I5 read will be skipped and the I7 will be used to demux.
          Josh Kinman

          Comment


          • #6
            Thanks very much for the links and clarification.

            -p

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Current Approaches to Protein Sequencing
              by seqadmin


              Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
              04-04-2024, 04:25 PM
            • seqadmin
              Strategies for Sequencing Challenging Samples
              by seqadmin


              Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
              03-22-2024, 06:39 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, 04-11-2024, 12:08 PM
            0 responses
            25 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 10:19 PM
            0 responses
            28 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 09:21 AM
            0 responses
            24 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-04-2024, 09:00 AM
            0 responses
            52 views
            0 likes
            Last Post seqadmin  
            Working...
            X