Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • RNA-seq of long RNAs (paired-end, stranded) -- fragmentation time

    I'm using the TruSeq Stranded Total RNA library preparation kit to generate libraries for 2x150 bp paired-end sequencing. Illumina's fragmentation protocol recommends 94°C for 8 minutes for intact, high quality RNA to generate insert lengths of 120-210 bp. If I am going to run a 2x150 bp sequencing run, I am concerned that 1) smaller fragments (< 150 bp) will preferentially bind and sequence, and 2) the read length may be overkill for this fragment population size and therefore be a waste of data. Illumina provides alternate fragmentation times for intact RNA, but I am unsure as to which is most efficient for a 2x150 bp (long RNA-seq) run. Can anyone provide feedback and/or experience with long RNA-seq? Thank you!!!

  • #2
    Sequencing TruSeq RNA library with 2x150 will read into adapters. To maximise useful sequencing output you should use other kits such as NEBNext Ultra II RNAseq which allows preparing libraries with inserts up to 500 bp.

    Comment


    • #3
      Thanks, nucacidhunter, I'm also concerned with reading into the adapter or through the adapter into the flow cell and crashing the run; however, we have generated libraries using the "0 minute" fragmentation time (instead of a 94°C incubation, incubate at 65°C for 5 minutes, followed by a 4°C hold) with library traces ranging from 300-800 bp (~450 bp inserts) on a TapeStation -- no size selection was performed. My concern is accurate qPCR quantification and downstream cluster generation.

      At this point, we will be using the TruSeq Stranded Total RNA kit, but I'm wondering if anyone has experience with altering fragmentation times using this kit to accommodate a 2x150 bp run on a NextSeq 500. Thanks.

      Comment


      • #4
        If you can generate libraries in desired size with TruSeq (anecdotally some people have not been successful to obtain above 250 bp) then qPCR should not be an issue. You just need to take the average library size in 100-950 bp (the upper size depends on your qPCR setting for extension time) from BA or TapeStation to calculate molar concentration. This is common practice with TruSeq Nano DNA kits and Nextera that results in libraries with fragment sizes spanning above 1kb.

        Edit:Reading into adapters will not crash the run but will give overall bad Q scores which should be fine after trimming adapters.
        Last edited by nucacidhunter; 09-19-2017, 05:59 PM.

        Comment


        • #5
          Thanks for the input. I'm using the KAPA library quantification kit for qPCR.

          I will try altering fragmentation time to 2 min at 94°C and doing a double-sided SPRI size selection for 350-500 bp fragments. I am trying to decide on an appropriate Ampure XP bead ratio to use; would 0.55X/0.15X be sufficient? My thinking is to target insert sizes no less than 300 bp for a 2x150 bp run (Illumina recommends up to 550 bp inserts on the NextSeq).

          Comment


          • #6
            If you have annealing and extension set for 45s then 950bp will be upper limit of amplification.

            0.55x/0.15x should select your target range but the final library insert will depend on the size selection point (probably best after second strand synthesis). Keep in mind that size of fragmented RNA is not the only factor affecting the insert size and library insert size will be shorter than fragmented RNA length.

            Comment


            • #7
              Yes, annealing/extension is set for 45 sec.

              I was under the assumption that the final library size to be sequenced was dependent on fragmentation and post-ligation size selection. Perhaps that is my misunderstanding, and I have never performed size selection following second strand synthesis.

              Ultimately, my overall goal is to make sure that I don't waste valuable sequencing by decreasing the amount of overlap between read 1 and read 2 for a 2 x 150 bp run. Based on Table 20. Library Insert Fragmentation Time (Page 117) of the TruSeq Stranded Total RNA protocol, a 2 minute fragmentation time should generate an average library size of 410 bp or a 290 bp insert (410 bp – 120 bp to remove adapters = 290 bp insert size). Sequencing 150 bp in both directions would then allow for only a 10 bp overlap, increasing available template for sequencing and sensitivity to detect positional information for splice variant analysis or novel isoform discovery. Am I thinking about this correctly? What would you suggest I do?

              (Of course, the expected library size ranges are only theoretical and are highly dependent on sample quality and accurate size selection.)

              TruSeq Stranded Total RNA
              Sample Preparation Guide

              Comment


              • #8
                Yes, annealing/extension is set to 45 sec.

                I was under the assumption that final library insert size was dependent on fragmentation time and post-ligation size selection. Perhaps that is my misunderstanding. I have never performed size selection after second strand synthesis.

                Ultimately, my overall goal is to make sure that I don’t waste valuable sequencing by decreasing the overlap between read 1 and read 2 for a 2 x 150 bp run. Based on Table 20. Library Insert Fragmentation Time on page 117 of the TruSeq Stranded Total RNA protocol, a 2 minute fragmentation time should generate an average library size of 410 bp or a 290 bp insert (410 bp – 120 bp to remove adapters = 290 bp insert size). Sequencing 150 bp in both directions would then allow for only a 10 bp overlap, increasing available template for sequencing and sensitivity to detect positional information for splice variant analysis or novel isoform discovery. Am I thinking about this correctly? What would you suggest?

                (Of course, the expected library size ranges are highly dependent on sample quality and accurate size selection, among other things.)

                Comment


                • #9
                  According to Illumina table the longest insert even without fragmentation is 200bp (130-350 range). To enrich for larger fragments you can clean up with low bead ratio (not double size selection) to remove smaller fragments (left size selection). This can be done after ligation or PCR (standard clean up is 0.9-1x bead). To cut off at 300 bp insert, ligated fragments shorter than 420 bp should be removed. This will result in lower yield and if the aim is transcript assembly or detecting isoform this is OK but would affect differential expression analysis.

                  For larger insert libraries a suitable kit that is able to generate larger cDNA fragments should be used.

                  Edit: You can also do gel based size selection.
                  Last edited by nucacidhunter; 09-20-2017, 07:43 PM.

                  Comment


                  • #10
                    Given that the TruSeq kit is only capable of producing up to 200 bp inserts, and I intend to do a 2 x150 bp run, overlap is obviously unavoidable. I do not want to introduce bias in the data analysis by removing smaller fragments OR not fragmenting and possibly missing intact, mature mRNAs (>1 Kb). Unfortunately, I don't have the option of switching to a different kit to generate larger fragments. The PI's goal is to "get as much data as possible from a 2 x 150 bp run", and I'm trying to figure out the best protocol conditions for the TruSeq kit to optimize data output.

                    The 0 minute fragmentation (see image) seems to introduce some fragmentation, but, again, I'm concerned about size selection and sequencing a good representation of the biology. Perhaps I should just stick with some form of fragmentation to ensure coverage of the entire population in library conversion and perform a right size selection to remove any larger fragments (> 670 bp ligated fragments) that would not cluster efficiently on the NextSeq. What do you think?

                    Thanks very much for your insight!
                    Attached Files

                    Comment


                    • #11
                      Library insert size is affected by RNA fragment size (user controllable) and random primers that can prime any region along RNA fragments and a fragment will be primed from multiple locations. That is the reason for relatively small inserts even without fragmentation. Some kits use less random primer so library insert size can be controlled by fragmentation time.

                      You do not need to remove larger fragments because they will cluster less efficiently and they do not affect sequencing results.

                      Comment


                      • #12
                        Should some form of fragmentation be performed then? Does no fragmentation introduce bias by not capturing the larger mRNAs in the sample?

                        Comment


                        • #13
                          Take a look at Kapa's RNA Seq documentation (https://www.kapabiosystems.com/produ...mrna-seq-kits/). They have very nice alternate fragmentation conditions worked out. They use the same chemistry as Illumina to fragment the RNA, so now worries there. I'd consider using the 85 degree temperature, as opposed to 94. I also agree with nucacidhunter that a lower bead cleanup ratio (0.7X-0.8X) post-PCR will give you what you're looking for.

                          Comment


                          • #14
                            Thanks, jteeee2. I will definitely try this method with a lower SPRI clean-up ratio, as suggested by nucacidhunter. KAPA has always been a reliable NGS resource.

                            Comment


                            • #15
                              Originally posted by emily202 View Post
                              Should some form of fragmentation be performed then? Does no fragmentation introduce bias by not capturing the larger mRNAs in the sample?
                              Larger mRNAs will be included in final library but maybe with less representation because cDNA synthesis is primed by randomers. If you are going to compare libraries prepared with the same method, bias is not an issue and all RNA-Seq library prep methods have their own bias.

                              To be sure you can try preparing two libraries from the same input RNA and compare the gene ranks and if correlation is high then you can process rest of your samples.

                              If you are going to compare large insert libraries with standard ones, the possible variation caused by insert size need to be accounted and I would suggest to consult your bioinformatician for your experiment design.

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Techniques and Challenges in Conservation Genomics
                                by seqadmin



                                The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                                Avian Conservation
                                Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                                03-08-2024, 10:41 AM
                              • seqadmin
                                The Impact of AI in Genomic Medicine
                                by seqadmin



                                Artificial intelligence (AI) has evolved from a futuristic vision to a mainstream technology, highlighted by the introduction of tools like OpenAI's ChatGPT and Google's Gemini. In recent years, AI has become increasingly integrated into the field of genomics. This integration has enabled new scientific discoveries while simultaneously raising important ethical questions1. Interviews with two researchers at the center of this intersection provide insightful perspectives into...
                                02-26-2024, 02:07 PM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, 03-14-2024, 06:13 AM
                              0 responses
                              34 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 03-08-2024, 08:03 AM
                              0 responses
                              72 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 03-07-2024, 08:13 AM
                              0 responses
                              81 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 03-06-2024, 09:51 AM
                              0 responses
                              68 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X