Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • inexplicable rRNA content

    Hi guys,

    long time reader, first time poster here. So here's my problem:

    Did two RNAseq runs (NextSeq500, RNA Access Illumina prep, 20 samples per run).

    The first run had 20 samples, high quality RNA, column based extraction Qiagen and showed 2-10% rRNA/sample (mapped with BWA, fasta from https://www.ncbi.nlm.nih.gov/nuccore with search term txid9606[Organism:exp], then used samtools flagstat to find mapped reads).

    The second run contained 10 samples high quality RNA, same extraction as first run and 10 samples RNA from FFPE samples, extracted with bead based Promega technology. rRNA content ranged from 4-38% and was not associated with RNA extraction method or quality (so low and high rRNA contents in both types of samples, in total 14/20 samples with >10% rRNA).

    Does anyone have an explanation as to why my second run contains much more rRNA?

    Cheers
    mvheetve

  • #2
    Hi There-

    You did not mention the RNA-seq library prep method you used nor if you used in any rRNA depletion method.

    What methods related to above were used?

    Olaf

    Comment


    • #3
      RE: prep and depletion

      Hi Olaf,

      thanks for showing an interest . I did specify the RNA lib prep, they used Illumina RNA Access (https://support.illumina.com/content...15049525-b.pdf). So basicly an RNA capture prep against coding regions.

      Apparently no rRNA depletion was done. The supervisor of the project deemed it unnecessary for a capture prep.

      Regards
      M

      Comment


      • #4
        Hi-

        OK, that's fine. We know of researchers who do the rRNA depletion as a precursor to capture as it is know that there are rRNA sequences similar to regions in mammalian RNA:

        Mauro, et al. PNAS 94:422-427 (January 1997):

        "rRNA-like sequences occur in diverse primary transcripts:
        Implications for the control of gene expression"

        Do your rRNA sequences actually map to the 5s/5.8s/18s/28s sequences for your organism in the reference genome you're using?

        Olaf

        Comment


        • #5
          Thanks,

          that's a welcome reference.

          Regarding the mapping: well I would think so yes. I used a fasta with all known human rRNA sequences (genome and mitochondrial) as reference in BWA. Come to think of it, I mapped the raw fastqs against the fasta and QC showed a significant amount of G-tails (NextSeq500 data) in the raw files. Might be that these reads mapped to G-repeats in the rRNA, explaining the problem (partially). I'll check that later and let you know.

          M

          Comment


          • #6
            You should scan and trim the data before aligning if there are substantial poly-G tails present.

            You can find the sequence of human rDNA repeat here, if you want to check again.

            If there any correlation between rRNA and FFPE in second set? FFPE samples are generally comparatively of lower quality and it would not be surprising if they are showing rRNA contamination.

            Comment


            • #7
              We tend to see poly-G strings in Read 2. We find it under the %Abundant files on Basespace.

              Comment


              • #8
                So I used fastp to remove low quality reads, adapters, reads that were too short, G-tails, etc... Then remapped using BWA, but this time against the fasta provided by GenoMax.

                For the data of the first run results were very similar to the earlier results (rRNA 2-10%, max difference between new and earlier results 0.63%).

                For the second run results were also very similar to earlier results for 16/20 samples (max ∆ 0.38%). For four samples however, percentages were 1-7% higher than before (which would mean that fastp has filtered out a significant amount of reads that didn't map to rRNA).

                So the question remains: why does run2 seem to contain more rRNA than run1 in the majority of samples? The only differences between both protocols is that 10/20 samples on run2 were FFPE samples extracted with different technology, but rRNA content is high for the majority of high quality samples as well. Both were carried out by the same lab tech.
                Also the supervisor just informed me that run2 reads contain UMIs whereas run1 didn't. Could this be affecting my output?

                M

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Strategies for Sequencing Challenging Samples
                  by seqadmin


                  Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                  03-22-2024, 06:39 AM
                • seqadmin
                  Techniques and Challenges in Conservation Genomics
                  by seqadmin



                  The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                  Avian Conservation
                  Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                  03-08-2024, 10:41 AM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, Yesterday, 06:37 PM
                0 responses
                8 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, Yesterday, 06:07 PM
                0 responses
                8 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 03-22-2024, 10:03 AM
                0 responses
                49 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 03-21-2024, 07:32 AM
                0 responses
                66 views
                0 likes
                Last Post seqadmin  
                Working...
                X