Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • mvheetve
    Junior Member
    • Mar 2020
    • 6

    inexplicable rRNA content

    Hi guys,

    long time reader, first time poster here. So here's my problem:

    Did two RNAseq runs (NextSeq500, RNA Access Illumina prep, 20 samples per run).

    The first run had 20 samples, high quality RNA, column based extraction Qiagen and showed 2-10% rRNA/sample (mapped with BWA, fasta from https://www.ncbi.nlm.nih.gov/nuccore with search term txid9606[Organism:exp], then used samtools flagstat to find mapped reads).

    The second run contained 10 samples high quality RNA, same extraction as first run and 10 samples RNA from FFPE samples, extracted with bead based Promega technology. rRNA content ranged from 4-38% and was not associated with RNA extraction method or quality (so low and high rRNA contents in both types of samples, in total 14/20 samples with >10% rRNA).

    Does anyone have an explanation as to why my second run contains much more rRNA?

    Cheers
    mvheetve
  • olafblue1955
    Junior Member
    • Feb 2019
    • 8

    #2
    Hi There-

    You did not mention the RNA-seq library prep method you used nor if you used in any rRNA depletion method.

    What methods related to above were used?

    Olaf

    Comment

    • mvheetve
      Junior Member
      • Mar 2020
      • 6

      #3
      RE: prep and depletion

      Hi Olaf,

      thanks for showing an interest . I did specify the RNA lib prep, they used Illumina RNA Access (https://support.illumina.com/content...15049525-b.pdf). So basicly an RNA capture prep against coding regions.

      Apparently no rRNA depletion was done. The supervisor of the project deemed it unnecessary for a capture prep.

      Regards
      M

      Comment

      • olafblue1955
        Junior Member
        • Feb 2019
        • 8

        #4
        Hi-

        OK, that's fine. We know of researchers who do the rRNA depletion as a precursor to capture as it is know that there are rRNA sequences similar to regions in mammalian RNA:

        Mauro, et al. PNAS 94:422-427 (January 1997):

        "rRNA-like sequences occur in diverse primary transcripts:
        Implications for the control of gene expression"

        Do your rRNA sequences actually map to the 5s/5.8s/18s/28s sequences for your organism in the reference genome you're using?

        Olaf

        Comment

        • mvheetve
          Junior Member
          • Mar 2020
          • 6

          #5
          Thanks,

          that's a welcome reference.

          Regarding the mapping: well I would think so yes. I used a fasta with all known human rRNA sequences (genome and mitochondrial) as reference in BWA. Come to think of it, I mapped the raw fastqs against the fasta and QC showed a significant amount of G-tails (NextSeq500 data) in the raw files. Might be that these reads mapped to G-repeats in the rRNA, explaining the problem (partially). I'll check that later and let you know.

          M

          Comment

          • GenoMax
            Senior Member
            • Feb 2008
            • 7142

            #6
            You should scan and trim the data before aligning if there are substantial poly-G tails present.

            You can find the sequence of human rDNA repeat here, if you want to check again.

            If there any correlation between rRNA and FFPE in second set? FFPE samples are generally comparatively of lower quality and it would not be surprising if they are showing rRNA contamination.

            Comment

            • olafblue1955
              Junior Member
              • Feb 2019
              • 8

              #7
              We tend to see poly-G strings in Read 2. We find it under the %Abundant files on Basespace.

              Comment

              • mvheetve
                Junior Member
                • Mar 2020
                • 6

                #8
                So I used fastp to remove low quality reads, adapters, reads that were too short, G-tails, etc... Then remapped using BWA, but this time against the fasta provided by GenoMax.

                For the data of the first run results were very similar to the earlier results (rRNA 2-10%, max difference between new and earlier results 0.63%).

                For the second run results were also very similar to earlier results for 16/20 samples (max ∆ 0.38%). For four samples however, percentages were 1-7% higher than before (which would mean that fastp has filtered out a significant amount of reads that didn't map to rRNA).

                So the question remains: why does run2 seem to contain more rRNA than run1 in the majority of samples? The only differences between both protocols is that 10/20 samples on run2 were FFPE samples extracted with different technology, but rRNA content is high for the majority of high quality samples as well. Both were carried out by the same lab tech.
                Also the supervisor just informed me that run2 reads contain UMIs whereas run1 didn't. Could this be affecting my output?

                M

                Comment

                Latest Articles

                Collapse

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by SEQadmin2, Today, 10:09 AM
                0 responses
                9 views
                0 reactions
                Last Post SEQadmin2  
                Started by SEQadmin2, Yesterday, 08:59 AM
                0 responses
                16 views
                0 reactions
                Last Post SEQadmin2  
                Started by SEQadmin2, 06-02-2026, 12:03 PM
                0 responses
                24 views
                0 reactions
                Last Post SEQadmin2  
                Started by SEQadmin2, 06-02-2026, 11:40 AM
                0 responses
                21 views
                0 reactions
                Last Post SEQadmin2  
                Working...