Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • tashia
    Junior Member
    • Oct 2018
    • 4

    Ribosomal RNA contamination

    Hi everyone,

    I have a question in relation to a differential expression analysis that I ran with rRNA contaminated samples.

    Essentially, my samples had 60-70% rRNA content, but I chose to go ahead with the analysis once removing these sequences since I had enough reads to work with (15-20M). Contamination was seen in both cases and controls.

    However, what I am wondering now is if I can trust the results I am seeing? Is there any way that high rRNA content can affect patterns of expression? Comparing results with cases and controls with no contamination I see 1. More DEGs in the non-contaminated sample 2. Different DEGs (only 1 gene overlap). Of course they are different people but they do have the same disease and the controls are matched so I am a bit at a loss to whether this makes sense

    Hope this was clear.
    Many thanks.
  • tashia
    Junior Member
    • Oct 2018
    • 4

    #2
    Bumping my own post.

    Comment

    • ddb
      Member
      • Feb 2012
      • 13

      #3
      Can you give some more details about your experiment set up and the methods / pipeline you are using. Is it possible that in your contaminated samples you do not have enough coverage to identify most of the differentially expressed genes?

      Comment

      • tashia
        Junior Member
        • Oct 2018
        • 4

        #4
        Originally posted by ddb View Post
        Can you give some more details about your experiment set up and the methods / pipeline you are using. Is it possible that in your contaminated samples you do not have enough coverage to identify most of the differentially expressed genes?
        I should have enough coverage in the contaminated samples. I had approximately 75M reads before removing the rRNA sequences.

        The pipeline is removing rRNA sequences with bbduk, trimming with trimmomatic, aligning with STAR (latest HG38 version) counting with HTseq and finally DESeq2. About half of the samples were contaminated but were sequenced at a greater depth to achieve the same coverage as in the non-contaminated samples once removing the rRNA content. A PCA plot shows that the contaminated samples are clearly distinct from the "good" batch. I have controlled for batch (and tried several tools) but the PCA plot still shows a clear distinction. It feels like either this seperation is due to that these samples were sequenced at two different times (but many people work with samples like this and I think I should be able to control for this) OR due to the fact that they had a lot of rRNA during sequencing.

        Do you think that expression levels can be affected by a high content of rRNA in the sample while sequencing?

        Comment

        • GenoMax
          Senior Member
          • Feb 2008
          • 7142

          #5
          Do you think that expression levels can be affected by a high content of rRNA in the sample while sequencing?
          Not likely. For a sequencer it is just DNA. Does not matter if it came from rRNA or some other gene. If the two sets of runs happened on two different chemistries or different machines then may be something else to consider.

          If your samples were all treated the same how come you have rRNA contamination in some samples but not others?

          Comment

          • tashia
            Junior Member
            • Oct 2018
            • 4

            #6
            Originally posted by GenoMax View Post
            Not likely. For a sequencer it is just DNA. Does not matter if it came from rRNA or some other gene. If the two sets of runs happened on two different chemistries or different machines then may be something else to consider.

            If your samples were all treated the same how come you have rRNA contamination in some samples but not others?
            Thanks for your reply.
            Unfortunately the rRNA depletion failed in about half of the samples and instead of re-doing the libraries the sequencing facility sequenced at a greater coverage instead...

            They should have been run on the same machine and the only difference is that they were run at different times and then of course the contamination. However, if the rRNA contamination shouldn't cause any problems, I don't know why I can't control for this batch effect that I see in the PCA plot I have tried including batch in the model in DESeq2 and EdgeR. I have also tried correcting with RUVseq and ComBat. I am now looking into ImpulseDE2.

            Comment

            Latest Articles

            Collapse

            • SEQadmin2
              From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
              by SEQadmin2


              Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


              The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
              ...
              Yesterday, 10:05 AM
            • SEQadmin2
              Single-Cell Sequencing at an Inflection Point: Early Impacts of New Platforms and Emerging Trends
              by SEQadmin2


              With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.


              Introduction

              Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...
              05-22-2026, 06:42 AM
            • SEQadmin2
              Environmental Genomics in the Age of NGS: From Microbes to Conservation Strategies
              by SEQadmin2

              Studying ecosystems means dealing with complex, multi-species communities that are hard to observe at scale. This complexity, however, hides many important questions to be answered, from how biogeochemical cycles work and how climate change can affect species distribution to how conservation strategies can work best.


              Genomics, particularly since the expansion of NGS, has transformed ecosystem ecology. By sequencing environmental DNA, we can now assess biodiversity without direct...
              05-06-2026, 09:04 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by SEQadmin2, Yesterday, 12:03 PM
            0 responses
            19 views
            0 reactions
            Last Post SEQadmin2  
            Started by SEQadmin2, Yesterday, 11:40 AM
            0 responses
            14 views
            0 reactions
            Last Post SEQadmin2  
            Started by SEQadmin2, 05-28-2026, 11:40 AM
            0 responses
            29 views
            0 reactions
            Last Post SEQadmin2  
            Started by SEQadmin2, 05-26-2026, 10:12 AM
            0 responses
            31 views
            0 reactions
            Last Post SEQadmin2  
            Working...