Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Analysing RNA seq data

    Hi,
    I am new to the field of RNA sequencing. I want to know which is the best way to analyse RNA sequencing data from illumina without replications. I describe briefly my experiment and would be grateful if some one could suggest me ways to analyse the data.

    I work on forest trees and they are outbreeding like humans. I have used 3 populations with 15 seedlings from each population. I have grown 15 seedlings from each population in a glasshouse for 4 months and then imposed water stress by giving them limited amount of water for 2 months. I have taken leaf samples for RNA just before imposing stress. Ten seedlings recieved stress treatment and the other five were well watered. I have taken leaf samples one month and two months after treatment. I have bulked the RNA from initial sampling from 10 seedlings which recieved stress treatment and bulked the RNA from the other five seedlings. I did the same with the stress treated seedlins and sequenced all five bulks (3 bulks from stressed including one before imposing stress and two from controls C0). Five RNA bulks were sequenced using Illumina.

    Is it valid to compare expression from the ten seedlings before and after stress treatment? I don't have any replications. Could I use DEseq for analysing this data as I don't have replications?

  • #2
    Hi Balat,

    DESeq supports testing without replicates. See thread
    Discussion of next-gen sequencing related bioinformatics: resources, algorithms, open source efforts, etc

    and the documentation on http://bioconductor.org/packages/rel...tml/DESeq.html

    Comment


    • #3
      Hi Balat

      The Bioconductor package edgeR also supports DE analysis without replication - see the discussion at the link posted by joro. Find out more about edgeR here - I recommend having a look at the User's Guide to get a feel for what the package does and how to use it. The section on Poisson analysis is most relevant if you want to analyse data without replication.

      Best regards
      Davis

      Comment


      • #4
        Hi,

        DESeq, when given data without replicates, will switch to a conservative mode of overestimating variance, as I described in the post that joro cited. EdgeR can do the same but you have to tell it what dispersion estimate to use. Be sure to read Davis's post in the same thread. We both stress that switching the dispersion to 0 (Poisson test) will never give reliable results.

        Why did you pool your data into bulk samples? I understand that sequencing each sample individually would have been to expensive, but you could have pooled the 10 stress samples into two pools of five each. Then, you would have sequences for three pools, two of which would have been biological replicates which is fully sufficient to get a good noise estimate. If you had used barcoded adapters, it might not even have cost more.

        Simon

        Comment


        • #5
          Hi Simon,
          Thanks for the reply. Yes I could have used two bulks of five each but unfortunately I didn't. However I have analysed my data using the sample clustering feature of DESeq. I have sequence data from two populations. I have denoted the control samples from each population as S0-P and S0-K, similarly from stress2 as S1-P and S1-K and stress1 as S2-P and S2-K (there was an error in my labelling). The heat map clearly separates the treatments. Moreover expression from the two populations within a treatment look very similar as biological relplicates. In that case, can I treat the two populations as biological replicates? I have attached the heat map here.
          Thank you very mcuh.
          Attached Files

          Comment


          • #6
            Hi Davis,
            I just had a look into your reply to Sergio. I have used edgeR by treating S0-P and S0-K and S1-P and S1-K as biological replicates. I got a common dispersion estimate of 0.06. Based on this result and the heat map figure, is it ok to treat my two populations (P and K) as bilogical replicates?

            Thank you.

            Comment


            • #7
              Originally posted by Balat View Post
              In that case, can I treat the two populations as biological replicates?
              If they are two independently grown populations, you don't just treat them as biological replicates, they are biological replicates. So go ahead and use them that way.

              Simon

              Comment


              • #8
                I concur with Simon - you either have biological replicates or you do not, based on the origin of your samples. It is not something that is determined in the analysis of the data.

                I have used edgeR by treating S0-P and S0-K and S1-P and S1-K as biological replicates. I got a common dispersion estimate of 0.06.
                By way of interpretation, the common dispersion estimate is the "squared coefficient of variation", which is a measure of the inter-library variability, distinct from the technical variability. Here the coefficient of variation is therefore approximately 0.24, which we would interpret as indicating that the true concentration of each gene (an unobservable quantity) varies up and down by 24% between libraries.

                The assumption here is that the coefficient of variation is more or less constant across all genes. Now, Simon would rightly point out that this assumption does not hold for all RNA-seq datasets, but it does give you some idea of the variability you see between sample replicates in you data.

                Cheers
                Davis

                Comment


                • #9
                  Thanks Simon and Davis.
                  The populations in my study are separately grown populations of the same species under common garden conditions. I was expecting that the gene expression patterns would be very different between the populations. But clustering analysis by DEseq clearly shows that the expression patterns within each treatment are similar between the two populations.

                  Comment

                  Latest Articles

                  Collapse

                  • seqadmin
                    Current Approaches to Protein Sequencing
                    by seqadmin


                    Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                    04-04-2024, 04:25 PM
                  • seqadmin
                    Strategies for Sequencing Challenging Samples
                    by seqadmin


                    Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                    03-22-2024, 06:39 AM

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by seqadmin, 04-11-2024, 12:08 PM
                  0 responses
                  18 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 04-10-2024, 10:19 PM
                  0 responses
                  22 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 04-10-2024, 09:21 AM
                  0 responses
                  17 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 04-04-2024, 09:00 AM
                  0 responses
                  49 views
                  0 likes
                  Last Post seqadmin  
                  Working...
                  X