Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • What's the best approach for a RNA seq project aiming at splicing and mRNA levels

    We are about to embark on an RNAseq project looking at the effects of a null mutant mouse lacking an RNA binding protein that affects mRNA stability/translation and splicing. So, we want to explore both mRNA levels as well as mRNA exon composition in mutants vs wild-type.


    1) Machine

    In theory, we have access to 454, SOLiD, Illumina, although the latter is best established in the facility. Read length is obviously important for a splicing analysis (an argument for 454) but a paired end on Illumina would probably also do the trick?

    2) RNA isolation & library
    We were thinking of using a standard RNeasy plus purification, which does not catch RNAs under 200nts. Any downsides you can think? Or a different isolation you think superior for our purpose?
    Followed by library generation in the facility which as far as I understand is a typical oligoT primed cDNA generation. RNA quantity is not an issue, so we don't need error-prone amplification.

    3) Type of read
    Should we go for maximal read length & paired end or is less good enough?
    What number of total read should we aim for to be able to do the analyses we want to do?
    Of course, long, paired, and more reads will give us better data but what's a good minimum starting point?

    Would be great to hear your opinions.

    Regards, j

  • #2
    For differential splicing analysis, number of reads may well be more important than length of reads. I don't think a 454 can give you enough reads to achieve statistical power. Even with Illumina, you may be better off with many shorter reads. What kind of effects do you expect your mutation may have?

    Comment


    • #3
      I'm doing a similar type of analysis of splicing and expression levels. I think the new Illumina HiSeq machines are probably the way to go because coverage is so important, and the new HiSeq machines can produce hundreds of millions of 100bp reads per lane. Depth is important if you want to see enough reads over splice junctions etc, and improve statistical power for expression quantification (as mentioned in above response).

      I've found that some genes are expressed at quite low levels making it difficult to assess expression and splicing. Although going with shorter reads will get you more coverage and probably be better for expression analysis, I think longer reads might be helpful in analyzing splicing, although it depends exactly what you're looking for there and what your transcriptome/genes look like. Also, you have to remember that you're probably going to trim off the last 10-15nt from your reads because they're of low quality and affect alignment/mapping.
      I went with single reads in order to increase my number of unique reads per lane (and per $), but if you have the funding paired-end couldn't hurt.

      What exactly are you trying to examine in terms of intron splicing? Alternative splicing or something else?

      Comment


      • #4
        Hi Camg,

        Recently I'm having a research about RNA-seq as well.
        Do you have any program or software in order to identify alternative splicing site of RNA-seq assembler read but without reference genome?
        Thanks.

        Comment


        • #5
          Thanks to Simon and Camg for replying. Regarding your questions:

          Simon wrote: "What kind of effects do you expect your mutation may have?";
          Camg wrote: "What exactly are you trying to examine in terms of intron splicing? Alternative splicing or something else?"


          I'm expecting suppression of alternative exons to be relieved in the mutant, since the protein typically acts to exclude optional exons. In addition, it can affect overall mRNA levels by modifying their stability.

          So, I need to compromise between read numbers which, like Simon pointed out, would be better for differential gene expression, and read length which would be better for exon alternative splicing. The question is what's a good compromise?

          Any immediate suggestions regarding RNA isolation?

          Regards, j

          Comment


          • #6
            Long reads are useful to identify which transcript you have. Imagine your gene has two cassette exons, with a constitutive exon in between. Your data shows that both exons are spliced out in half of the transcript molecules. You may be interested to know whether there is correlation: It could be the case, e,g., (a) that a transcript either has both exons or none of them, or (b) whenever one exon is present the other is absent, or (c) there is no correlation between the two exon and all four possibilities happen abou equally often. Long reads help you deciding between such possibilities, as they could span from one exon to the other.

            However, getting long reads is more expensive than short reads. And if you just want to know whether your treatment causes exons which are usually spliced out are now retained, you might not need it, as things are much easier then. Just count, for each sample, the number of reads falling onto the gene of interest and the number of reads among these that overlap with the alternative exon. Does the fraction of reads from this gene that fall onto this exon increase significantly from control to treatment? If you just want to compare the number of reads mapping onto the exon with the number of reads mapping onto any other part of the gene, rather short reads are fully sufficient.

            So: In order to distinguish transcripts and see correlation between the usage of several facultative exons, read length helps a lot. But if you only want to know for each exon individually whether its usage in transcripts changes due to your treatment, better invest your money in read number than in read length.

            Calculating the ratios is simple, testing whether a diferece is statistically significant is challenging. I say this because we are currently working on a tool to perform such an analysis. It should be ready for release soon.

            Finally: Please don't forget to do your experiment at least in duplicates.

            Comment


            • #7
              Originally posted by edge View Post
              Hi Camg,

              Recently I'm having a research about RNA-seq as well.
              Do you have any program or software in order to identify alternative splicing site of RNA-seq assembler read but without reference genome?
              Thanks.
              I'm using Tophat, which requires a reference genome. You're going to need to do de novo assembly, I think there are versions of Abyss and SOAP that can do this, and probably some others. Since I'm not doing de novo assembly I'm not really familiar with what you'll need to do, but it seems like identifying alternative splicing from a de novo transcriptome could be pretty tricky. Has anyone done this?
              I suppose if you have good coverage and you see the spliced and unspliced versions of your transcripts, then it should be possible. Sorry I couldn't be much help.

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Strategies for Sequencing Challenging Samples
                by seqadmin


                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                03-22-2024, 06:39 AM
              • seqadmin
                Techniques and Challenges in Conservation Genomics
                by seqadmin



                The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                Avian Conservation
                Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                03-08-2024, 10:41 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, Yesterday, 06:37 PM
              0 responses
              11 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, Yesterday, 06:07 PM
              0 responses
              10 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 03-22-2024, 10:03 AM
              0 responses
              51 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 03-21-2024, 07:32 AM
              0 responses
              68 views
              0 likes
              Last Post seqadmin  
              Working...
              X