Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • RNAseq and number of mapped reads

    I'm new to RNAseq and have a couple of questions I would love to get some help with. I want to preform RNAseq on small cell numbers of mouse and human cells and am considering using the Nugen Ovation kit. There doesn't seem to be much published data using this kit, but per the posters on their website (e.g. http://www.nugeninc.com/tasks/sites/...Seq_poster.pdf) they show that ~50% of total reads map uniquely, and ~15% of total reads map to RefSeq. A recent Biotechniques paper (http://www.ncbi.nlm.nih.gov/pubmed?term=21486238) using the kit indicates ~25% map to RefSeq . I have two questions:

    1) Is 15%-25% of RNAseq reads mapping to RefSeq what one would see using the standard Illumina procedure starting with mRNA? It seems surpisingly low.

    2) How many mapped reads are generally needed to get robust expression measurements? Specifically, how many mapped RefSeq (or equivalent) reads do people feel give sensitivity comparable to microarrays.

  • #2
    Originally posted by seqfan View Post
    1) Is 15%-25% of RNAseq reads mapping to RefSeq what one would see using the standard Illumina procedure starting with mRNA? It seems surpisingly low.

    2) How many mapped reads are generally needed to get robust expression measurements? Specifically, how many mapped RefSeq (or equivalent) reads do people feel give sensitivity comparable to microarrays.
    1) Even considering all of the reads (ie not just the mapped ones) I find it quite low too.

    2) I would say it depends on the expression level of the gene you consider. A low depth of coverage may be enough for highly expressed genes, but a lot of genes often have few or no reads mapped on them so I am not sure that the total number of genic reads is the best criteria. Plus the distribution is not simple..
    Another factor you may want to consider is the number of replicates, which is important if you are looking for robustness.

    I am not being very helpful here.. any thought, the others?

    Comment


    • #3
      Originally posted by steven View Post
      1) Even considering all of the reads (ie not just the mapped ones) I find it quite low too.

      2) I would say it depends on the expression level of the gene you consider. A low depth of coverage may be enough for highly expressed genes, but a lot of genes often have few or no reads mapped on them so I am not sure that the total number of genic reads is the best criteria. Plus the distribution is not simple..
      Another factor you may want to consider is the number of replicates, which is important if you are looking for robustness.

      I am not being very helpful here.. any thought, the others?
      Thanks for your reply. I'm surprised that there aren't decent guidelines about what the # of mapped reads should be to reach roughly the same sensitivity as arrays. It will of course depend on the organism and cell type, but there must be a ballpark figure that would be likely to work for most things. Most importantly, I think knowing what parameters would give similar sensitivity as say Affy arrays would be very helpful for many people thinking of switching to RNA-seq.

      I have found a few papers looking at this but none have really addressed the question appropriately. Most don't do any validation of genes called be one but not the other method to test which (if either) is more sensitive. When they do they pick ~5 genes on each side, which is not enough.

      It would be great to get more discussion on this point here. Maybe I can phrase it another way to make it easier to start: How many reads are people targeting for an average mammalian RNA-seq experiment? 10M, 20M, 100M, something else?

      Comment


      • #4
        here is the answer: 500 millions

        Comment


        • #5
          I sent you a PM seqfan, but I figured I would post here for others to read too.

          I get more reads mapping than that generally. I use a Trizol method which, according to that paper, yields more mRNA mapping reads. Also, they used an Illumina CASAVA pipeline to map the reads. I'm a bit surprised at that since none of the people I know doing RNA-Seq actually stick with Illumina tools for their analysis. When I align my RNA reads with Illumina's ELAND for example, I get single digit percentages that map. Switching to something like BWA, Tophat, or RSEM works much, much better. So as a comparison for the methods tested in that paper, using ELAND might be fine, but I wouldn't say it will be the same with your samples if you use a better mapping method. However, I haven't tried the ssDNA nuclease method yet for my Nugen samples, but I will be doing so in the next 2-3 weeks.

          Comment


          • #6
            I think pbluescript has a good point. You will see better mapping using something besides ELAND and it will also be dependent on you annotation file. Try BWA or TopHat.

            We are still trying to figure out how many reads we need as well. Currently we like 10-20 million for mRNA seq and much more for whole transcriptome but we are still trying to figure it out. Luckily we have a ton of microarray and qPCR data we can pull on for some statistical comparisons.

            Comment


            • #7
              Going back to a part of the original post, but for any RNA-seq method,

              what % reads mapping to refseq would classify as a successful vs a failed run ?

              20%, 40%, 60% 80%?

              Assume that you don't have infinite sequencing resources, and repeating a run or running additional lanes has a huge penalty associated with it.

              Comment


              • #8
                85% on bowtie using default alignment criteria and short read length of 50bp
                95% on tophat using default alignment criteria and short read length of 50bp

                loosening the alignment criteria may add a percentage point or two

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Essential Discoveries and Tools in Epitranscriptomics
                  by seqadmin




                  The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                  04-22-2024, 07:01 AM
                • seqadmin
                  Current Approaches to Protein Sequencing
                  by seqadmin


                  Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                  04-04-2024, 04:25 PM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, Today, 08:47 AM
                0 responses
                12 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-11-2024, 12:08 PM
                0 responses
                60 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-10-2024, 10:19 PM
                0 responses
                59 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-10-2024, 09:21 AM
                0 responses
                54 views
                0 likes
                Last Post seqadmin  
                Working...
                X