Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • RNA-Seq throughput

    Hello everyone!, I have a dumb question, but I didn't found nothing about that.
    I want to do RNA-Seq paired-end with Hiseq2000 to grape mRNA (to see differential expression), but I don't know how many samples may I put per lane to have a good coverage.

    The whole genome in grape (Vitis vinifera): 485.000.000 bp

    I ask to someone and he says me "if sequencing the 8 samples in 1 lane, I can take 3~3.5 Gb throughput"

    1.- what does means with throughput?
    2.- Is enough to see differential expression?
    3.- How can I calculate a good coverage?

    Thank you very much!!

  • #2
    1) ~3-3.5 GB output per sample
    2) You are on the low end. If your goal is differential expression, then GBs is not really that informative, because what you are interested in is the number of reads/fragments. For differential expression, a single accurately mapped 50bp read gives about as much information as a 100bp paired end reads. Now if you are also looking for alternative splicing, then the added bps help. I am assuming these are paired end, 100bp data? If so, then the number of paired end reads you will have will be ~3GBs/200bps = ~15,000,000 reads per sample. Yes, its possible to detect differential expression at that level. But to give you an idea, I would say ~10,000,000 to be the lower limit for Arabidopsis. Also, not all reads will map or map correctly, so that you should expect to lose data. If you can get ~15,000,000 reads for 8 samples on a single lane, then I would go with two lanes of data, get ~30,000,000 which will give you a much better representation. Your ability to detect differential expression accurately is very much dependent upon read counts and if you take the minimal number of reads, then lower expressed genes will be a problem.
    3) I don't like calculating coverage for RNA-seq. Each gene is expressed differently and how does one make sense of coverage from such data? For this to make sense, then you need to know a priori how many copies of each RNA you have.....in which case there is no need to do an experiment.

    Also, make sure you have biological replicates. It is pointless if you do not have biological replicates.
    Last edited by chadn737; 03-08-2013, 07:17 AM.

    Comment


    • #3
      Thanks for your response so quickly!!

      Originally posted by chadn737 View Post
      I am assuming these are paired end, 100bp data? If so, then the number of paired end reads you will have will be ~3GBs/200bps = ~15,000,000 reads per sample. Yes, its possible to detect differential expression at that level. But to give you an idea, I would say ~10,000,000 to be the lower limit for Arabidopsis. Also, not all reads will map or map correctly, so that you should expect to lose data. If you can get ~15,000,000 reads for 8 samples on a single lane, then I would go with two lanes of data, get ~30,000,000 which will give you a much better representation.
      Yes are paired end 100 bp data, really are 4 samples with a biological replicates
      an schema is:
      RNA from:
      plant A: 2 one grape cluster in 2 different times
      plant B: 2 one grape cluster in 2 different times
      each sample separately, like follow:

      PLANT-TIME-CLUSTER
      A-T1-C1
      A-T1-C2
      A-T2-C1
      A-T2-C2
      B-T1-C1
      B-T1-C2
      B-T2-C1
      B-T2-C2

      so, all "C2" are the biological replicates. Thanks again!

      Comment


      • #4
        Its good you have Biological reps. I just checked and the number of grape genes ~30,000 is not much more than Arabidopsis and a lot less than some of the other species I have worked with. You could get away with ~15,000,000 reads, but from experience, getting that extra lane of data and increased depth of sequencing makes a huge difference. So I would really encourage you to use at least 2 lanes of data.

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Advancing Precision Medicine for Rare Diseases in Children
          by seqadmin




          Many organizations study rare diseases, but few have a mission as impactful as Rady Children’s Institute for Genomic Medicine (RCIGM). “We are all about changing outcomes for children,” explained Dr. Stephen Kingsmore, President and CEO of the group. The institute’s initial goal was to provide rapid diagnoses for critically ill children and shorten their diagnostic odyssey, a term used to describe the long and arduous process it takes patients to obtain an accurate...
          12-16-2024, 07:57 AM
        • seqadmin
          Recent Advances in Sequencing Technologies
          by seqadmin



          Innovations in next-generation sequencing technologies and techniques are driving more precise and comprehensive exploration of complex biological systems. Current advancements include improved accessibility for long-read sequencing and significant progress in single-cell and 3D genomics. This article explores some of the most impactful developments in the field over the past year.

          Long-Read Sequencing
          Long-read sequencing has seen remarkable advancements,...
          12-02-2024, 01:49 PM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, 12-17-2024, 10:28 AM
        0 responses
        26 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 12-13-2024, 08:24 AM
        0 responses
        42 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 12-12-2024, 07:41 AM
        0 responses
        28 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 12-11-2024, 07:45 AM
        0 responses
        42 views
        0 likes
        Last Post seqadmin  
        Working...
        X