Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Coverage for duplicates

    Hi guys,

    So I'm doing a 100bp PE reads on the human genome with 3 different conditions. I ran one set on one lane, and I just finished looking at the data. The plan originally was that after I looked at the data, if it looked like there were good differences (there are), I would run two sets of samples on one more lane.

    However, now that the moment of truth is here, I'm wondering if this is the right move. Should I instead run only one more set of data for one lane, and just use my results in duplicate?

    Sorry, I'm a such a newbie, so I'm not sure what stats are pertinent so let me just give what I have. The RNA is very good, and in the set that I have results for, I got 90% read alignment. After I used cuffdiff, cummeRbund gave me 300 significantly differntially expressed genes. So 30 of the 300 had "values" under 1. However, of the 300, about 40 had over a two fold difference, and of these 40, 25 of them had a "value" under 1.

    For microarrays, a two fold change is the minimum you can have to call it useful. If that's the rule on sequencing, then I worry that if I split up my lane in essentially half, I'll lose 25 of the 40 genes that were very significantly differentially expressed.

    Any thoughts or suggestions?

    Thanks so much!

  • #2
    So I had some more info if anyone was searching and came across this. First off, so the values are FPKM values (I had a deadline yesterday, and I was running around all crazy and didn't actually stop and THINK).

    So to address this problem, see how many mappable reads you have. I had approximately 100 M per condition. If I use two sets of samples on lane, I'll end up with 50 M per condition. With an FPKM value of 1, I'd have 50 fragments. I spoke to an Illumina Tech who told me they hear in the 30-50 range as the minimum in which you can still use. Is this what everyone else has heard?

    Someone else seemed to wonder this as well:
    Discussion of next-gen sequencing related bioinformatics: resources, algorithms, open source efforts, etc
    Last edited by billstevens; 04-06-2012, 11:22 AM.

    Comment

    Latest Articles

    Collapse

    • seqadmin
      Strategies for Sequencing Challenging Samples
      by seqadmin


      Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
      03-22-2024, 06:39 AM
    • seqadmin
      Techniques and Challenges in Conservation Genomics
      by seqadmin



      The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

      Avian Conservation
      Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
      03-08-2024, 10:41 AM

    ad_right_rmr

    Collapse

    News

    Collapse

    Topics Statistics Last Post
    Started by seqadmin, 03-27-2024, 06:37 PM
    0 responses
    13 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 03-27-2024, 06:07 PM
    0 responses
    12 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 03-22-2024, 10:03 AM
    0 responses
    53 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 03-21-2024, 07:32 AM
    0 responses
    69 views
    0 likes
    Last Post seqadmin  
    Working...
    X