Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • normalizing RNA-seq data to "unique transcript length" instead of "transcript length"

    I've clustered expression profiles from 20 experiments for a large group of highly related genes. I have the raw read counts and normalized this data using [total read counts uniquely matching to a gene]/[total counts in experiment]*[length of transcript]. However, because these genes have a large amount of non-unique sequence, I don't think that this method is correct. I'd like to try normalizing the expression data based on [length of unique k-mers within transcript] rather than [length of transcript]. Is there an existing tool that can calculate this?
    Thanks in advance any help!

  • #2
    I'm not particularly impressed with the RPKM measure either, as it is still biased towards long transcripts (Oshlack and Wakefield, Biology Direct 2009). The method found in this paper (doi:10.1093/bioinformatics/btp692) seems to be a more intelligent way of addressing this issues, though I haven't yet tested it out. I'm not sure that [length of unique k-mers within transcript] will be any better than [length of transcript] at eliminating the bias you think is there.
    If you're set on doing it, though, I think you'll have to roll your own script to determine the length of unique k-mers, which may get to be fairly computationally intensive depending on how you go about doing that.

    Comment


    • #3
      Originally posted by lmc View Post
      I've clustered expression profiles from 20 experiments for a large group of highly related genes. I have the raw read counts and normalized this data using [total read counts uniquely matching to a gene]/[total counts in experiment]*[length of transcript]. However, because these genes have a large amount of non-unique sequence, I don't think that this method is correct. I'd like to try normalizing the expression data based on [length of unique k-mers within transcript] rather than [length of transcript]. Is there an existing tool that can calculate this?
      Thanks in advance any help!
      Although it may not remove all the biases, subtracting non unique k-mers from the total transcript length before normalization makes sense. You can find some precomputed data in the "mapability" tracks at the UCSC for this purpose.
      Another thing: because of biases in the read coverage at the end of the transcripts, it is frequent to disregard the initial and terminal exons and/or the UTRs.
      good luck,
      s.

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Strategies for Sequencing Challenging Samples
        by seqadmin


        Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
        03-22-2024, 06:39 AM
      • seqadmin
        Techniques and Challenges in Conservation Genomics
        by seqadmin



        The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

        Avian Conservation
        Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
        03-08-2024, 10:41 AM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, Yesterday, 06:37 PM
      0 responses
      8 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, Yesterday, 06:07 PM
      0 responses
      8 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 03-22-2024, 10:03 AM
      0 responses
      49 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 03-21-2024, 07:32 AM
      0 responses
      66 views
      0 likes
      Last Post seqadmin  
      Working...
      X