Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • comparing Bowtie/DESeq and Top-Hat/Cufflinks results

    Dear all, I have 2 RNA-seq libraries (40 bp single end) and a genome annotation, I am interested in differential expression.

    I run:
    1- Bowtie/DESeq at the gene level
    2-TopHat/Cufflinks at the transcript level.

    I got very different results (not only in terms of quantification, but also "direction" of changes)-see example below. I was expecting differences, but not this much.

    Which method do you think best suites the type of data I have?
    Is is appropriate to try to run TopHat with 40bp-single end reads?
    The mean N. of reads given by DESeq does not account for transcript length, would this prevent comparison of transcript quantification levels within a library?

    thanks in advance for any reply,


    Bowtie
    Raw N. reads
    gene Transcript Transcript length conditionA conditionB
    1 1 1590 297 242
    2 2 198 0 0
    3 3 2048 383 500
    4 4 2034 283 109
    5 5-a 788 86 137
    5 5-b 1268
    6 6 2087 303 640
    7 7 1656 0 0
    8 8 1809 316 335
    9 9 761 0 0
    10 10-a 735 658 386
    10 10-b 524
    TopHat-Cufflinks
    FPKM-A FPKM-B
    gene Transcript Transcript length conditionA conditionB ln(fold_ch) AvB
    1 1 1590 20.526 11.7229 0.560149
    2 2 198 45.8285 0 1.79769e+308
    3 3 2048 17.5533 9.35482 0.62935
    4 4 2034 28.2751 9.71151 1.06867
    5 5-a 788 6.67631 1.6504 1.39755
    5 5-b 1268 32.5 4.01143 2.09209
    6 6 2087 53.4758 3.36856 2.76474
    7 7 1656 0.110199 0 1.79769e+308
    8 8 1809 16.365 15.5165 0.0532368
    9 9 761 2.85777 0 1.79769e+308
    10 10-a 735 6.11169 3.07078 0.688272
    10 10-b 524 818.778 1315.66 -0.474281
    Bowtie-DESeq
    MeanReadsA MeanReadsB
    gene Transcript Transcript length conditionA conditionB Log2FCAvB
    1 1 1590 284.9435568 252.2394288 -0.175228351
    2 2 198 0 0 0
    3 3 2048 367.4524655 521.1558445 0.503001955
    4 4 2034 271.511874 113.6119741 -1.249561315
    5 5-a 788 82.50890871 142.7967014 0.784028564
    5 5-b 1268
    6 6 2087 290.6999923 667.079481 1.195534401
    7 7 1656 0 0 0
    8 8 1809 303.1722692 349.1744158 0.203185051
    9 9 761 0 0 0
    10 10-a 735 631.2890922 402.332312 -0.648615343
    10 10-b 524

  • #2
    I don't get why you are using Bowtie here for the gene level analysis. Why not calculate read counts for the TopHat output and feed into DESeq? With Bowtie, you will miss splice junction spanning alignments. This is one reason why the two cases may not be comparable.

    Comment


    • #3
      thank you Kopi-o for your reply. I have 40 bp single end reads, i used Bowtie because i ma not sure that this type of read will be ok with TopHat. will they? thanks

      Comment


      • #4
        I think that should be OK. To be honest I haven't run TopHat on shorter reads than 1x50 bp, which worked fine.

        Comment


        • #5
          Originally posted by maryb View Post
          thank you Kopi-o for your reply. I have 40 bp single end reads, i used Bowtie because i ma not sure that this type of read will be ok with TopHat. will they? thanks
          You can use Tophat with these reads, but be sure to change the --segment-length option to 20. With a default of 25, you'll have issues with reads that short.

          If you want an opinion on which method for DE analysis is better, you should settle in for a long read on this board. People debate it frequently. Personally, I see no clear winner that will work for every situation and generally end up using Cufflinks, DESeq, and RSEM. Whatever pops up consistently is what I take on for further analysis.

          Comment


          • #6
            thank you very much

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Essential Discoveries and Tools in Epitranscriptomics
              by seqadmin


              The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist on Modified Bases...
              Yesterday, 07:01 AM
            • seqadmin
              Current Approaches to Protein Sequencing
              by seqadmin


              Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
              04-04-2024, 04:25 PM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, 04-11-2024, 12:08 PM
            0 responses
            39 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 10:19 PM
            0 responses
            41 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 09:21 AM
            0 responses
            35 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-04-2024, 09:00 AM
            0 responses
            55 views
            0 likes
            Last Post seqadmin  
            Working...
            X