Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • TOPHAT2 vs RNA-STAR 2X the mapped %.

    Hello all. I have been working with some very clean RNAseq data generated from Drosophila and have come across a few issues. First of all, I am using our local install of GALAXY - forgive me, but I am only a part-time bioinformaticist! That being said, I have used this interface for about five years now, so I am comfortable. The main issue troubling me right now is that I can use TOPHAT2 for mapping and am able to generate differential gene expression output using the Bowtie/Cufflinks/Cuffmerge/Cuffdiff pipeline. The problem is that the mapping is VERY low (about 50-60% for any of the fastqsanger files). I am only using clean data trimmed to 200bp or less (Ion Proton) with quality >20 median score. The problem is, when I use RNA-STAR I get multiple issues in the pipeline (problems with cufflinks that end up being problems in Cuffdiff that I don't want to get into here). What I want to know is, why do I get >80% mapped reads (and higher) when I use RNA-STAR, but only about 50% mapped reads with TOPHAT2? Is there some reason for this?

    Thanks in advance. Working through this....

  • #2
    TopHat is not very tolerant of errors in data. I'd recommend avoiding all of the Tuxedo pipeline where possible; I've always found it to be slow and unstable. Deseq and Edger seem to give more accurate results, anyway.

    Comment


    • #3
      Brian - I am just talking about mapping right now. Even if I switched to Deseq or Edger (which I could do on our local install) I still need a gaped aligner to map the sequence with first. Do you use RNA-STAR?

      Comment


      • #4
        Here are the actual numbers on the same fastqsanger file (trimmed to 200bp QS>20).

        Tophat2:
        Reads:
        Input: 39248980
        Mapped: 20971700 (53.4% of input)
        of these: 3246961 (15.5%) have multiple alignments (5659 have >20)
        53.4% overall read alignment rate.

        RNA-STAR:
        Started job on | Feb 15 12:26:35
        Started mapping on | Feb 15 12:26:38
        Finished on | Feb 15 12:35:13
        Mapping speed, Million of reads per hour | 274.36

        Number of input reads | 39248980
        Average input read length | 166
        UNIQUE READS:
        Uniquely mapped reads number | 32172637
        Uniquely mapped reads % | 81.97%
        Average mapped length | 164.84
        Number of splices: Total | 7873494
        Number of splices: Annotated (sjdb) | 0
        Number of splices: GT/AG | 7824667
        Number of splices: GC/AG | 46067
        Number of splices: AT/AC | 2760
        Number of splices: Non-canonical | 0
        Mismatch rate per base, % | 0.47%
        Deletion rate per base | 0.14%
        Deletion average length | 1.17
        Insertion rate per base | 0.24%
        Insertion average length | 1.15
        MULTI-MAPPING READS:
        Number of reads mapped to multiple loci | 5162521
        % of reads mapped to multiple loci | 13.15%
        Number of reads mapped to too many loci | 59135
        % of reads mapped to too many loci | 0.15%
        UNMAPPED READS:
        % of reads unmapped: too many mismatches | 0.00%
        % of reads unmapped: too short | 4.42%
        % of reads unmapped: other | 0.30%

        Comment


        • #5
          Originally posted by tonup69 View Post
          Brian - I am just talking about mapping right now. Even if I switched to Deseq or Edger (which I could do on our local install) I still need a gaped aligner to map the sequence with first. Do you use RNA-STAR?
          Nope, I use BBMap I have heard good things about STAR, but I've never benchmarked it.

          Comment


          • #6
            Hi Tonup69,

            check your alignments with rseqc, especially the clipping profile will be interesting.
            Maybe your alignment rate with STAR is very good, but the alignment-length might be quite short...

            Comment


            • #7
              Well, its a nice thought, but we don't have either of those wrappers on our local install and I am not the admin for the box. I will bring it up, but I am limited to what we have on our server.

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Strategies for Sequencing Challenging Samples
                by seqadmin


                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                03-22-2024, 06:39 AM
              • seqadmin
                Techniques and Challenges in Conservation Genomics
                by seqadmin



                The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                Avian Conservation
                Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                03-08-2024, 10:41 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, 03-27-2024, 06:37 PM
              0 responses
              12 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 03-27-2024, 06:07 PM
              0 responses
              11 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 03-22-2024, 10:03 AM
              0 responses
              53 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 03-21-2024, 07:32 AM
              0 responses
              68 views
              0 likes
              Last Post seqadmin  
              Working...
              X