Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Different results produced by tophat?

    I use tophat to map my strand-specific RNA-seq seqs and its reverse complementary seqs to genome, but the results have a little difference.

    For example, I have 5700000 seqs in one sample after sequencing, 5000000 of them mapped to genome by tophat. For reverse complementary seqs, maybe 4998000 of them were mapped to genome.

    command I used:
    $tophat -o output1 --segment-mismatches 0 genome_index RNA-seq.fa

    Then I compare mapped seqs of different results. There are 50000 seqs in the results A (5000000 mapped, normal strand) are not present in result B (4998000 mapped, reverse complementary strand). Another 50000 seqs are not present in result A but included in result B.

    Who can tell me why the results is different ? Thanks.

  • #2
    I think it's probably due to quality values, you know, the quality value of each base usually decreases from left to right. So, if we reverse-complement a read, the read will have low quality values in its first several bases and therefore it's less likely for the read to be mapped to a genome. Perhaps you check this running Bowtie with the original reads and the reverse-complemented reads as TopHat is based on Bowtie.

    Bowtie allows 0~3 mismatches in the SEED of a read, which is 28bp by default, and allows additional mismatches in the rest depending on the sum of quality values, which I understand.


    Regarding each 50,000 seq difference between the results A and B, TopHat divides unmapped reads into several segments to find novel junctions, and a set of segments from the original reads can be different from that of segments from the reverse-complemented reads depending on the read length and the segment length, rendering different set of putative junctions.

    Other than this, the new version of TopHat at <http://tophat.cbcb.umd.edu/index.html> supports strand-specific RNA-Seq if you want to try it.

    Thanks,
    Daehwan
    Last edited by Daehwan; 10-27-2010, 10:06 AM.

    Comment


    • #3
      Thank you Daehwan.

      Before map reads to genome using tophat, contamination has already been cleaned. So I use fasta as input of tophat. And then I use bowtie map forward and reverse seqs to genome, the results are same.

      After check the cigar field of sam file, I think you are right. All the different parts of forward and reverse results are unmapped reads. When I set the parameter -a to 4, the different parts of results are reduced.
      Originally posted by Daehwan View Post
      Regarding each 50,000 seq difference between the results A and B, TopHat divides unmapped reads into several segments to find novel junctions, and a set of segments from the original reads can be different from that of segments from the reverse-complemented reads depending on the read length and the segment length, rendering different set of putative junctions.
      And I will try the new version of tophat. Thanks again

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Current Approaches to Protein Sequencing
        by seqadmin


        Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
        04-04-2024, 04:25 PM
      • seqadmin
        Strategies for Sequencing Challenging Samples
        by seqadmin


        Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
        03-22-2024, 06:39 AM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, 04-11-2024, 12:08 PM
      0 responses
      25 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-10-2024, 10:19 PM
      0 responses
      29 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-10-2024, 09:21 AM
      0 responses
      25 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-04-2024, 09:00 AM
      0 responses
      52 views
      0 likes
      Last Post seqadmin  
      Working...
      X