Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Tophat + htseq_count

    Hello, I'm performing RNAseq analyses and I've stumbled upon some puzzling results.
    I aligned some data with tophat2 (default settings) and as long as the results were disappointing (only about 5% of properly paired reads) I changed the -r and --mate-std-dev parameters and gotten to 60% (I know, still not very high). I ran htseq_count on the resulting bam alignments and comparing the two results I see no differences.
    Am I missing something? Does htseq_count use the information about properly paired reads or not? By these results I am prone to say no, I will check the code...
    Last edited by EGrassi; 01-14-2013, 05:20 AM.

  • #2
    A quick check on the htseq_count code tells me that it never uses the reads "mate_aligned" attribute and just considers all of the paired reads. Does this seem a strange behaviour only to me? I don't see in any place a check on wheter the two reads fall at a sensible distance to be reliably considered in the counts.

    Comment


    • #3
      The "mate_aligned" bit in the FLAG field indicates, in my reading of the SAM spec, that an alignment for the mate is given in the SAM file, not that this alignment is considered plausible. If TopHat really changes the mate_aligned field according to the distance, I'd consider this a very odd behaviour. In my opinion, it should set the alignment quality (5th field in the SAM file) to a low value to indicate that an alignment is reported but should not be trusted.

      htseq-count, by the way, filters by the alignment quality only if you use the -a option. I guess I should change this to be the default.

      Comment


      • #4
        Originally posted by Simon Anders View Post
        The "mate_aligned" bit in the FLAG field indicates, in my reading of the SAM spec, that an alignment for the mate is given in the SAM file, not that this alignment is considered plausible. If TopHat really changes the mate_aligned field according to the distance, I'd consider this a very odd behaviour. In my opinion, it should set the alignment quality (5th field in the SAM file) to a low value to indicate that an alignment is reported but should not be trusted.
        As long as the samtools flagstat percentage of properly paired reads gotten on the accepted_hits changed setting the -r tophat parameter I believed that the ones reported as not properly aligned were in the sam file but should not be considered as aligned in the analyses.

        (filtering on quality only with an option is fine in my opinion by the way ).

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Current Approaches to Protein Sequencing
          by seqadmin


          Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
          04-04-2024, 04:25 PM
        • seqadmin
          Strategies for Sequencing Challenging Samples
          by seqadmin


          Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
          03-22-2024, 06:39 AM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, 04-11-2024, 12:08 PM
        0 responses
        27 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 10:19 PM
        0 responses
        30 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 09:21 AM
        0 responses
        26 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-04-2024, 09:00 AM
        0 responses
        52 views
        0 likes
        Last Post seqadmin  
        Working...
        X