Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Tophat + htseq_count

    Hello, I'm performing RNAseq analyses and I've stumbled upon some puzzling results.
    I aligned some data with tophat2 (default settings) and as long as the results were disappointing (only about 5% of properly paired reads) I changed the -r and --mate-std-dev parameters and gotten to 60% (I know, still not very high). I ran htseq_count on the resulting bam alignments and comparing the two results I see no differences.
    Am I missing something? Does htseq_count use the information about properly paired reads or not? By these results I am prone to say no, I will check the code...
    Last edited by EGrassi; 01-14-2013, 05:20 AM.

  • #2
    A quick check on the htseq_count code tells me that it never uses the reads "mate_aligned" attribute and just considers all of the paired reads. Does this seem a strange behaviour only to me? I don't see in any place a check on wheter the two reads fall at a sensible distance to be reliably considered in the counts.

    Comment


    • #3
      The "mate_aligned" bit in the FLAG field indicates, in my reading of the SAM spec, that an alignment for the mate is given in the SAM file, not that this alignment is considered plausible. If TopHat really changes the mate_aligned field according to the distance, I'd consider this a very odd behaviour. In my opinion, it should set the alignment quality (5th field in the SAM file) to a low value to indicate that an alignment is reported but should not be trusted.

      htseq-count, by the way, filters by the alignment quality only if you use the -a option. I guess I should change this to be the default.

      Comment


      • #4
        Originally posted by Simon Anders View Post
        The "mate_aligned" bit in the FLAG field indicates, in my reading of the SAM spec, that an alignment for the mate is given in the SAM file, not that this alignment is considered plausible. If TopHat really changes the mate_aligned field according to the distance, I'd consider this a very odd behaviour. In my opinion, it should set the alignment quality (5th field in the SAM file) to a low value to indicate that an alignment is reported but should not be trusted.
        As long as the samtools flagstat percentage of properly paired reads gotten on the accepted_hits changed setting the -r tophat parameter I believed that the ones reported as not properly aligned were in the sam file but should not be considered as aligned in the analyses.

        (filtering on quality only with an option is fine in my opinion by the way ).

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Essential Discoveries and Tools in Epitranscriptomics
          by seqadmin


          The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist on Modified Bases...
          Yesterday, 07:01 AM
        • seqadmin
          Current Approaches to Protein Sequencing
          by seqadmin


          Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
          04-04-2024, 04:25 PM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, 04-11-2024, 12:08 PM
        0 responses
        39 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 10:19 PM
        0 responses
        41 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 09:21 AM
        0 responses
        35 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-04-2024, 09:00 AM
        0 responses
        55 views
        0 likes
        Last Post seqadmin  
        Working...
        X