Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • all_your_base
    Member
    • Mar 2012
    • 40

    HTSeq not working with Bowtie2 .SAM

    Hi all,

    I am having a weird problem with my Bowtie2 .SAM output for use with HTseq to count reads that correspond to genes in a .gff file.

    Usually, I can just feed my Bowtie1 .SAM into HTseq using the following command:

    htseq-count -m union -s no -t gene -i ID -o myOutput.sam myInput.sam organism.gff


    However, after switching to Bowtie2 and running the same command, I get gigabytes of this:


    Warning: Read HWI-ST1234:350WK3ACXX:6:1101:1780:2126/1 claims to have an aligned mate which could not be found. (Is the SAM file properly sorted?)
    Warning: Read HWI-ST1234:350WK3ACXX:6:1101:1780:2126/2 claims to have an aligned mate which could not be found. (Is the SAM file properly sorted?)
    Warning: Read HWI-ST1234:350WK3ACXX:6:1101:1671:2238/1 claims to have an aligned mate which could not be found. (Is the SAM file properly sorted?)
    Warning: Read HWI-ST1234:350WK3ACXX:6:1101:1671:2238/2 claims to have an aligned mate which could not be found. (Is the SAM file properly sorted?)
    Warning: Read HWI-ST1234:350WK3ACXX:6:1101:2011:2134/1 claims to have an aligned mate which could not be found. (Is the SAM file properly sorted?)
    Warning: Read HWI-ST1234:350WK3ACXX:6:1101:2011:2134/2 claims to have an aligned mate which could not be found. (Is the SAM file properly sorted?)


    According to other forums, this usually happens when the SAM isn't sorted by read ID, so that htseq can't find the two halves of a paired-end read. However, I tried sorting my SAM in multiple ways, such as:

    sort -k1 myfile.sam > myfile_sorted.sam


    I still get the same error! Any help or suggestions are greatly appreciated
  • dpryan
    Devon Ryan
    • Jul 2011
    • 3478

    #2
    If any of those reads are multimapped, then using the command line sort command will not do what you want. Use samtools sort -n

    Comment

    • all_your_base
      Member
      • Mar 2012
      • 40

      #3
      @dpryan,

      Thanks for the reply, but can you please explain your answer? How does the samtools sort command differ than unix sort?

      Also, since Bowtie2 produces a SAM file by default, to use SAMtools sort, do I have to first convert to BAM, then sort, then convert back to SAM?

      Thanks...

      Comment

      • kmcarr
        Senior Member
        • May 2008
        • 1181

        #4
        The problem is the /1 and /2 in your read names. The SAM specification indicates that the names of paired reads be identical. SAM identifies read 1 or read 2 by the FLAG bits. Remove the /1 & /2 from the names in your SAM files and repeat your analysis.

        Comment

        • all_your_base
          Member
          • Mar 2012
          • 40

          #5
          @kmcarr

          Wonderful, I trimmed the /1 and /2 off my reads and made sure the mates were next to each other after sorting, and HTSEQ runs fine without the previous error messages.

          Quick question...
          After processing a few thousands reads, HTSEQ reports the following error:

          Warning: Malformed SAM line: MRNM != '*' although flag bit &0x0008 set
          Warning: Malformed SAM line: RNAME != '*' although flag bit &0x0004 set

          This is from raw Bowtie2 output; the only modifications were my /1 and /2 trimming and sorting.

          Anyone have an idea where these errors are coming from??

          Thanks!

          Comment

          • dpryan
            Devon Ryan
            • Jul 2011
            • 3478

            #6
            RNAME and MRNM are the name of the chromosome (or scaffold or whatever) to which the current read and its mate (for MRNM) map. Since the flags indicate that the reads are unmapped, it's just complaining that there's stuff here instead of an *, meaning "Not available". I don't recall ever seeing that with bowtie2, only bwa. You can normally ignore such warnings.

            Comment

            • all_your_base
              Member
              • Mar 2012
              • 40

              #7
              @dpryan,

              Thanks for all your help. My analysis is working well now

              Comment

              Latest Articles

              Collapse

              • GATTACAT
                Reply to Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                by GATTACAT
                Love this - good data definitely starts from good input, and poor input can only give relatively poor data. I particularly like the mention of Nanodrop/absorbance based methods for quantification. It's such a toss up if you'll get an accurate reading or what amounts to a randomly generated number, and a lot of library/sequencing related issues can be traced back to poor quant.
                Yesterday, 11:43 AM
              • SEQadmin2
                Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                by SEQadmin2


                I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

                Here are nine questions we think about, in roughly the order they matter, before...
                06-18-2026, 07:11 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by SEQadmin2, Today, 11:08 AM
              0 responses
              1 view
              0 reactions
              Last Post SEQadmin2  
              Started by SEQadmin2, 06-30-2026, 05:37 AM
              0 responses
              11 views
              0 reactions
              Last Post SEQadmin2  
              Started by SEQadmin2, 06-26-2026, 11:10 AM
              0 responses
              18 views
              0 reactions
              Last Post SEQadmin2  
              Started by SEQadmin2, 06-17-2026, 06:09 AM
              0 responses
              52 views
              0 reactions
              Last Post SEQadmin2  
              Working...