Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • dharan
    Junior Member
    • Jan 2012
    • 7

    #16
    Originally posted by chadn737 View Post
    As an aside, the latest version of Tophat 2 no longer has the "*" qualities problems.
    TopHat new version (v2.03) has now solved the problem caused in HTSeq.

    Thanks a lot..!!

    Comment

    • phred
      Member
      • May 2012
      • 11

      #17
      I am having the same issue with HTSeq dealing with alignments from Tophat 2.

      I installed HTSeq/0.5.3p5 but the issue persists. The alignments were done using Tophat 2.0.0

      Code:
      Error occured in line 36 of file R13a_m_accepted_hits.sam.
      Error: ("'seq' and 'qualstr' do not have the same length.", 
      'line 36 of file R13a_m_accepted_hits.sam')
      [Exception type: ValueError, raised in _HTSeq.pyx:765]
      It would like to avoid having to re-align all of my samples with a newer version of Tophat. Any suggestions?

      Incidentally, when I was checking the installation as suggested above, the following appears with the v0.5.3p5 of HTSeq:


      Code:
      >htseq-count
      ....
      
      >Released under the terms of the GNU General
      >Public License v3. Part of the 'HTSeq' framework, version 0.5.3p3.
      I presume the footnote was just not updated with the new release?

      Comment

      • dharan
        Junior Member
        • Jan 2012
        • 7

        #18
        Hi
        As far as I know, I think the only option will be to rerun the TopHat with the new version (v2.3). I think the only problem is dealing with the * qualities in the sam files and that has been resolved in the latest version.

        Comment

        • EGrassi
          Member
          • Oct 2010
          • 66

          #19
          Originally posted by phred View Post
          Code:
          >htseq-count
          ....
          
          >Released under the terms of the GNU General
          >Public License v3. Part of the 'HTSeq' framework, version 0.5.3p3.
          I presume the footnote was just not updated with the new release?

          Same here. Re-aligning all the reads with the new tophat would be cumbersome, I'll try to dig in the python and find a workaround...

          Comment

          • Simon Anders
            Senior Member
            • Feb 2010
            • 995

            #20
            Sorry, it seems we made some mix-up between version 0.5.3p4 and 0.5.3p5. Essentially, p5 undid some fixes in p4, including the one for "*" qualities. Now, there is version 0.5.3p6, which should clean up this mess. Please let me know if you still have problems.

            Comment

            • EGrassi
              Member
              • Oct 2010
              • 66

              #21
              Yep, thanks! Now that problem does not occur anymore.
              It's claiming to have some troubles about the sort order of the bam (which cuffdiff used without problems) but as long as I've not read all the docs it's my time to work now

              Comment

              • chadn737
                Senior Member
                • Jan 2009
                • 392

                #22
                A very quick and dirty workaround that I used until they fixed the problem in Tophat and since I'm really not familiar with Python was to replace the missing quality scores with the read sequence itself. That way its guaranteed to be the same length as the read and HTSeq-count is happy.

                Code:
                awk '$11=="*"' accepted_hits.sam > noquals.sam;
                awk '$11 !~ /^\*$/' accepted_hits.sam > tmp.sam;
                awk 'FNR==NR{a[NR]=$10;next}{$11=a[FNR]}1' noquals.sam noquals.sam > tmp2.sam;
                awk -v OFS="\t" '$1=$1' tmp2.sam > tmp3.sam
                cat tmp.sam tmp3.sam > accepted_hits2.sam;
                rm tmp.sam tmp2.sam tmp3.sam noquals.sam;
                Then just run HTSeq-count on the accepted_hits2.sam

                Not pretty (I'm still a novice at this stuff, so I settle for what works for me), but it gets the job done if you don't want to realign it and if you don't want to fiddle around with the HTSeq-count code. Of course if Simon Anders has fixed the issue, then thats obviously a better solution than this.

                Comment

                • phred
                  Member
                  • May 2012
                  • 11

                  #23
                  Thanks chad for the workaround, I will give it a try.

                  Having installed HTSeq 0.5.3p6, I now get the following message:

                  Code:
                  $ htseq-count
                  
                  Traceback (most recent call last):
                    File "/home/support/apps/cports/rhel-6.x86_64/gnu/HTSeq/0.5.3p6_Python_2.7.2/bin/htseq-count", line 3, in <module>
                      import HTSeq.scripts.count
                    File "/home/support/apps/cports/rhel-6.x86_64/gnu/HTSeq/0.5.3p6_Python_2.7.2/lib/python/HTSeq/__init__.py", line 8, in <module>
                      from _HTSeq import *
                    File "_HTSeq.pyx", line 11, in init HTSeq._HTSeq (src/_HTSeq.c:30228)
                  ImportError: No module named pysam
                  I am wondering has anyone else encountered this?

                  Comment

                  • EGrassi
                    Member
                    • Oct 2010
                    • 66

                    #24
                    Originally posted by phred View Post
                    Thanks chad for the workaround, I will give it a try.

                    Having installed HTSeq 0.5.3p6, I now get the following message:

                    Code:
                    $ htseq-count
                    
                    Traceback (most recent call last):
                      File "/home/support/apps/cports/rhel-6.x86_64/gnu/HTSeq/0.5.3p6_Python_2.7.2/bin/htseq-count", line 3, in <module>
                        import HTSeq.scripts.count
                      File "/home/support/apps/cports/rhel-6.x86_64/gnu/HTSeq/0.5.3p6_Python_2.7.2/lib/python/HTSeq/__init__.py", line 8, in <module>
                        from _HTSeq import *
                      File "_HTSeq.pyx", line 11, in init HTSeq._HTSeq (src/_HTSeq.c:30228)
                    ImportError: No module named pysam
                    I am wondering has anyone else encountered this?
                    Yep, but I solved it installing pysam from http://pysam.googlecode.com/files/pysam-0.6.tar.gz !

                    Comment

                    • Dethecor
                      Member
                      • May 2010
                      • 24

                      #25
                      pysam dependency

                      Hi all,

                      indeed certain functions in HTSeq depend on an installed pysam. We have changed this so that now this error is only raised if one actually tries to call one of these functions while not having pysam installed.
                      HTSeq version 0.5.3p7 includes this change (if you do not want to upgrade you can always just install pysam)

                      Cheers,
                      Paul

                      "You are only young once, but you can stay immature indefinitely."

                      Comment

                      • vyellapa
                        Member
                        • Oct 2011
                        • 59

                        #26
                        I'm curious what the best way to sort the sam file to give as input to htseq-count
                        I have sorted using samtools and UNIX sort as shown below
                        Code:
                        sort 1:
                        sort -T /scratch/vyellapantula -s -k 1,1 input.sam > sorted.sam
                        
                        sort 2:
                        samtools sort input.bam sorted
                        samtools view -h -o sorted.sam sorted.bam
                        Ive used the same htseq-count command on these sams and ran a diff on these outputs.There are some differences and would like to know of the best way to sort the bam/sam file for htseq.

                        Code:
                        < ENSG00000259158	   14
                        ---
                        > ENSG00000259158	   23
                        51905c51905
                        < ENSG00000259165	   15
                        ---
                        > ENSG00000259165	   30

                        Comment

                        • Dethecor
                          Member
                          • May 2010
                          • 24

                          #27
                          Sort by position or name

                          Hi vyellapa,

                          if you run htseq-count on paired-end reads then you might want a .sam file sorted by name. Your Unix sort command does this (I hope it deals with the header properly), samtools sort sorts by position unless you specify the -n option.

                          That would be one likely source of the disparity between the counts you observed.

                          Cheers,
                          Paul

                          "You are only young once, but you can stay immature indefinitely."

                          Comment

                          • vyellapa
                            Member
                            • Oct 2011
                            • 59

                            #28
                            Thanks Paul, using -n produced no differences.

                            Comment

                            • EGrassi
                              Member
                              • Oct 2010
                              • 66

                              #29
                              Woa, I used samtools -n and htseq-count gave me all counts of zero...that's strange, I will check with IGV or similar tools but cuffdiff gave me fpkg values with the same gtf and bam file...

                              Comment

                              • TEFA
                                Junior Member
                                • Mar 2012
                                • 5

                                #30
                                Originally posted by EGrassi View Post
                                Woa, I used samtools -n and htseq-count gave me all counts of zero...that's strange, I will check with IGV or similar tools but cuffdiff gave me fpkg values with the same gtf and bam file...
                                Hi EGrassi,

                                Did you solve this issue??????

                                Cheers,
                                TEFA

                                Comment

                                Latest Articles

                                Collapse

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by SEQadmin2, Today, 10:09 AM
                                0 responses
                                9 views
                                0 reactions
                                Last Post SEQadmin2  
                                Started by SEQadmin2, Yesterday, 08:59 AM
                                0 responses
                                14 views
                                0 reactions
                                Last Post SEQadmin2  
                                Started by SEQadmin2, 06-02-2026, 12:03 PM
                                0 responses
                                24 views
                                0 reactions
                                Last Post SEQadmin2  
                                Started by SEQadmin2, 06-02-2026, 11:40 AM
                                0 responses
                                20 views
                                0 reactions
                                Last Post SEQadmin2  
                                Working...