Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    Originally posted by chadn737 View Post
    As an aside, the latest version of Tophat 2 no longer has the "*" qualities problems.
    TopHat new version (v2.03) has now solved the problem caused in HTSeq.

    Thanks a lot..!!

    Comment


    • #17
      I am having the same issue with HTSeq dealing with alignments from Tophat 2.

      I installed HTSeq/0.5.3p5 but the issue persists. The alignments were done using Tophat 2.0.0

      Code:
      Error occured in line 36 of file R13a_m_accepted_hits.sam.
      Error: ("'seq' and 'qualstr' do not have the same length.", 
      'line 36 of file R13a_m_accepted_hits.sam')
      [Exception type: ValueError, raised in _HTSeq.pyx:765]
      It would like to avoid having to re-align all of my samples with a newer version of Tophat. Any suggestions?

      Incidentally, when I was checking the installation as suggested above, the following appears with the v0.5.3p5 of HTSeq:


      Code:
      >htseq-count
      ....
      
      >Released under the terms of the GNU General
      >Public License v3. Part of the 'HTSeq' framework, version 0.5.3p3.
      I presume the footnote was just not updated with the new release?

      Comment


      • #18
        Hi
        As far as I know, I think the only option will be to rerun the TopHat with the new version (v2.3). I think the only problem is dealing with the * qualities in the sam files and that has been resolved in the latest version.

        Comment


        • #19
          Originally posted by phred View Post
          Code:
          >htseq-count
          ....
          
          >Released under the terms of the GNU General
          >Public License v3. Part of the 'HTSeq' framework, version 0.5.3p3.
          I presume the footnote was just not updated with the new release?

          Same here. Re-aligning all the reads with the new tophat would be cumbersome, I'll try to dig in the python and find a workaround...

          Comment


          • #20
            Sorry, it seems we made some mix-up between version 0.5.3p4 and 0.5.3p5. Essentially, p5 undid some fixes in p4, including the one for "*" qualities. Now, there is version 0.5.3p6, which should clean up this mess. Please let me know if you still have problems.

            Comment


            • #21
              Yep, thanks! Now that problem does not occur anymore.
              It's claiming to have some troubles about the sort order of the bam (which cuffdiff used without problems) but as long as I've not read all the docs it's my time to work now

              Comment


              • #22
                A very quick and dirty workaround that I used until they fixed the problem in Tophat and since I'm really not familiar with Python was to replace the missing quality scores with the read sequence itself. That way its guaranteed to be the same length as the read and HTSeq-count is happy.

                Code:
                awk '$11=="*"' accepted_hits.sam > noquals.sam;
                awk '$11 !~ /^\*$/' accepted_hits.sam > tmp.sam;
                awk 'FNR==NR{a[NR]=$10;next}{$11=a[FNR]}1' noquals.sam noquals.sam > tmp2.sam;
                awk -v OFS="\t" '$1=$1' tmp2.sam > tmp3.sam
                cat tmp.sam tmp3.sam > accepted_hits2.sam;
                rm tmp.sam tmp2.sam tmp3.sam noquals.sam;
                Then just run HTSeq-count on the accepted_hits2.sam

                Not pretty (I'm still a novice at this stuff, so I settle for what works for me), but it gets the job done if you don't want to realign it and if you don't want to fiddle around with the HTSeq-count code. Of course if Simon Anders has fixed the issue, then thats obviously a better solution than this.

                Comment


                • #23
                  Thanks chad for the workaround, I will give it a try.

                  Having installed HTSeq 0.5.3p6, I now get the following message:

                  Code:
                  $ htseq-count
                  
                  Traceback (most recent call last):
                    File "/home/support/apps/cports/rhel-6.x86_64/gnu/HTSeq/0.5.3p6_Python_2.7.2/bin/htseq-count", line 3, in <module>
                      import HTSeq.scripts.count
                    File "/home/support/apps/cports/rhel-6.x86_64/gnu/HTSeq/0.5.3p6_Python_2.7.2/lib/python/HTSeq/__init__.py", line 8, in <module>
                      from _HTSeq import *
                    File "_HTSeq.pyx", line 11, in init HTSeq._HTSeq (src/_HTSeq.c:30228)
                  ImportError: No module named pysam
                  I am wondering has anyone else encountered this?

                  Comment


                  • #24
                    Originally posted by phred View Post
                    Thanks chad for the workaround, I will give it a try.

                    Having installed HTSeq 0.5.3p6, I now get the following message:

                    Code:
                    $ htseq-count
                    
                    Traceback (most recent call last):
                      File "/home/support/apps/cports/rhel-6.x86_64/gnu/HTSeq/0.5.3p6_Python_2.7.2/bin/htseq-count", line 3, in <module>
                        import HTSeq.scripts.count
                      File "/home/support/apps/cports/rhel-6.x86_64/gnu/HTSeq/0.5.3p6_Python_2.7.2/lib/python/HTSeq/__init__.py", line 8, in <module>
                        from _HTSeq import *
                      File "_HTSeq.pyx", line 11, in init HTSeq._HTSeq (src/_HTSeq.c:30228)
                    ImportError: No module named pysam
                    I am wondering has anyone else encountered this?
                    Yep, but I solved it installing pysam from http://pysam.googlecode.com/files/pysam-0.6.tar.gz !

                    Comment


                    • #25
                      pysam dependency

                      Hi all,

                      indeed certain functions in HTSeq depend on an installed pysam. We have changed this so that now this error is only raised if one actually tries to call one of these functions while not having pysam installed.
                      HTSeq version 0.5.3p7 includes this change (if you do not want to upgrade you can always just install pysam)

                      Cheers,
                      Paul

                      "You are only young once, but you can stay immature indefinitely."

                      Comment


                      • #26
                        I'm curious what the best way to sort the sam file to give as input to htseq-count
                        I have sorted using samtools and UNIX sort as shown below
                        Code:
                        sort 1:
                        sort -T /scratch/vyellapantula -s -k 1,1 input.sam > sorted.sam
                        
                        sort 2:
                        samtools sort input.bam sorted
                        samtools view -h -o sorted.sam sorted.bam
                        Ive used the same htseq-count command on these sams and ran a diff on these outputs.There are some differences and would like to know of the best way to sort the bam/sam file for htseq.

                        Code:
                        < ENSG00000259158	   14
                        ---
                        > ENSG00000259158	   23
                        51905c51905
                        < ENSG00000259165	   15
                        ---
                        > ENSG00000259165	   30

                        Comment


                        • #27
                          Sort by position or name

                          Hi vyellapa,

                          if you run htseq-count on paired-end reads then you might want a .sam file sorted by name. Your Unix sort command does this (I hope it deals with the header properly), samtools sort sorts by position unless you specify the -n option.

                          That would be one likely source of the disparity between the counts you observed.

                          Cheers,
                          Paul

                          "You are only young once, but you can stay immature indefinitely."

                          Comment


                          • #28
                            Thanks Paul, using -n produced no differences.

                            Comment


                            • #29
                              Woa, I used samtools -n and htseq-count gave me all counts of zero...that's strange, I will check with IGV or similar tools but cuffdiff gave me fpkg values with the same gtf and bam file...

                              Comment


                              • #30
                                Originally posted by EGrassi View Post
                                Woa, I used samtools -n and htseq-count gave me all counts of zero...that's strange, I will check with IGV or similar tools but cuffdiff gave me fpkg values with the same gtf and bam file...
                                Hi EGrassi,

                                Did you solve this issue??????

                                Cheers,
                                TEFA

                                Comment

                                Latest Articles

                                Collapse

                                • seqadmin
                                  Strategies for Sequencing Challenging Samples
                                  by seqadmin


                                  Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                                  03-22-2024, 06:39 AM
                                • seqadmin
                                  Techniques and Challenges in Conservation Genomics
                                  by seqadmin



                                  The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                                  Avian Conservation
                                  Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                                  03-08-2024, 10:41 AM

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by seqadmin, Yesterday, 06:37 PM
                                0 responses
                                10 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, Yesterday, 06:07 PM
                                0 responses
                                9 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 03-22-2024, 10:03 AM
                                0 responses
                                51 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 03-21-2024, 07:32 AM
                                0 responses
                                67 views
                                0 likes
                                Last Post seqadmin  
                                Working...
                                X