Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Problem with HTSeq dexseq_count.py script

    Hi all!

    I used samtools to create sam files from bam. Then I sorted sam file with command: sort -s -k 1,1 07pos.sam > 07pos.sorted.sam . Now I am trying to run the script dexseq_count.py (it came along with the DEXSeq package). But I am getting the following error message:

    Code:
    $ python dexseq_count.py --paired=yes out.gff 07pos.sorted.sam 07poscounts.txt
    Traceback (most recent call last):
      File "dexseq_count.py", line 132, in <module>
        for af, ar in HTSeq.pair_SAM_alignments( HTSeq.SAM_Reader( sam_file ) ):
      File "/usr/lib/python2.6/site-packages/HTSeq/__init__.py", line 604, in pair_SAM_alignments
        for almnt in alignments:
      File "/usr/lib/python2.6/site-packages/HTSeq/__init__.py", line 543, in __iter__
        algnt = SAM_Alignment.from_SAM_line( line )
      File "_HTSeq.pyx", line 1249, in HTSeq._HTSeq.SAM_Alignment.from_SAM_line (src/_HTSeq.c:21848)
    UnboundLocalError: local variable 'cigarlist' referenced before assignment
    I could not find the solution by searching forums and Internet. So any ideas?

    Thanks in advance!
    Sander

  • #2
    This looks like a bug that we accidentally introduced in version 0.5.3p5. We fixed this last week, so please try version 0.5.3p7.

    Comment


    • #3
      Hi!

      Thank you, Simon. Version 0.5.3p7 worked.

      Sander

      Comment


      • #4
        dexseq_count.py error with HTSeq-0.5.3p9

        Hello,

        I am using the latest version of HTSeq (-0.5.3p9) and am getting asimilar error as SaunderEST. I have searched on forums for a similar error with the latest version, but no luck.

        I have used this version of HTSeq successfully for HTseq-count as well as
        dexseq_prepare_annotation.py, so I doubt there is a problem with the installation.

        Can anyone help me out here?

        Thanks!

        > ~/software/HTSeq-0.5.3p9/scripts/python_scripts$ python dexseq_count.py -p yes -s no G1_0h.sam /home/PhD_project/RNASeq_data/TopHat_Cufflinks/merged_asm/merged.gff exons_G1_0h.txt

        Traceback (most recent call last):
        File "dexseq_count.py", line 70, in <module>
        for f in HTSeq.GFF_Reader( gff_file ):
        File "/usr/local/lib/python2.7/dist-packages/HTSeq/__init__.py", line 214, in __iter__
        strand, frame, attributeStr ) = line.split( "\t", 8 )
        ValueError: need more than 3 values to unpack

        Comment


        • #5
          Dear @nat,

          Are you using the flattened gtf file produced by dexseq_prepare_annotation.py?
          The script dexseq_count.py is expecting to receive this as input

          Alejandro

          Comment


          • #6
            Dear DEXSeq team, Thanks for the package. Many thanks in advance.

            I am trying to determine what is the cause for me to have all the counts being zero in the dexseq_count.py output files (txt file with rows like ENSG00000000003:001 0). I have 18 RNASeq and using the dexseq_prepare_annotation.py to generate Homo_sapiens.GRCh37.70.gff. I used the script.sh to call samtools and dexseq_count.py. Sam files seem OK since I can see the txt files. dexseq_count.py indeed generated 18 .txt files. But the counts are all zero
            Thanks.
            Wenhong
            P.S. my reads are all paired-end from ILMN HiSeq 2000-2500 for RNAseq with 4X multiplex on human
            ***************copy and paste of some codes*********************
            #! /bin/bash
            sample=(Sample_330-0 Sample_54-0 Sample_54-1 Sample_F60-0 Sample_F60-1 Sample_NLBM2 Sample_NLBM3 Sample_NLBM4 Sample_NLBM5 Sample_R012-0-CD34 Sample_R012-1-CD34 Sample_R1291-0-CD34 Sample_R1291-1-CD34 Sample_R400-0-CD34 Sample_R400-1-CD34 Sample_R400-blasts Sample_R400-MNCs Sample_Y60-0 Sample_Y60-1)

            for i in ${sample[@]}
            do
            echo "working on $i ..."
            python dexseq_count.py -p yes Homo_sapiens.GRCh37.70.gff ${i}_sorted.sam ${i}_fb.txt
            done

            #! /bin/bash
            sample=(Sample_330-0 Sample_330-1 Sample_54-0 Sample_54-1 Sample_F60-0 Sample_F60-1 Sample_NLBM2 Sample_NLBM3 Sample_NLBM4 Sample_NLBM5 Sample_R012-0-CD34 Sample_R012-1-CD34 Sample_R1291-0-CD34 Sample_R1291-1-CD34 Sample_R400-0-CD34 Sample_R400-1-CD34 Sample_R400-blasts Sample_R400-MNCs Sample_Y60-0 Sample_Y60-1)

            for i in ${sample[@]}
            do
            echo "working on $i ..."
            samtools index $i.bam
            samtools view $i.bam > $i.sam
            sort -k1,1 -k2,2n $i.sam > ${i}_sorted.sam
            python dexseq_count.py -p yes Homo_sapiens.GRCh37.70.gff ${i}_sorted.sam ${i}_fb.txt
            done

            the first line in my .gff file is like this:
            1 Homo_sapiens.GRCh37.70.gtf aggregate_gene 11869 14412 . + . gene_id "ENSG00000223972"

            Comment


            • #7
              Hi @wfan,

              One thing, could you check if you have consistent chromosome names between your bam files and annotation files? This sometimes happens when you have for example "chr1" in one file and "1" in the other file. The chromosome names should match!

              Alejandro

              Comment


              • #8
                I am updating my previous thread (Problem with HTSeq dexseq_count.py script).
                Indeed, the zero counts result from different version of .gtf file used in Tophat. I rerun the Tophat using the same .gtf files, problem solved.
                Thanks
                Wenhong

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Recent Innovations in Spatial Biology
                  by seqadmin


                  Spatial biology is an exciting field that encompasses a wide range of techniques and technologies aimed at mapping the organization and interactions of various biomolecules in their native environments. As this area of research progresses, new tools and methodologies are being introduced, accompanied by efforts to establish benchmarking standards and drive technological innovation.

                  3D Genomics
                  While spatial biology often involves studying proteins and RNAs in their...
                  01-01-2025, 07:30 PM
                • seqadmin
                  Advancing Precision Medicine for Rare Diseases in Children
                  by seqadmin




                  Many organizations study rare diseases, but few have a mission as impactful as Rady Children’s Institute for Genomic Medicine (RCIGM). “We are all about changing outcomes for children,” explained Dr. Stephen Kingsmore, President and CEO of the group. The institute’s initial goal was to provide rapid diagnoses for critically ill children and shorten their diagnostic odyssey, a term used to describe the long and arduous process it takes patients to obtain an accurate...
                  12-16-2024, 07:57 AM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, 01-09-2025, 04:04 PM
                0 responses
                431 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 01-09-2025, 09:42 AM
                0 responses
                440 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 01-08-2025, 03:17 PM
                0 responses
                452 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 01-03-2025, 11:18 AM
                1 response
                50 views
                1 like
                Last Post Tonia
                by Tonia
                 
                Working...
                X