Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • htseq-count error, AssertionError

    Hi all,

    I am having a bit of a problem with htseq-count. I have single-end stranded RNA-seq experiment, aligned with tophat-2.0.14. SAM file is names sorted and using HTSeq-0.6.1p1 on Mac 10.7.5 with python 2.7. I also tried a few other versions of HTSeq, but it comes out with the same result.

    When trying to run htseq-count:
    htseq-count --stranded=reverse -t exon --idattr=gene_id --mode=intersection-nonempty ./test.sam ./gtf.gtf > ./htseq_count.txt

    I get this error message:
    Error:
    [Exception type: AssertionError, raised in __init__.py:599]


    Here are a few lines of my sam file:
    HWI-ST1146:240:C5HD2ACXX:1:1101:1040:33602 16 X 13759664 50 100M * 0 0 GGATGCCTCCAATGACCGACTGAAGCCGACCAAGTGGCTGACCGCCACCGAGCTGGAGAACGTGCCCTNCCTCAACGACATCACCTGGGAGCGTTTGGAG
    C<C@BBCCCCECB>9BCCDCCC<<5BBC>CCCCCBCC?@>9CDA<HGGIFIIGHFF<HHIIIIGGB:0#IGIHGFHFEHEDHEIIIIHHH@HFFFFF@@@ AS:i:-1 XN:i:0 XM:i:1 XO:i:0 XG:i:0 NM:i:1 MD:Z:68C31 YT:Z:UU NH:i:1 XS:A:+
    HWI-ST1146:240:C5HD2ACXX:1:1101:1041:2544 0 2L 450124 50 68M1I30M * 0 0 GATCCATTCGCCCACCGGCTTGGTTACGTACNACTTTCCGCCTTCATGGTTCAGCATAAAGTTTATGATTTTTTTTGACGATCTTGCGCTGGTCTCGCA
    CCCFFFFFHFHHHJJJIIIJJHIBBFEHHHI#08?FHIIJJJGIJEHIJ=@AEA?AHFBE@7;@CCCCEEEDCB@@>?CBB@BDDCDDDBBBCDCDC<@ AS:i:-9 XN:i:0 XM:i:1 XO:i:1 XG:i:1 NM:i:2 MD:Z:31A66 YT:Z:UU NH:i:1 XS:A:-
    HWI-ST1146:240:C5HD2ACXX:1:1101:1041:10782 0 2L 2191942 50 100M * 0 0 CGCTCATTCACATAGGTCACCTTGGTGGCCGNTTGACCCTGGCCCTGGCTCTTGTCCTTCTCCTTTTCGGATATCGTCCTCTCGATGGAGCAGGAGGCCG CC@FFFFFGDFHHJJJEGGJI@HIJHJJJJJ#1?DFHIGIJIIIGHIIGIJJJJIJIJJJJJIJEEHDE6BFCCE=CBDDDDDDCDDDCBDDDDD?BBB5 AS:i:-1 XN:i:0 XM:i:1 XO:i:0 XG:i:0 NM:i:1 MD:Z:31C68 YT:Z:UU NH:i:1 XS:A:-
    HWI-ST1146:240:C5HD2ACXX:1:1101:1041:30995 16 3R 7900087 50 100M * 0 0 AAGGCAAGGATCCACCTTTCCTGCTAGAAGTCAGCGCACAAATCACTTTTGTTCTGTTCCTGCTGTTCNTGACCATCATTCTGATGAACCTGCTCGTGGG @:?CA>ACA><B@:9DDDBADDCA@CDDCEADFFHCHCIHG@@@7FHDHJIJIJIIHIGFIHEGB?11#IHGIIIIHFCHJJHGFEBHHHFFDDFFFCC@ AS:i:-1 XN:i:0 XM:i:1 XO:i:0 XG:i:0 NM:i:1 MD:Z:68G31 YT:Z:UU NH:i:1 XS:A:+
    HWI-ST1146:240:C5HD2ACXX:1:1101:1041:48103 16 2L 14617161 50 100M * 0 0 AACCAGAACGGAGCCATCTGGAAACTGGACTTGGGCGCCCTGGAGGCCATCCAGTGGACCAAGCACTGNGACTCCGGCATCTAAGAAGTGATACTCCCAA
    >BBBABBBBBBA4>BAA>;A@;5>@:BB?B?695(ECECEE@FAC@CCFEFFFDEFIEF??;F?:0#IIFGFCBAIHHEBEEIIIFFFA:=DDDD@@@ AS:i:-3 XN:i:0 XM:i:2 XO:i:0 XG:i:0 NM:i:2 MD:Z:36A31G31 YT:Z:UU NH:i:1 XS:A:+


    If I use another experiment with paired-end reads it works without a problem. As far as I can get it is an error related to "paired-end handling", although my reads are single-end.
    It would appear that htseq-count recognises my reads as paired-end rather than single-end. It also does not matter if I have selected stranded= yes,no or reverse, same error.

    Would really appreciate any help.

  • #2
    Forgot to say that tophat was supplied with --library fr-firststrand

    Comment


    • #3
      What happens if you do a "samtools view -f 1 -c test.sam"? If you have even a single alignment with an incorrect pairing flag then that'll break things.

      Comment


      • #4
        Originally posted by dpryan View Post
        What happens if you do a "samtools view -f 1 -c test.sam"? If you have even a single alignment with an incorrect pairing flag then that'll break things.
        I get 0 with samtools view -f 1 -c test.sam

        All of the flags are either 0 or 16. htseq-count breaks on the first read it encounters.

        Comment


        • #5
          It is probably something to do with my htseq installation, as the same input works on fedora 20.
          I still cannot understand, if it is installation problem, why would the same installation work with paired-end data and not with single-end.

          Comment


          • #6
            Very strange indeed. To tell you the truth, I'm not sure where exactly that assert is being called. The most likely __init__.py on that version of HTSeq that I have installed locally doesn't have an assert() on that line. I suspect that you're correct about this being a weird installation problem. Perhaps just deinstalling/reinstalling will fix things.

            Comment


            • #7
              @saskak: Are those two installs using two different default versions of python?

              Comment


              • #8
                Originally posted by GenoMax View Post
                @saskak: Are those two installs using two different default versions of python?
                These are two different installs, one on MacOsX and another on fedora 20. Both are the default versions of python (2.7)
                Last edited by saskak; 07-06-2015, 04:07 AM.

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Current Approaches to Protein Sequencing
                  by seqadmin


                  Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                  04-04-2024, 04:25 PM
                • seqadmin
                  Strategies for Sequencing Challenging Samples
                  by seqadmin


                  Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                  03-22-2024, 06:39 AM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, 04-11-2024, 12:08 PM
                0 responses
                18 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-10-2024, 10:19 PM
                0 responses
                22 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-10-2024, 09:21 AM
                0 responses
                17 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-04-2024, 09:00 AM
                0 responses
                49 views
                0 likes
                Last Post seqadmin  
                Working...
                X