Seqanswers Leaderboard Ad



No announcement yet.
  • Filter
  • Time
  • Show
Clear All
new posts

  • import HTSeq & random sampling

    Hi, warning that this is a noob question.

    I am running into problems importing HTSeq. The context is that I am trying to sample randomly from a fastq file (not paired reads). After trying a few methods and always running out of memory I think I need to use the script Simon Anders posted here that relies on HTSeq:

    I navigate to the directory where my script and the fastq file of interest are saved, then call the script as follows in order to randomly subsample a tenth of the reads:

    python 0.10 SRR12345.fastq out_tenth_SRR12345.fastq

    However, I always get a message that HTSeq can't be imported. I have installed HTSeq (version HTSeq-0.5.4p1.win32-py2.7.exe), along with Numpy and I am using 32 bit python 2.7 on Windows 7. I have read the tour through HTSeq, but I still can't figure this out. I am a newbie to computational work and any help would be greatly appreciated. The error is below:

    C:\Users\kak\Desktop\ShortReads\fastqformatted>python 0
    .1 fastq_SRR12345.fastq tenth_SRR12345.fastq
    Traceback (most recent call last):
    File "", line 15, in <module>
    import HTSeq
    File "C:\Python27\lib\site-packages\HTSeq\", line 9, in <module>
    from _HTSeq import *
    File "_HTSeq.pyx", line 14, in init HTSeq._HTSeq (src/_HTSeq.c:31058)
    File "C:\Python27\lib\site-packages\HTSeq\", line 26, in <module>

    _StepVector = swig_import_helper()
    File "C:\Python27\lib\site-packages\HTSeq\", line 22, in swig_imp
    _mod = imp.load_module('_StepVector', fp, pathname, description)
    ImportError: DLL load failed: The specified module could not be found.

  • #2
    Originally posted by kakseq View Post
    However, I always get a message that HTSeq can't be imported.
    Is that referring to this test from the HTSeq installationpage?

    To test your installation, start Python and then try whether typing[COLOR="Red"] import HTSeq[/COLOR] causes an error meesage.


    • #3
      Thank you so much for taking a few minutes to help me.
      When I type import HTSeq after typing python it gives me nearly the same error message as above. But, if I type import HTSeq again it doesn't give me an error message.
      I tried reinstalling HTSeq to no avail.


      • #4
        I have sent some email to Simon Anders (not sure if he is on this forum). It appears that the current version of HTseq for windows is not working (for me either).


        • #5
          An alternative to HTSeq for randomly sampling a FASTQ file is Heng Li's seqtk. It can subsample a specific number of reads from a file or a fraction of the input as with Simon's HTSeq script.


          • #6

            Thanks for your help. For now I guess I'll run HTSeq on a mac instead and look into seqtk for future use.


            • #7
              I'm having the same issue with HTSeq on my PC. Made sure I was using Python 2.7, reinstalled numpy, reinstalled HTSeq, also without success.

              Is there an alternative for getting read counts from SAM file produced by BWA?


              • #8
                It's hard to say what's wrong if you don't post complete session logs (i.e., all commands typed plus all output and error messages)


                • #9
                  Originally posted by Simon Anders View Post
                  It's hard to say what's wrong if you don't post complete session logs (i.e., all commands typed plus all output and error messages)

                  On a windows 7 (64-bit) machine: I installed 32-bit python, NumPy (BTW: the link you have in the instructions does not work any more, I downloaded NumPy from:

                  Trying "import HTSeq" as recommended in your instructions is generating the following error. Googling around seems to indicate that a VS2010 DLL may be missing but I would like to get your take on what is going on.

                  Python 2.7.4 (default, Apr  6 2013, 19:54:46) [MSC v.1500 32 bit (Intel)] on win
                  Type "help", "copyright", "credits" or "license" for more information.
                  >>> import HTSeq
                  Traceback (most recent call last):
                    File "<stdin>", line 1, in <module>
                    File "C:\Python27\lib\site-packages\HTSeq\", line 9, in <module>
                      from _HTSeq import *
                    File "_HTSeq.pyx", line 14, in init HTSeq._HTSeq (src/_HTSeq.c:31058)
                    File "C:\Python27\lib\site-packages\HTSeq\", line 26, in <module>
                      _StepVector = swig_import_helper()
                    File "C:\Python27\lib\site-packages\HTSeq\", line 22, in swig_imp
                      _mod = imp.load_module('_StepVector', fp, pathname, description)
                  ImportError: DLL load failed: The specified module could not be found.


                  • #10
                    Seems that "_StepVector.dll" is really missing in this binary package. I'll try to fix it.


                    • #11
                      Originally posted by Simon Anders View Post
                      Seems that "_StepVector.dll" is really missing in this binary package. I'll try to fix it.
                      Thanks Simon.

                      If you can post an update to this thread when you have a chance to fix that it would be great.


                      • #12
                        Okay, after an hour of fighting with Windows (even the easiest things are hard on an OS that one uses less than once a year), I found the mistake and fixed it.

                        Please try HTSeq-0.5.4p2.win32-py2.7.exe and let me know if it still fails.


                        • #13
                          Lot of people want to use this package on windows so "fighting windows" on your part is worthwhile

                          We are not in the clear yet. This is the latest result.
                          Python 2.7.4 (default, Apr  6 2013, 19:54:46) [MSC v.1500 32 bit (Intel)] on wi
                          Type "help", "copyright", "credits" or "license" for more information.
                          >>> import HTSeq
                          Traceback (most recent call last):
                            File "<stdin>", line 1, in <module>
                            File "C:\Python27\lib\site-packages\HTSeq\", line 9, in <module>
                              from _HTSeq import *
                          ImportError: DLL load failed: The specified module could not be found.


                          • #14
                            Could you post a list of all file in C:\Python27\lib\site-packages\HTSeq\, please? With file extensions, if possible. Or, maybe compare with my system, where the directory looks like this:

                             Volume in drive C has no label.
                             Volume Serial Number is BC62-98E8
                             Directory of C:\Python27\Lib\site-packages\HTSeq
                            18/04/2013  16:22    <DIR>          .
                            18/04/2013  16:22    <DIR>          ..
                            18/04/2013  16:22    <DIR>          scripts
                            18/02/2013  17:06            25,079
                            18/04/2013  16:42            32,148 StepVector.pyc
                            18/04/2013  16:42            32,097 StepVector.pyo
                            18/04/2013  18:21           248,832 _HTSeq.pyd
                            20/02/2013  17:19             1,407
                            18/04/2013  16:42             2,011 _HTSeq_internal.pyc
                            18/04/2013  16:42             2,011 _HTSeq_internal.pyo
                            18/04/2013  18:21            83,968 _StepVector.pyd
                            18/04/2013  18:17                24
                            18/04/2013  16:42               168 _version.pyc
                            18/04/2013  16:42               168 _version.pyo
                            18/04/2013  18:16            32,996
                            18/04/2013  16:42            36,726 __init__.pyc
                            18/04/2013  16:42            36,495 __init__.pyo
                                          14 File(s)        534,130 bytes
                                           3 Dir(s)   5,331,779,584 bytes free


                            • #15
                              Here it is. Looks the same at first glance to me.

                              Directory of c:\Python27\Lib\site-packages\HTSeq
                              04/18/2013  11:29 AM    <DIR>          .
                              04/18/2013  11:29 AM    <DIR>          ..
                              04/18/2013  11:29 AM    <DIR>          scripts
                              02/18/2013  11:06 AM            25,079
                              04/18/2013  11:29 AM            32,148 StepVector.pyc
                              04/18/2013  11:29 AM            32,097 StepVector.pyo
                              04/18/2013  12:21 PM           248,832 _HTSeq.pyd
                              02/20/2013  11:19 AM             1,407
                              04/18/2013  11:29 AM             2,011 _HTSeq_internal.pyc
                              04/18/2013  11:29 AM             2,011 _HTSeq_internal.pyo
                              04/18/2013  12:21 PM            83,968 _StepVector.pyd
                              04/18/2013  12:17 PM                24
                              04/18/2013  11:29 AM               168 _version.pyc
                              04/18/2013  11:29 AM               168 _version.pyo
                              04/18/2013  12:16 PM            32,996
                              04/18/2013  11:29 AM            36,726 __init__.pyc
                              04/18/2013  11:29 AM            36,495 __init__.pyo
                                            14 File(s)        534,130 bytes


                              Latest Articles


                              • seqadmin
                                Investigating the Gut Microbiome Through Diet and Spatial Biology
                                by seqadmin

                                The human gut contains trillions of microorganisms that impact digestion, immune functions, and overall health1. Despite major breakthroughs, we’re only beginning to understand the full extent of the microbiome’s influence on health and disease. Advances in next-generation sequencing and spatial biology have opened new windows into this complex environment, yet many questions remain. This article highlights two recent studies exploring how diet influences microbial...
                                02-24-2025, 06:31 AM
                              • seqadmin
                                Quality Control Essentials for Next-Generation Sequencing Workflows
                                by seqadmin

                                Like all molecular biology applications, next-generation sequencing (NGS) workflows require diligent quality control (QC) measures to ensure accurate and reproducible results. Proper QC begins at nucleic acid extraction and continues all the way through to data analysis. This article outlines the key QC steps in an NGS workflow, along with the commonly used tools and techniques.

                                Nucleic Acid Quality Control
                                Preparing for NGS starts with isolating the...
                                02-10-2025, 01:58 PM





                              Topics Statistics Last Post
                              Started by seqadmin, 03-03-2025, 01:15 PM
                              0 responses
                              Last Post seqadmin  
                              Started by seqadmin, 02-28-2025, 12:58 PM
                              0 responses
                              Last Post seqadmin  
                              Started by seqadmin, 02-24-2025, 02:48 PM
                              0 responses
                              Last Post seqadmin  
                              Started by seqadmin, 02-21-2025, 02:46 PM
                              0 responses
                              Last Post seqadmin  