Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • import HTSeq & random sampling

    Hi, warning that this is a noob question.

    I am running into problems importing HTSeq. The context is that I am trying to sample randomly from a fastq file (not paired reads). After trying a few methods and always running out of memory I think I need to use the script Simon Anders posted here that relies on HTSeq: http://seqanswers.com/forums/showthread.php?t=12070

    I navigate to the directory where my script and the fastq file of interest are saved, then call the script as follows in order to randomly subsample a tenth of the reads:

    python subsamplewithHTSeq.py 0.10 SRR12345.fastq out_tenth_SRR12345.fastq

    However, I always get a message that HTSeq can't be imported. I have installed HTSeq (version HTSeq-0.5.4p1.win32-py2.7.exe), along with Numpy and I am using 32 bit python 2.7 on Windows 7. I have read the tour through HTSeq, but I still can't figure this out. I am a newbie to computational work and any help would be greatly appreciated. The error is below:

    C:\Users\kak\Desktop\ShortReads\fastqformatted>python subsamplewithHTSeq.py 0
    .1 fastq_SRR12345.fastq tenth_SRR12345.fastq
    Traceback (most recent call last):
    File "subsamplewithHTSeq.py", line 15, in <module>
    import HTSeq
    File "C:\Python27\lib\site-packages\HTSeq\__init__.py", line 9, in <module>
    from _HTSeq import *
    File "_HTSeq.pyx", line 14, in init HTSeq._HTSeq (src/_HTSeq.c:31058)
    File "C:\Python27\lib\site-packages\HTSeq\StepVector.py", line 26, in <module>

    _StepVector = swig_import_helper()
    File "C:\Python27\lib\site-packages\HTSeq\StepVector.py", line 22, in swig_imp
    ort_helper
    _mod = imp.load_module('_StepVector', fp, pathname, description)
    ImportError: DLL load failed: The specified module could not be found.
    HTSeq-0.5.4p1.win32-py2.7.exe

  • #2
    Originally posted by kakseq View Post
    However, I always get a message that HTSeq can't be imported.
    Is that referring to this test from the HTSeq installationpage?

    Code:
    To test your installation, start Python and then try whether typing[COLOR="Red"] import HTSeq[/COLOR] causes an error meesage.

    Comment


    • #3
      hi,
      Thank you so much for taking a few minutes to help me.
      When I type import HTSeq after typing python it gives me nearly the same error message as above. But, if I type import HTSeq again it doesn't give me an error message.
      I tried reinstalling HTSeq to no avail.

      Comment


      • #4
        I have sent some email to Simon Anders (not sure if he is on this forum). It appears that the current version of HTseq for windows is not working (for me either).

        Comment


        • #5
          An alternative to HTSeq for randomly sampling a FASTQ file is Heng Li's seqtk. It can subsample a specific number of reads from a file or a fraction of the input as with Simon's HTSeq script.

          Comment


          • #6
            hi,

            Thanks for your help. For now I guess I'll run HTSeq on a mac instead and look into seqtk for future use.

            Comment


            • #7
              I'm having the same issue with HTSeq on my PC. Made sure I was using Python 2.7, reinstalled numpy, reinstalled HTSeq, also without success.

              Is there an alternative for getting read counts from SAM file produced by BWA?

              Comment


              • #8
                It's hard to say what's wrong if you don't post complete session logs (i.e., all commands typed plus all output and error messages)

                Comment


                • #9
                  Originally posted by Simon Anders View Post
                  It's hard to say what's wrong if you don't post complete session logs (i.e., all commands typed plus all output and error messages)
                  Simon,

                  On a windows 7 (64-bit) machine: I installed 32-bit python, NumPy (BTW: the www.scipy.org link you have in the instructions does not work any more, I downloaded NumPy from: https://pypi.python.org/pypi/numpy).

                  Trying "import HTSeq" as recommended in your instructions is generating the following error. Googling around seems to indicate that a VS2010 DLL may be missing but I would like to get your take on what is going on.

                  Code:
                  Python 2.7.4 (default, Apr  6 2013, 19:54:46) [MSC v.1500 32 bit (Intel)] on win
                  32
                  Type "help", "copyright", "credits" or "license" for more information.
                  >>> import HTSeq
                  Traceback (most recent call last):
                    File "<stdin>", line 1, in <module>
                    File "C:\Python27\lib\site-packages\HTSeq\__init__.py", line 9, in <module>
                      from _HTSeq import *
                    File "_HTSeq.pyx", line 14, in init HTSeq._HTSeq (src/_HTSeq.c:31058)
                    File "C:\Python27\lib\site-packages\HTSeq\StepVector.py", line 26, in <module>
                  
                      _StepVector = swig_import_helper()
                    File "C:\Python27\lib\site-packages\HTSeq\StepVector.py", line 22, in swig_imp
                  ort_helper
                      _mod = imp.load_module('_StepVector', fp, pathname, description)
                  ImportError: DLL load failed: The specified module could not be found.
                  >>>

                  Comment


                  • #10
                    Seems that "_StepVector.dll" is really missing in this binary package. I'll try to fix it.

                    Comment


                    • #11
                      Originally posted by Simon Anders View Post
                      Seems that "_StepVector.dll" is really missing in this binary package. I'll try to fix it.
                      Thanks Simon.

                      If you can post an update to this thread when you have a chance to fix that it would be great.

                      Comment


                      • #12
                        Okay, after an hour of fighting with Windows (even the easiest things are hard on an OS that one uses less than once a year), I found the mistake and fixed it.

                        Please try HTSeq-0.5.4p2.win32-py2.7.exe and let me know if it still fails.

                        Comment


                        • #13
                          Lot of people want to use this package on windows so "fighting windows" on your part is worthwhile

                          We are not in the clear yet. This is the latest result.
                          Code:
                          Python 2.7.4 (default, Apr  6 2013, 19:54:46) [MSC v.1500 32 bit (Intel)] on wi
                          32
                          Type "help", "copyright", "credits" or "license" for more information.
                          >>> import HTSeq
                          Traceback (most recent call last):
                            File "<stdin>", line 1, in <module>
                            File "C:\Python27\lib\site-packages\HTSeq\__init__.py", line 9, in <module>
                              from _HTSeq import *
                          ImportError: DLL load failed: The specified module could not be found.
                          >>>

                          Comment


                          • #14
                            Could you post a list of all file in C:\Python27\lib\site-packages\HTSeq\, please? With file extensions, if possible. Or, maybe compare with my system, where the directory looks like this:

                            Code:
                            C:\Python27\Lib\site-packages\HTSeq>dir
                             Volume in drive C has no label.
                             Volume Serial Number is BC62-98E8
                            
                             Directory of C:\Python27\Lib\site-packages\HTSeq
                            
                            18/04/2013  16:22    <DIR>          .
                            18/04/2013  16:22    <DIR>          ..
                            18/04/2013  16:22    <DIR>          scripts
                            18/02/2013  17:06            25,079 StepVector.py
                            18/04/2013  16:42            32,148 StepVector.pyc
                            18/04/2013  16:42            32,097 StepVector.pyo
                            18/04/2013  18:21           248,832 _HTSeq.pyd
                            20/02/2013  17:19             1,407 _HTSeq_internal.py
                            18/04/2013  16:42             2,011 _HTSeq_internal.pyc
                            18/04/2013  16:42             2,011 _HTSeq_internal.pyo
                            18/04/2013  18:21            83,968 _StepVector.pyd
                            18/04/2013  18:17                24 _version.py
                            18/04/2013  16:42               168 _version.pyc
                            18/04/2013  16:42               168 _version.pyo
                            18/04/2013  18:16            32,996 __init__.py
                            18/04/2013  16:42            36,726 __init__.pyc
                            18/04/2013  16:42            36,495 __init__.pyo
                                          14 File(s)        534,130 bytes
                                           3 Dir(s)   5,331,779,584 bytes free

                            Comment


                            • #15
                              Here it is. Looks the same at first glance to me.

                              Code:
                              Directory of c:\Python27\Lib\site-packages\HTSeq
                              
                              04/18/2013  11:29 AM    <DIR>          .
                              04/18/2013  11:29 AM    <DIR>          ..
                              04/18/2013  11:29 AM    <DIR>          scripts
                              02/18/2013  11:06 AM            25,079 StepVector.py
                              04/18/2013  11:29 AM            32,148 StepVector.pyc
                              04/18/2013  11:29 AM            32,097 StepVector.pyo
                              04/18/2013  12:21 PM           248,832 _HTSeq.pyd
                              02/20/2013  11:19 AM             1,407 _HTSeq_internal.py
                              04/18/2013  11:29 AM             2,011 _HTSeq_internal.pyc
                              04/18/2013  11:29 AM             2,011 _HTSeq_internal.pyo
                              04/18/2013  12:21 PM            83,968 _StepVector.pyd
                              04/18/2013  12:17 PM                24 _version.py
                              04/18/2013  11:29 AM               168 _version.pyc
                              04/18/2013  11:29 AM               168 _version.pyo
                              04/18/2013  12:16 PM            32,996 __init__.py
                              04/18/2013  11:29 AM            36,726 __init__.pyc
                              04/18/2013  11:29 AM            36,495 __init__.pyo
                                            14 File(s)        534,130 bytes

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Essential Discoveries and Tools in Epitranscriptomics
                                by seqadmin




                                The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                                04-22-2024, 07:01 AM
                              • seqadmin
                                Current Approaches to Protein Sequencing
                                by seqadmin


                                Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                                04-04-2024, 04:25 PM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, Today, 11:49 AM
                              0 responses
                              12 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, Yesterday, 08:47 AM
                              0 responses
                              16 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-11-2024, 12:08 PM
                              0 responses
                              61 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-10-2024, 10:19 PM
                              0 responses
                              60 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X