Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Linux system requirements for NGS analysis programs

    Hi,

    We're seeking basic set up advice and pointers on Linux system requirements as a new lab getting into next-gen data analysis.

    We may be missing some basic software or generic programs, sometimes error messages are suggesting difficulty locating gcc. We're unsure about the best place to download GNU/GCC (to ensure we can use C++ amongst other things).

    Example error we've encountered:
    # When configuring ghc-6.12.7
    $ ...No acceptable C compiler found in $PATH

    We've been attempting to install and use Bowtie 0.12.7 on Illumina sequences of whole-genomic plant DNA (7million x ~100bp reads). The aim is to separate chloroplast sequences from other DNA sequences to be able to create better de novo assemblies of the chloroplast only. Bowtie looks good, however we’re having troubles getting it up and running, most likely general programming issues.

    To install Bowtie, pandoc must be installed, which requires the Haskell platform or GHC and so on. I haven’t been able to successfully configure and install any of these programs, partly because I don’t always understand the error messages. E.g.
    # When attempting to install the latest Haskell platform 2011-2.0.1 for Linux (our machine is running Kernel Linux 2.6.32-122el6.x86_64)
    $ …invalid configuration ‘x86_64-unkown-linux-‘: machine ‘x86_64-unknown-linux’ not recognized

    We'd appreciate advice on:
    -the basic system requirements to run command line programs e.g. GNU
    -general advice on installing and configuring Bowtie
    -other tips which are often 'assumed knowledge' but essential to be able to run command-line NGS programs effectively!

    Many thanks!

  • #2
    puh, that's alot to answer...

    but before anyone could do that: which Linux distribution are you using? There are differences concerning file system hierarchy, configuration data etc...

    Originally posted by RBGSYD View Post

    Example error we've encountered:
    # When configuring ghc-6.12.7
    $ ...No acceptable C compiler found in $PATH
    For example: to run a program you either need to specify the directory where it lies or you put it into a directory which is specified in a system variable called $PATH. (Mostly a directory in that $PATH variable is /usr/bin or /bin or something like that, this is depending on your distribution.

    We'd appreciate advice on:
    -the basic system requirements to run command line programs e.g. GNU
    -general advice on installing and configuring Bowtie
    -other tips which are often 'assumed knowledge' but essential to be able to run command-line NGS programs effectively!

    Many thanks!
    To execute a program you must set the executable bit and you msut have the right to execute it. You can make a file executable by:
    Code:
    chmod 755 yourprogram
    and the $PATH thing described above

    For installing various packages I would recommend you familiarize with your package installing system (depending on your distribution APT for Ubuntu, YUM for Fedora, RPM for RedHat)

    Concerning the assumed knowledge: Linux and BASH knowledge is actually assumed in most bioinformatic software manuals, but there are a lot of ressources in the net.

    I think for general purpose questions on Linux I recommend looking into Linux forums dedicated to your distribution, as your questions are difficult to answer and actually don't fit too nicely into that forum...

    Comment


    • #3
      Hi Peter,

      Thanks so much for your reply. I appreciate it is a lot to ask, at this stage knowing which are the right questions to be asking is almost as important!

      In reply to your comments:
      - We are running Linux 6.61 (Kernel Linux 2.6.32-122el6.x86_64), using RedHat
      - I am familiar with the requirements for $PATH but probably hadn't specified it correctly, so I'll try that again
      - Investigating RPM for RedHat sounds like a good way to go, thanks
      - Actively checking various Linux forums is certainly very helpful, although it's always great if other bioinformaticians have pointers on specific resources they found valuable in the set-up stages too.

      Thanks again!

      Comment


      • #4
        I just saw that for Bowtie there are some precompiled binaries available. Did you try them? You wouldn't need GCC or other stuff to have it work, just download the Linux x86_64 binary make it executable and that's it ( I guess, didn't try it though). Here's the link:
        Bowtie, an ultrafast, memory-efficient short read aligner for short DNA sequences (reads) from next-gen sequencers. Please cite: Langmead B, et al.…

        Comment


        • #5
          It has turned out to be due to us missing some primary programs -

          The precompiled binaries for Bowtie worked fine on another computer. A little comparison of installed programs showed we were missing gcc-4.4.4-13.el6.x86_64.rpm on our machine, hence it failed to run. We downloaded this program and attempted to install it but came across the error that various dependencies were missing, e.g. cloog-ppl, cpp and libcomp. After downloading a version of cloog-ppl to make up for its absence, installation failed due to "no acceptable C compiler" again.

          This makes me think that when we recently rebuilt the machine we missed something vital! We'll be going back to the IT people who helped us set up the system for advice.

          Comment


          • #6
            if it's just the gcc missing that is easy (in case you have RPM installed):
            just type
            Code:
            rpm -q gcc
            that will give you the exact name of the package and with
            Code:
            rpm -i gcc-4.5.1.4.something
            you may install gcc. (Of course you should substitute something with the actual package number). That would be the way if you miss other programs as well...

            Hope that helps

            Comment


            • #7
              I forgot: you may need root rights, either by sudo or by logging in as root (in case you've got a root password) then it would look like:
              Code:
              sudo rpm -i gcc-....

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Essential Discoveries and Tools in Epitranscriptomics
                by seqadmin




                The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                04-22-2024, 07:01 AM
              • seqadmin
                Current Approaches to Protein Sequencing
                by seqadmin


                Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                04-04-2024, 04:25 PM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, Yesterday, 11:49 AM
              0 responses
              13 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-24-2024, 08:47 AM
              0 responses
              16 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-11-2024, 12:08 PM
              0 responses
              61 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 10:19 PM
              0 responses
              60 views
              0 likes
              Last Post seqadmin  
              Working...
              X