Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • New to Linux and Bioinformatics

    Hi all, I am sure there have been many posts in the past like this. I am new in bioinformatics and am doing target sequencing and would love to learn all the tools I need, but all of them are in linux which I have no idea to begin with. Is it possible to learn linux and bioinformatic tools at the same time. If yes, appreciate your valuable suggestions.
    Thanks.
    John.

  • #2
    Hi John,

    Did you already choose and install a distro, or are you first awaiting some answers here? In that case, the choice for a distro is kind of personal. I'm a Fedora person (have been since 2006 I think, got slightly annoyed with XP) but you'll meet others who'd rather work with Ubuntu, Mint, or whatever.

    There's a nice project going on here called BioLinux. It's based on Ubuntu and has tons of bioinformatics tools packed with it. I have no idea how good it is, maybe someone else knows?

    I would like to suggest learning a bit about Linux first. I see a lot of Win users get stuck because they do not understand how Linux works. They try the Win approach, which doesn't work. An example: in Win, you may sometimes want to log in to your desktop environment as administrator to get stuff done. You don't do this in Linux. Most distros won't even allow it and in Ubuntu the root account is even locked by default. Instead, you use sudo from the commandline.

    Then I found this nice tut for those who have absolutely NO experience with the commandline. It also explains the file system organisation and has a section on MAN pages.

    Once you understand how to use the shell, learning how to use the bioinformatics tools should be easy. You can then also start learning shell scripting with bash, sed and awk. They provide very powerful means to edit the large NGS text files.

    Have fun!

    Comment


    • #3
      Hi John,

      I completely agree with everything stated by Bruins in the last post. Choosing an distro (I recommend Ubuntu for the new user) and becomming comfortable with shell commands will take you a long way in both linux and bioinformatics. A large number of bioinformatics tools are shell scripts, so you are killing two birds with one stone becoming comfortable with the shell.

      Bruins pointed out some nice online tutorials. I will plug something that I have been involved with which is designed to be geared towards the new user. I help put together Workshops to train people on how to do analysis of genomic data. We are hosting one in Fort Collins, CO this summer, but also hold similar Workshops in Europe. If you are interested in attending the Workshop registration for the Fort Collins one is currently available.

      If you don't have the time or funding to take the course, you can access our on-line tutorials. Almost all of the software packages have some install and usage instructions that should be useful in general, but sometimes are a bit specific to the Workshop. We have a very basic tutorial on how to use the terminal and how to use linux text editors. Links to all are below:

      HomePage: http://www.molecularevolution.org/
      Unix Tutorial: http://www.molecularevolution.org/re.../unix_tutorial
      Text editors: http://www.molecularevolution.org/re...x_text_editors

      If you use them and find them useful or have any recommendations on how to improve them please let us know. And good luck!!!

      SAH

      Comment


      • #4
        Hi,

        As it's mentioned, BioLinux is a nice project to start with. I did it when I started with this crazy bioinformatics world. It has a lot of tools already installed and a friendly user interface, but it's also a good point to star with command shell programs (blast, muscle, bwa..... too installed); with lots of example data and tutorials.

        Comment


        • #5
          Dear Bruins, Shandley, and cascoamarillo,
          Thanks for the suggestions. I have installed ubuntu on my desktop, but have no idea how to download and install something coz it shows all shell and other linux based steps to do it. I hope to learn them faster and be able to do bioinformatics.

          I will also look at Biolinux and read some books.

          Any suggestions on books....or its better to stick to online resources?
          Thanks again.
          John.

          Comment


          • #6
            Online resources: better and cheaper; online tutorials, program manuals, mailing lists, forums (like this )....

            Books are fine to have a broad view on what's going on in bioinformatics. I've looked several, and I bought a few years ago 'Bioinformatics' by Baxevanis and Ouellette. Nice book, but it's from 2005 (don't know if there's a new edition).

            Good luck!

            Comment


            • #7
              Thanks Amarillo. BTW, I was in Canyon for two yrs, close to Amarillo...are u from Amarillo by any chance?

              Comment


              • #8
                No, I've never been in Texas.

                Comment


                • #9
                  Great choice Ubuntu/BioLinux, though Fedora is very solid also. For Linux you can find just about any answer on ubuntuforums.org, relevant to most distros. But keep a file in your home dir with the key tips you find online, including shortcuts, tricky syntax, file paths, etc. Very helpful until you start remembering it all!

                  For commandline, try the Applications menu entry "Terminal". I highly recommend using "byobu" as your first command, it let's your session live on if you get disconnected or close the window, and has helpful shortcuts printed onscreen. In Bioinformatics it's all about starting along process and letting it churn in the background! Hit F2 to open a parallel commandline shell, F3 and F4 to switch left and right among your commandlines. Very useful if you start some analysis but then want to check the progress. And if you ever remote login, you can use the same byobu session if you want, again very useful.

                  Similarly 'nano' is a good commandline text editor, also with shortcuts onscreen. The first thing I'd.do is run 'nano ~/.nanorc' and type in 'set nowrap' so it won't add return characters to wrap long lines.

                  Everything else you'll see online with others' clever tips and tutorials (e.g. Ubuntu Wiki, BioLinux) but you have to protect your shell session from accidentally closing the window, and you have to text edit. HTH!

                  Comment


                  • #10
                    hm - my partner uses Biolinux. Nice thing is that it comes with a lot of pre-installed software. However - most of it I (and I guess many people) never use - and I prefer to have total control and overview over the software that I installed. Using Kubuntu - looks nice and has some very useful standard software.

                    Otherwise I recommend to get familiar with some basic command-line programs like sort, awk, sed, grep, wc, diff, uniq and so on - they provide a fast and easy way to manipulate data files (even if you just know a bit about them it helps). A bit of bash language can also be very useful.

                    Comment


                    • #11
                      I agree with everything said above, but learning a scripting language for basic data manipulation is essential in bioinformatics. Don't want to light the fire again on the Perl vs. Python story but any of these languages will solve many problems that will arise in your bioinformatic future!

                      Of course command line usage basic linux tools like sed, awk, grep etc. are necessary as well. However basic concepts of programming takes some time to learn but (in my opinion) is worth spending the time in bioinformatics.

                      Comment


                      • #12
                        Best option

                        This article might help you to decide what is the best linux distro for you and then install Bioinformatics tools

                        Comment


                        • #13
                          Thanks a lot guys. I have not been into this for the last few months. I think its time to get back into it. I will start learning things and will ask you guys more q's as they arise.
                          John.

                          Comment

                          Latest Articles

                          Collapse

                          • seqadmin
                            Current Approaches to Protein Sequencing
                            by seqadmin


                            Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                            04-04-2024, 04:25 PM
                          • seqadmin
                            Strategies for Sequencing Challenging Samples
                            by seqadmin


                            Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                            03-22-2024, 06:39 AM

                          ad_right_rmr

                          Collapse

                          News

                          Collapse

                          Topics Statistics Last Post
                          Started by seqadmin, 04-11-2024, 12:08 PM
                          0 responses
                          24 views
                          0 likes
                          Last Post seqadmin  
                          Started by seqadmin, 04-10-2024, 10:19 PM
                          0 responses
                          25 views
                          0 likes
                          Last Post seqadmin  
                          Started by seqadmin, 04-10-2024, 09:21 AM
                          0 responses
                          21 views
                          0 likes
                          Last Post seqadmin  
                          Started by seqadmin, 04-04-2024, 09:00 AM
                          0 responses
                          52 views
                          0 likes
                          Last Post seqadmin  
                          Working...
                          X