Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Recommended Windows based freeware for sequence aligment and variant calling

    Hi SEQanswers,

    I am very new to NGS and I am currently investigating the best sequence aligment and variant calling Windows based freeware that is currently available.We aim to use the Illumina GAiix platform I do not have access to a UNIX server and my research dept is unlikely to have the funds required to purchase any of the commercial products that are available.

    Many thanks for your help,

    Jonathan

  • #2
    Most of the applications you'll want are designed to run on Linux/OSX, unfortunately. Rather than limiting yourself to Windows apps, have you considered installing a Linux Virtual Machine on your Windows machine? Something like VMware Player is free for non-commercial use, and will let you run a fully-functional Linux server in a window on your windows machine. You can then install and play around with all the Linux apps you want on the VM.

    Computing performance shouldn't be significantly worse than a native Linux install, though disc performance may suffer a little, depending. If you have a reasonably modern Windows machine but you don't want to dual-boot Linux for whatever reason this is probably a good approach to take.

    EDIT: Also, Mac OSX is Unix-based, and so most freeware apps that compile on Linux can also be compiled on it. If you have access to a reasonably new Mac you could just use that instead.
    Last edited by Rocketknight; 02-14-2012, 05:39 AM.

    Comment


    • #3
      Try Cygwin. It will allow you to run Linux programs on Windows:

      Comment


      • #4
        Hello jcgrant31
        You might want to check UGene.
        They have video tutorials on Youtube!
        Cheers.

        Comment


        • #5
          Originally posted by chadn737 View Post
          Try Cygwin. It will allow you to run Linux programs on Windows:

          http://www.cygwin.com/
          It does, to a certain extent. But can be quite difficult for beginners to compile certain programmes. I second the suggestion to investigate a Linux virtual machine or a dual-boot Windows/Linux machine.

          Comment


          • #6
            I second the VM install as well... Or look at using Galaxy on the cloud on pay per hour basis.

            but
            Originally posted by jcgrant31 View Post
            We aim to use the Illumina GAiix platform I do not have access to a UNIX server and my research dept is unlikely to have the funds required to purchase any of the commercial products that are available.
            sounds to me like your research department has money for doing sequencing (or even buying a GAIIx), but not for spending some 5-15 k$ for a decent workstation and software? You should tell them that you can't get any results without proper analysis tools, which include computer hardware and software.

            Comment


            • #7
              Originally posted by arvid View Post
              sounds to me like your research department has money for doing sequencing (or even buying a GAIIx), but not for spending some 5-15 k$ for a decent workstation and software? You should tell them that you can't get any results without proper analysis tools, which include computer hardware and software.
              $5-15k seems a little excessive, a modern machine with 8-16GB of RAM will be fine for low data volumes. You might even get away with 4GB if you dual-boot, but a VM will require a little more memory.

              @jcgrant31: Don't panic if it feels like everyone's giving you conflicting suggestions and you haven't a clue what's going on, everyone had to start somewhere with bioinformatics. If you've no idea how to dual-boot or install more RAM in your computer or whatever, feel free to ask.

              The one piece of advice I would give is that although there are ways like Cygwin and UGene to get some alignment software working on your Windows machine, in the long run you are probably better off getting a Linux VM or a dual-boot. A lot of downstream tools you might want to run won't be available for Windows, and almost all the guides and tutorials in this area are written for a Linux environment. It'll make it a lot easier to learn bioinformatics if you have a stable Linux environment to work in.

              Comment


              • #8
                Originally posted by jcgrant31 View Post
                Hi SEQanswers,

                I am very new to NGS and I am currently investigating the best sequence aligment and variant calling Windows based freeware that is currently available.We aim to use the Illumina GAiix platform I do not have access to a UNIX server and my research dept is unlikely to have the funds required to purchase any of the commercial products that are available.

                Many thanks for your help,

                Jonathan
                I do my NGS pipeline development on a VirtualBox virtual machine running Debian. My desktop machine is Windows 7 - but it's 8 cores with 16GB of RAM, most of which are given over to the Linux VM.

                I don't think you will have a great deal of success finding something for Windows where most development is focused on developing Linux tools. I don't recommend Cygwin, although this is personal taste. It's far easier to deal with a VM for my money than Cygwin.

                Comment


                • #9
                  Originally posted by Rocketknight View Post
                  $5-15k seems a little excessive, a modern machine with 8-16GB of RAM will be fine for low data volumes. You might even get away with 4GB if you dual-boot, but a VM will require a little more memory.
                  It depends what you consider necessary. I agree that a 8-16 GB workstation on Linux with free software is good enough for typical analysis pipelines, but you should also consider mid- to long-term data safety (a requirement in grant proposals here, at least). Then you need something like a NAS with an off-site backup plan, which doesn't come for free. And if you want commercial NGS software with a GUI, you would spend a few k$ on that.

                  My point is more that when you anyway plan to spend several k$ on single sequencing experiments, yearly invest tens of k$ for lab equipment to keep a small lab running, and hundreds of k$ for personnel costs, then yearly investments of a few k$ for proper IT infrastructure (local or sourced-out) could be easily anticipated in the research budget.

                  If we keep telling our bosses that we can do the analyses properly with an old laptop from five years ago, or whatever other hardware standing around, of course they say, fine, let's use that money for more experiments or a new PCR machine.

                  Comment


                  • #10
                    Originally posted by arvid View Post
                    It depends what you consider necessary. I agree that a 8-16 GB workstation on Linux with free software is good enough for typical analysis pipelines, but you should also consider mid- to long-term data safety (a requirement in grant proposals here, at least). Then you need something like a NAS with an off-site backup plan, which doesn't come for free. And if you want commercial NGS software with a GUI, you would spend a few k$ on that.
                    +1

                    Whilst I might *develop* on a desktop machine and it would suffice for most analyses (providing I wanted to do them really slowly). I run my analysis on dedicated servers, with dedicated storage, backups to tape with an off-site backup plan.

                    You don't want to invest a huge chunk of time on NGS analysis only to lose the source of your primary data when the disk in your desktop goes *pop*

                    Comment


                    • #11
                      If your going to be doing the Bioinformatics analysis your self and your academic or nonprofit then you might want to have a look at Real Time Genomics software it's command line based and runs on windows, Linux and Mac. They have a free individual user license which might suit your requirements.

                      I've not used them on Windows personally but have used them on Linux alot and liked the software enough that we did end up purchasing a license for our cluster. It is also very fast software which if your limited to a desktop or two to do all your processing could be rather important.
                      Real Time Genomics provides bioinformatic services and analytical solutions that reduce the cost and complexity of extracting knowledge from raw genomic data.


                      If that doesn't seem suitable then like the others my recommendation is install Virtual Box and use it to create a Linux VM to do all your analysis in. It'll give you access to the widest range of tools possible.

                      Either way if your working on a desktop make sure you backup often, and that you backup both your raw data and working data. As if you've got a decent amount of compute power then it's not a huge issue if a hard drive dies and you lose all your bams/alignments or variant calls as you can just toss it all back onto the cluster and a few days later you've regenerated all your working files. With a workstation though if you've only got 8cores then what took someone like me a couple of days with 200cores could take you a month or more to recover from.
                      Last edited by aeonsim; 02-15-2012, 03:37 AM.

                      Comment


                      • #12
                        One problem with Cygwin (and, the last time I checked, the bwa build in UGene) is that you are limited to 32-bit applications, which can be inadequate for many common NGS tasks. 64-bit Linux will be less trouble than any other platform, and you can use standard tools and pipelines based on bwa, samtools, gatk, picard, etc. A virtual machine might be adequate for this, though you'll get more performance with a direct Linux installation - better to run Windows as the guest OS in a VM (dual booting is another option, but tends to be a pain in practice). I agree with the other posters that what you really need is a better balance between the money spent on sequencing and on the resources required for analysis.

                        Comment


                        • #13
                          Dear all,

                          Please check http://bow.codeplex.com/
                          Just released BWA on x64 Windows, natively compiled on Windows without using any 3dr party library.

                          Best,

                          dong

                          Comment

                          Latest Articles

                          Collapse

                          • seqadmin
                            Current Approaches to Protein Sequencing
                            by seqadmin


                            Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                            04-04-2024, 04:25 PM
                          • seqadmin
                            Strategies for Sequencing Challenging Samples
                            by seqadmin


                            Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                            03-22-2024, 06:39 AM

                          ad_right_rmr

                          Collapse

                          News

                          Collapse

                          Topics Statistics Last Post
                          Started by seqadmin, 04-11-2024, 12:08 PM
                          0 responses
                          31 views
                          0 likes
                          Last Post seqadmin  
                          Started by seqadmin, 04-10-2024, 10:19 PM
                          0 responses
                          33 views
                          0 likes
                          Last Post seqadmin  
                          Started by seqadmin, 04-10-2024, 09:21 AM
                          0 responses
                          28 views
                          0 likes
                          Last Post seqadmin  
                          Started by seqadmin, 04-04-2024, 09:00 AM
                          0 responses
                          53 views
                          0 likes
                          Last Post seqadmin  
                          Working...
                          X