Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • polyhedron
    Member
    • Aug 2009
    • 12

    ask for gapped (indel) alignment software

    i have now single-end solexa data and later will be also working on PE.

    i've tried soap2 for the -g 5 option for gapped alignment, however, on known indels, there's nothing properly mapped. although i've seen on seqanswers someone did gapped alignment successfully (but on soap rather than soap2?). and the developer says soap2 can do gapped PET alignment, but "gapped" means the interval between the ends rather than indels?

    bowtie says it doesn't support gapped alignment yet.

    may anyone tell me which free software can do gapped alignment (for indels) on both SE and PE data? many thx!
    Last edited by polyhedron; 10-14-2010, 09:09 AM.
  • francois.sabot
    Member
    • Dec 2009
    • 41

    #2
    I may say that BWA accepts indel...
    Francois Sabot, PhD

    Be realistic. Demand the Impossible.
    www.wikiposon.org

    Comment

    • Michael.James.Clark
      Senior Member
      • Apr 2009
      • 207

      #3
      Novoalign, BWA, BFAST

      Oh, Novoalign is not exactly free. You can get a free trial from them, though, and it is not very expensive.
      Last edited by Michael.James.Clark; 10-14-2010, 01:18 AM.
      Mendelian Disorder: A blogshare of random useful information for general public consumption. [Blog]
      Breakway: A Program to Identify Structural Variations in Genomic Data [Website] [Forum Post]
      Projects: U87MG whole genome sequence [Website] [Paper]

      Comment

      • polyhedron
        Member
        • Aug 2009
        • 12

        #4
        Thanks very much, François and Michael!

        I've been testing bwa these days, and find bwa have considerable better mappability than soap2, and good with indels. The trade-off is that bwa is 3 or 4 times slower than soap2, and I haven't find so flexible output options as in soap2, like

        -r INT How to report repeat hits, 0=none; 1=random one; 2=all, [1]
        Therefore, I think when I've got really huge data, I would rather run soap2 first to get some basic idea and results from my reads, then just let bwa running for days to get more precise results.
        Last edited by polyhedron; 10-20-2010, 06:44 AM.

        Comment

        • sci_guy
          Member
          • Jan 2008
          • 83

          #5
          SHRiMP is very robust to INDELs, but is very slow (as it performs Smith-Waterman alignments).

          Comment

          • lry198010
            Member
            • Aug 2008
            • 13

            #6
            -r INT How to report repeat hits, 0=none; 1=random one; 2=all, [1]
            you can try options -n and -N in sampe or samse subrouting!

            Comment

            • drio
              Senior Member
              • Oct 2008
              • 323

              #7
              Originally posted by polyhedron View Post
              Therefore, I think when I've got really huge data, I would rather run soap2 first to get some basic idea and results from my reads, then just let bwa running for days to get more precise results.
              Days? BWA is very fast and accurate. How much data are you trying to align? On what environment? A lane of hiseq (75bp single ended) should take less than 1 day on a typical 8 core machine. You can further reduce the running times by splitting and computing the data into multiple machines.
              -drd

              Comment

              • Michael.James.Clark
                Senior Member
                • Apr 2009
                • 207

                #8
                Yeah, I'm with drio on this one. Something is wrong with your approach using BWA if it's taking days, even for a very large data set. You may not be running it in an optimal fashion for your computational resources if that's the case.

                All three gapped aligners that I mentioned--Novoalign, BFAST, BWA--are quite fast with Illumina reads, so if it seems slow, there may be an incorrect setting or it may be getting run in a suboptimal way.
                Mendelian Disorder: A blogshare of random useful information for general public consumption. [Blog]
                Breakway: A Program to Identify Structural Variations in Genomic Data [Website] [Forum Post]
                Projects: U87MG whole genome sequence [Website] [Paper]

                Comment

                • olesk
                  Junior Member
                  • Nov 2010
                  • 6

                  #9
                  Depending on the size of your reference sequence and amount of sequence data you could test if my program R2R is your solution.
                  Find it at: http://milne.ruc.dk/R2R/

                  Comment

                  • lexa
                    Member
                    • Jun 2010
                    • 17

                    #10
                    there is a new mapper available called STAMPY (http://www.well.ox.ac.uk/project-stampy). the paper seems promising but I haven't tested it myself.

                    Comment

                    • colindaven
                      Senior Member
                      • Oct 2008
                      • 417

                      #11
                      The best tools I've used for alignment of small indels are Stampy and Shrimp2. We've validated quite a number of these.
                      Unfortunately they are slow as well.

                      As a follow-up I've been using Dindel on Stampy aligned data for further testing. No wet-lab data on this yet though.

                      Comment

                      • ttnguyen
                        Member
                        • Mar 2010
                        • 41

                        #12
                        Novoalign and BWA work quite well for this problem. BWA is faster than the free version of Novoalign (not support MPI), but Novoalign maybe more accurate.
                        It might be worth taking a look at these surveys - though they are not the most up to date:


                        Comment

                        Latest Articles

                        Collapse

                        • SEQadmin2
                          Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                          by SEQadmin2


                          I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.


                          Here are nine questions we think about, in roughly the order they matter, before...
                          06-18-2026, 07:11 AM
                        • SEQadmin2
                          From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                          by SEQadmin2


                          Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                          The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                          ...
                          06-02-2026, 10:05 AM
                        • SEQadmin2
                          Single-Cell Sequencing at an Inflection Point: Early Impacts of New Platforms and Emerging Trends
                          by SEQadmin2


                          With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.


                          Introduction

                          Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...
                          05-22-2026, 06:42 AM

                        ad_right_rmr

                        Collapse

                        News

                        Collapse

                        Topics Statistics Last Post
                        Started by SEQadmin2, 06-17-2026, 06:09 AM
                        0 responses
                        21 views
                        0 reactions
                        Last Post SEQadmin2  
                        Started by SEQadmin2, 06-09-2026, 11:58 AM
                        0 responses
                        40 views
                        0 reactions
                        Last Post SEQadmin2  
                        Started by SEQadmin2, 06-05-2026, 10:09 AM
                        0 responses
                        46 views
                        0 reactions
                        Last Post SEQadmin2  
                        Started by SEQadmin2, 06-04-2026, 08:59 AM
                        0 responses
                        49 views
                        0 reactions
                        Last Post SEQadmin2  
                        Working...