Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Bowtie to BWA parameters

    I use a pipeline that uses bowtie as the aligner, however, due to genome size, I have had to alter this pipeline to use BWA. I need to run BWA in a way that will give me results equivalent as possible to how I run bowtie, but I am not entirely certain how to do this.

    Here are the arguments we use in running bowtie:

    -S -k 1 -m 1 --chunkmbs 3072 --best --strata -o 4 -e 80 -l 20 -n 0

    Anyone have any suggestions on the best way to run BWA?

  • #2
    assuming you mean bwa 'aln' and not 'mem'...

    -S -k1 -m1 --chunkmbs --best and --strata all correspond to how bwa works by default.

    -l 20 in bowtie is -l 20 in bwa.
    -n 0 in bowtie is -k 0 in bwa
    -e will be impossible to mimic because bwa doesn't care about base qualities however you can tinker with -n (in bwa) to adjust the mismatch allowance by either setting a strict limit (-n INT) or an automatic limit based on read length (0 < -n < 1).
    The -o option is not mirrored in bwa.

    The only other difference is bowtie does not report gapped alignments and BWA does. I have tried to figure out how to disable this behavior in BWA thought it seems like the -i option should be able to kill them off (-i is a number of bases from the end of a read limit for indels). In most aligners if you set this kind of setting to a value equal to or greater than your read length it disables gaps.

    If you're talking about using bwa 'mem' then things work a little differently and you actually have much less control.

    FYI it sounds like the new bowtie2 release can handle genomes > 4GB. bowtie2 is easier to configure in a way that mimics the behavior of bowtie 1. to the best of my knowledge you're going to have to change your mismatch allowance rule no matter what because bowtie1 is the only one I know of that uses that -e setting (where sum of base quals of mismatches is used as a limit).

    To match your options with bowtie2 (and produce un-gapped alignments) you can use the following:

    --gbar <read length or larger> --mp A,B --np 1 --score-min L,0,C -L 20 -N 0

    A, B and C should be replaced with values that you can tailor for mismatch allowance. bowtie2 will rely on a minimum alignment score setting for reporting alignments so to control mismatches you have to be specific about mismatch penalties. The --mp option is used to set the penalty for high-qual and low-qual bases. If you wanted a penalty of 2 for high qual and 1 for low qual you'd use --mp 2,1. The --score-min option sets the minimum score relative to your read length. The way I have it written the formula for minimum score will be read_length*-C. So with --score-min L,0,-0.04 and 100bp reads you're allowing a minimum score of -4 which could be divided up into -2 and -1 penalties for high-qual and low-qual mismatches (assuming --mp 2,1).

    By the way I've found bowtie2 to be very good at reproducing correct alignments in simulations. I'd say it's a great upgrade to the performance of bowtie1.
    Last edited by sdriscoll; 02-15-2014, 12:12 AM. Reason: updated knowledge
    /* Shawn Driscoll, Gene Expression Laboratory, Pfaff
    Salk Institute for Biological Studies, La Jolla, CA, USA */

    Comment

    Latest Articles

    Collapse

    • seqadmin
      Strategies for Sequencing Challenging Samples
      by seqadmin


      Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
      03-22-2024, 06:39 AM
    • seqadmin
      Techniques and Challenges in Conservation Genomics
      by seqadmin



      The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

      Avian Conservation
      Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
      03-08-2024, 10:41 AM

    ad_right_rmr

    Collapse

    News

    Collapse

    Topics Statistics Last Post
    Started by seqadmin, Yesterday, 06:37 PM
    0 responses
    10 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, Yesterday, 06:07 PM
    0 responses
    9 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 03-22-2024, 10:03 AM
    0 responses
    50 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 03-21-2024, 07:32 AM
    0 responses
    67 views
    0 likes
    Last Post seqadmin  
    Working...
    X