SEQanswers (
-   Bioinformatics (
-   -   BWA - input files (

Bruins 07-06-2010 07:08 AM

BWA - input files

I am trying to align paired end reads (Illumina) to a reference genome using BWA. I have 10 reads files, 5 for each direction.

In this old post 'totalnew' says to align the files separately:

bwa aln database.fasta 4_1.fq > 1_1.fq.sai
bwa aln database.fasta 4_2.fq > 1_2.fq.sai
bwa aln database.fasta 5_1.fq > 2_1.fq.sai
bwa aln database.fasta 5_2.fq > 2_2.fq.sai

I have been reading for quite a while now so sorry if I missed something totally obvious but how do I run sampe from there on? Can I just say

bwa sampe database.fa 1_1.fq.sai 2_1.fq.sai 3_1... 1_2.fq.sai 2_2.fq.sai 3_2... 1_1.fq 2_1.fq 3_1... 1_2.fq 2_2.fq 3_2... > alignment.sam
or do I need to run sampe for each file separately?
What if I concatenate all reads files into s_1_sequence.txt and s_2_sequences.txt and then run bwa aln twice and bwa samse once?


ps (offtopic) @lh3: I tried downloading bwa 0.5.8a but it seems as though there are files missing. Here what I get:

bunzip2 bwa-0.5.8a.tar.bz2
tar -xf bwa-0.5.8a.tar
ls bwa-0.5.8a
bntseq.c  bwase.c    bwtaln.c  bwtgap.h    bwtio.c    bwtsw2_aux.c    bwtsw2_main.c  is.c    kstring.c  main.h        simple_dp.c    utils.c
bntseq.h  bwase.h    bwtaln.h  bwt_gen    bwt_lite.c  bwtsw2_chain.c  ChangeLog      khash.h  kstring.h  Makefile  utils.h
bwa.1    bwaseqio.c  bwt.c    bwt.h      bwt_lite.h  bwtsw2_core.c  COPYING        kseq.h  kvec.h    NEWS          stdaln.c
bwape.c  bwa.txt    bwtgap.c  bwtindex.c  bwtmisc.c  bwtsw2.h        cs2nt.c        ksort.h  main.c  stdaln.h

I downloaded 0.5.7 and it runs fine.

BENM 07-06-2010 07:39 AM

I think you'd better to run sampe for each PE files separately:

bwa sampe [options] <prefix> <in1.sai> <in2.sai> <in1.fq> <in2.fq>

Although you concatenate all read files into two PE seq is okay, I recommend you split them (for example: 100M reads per files) and run in different CPU cores or computer nodes (MPI mode), so that it would be more faster.

Bruins 07-06-2010 11:43 PM


Thanks for your reply.

So you suggest running sampe at least five times and then merge the resulting sam files?

I will try this.


*** edit
Thanks, works fine.

All times are GMT -8. The time now is 11:48 PM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.