View Single Post
Old 07-06-2010, 08:08 AM   #1
Location: Groningen

Join Date: Feb 2010
Posts: 78
Default BWA - input files


I am trying to align paired end reads (Illumina) to a reference genome using BWA. I have 10 reads files, 5 for each direction.

In this old post 'totalnew' says to align the files separately:
bwa aln database.fasta 4_1.fq > 1_1.fq.sai
bwa aln database.fasta 4_2.fq > 1_2.fq.sai
bwa aln database.fasta 5_1.fq > 2_1.fq.sai
bwa aln database.fasta 5_2.fq > 2_2.fq.sai
I have been reading for quite a while now so sorry if I missed something totally obvious but how do I run sampe from there on? Can I just say
bwa sampe database.fa 1_1.fq.sai 2_1.fq.sai 3_1... 1_2.fq.sai 2_2.fq.sai 3_2... 1_1.fq 2_1.fq 3_1... 1_2.fq 2_2.fq 3_2... > alignment.sam
or do I need to run sampe for each file separately?
What if I concatenate all reads files into s_1_sequence.txt and s_2_sequences.txt and then run bwa aln twice and bwa samse once?


ps (offtopic) @lh3: I tried downloading bwa 0.5.8a but it seems as though there are files missing. Here what I get:
bunzip2 bwa-0.5.8a.tar.bz2
tar -xf bwa-0.5.8a.tar
ls bwa-0.5.8a
bntseq.c  bwase.c     bwtaln.c  bwtgap.h    bwtio.c     bwtsw2_aux.c    bwtsw2_main.c  is.c     kstring.c  main.h        simple_dp.c     utils.c
bntseq.h  bwase.h     bwtaln.h  bwt_gen     bwt_lite.c  bwtsw2_chain.c  ChangeLog      khash.h  kstring.h  Makefile  utils.h
bwa.1     bwaseqio.c  bwt.c     bwt.h       bwt_lite.h  bwtsw2_core.c   COPYING        kseq.h   kvec.h     NEWS          stdaln.c
bwape.c   bwa.txt     bwtgap.c  bwtindex.c  bwtmisc.c   bwtsw2.h        cs2nt.c        ksort.h  main.c  stdaln.h
I downloaded 0.5.7 and it runs fine.
Bruins is offline   Reply With Quote