Go Back   SEQanswers > Bioinformatics > Bioinformatics

Similar Threads
Thread Thread Starter Forum Replies Last Post
Several input files w. Novoalign? kga1978 Bioinformatics 18 11-23-2011 01:57 AM
where is the error in my input files? shuang Bioinformatics 3 08-23-2011 02:23 AM
input files for IMAGE Maegwin Bioinformatics 4 04-22-2011 05:54 PM
SVA input files srd Introductions 0 03-16-2011 07:17 AM
IMAGE input files skingan Genomic Resequencing 0 07-29-2010 01:02 PM

Thread Tools
Old 07-06-2010, 08:08 AM   #1
Location: Groningen

Join Date: Feb 2010
Posts: 78
Default BWA - input files


I am trying to align paired end reads (Illumina) to a reference genome using BWA. I have 10 reads files, 5 for each direction.

In this old post 'totalnew' says to align the files separately:
bwa aln database.fasta 4_1.fq > 1_1.fq.sai
bwa aln database.fasta 4_2.fq > 1_2.fq.sai
bwa aln database.fasta 5_1.fq > 2_1.fq.sai
bwa aln database.fasta 5_2.fq > 2_2.fq.sai
I have been reading for quite a while now so sorry if I missed something totally obvious but how do I run sampe from there on? Can I just say
bwa sampe database.fa 1_1.fq.sai 2_1.fq.sai 3_1... 1_2.fq.sai 2_2.fq.sai 3_2... 1_1.fq 2_1.fq 3_1... 1_2.fq 2_2.fq 3_2... > alignment.sam
or do I need to run sampe for each file separately?
What if I concatenate all reads files into s_1_sequence.txt and s_2_sequences.txt and then run bwa aln twice and bwa samse once?


ps (offtopic) @lh3: I tried downloading bwa 0.5.8a but it seems as though there are files missing. Here what I get:
bunzip2 bwa-0.5.8a.tar.bz2
tar -xf bwa-0.5.8a.tar
ls bwa-0.5.8a
bntseq.c  bwase.c     bwtaln.c  bwtgap.h    bwtio.c     bwtsw2_aux.c    bwtsw2_main.c  is.c     kstring.c  main.h        simple_dp.c     utils.c
bntseq.h  bwase.h     bwtaln.h  bwt_gen     bwt_lite.c  bwtsw2_chain.c  ChangeLog      khash.h  kstring.h  Makefile  utils.h
bwa.1     bwaseqio.c  bwt.c     bwt.h       bwt_lite.h  bwtsw2_core.c   COPYING        kseq.h   kvec.h     NEWS          stdaln.c
bwape.c   bwa.txt     bwtgap.c  bwtindex.c  bwtmisc.c   bwtsw2.h        cs2nt.c        ksort.h  main.c  stdaln.h
I downloaded 0.5.7 and it runs fine.
Bruins is offline   Reply With Quote
Old 07-06-2010, 08:39 AM   #2
Location: PRC

Join Date: May 2009
Posts: 33

I think you'd better to run sampe for each PE files separately:

bwa sampe [options] <prefix> <in1.sai> <in2.sai> <in1.fq> <in2.fq>

Although you concatenate all read files into two PE seq is okay, I recommend you split them (for example: 100M reads per files) and run in different CPU cores or computer nodes (MPI mode), so that it would be more faster.
BENM is offline   Reply With Quote
Old 07-07-2010, 12:43 AM   #3
Location: Groningen

Join Date: Feb 2010
Posts: 78


Thanks for your reply.

So you suggest running sampe at least five times and then merge the resulting sam files?

I will try this.


*** edit
Thanks, works fine.

Last edited by Bruins; 07-12-2010 at 02:47 AM.
Bruins is offline   Reply With Quote

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

All times are GMT -8. The time now is 01:54 PM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2022, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO