SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Several input files w. Novoalign? kga1978 Bioinformatics 18 11-23-2011 12:57 AM
where is the error in my input files? shuang Bioinformatics 3 08-23-2011 01:23 AM
input files for IMAGE Maegwin Bioinformatics 4 04-22-2011 04:54 PM
SVA input files srd Introductions 0 03-16-2011 06:17 AM
IMAGE input files skingan Genomic Resequencing 0 07-29-2010 12:02 PM

Reply
 
Thread Tools
Old 07-06-2010, 07:08 AM   #1
Bruins
Member
 
Location: Groningen

Join Date: Feb 2010
Posts: 78
Default BWA - input files

Hi,

I am trying to align paired end reads (Illumina) to a reference genome using BWA. I have 10 reads files, 5 for each direction.

In this old post 'totalnew' says to align the files separately:
Code:
bwa aln database.fasta 4_1.fq > 1_1.fq.sai
bwa aln database.fasta 4_2.fq > 1_2.fq.sai
bwa aln database.fasta 5_1.fq > 2_1.fq.sai
bwa aln database.fasta 5_2.fq > 2_2.fq.sai
[...]
I have been reading for quite a while now so sorry if I missed something totally obvious but how do I run sampe from there on? Can I just say
Code:
bwa sampe database.fa 1_1.fq.sai 2_1.fq.sai 3_1... 1_2.fq.sai 2_2.fq.sai 3_2... 1_1.fq 2_1.fq 3_1... 1_2.fq 2_2.fq 3_2... > alignment.sam
or do I need to run sampe for each file separately?
What if I concatenate all reads files into s_1_sequence.txt and s_2_sequences.txt and then run bwa aln twice and bwa samse once?

cheers!

ps (offtopic) @lh3: I tried downloading bwa 0.5.8a but it seems as though there are files missing. Here what I get:
Code:
wget http://sourceforge.net/projects/bio-bwa/files/bwa-0.5.8a.tar.bz2/download
bunzip2 bwa-0.5.8a.tar.bz2
tar -xf bwa-0.5.8a.tar
ls bwa-0.5.8a
bntseq.c  bwase.c     bwtaln.c  bwtgap.h    bwtio.c     bwtsw2_aux.c    bwtsw2_main.c  is.c     kstring.c  main.h        simple_dp.c     utils.c
bntseq.h  bwase.h     bwtaln.h  bwt_gen     bwt_lite.c  bwtsw2_chain.c  ChangeLog      khash.h  kstring.h  Makefile      solid2fastq.pl  utils.h
bwa.1     bwaseqio.c  bwt.c     bwt.h       bwt_lite.h  bwtsw2_core.c   COPYING        kseq.h   kvec.h     NEWS          stdaln.c
bwape.c   bwa.txt     bwtgap.c  bwtindex.c  bwtmisc.c   bwtsw2.h        cs2nt.c        ksort.h  main.c     qualfa2fq.pl  stdaln.h
I downloaded 0.5.7 and it runs fine.
Bruins is offline   Reply With Quote
Old 07-06-2010, 07:39 AM   #2
BENM
Member
 
Location: PRC

Join Date: May 2009
Posts: 33
Default

I think you'd better to run sampe for each PE files separately:

bwa sampe [options] <prefix> <in1.sai> <in2.sai> <in1.fq> <in2.fq>

Although you concatenate all read files into two PE seq is okay, I recommend you split them (for example: 100M reads per files) and run in different CPU cores or computer nodes (MPI mode), so that it would be more faster.
BENM is offline   Reply With Quote
Old 07-06-2010, 11:43 PM   #3
Bruins
Member
 
Location: Groningen

Join Date: Feb 2010
Posts: 78
Default

Hi,

Thanks for your reply.

So you suggest running sampe at least five times and then merge the resulting sam files?

I will try this.

Cheers!

*** edit
Thanks, works fine.

Last edited by Bruins; 07-12-2010 at 01:47 AM.
Bruins is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 01:41 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO