hello,
ugh... this is such a stupid question, i'm embarrassed to ask it, but i must. i can't find a definition of the workflow to use bwa to map paired end reads anywhere (am i simply missing it?). i'm starting with fastq seq files and i know how to index the genome. compiling info from the snooping i've done - this is what i believe the workflow to be:
$ bwa aln genome.index.fa s_4_1_sequence.fastq > s_4_1.sai
$ bwa aln genome.index.fa s_4_2_sequence.fastq > s_4_2.sai
$ bwa sampe genome.index.fa s_4_1.sai s_4_2.sai s_4_1_sequence.fastq s_4_2_sequence.fastq > s_4.sam
then, using samtools and s_4.sam i can create .bam and .bam.bai files.
is this correct? i think i understand the -b1 and -b2 options, but those are only used when starting w/ .bam files, which i'm not.
what is a little confusing to me is the two separate calls to 'bwa aln'. i suppose it's wildly wrong to (unix) cat the two s_4_[12]_sequence.fastq files together and make one call to 'bwa aln'?
thanks for your help,
mike
ugh... this is such a stupid question, i'm embarrassed to ask it, but i must. i can't find a definition of the workflow to use bwa to map paired end reads anywhere (am i simply missing it?). i'm starting with fastq seq files and i know how to index the genome. compiling info from the snooping i've done - this is what i believe the workflow to be:
$ bwa aln genome.index.fa s_4_1_sequence.fastq > s_4_1.sai
$ bwa aln genome.index.fa s_4_2_sequence.fastq > s_4_2.sai
$ bwa sampe genome.index.fa s_4_1.sai s_4_2.sai s_4_1_sequence.fastq s_4_2_sequence.fastq > s_4.sam
then, using samtools and s_4.sam i can create .bam and .bam.bai files.
is this correct? i think i understand the -b1 and -b2 options, but those are only used when starting w/ .bam files, which i'm not.
what is a little confusing to me is the two separate calls to 'bwa aln'. i suppose it's wildly wrong to (unix) cat the two s_4_[12]_sequence.fastq files together and make one call to 'bwa aln'?
thanks for your help,
mike
Comment