View Single Post
Old 06-08-2014, 03:06 AM   #5
gringer
David Eccles (gringer)
 
Location: Wellington, New Zealand

Join Date: May 2011
Posts: 838
Default

Here's a rough idea of how to do bam2fastq:
Code:
samtools view file.bam | awk -F '\t' '{print ">"$1"\n"$10"\n+\n"$11}' > file.fastq
Unfortunately this will give you a fastq file with interleaved reads, which can be a little bit of a pain to use. You can use the filter function (-f / -F) of samtools view to get around that, reading through the BAM file twice:

Code:
samtools view -f 0x40 file.bam | awk -F '\t' '{print ">"$1"\n"$10"\n+\n"$11}' > file_R1.fastq
samtools view -f 0x80 file.bam | awk -F '\t' '{print ">"$1"\n"$10"\n+\n"$11}' > file_R2.fastq
The SAM File format specification is your friend, see section 1.4.

The process of BAM -> FASTQ -> Tophat is slower in terms of computer time, but from your description it sounds like it will be quicker in terms of bum-on-seat time.

Last edited by gringer; 06-08-2014 at 03:08 AM.
gringer is offline   Reply With Quote