Hello,
Overall, I need extract paired and mate reads (Illumina) in FASTQ format, from an specific region of a BAM file (generated with Bowtie2). For this purpose I have used SAMtools, bam2fastq and basm2fastx; however I'm not obtaining all the reads that I would expected.
This is what I am doing:
I have used the following commands to generate BAM files (for my mate and paired reads) using SAMtools
THe number of reads (in a BAM format, obtained with samtools view -c argument) that I would expect are: 2845 mapped reads, 2695 paired reads and 150 mate reads.
Now, the next step would be to extract each set of reads into a FASTQ format. I have used the (Tophat) bam2fastx tool to extract the mate reads:
For the paired reads I used the bam2fastq tool, which generates two FASTQ files, but the reads counts don't correspond with the numbers previously described. This is the command I used and its output:
The warning message reports that there are 15 reads that don't have a mate.
I find no explanation for this result. Does these 15 reads should be in the
mate_reads.bam file? Do I have a problem with the samtools command flags to extract the reads?
Any sugestions would be appreciated. Thanks!
Regards
Héctor Spitia
Overall, I need extract paired and mate reads (Illumina) in FASTQ format, from an specific region of a BAM file (generated with Bowtie2). For this purpose I have used SAMtools, bam2fastq and basm2fastx; however I'm not obtaining all the reads that I would expected.
This is what I am doing:
I have used the following commands to generate BAM files (for my mate and paired reads) using SAMtools
Code:
$ samtools view -u -F4 alignment.bam 'myregion' > mapped_reads.bam $ samtools view -u -F8 mapped_reads.bam > paired_reads.bam $ samtools view -u -f8 mapped_reads.bam > mate_reads.bam
Now, the next step would be to extract each set of reads into a FASTQ format. I have used the (Tophat) bam2fastx tool to extract the mate reads:
Code:
$ bam2fastx -q -A -o mate_reads.fastq mate_reads.bam
Code:
$ bam2fastq -o paired_reads#.fastq paired_reads.bam This looks like paired data from lane 223. Output will be in paired_reads_1.fastq and paired_reads_2.fastq 2695 sequences in the BAM file 2695 sequences exported WARNING: 15 reads could not be matched to a mate and were not exported
I find no explanation for this result. Does these 15 reads should be in the
mate_reads.bam file? Do I have a problem with the samtools command flags to extract the reads?
Any sugestions would be appreciated. Thanks!
Regards
Héctor Spitia
Comment