Hi all,
I am working with ChIP-seq data and want to filter the reads with the following criteria: (A) mappable reads; (B) unique alignments (most likely based on MAPQ score) and (C) not more than 1 mismatches. I want the final output as BED file.
I have BAM files, and am doing the following to achieve A, B, C.
The NM tag in the BAM file could be used for filtering based on number of mismatches. Is there a way I can do all filtering in one step (is there a SAMtools command to get this?).
Thanks for any help or pointers.
I am working with ChIP-seq data and want to filter the reads with the following criteria: (A) mappable reads; (B) unique alignments (most likely based on MAPQ score) and (C) not more than 1 mismatches. I want the final output as BED file.
I have BAM files, and am doing the following to achieve A, B, C.
Code:
samtools view -b -F 4 -q 20 temp.bam > mapped_temp1.bam bamToBed -i mapped_temp1.bam > mapped_temp1.bed bamToBed -i mapped_temp1.bam -tag NM > mapped_temp2.bed paste mapped_temp1.bed mapped_temp2.bed | awk '$11 == 0' > final.bed # Field 11 is the NM score paste mapped_temp1.bed mapped_temp2.bed | awk '$11 == 1' >> final.bed more final.bed | cut -f1-6 > temp_FILTERED.bed # Retain fields 1-6
Thanks for any help or pointers.
Comment