View Single Post
Old 06-30-2010, 04:10 PM   #1
Location: canada

Join Date: May 2010
Posts: 16
Default BWA unique mapping, mutireads

Question: What is the best way to remove reads that map to multiple locations?

What I have done so far:
I have a 76x2 paired end data which I mapped to hg18 using BWA.
I want to get rid of reads that map to multiple locations, so I was
told to use the cutoff of -q 1 in samtools view;
however, to my surprise, I have reads with non-zero quality but appear to map to alternative locations with the same score.
For example, this read maps to chr1 and also to chrM

PF_Y_612GW_I581:100617:1:21:8990:3211#0 89      chr1    557906  23      76M     *       0       0       CCCATGGCCTCCATGACTTTTTCAAAAAGATATTAGAAAAACCATTTCATAACTTTGTCAAAGTTAAATTATAGGC    HHHHDHH>HBHHHHHHGHHHH>HGHHHHHHHHGFHHHHHFHGGDGGDHHHHFHFHHHHHHHHFHHHEHHHGGDGGG    XT:A:U  NM:i:0  SM:i:23 AM:i:0  X0:i:1  X1:i:1  XM:i:0  XO:i:0  XG:i:0  MD:Z:76 XA:Z:chrM,-7493,76M,1;
XA tag is the BWA optional field and if I understand correctly it means that this sequence also mapped to chrM with 100% matches;
so it appears that it is not sufficient to use the -q 1 as the flag to remove multiple hits.

Can someone tell me if I am correct; and also what is the best way to get rid of reads that align non-uniquely on the genome.

christophpale is offline   Reply With Quote