SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics

Similar Threads
Thread Thread Starter Forum Replies Last Post
BWA:high amount of unique alignments despite high mismatch tolerance moritzhess Bioinformatics 2 09-05-2011 12:31 PM
BWA:Unique reads in seeding mode Chandana Bioinformatics 0 12-10-2010 12:26 AM
Regarding Unique reads, Unique alignments sridharacharya RNA Sequencing 2 09-20-2010 06:39 AM
Unique VS Non-Unique read analysis samt Bioinformatics 2 09-29-2009 10:44 AM
unique mapping for chip-seq? frozenlyse Epigenetics 2 12-11-2008 03:05 AM

Reply
 
Thread Tools
Old 06-30-2010, 05:10 PM   #1
christophpale
Member
 
Location: canada

Join Date: May 2010
Posts: 16
Default BWA unique mapping, mutireads

Hi,
Question: What is the best way to remove reads that map to multiple locations?

What I have done so far:
I have a 76x2 paired end data which I mapped to hg18 using BWA.
I want to get rid of reads that map to multiple locations, so I was
told to use the cutoff of -q 1 in samtools view;
however, to my surprise, I have reads with non-zero quality but appear to map to alternative locations with the same score.
For example, this read maps to chr1 and also to chrM

Code:
PF_Y_612GW_I581:100617:1:21:8990:3211#0 89      chr1    557906  23      76M     *       0       0       CCCATGGCCTCCATGACTTTTTCAAAAAGATATTAGAAAAACCATTTCATAACTTTGTCAAAGTTAAATTATAGGC    HHHHDHH>HBHHHHHHGHHHH>HGHHHHHHHHGFHHHHHFHGGDGGDHHHHFHFHHHHHHHHFHHHEHHHGGDGGG    XT:A:U  NM:i:0  SM:i:23 AM:i:0  X0:i:1  X1:i:1  XM:i:0  XO:i:0  XG:i:0  MD:Z:76 XA:Z:chrM,-7493,76M,1;
XA tag is the BWA optional field and if I understand correctly it means that this sequence also mapped to chrM with 100% matches;
so it appears that it is not sufficient to use the -q 1 as the flag to remove multiple hits.

Can someone tell me if I am correct; and also what is the best way to get rid of reads that align non-uniquely on the genome.


thanks
Christoph
christophpale is offline   Reply With Quote
Old 07-02-2010, 07:21 AM   #2
lh3
Senior Member
 
Location: Boston

Join Date: Feb 2008
Posts: 693
Default

Mapping quality. See samtools FAQ.
lh3 is offline   Reply With Quote
Old 07-07-2010, 02:22 AM   #3
hash
Junior Member
 
Location: London

Join Date: May 2010
Posts: 3
Default Mismatch threshold for both reads in a paired-end run

Hi,

One of my colleagues has found that when he aligns 80bp paired-end reads using bwa that the mismatch threshold only applies to any one of the two sequences and the other can have as many mismatches as permitted. For example, one paired read can have the default two mismatches and the other paired read had 45 mismatches. Is there an option in bwa to set the mismatch level on both paired reads because it is going to be quite unproductive to have to parse the resulting SAM file. Please give me some good news with regard to phantom parameters. Look forward to hearing from you.

Harshil
hash is offline   Reply With Quote
Old 07-07-2010, 06:18 AM   #4
lh3
Senior Member
 
Location: Boston

Join Date: Feb 2008
Posts: 693
Default

It cannot be 45 mismatches. It must be soft clipping.

For PE reads, bwa runs SW alignment for unpaired mate. SW is much more permissive. This procedure is rewritten in the paper, not phantom. You may filter the alignment if you do not like them.
lh3 is offline   Reply With Quote
Old 07-29-2010, 03:25 PM   #5
donniemarco
Member
 
Location: USA

Join Date: Aug 2009
Posts: 17
Default bwa minimum map quality

Is there as anything as minimum or optimal mapq in the bwa.

I plan to take unique query sequences with highest mapq but still not able to figure the cutoff to use it for the analysis.

Any help is appreciated!!
donniemarco is offline   Reply With Quote
Old 08-06-2010, 05:58 AM   #6
aleferna
Senior Member
 
Location: sweden

Join Date: Sep 2009
Posts: 121
Default

Minimum MapQ value in BWA.

I ran some simulations with bwa in Single End Mapping and it really depends on the read length and the number of mismatches. Also the mapq value of BWA ALN (the short aligner) you should use MapQ > 30 anything less than that you pretty much cannot trust. If you use bwasw you can use a MapQ of 10. These 2 settings should give you >99% accuracy in the mapping, specially if you increase the sensitivity in BWASW (Z > 100). This however is for Single End mapping not Paired End mapping.

Hint: 45 mismatches is probably a paired end adapter, did you mask your sequences ? Its a good idea to check your fastq file for the Solexa PE adapters. In my case I had A LOT of contamination from these adapters, that makes it harder to map. I have a small python script to filter a fastq file if you want it.

Last edited by aleferna; 08-06-2010 at 06:06 AM.
aleferna is offline   Reply With Quote
Old 08-06-2010, 08:27 AM   #7
donniemarco
Member
 
Location: USA

Join Date: Aug 2009
Posts: 17
Default

thanks for your reply Aleferna. Its a great help. I have cleaned my sequences already.
donniemarco is offline   Reply With Quote
Old 12-06-2010, 08:14 AM   #8
frymor
Senior Member
 
Location: Germany

Join Date: May 2010
Posts: 150
Arrow

Quote:
Originally Posted by aleferna View Post
Minimum MapQ value in BWA.
I have a small python script to filter a fastq file if you want it.
Hi,

I would like to have a look at the python script.

I run the fastqc software on my data before start using the bwa algorithm. I found out that at both ends there are still some adapter sequences which I would like to trimm.

It would be great if I can have a look at your script.
frymor is offline   Reply With Quote
Old 08-10-2011, 07:13 AM   #9
mghita
Member
 
Location: Cambridge

Join Date: Aug 2011
Posts: 10
Default

Hi,

What options should I use in bwa in order to obtain perfect maps only? I have two references and I would like to get the unique perfect maps to both of them (in two different files if possible).

Thanks!
mghita is offline   Reply With Quote
Old 08-10-2011, 02:20 PM   #10
donniemarco
Member
 
Location: USA

Join Date: Aug 2009
Posts: 17
Default

Quote:
Originally Posted by mghita View Post
Hi,

What options should I use in bwa in order to obtain perfect maps only? I have two references and I would like to get the unique perfect maps to both of them (in two different files if possible).

Thanks!
I think its -F 4 option in BWA!!!
donniemarco is offline   Reply With Quote
Old 08-10-2011, 03:56 PM   #11
swbarnes2
Senior Member
 
Location: San Diego

Join Date: May 2008
Posts: 912
Default

Quote:
Originally Posted by donniemarco View Post
I think its -F 4 option in BWA!!!
I don't see -F as an option in either bwa aln or sampe

but bwa aln -n 0 might do the trick. Or, if you already have sam files, you could filter them with grep to only contain lines with "NM:i:0" That means a perfect match.
swbarnes2 is offline   Reply With Quote
Reply

Tags
bwa, multireads, unique hit

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 12:47 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO