SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Strange Bioanalyzer results wzombie RNA Sequencing 5 01-11-2012 04:24 AM
strange results of samtools liying Bioinformatics 3 09-23-2011 12:02 AM
strange mapping results bwa + SOLiD Hit SOLiD 11 05-09-2011 11:54 AM
Strange DE results SMcTaggart Bioinformatics 0 11-25-2010 05:53 AM
maq2sam-long eats first letter aleferna Bioinformatics 1 07-14-2010 03:18 AM

Reply
 
Thread Tools
Old 08-13-2009, 09:57 AM   #1
wjeck
Member
 
Location: Chapel Hill, NC

Join Date: Mar 2009
Posts: 39
Default Strange Results After maq2sam-long

Hi all,

I've been working with bwa, maq and samtools for a few weeks now, and my PI just came across an unusual result which now has me worried about my results. I started off my workflow by running MAQ on a data set and matching against a restricted chromosomal region of hg18. I now have output with an example as follows:

HWUSI-EAS211R_5:2:6:90:1384 131 chr9_22054888_22134171 1 99 35M * 0 172 ATCCTTGGAGTTGTGAGGATTTAATGCAATTGTCT WWWWWWWWWWWWWWWWWVWWWWWWWUWWWVUUUUT MF:i:18 AM:i:99 SM:i:99 NM:i:1 UQ:i:30 H0:i:1 H1:i:0

My question is this: what is going on with the tags NM, H0 and H1 (in bold above). NM:i:1 should mean that the read has one mismatch to the genome, which seems to be true if I blat back to the reference. However H0:i:1 should mean that there is an exact match to the genome, and H1:i:1 should mean that there are no matches with distance 1 from the reference. Am I misinterpreting the tags or is this really inconsistent? If it is inconsistent, where is the bug (MAQ or maq2sam-long) and how can I fix it?

--Will
wjeck is offline   Reply With Quote
Old 08-13-2009, 10:36 AM   #2
totalnew
Member
 
Location: Canada

Join Date: Apr 2009
Posts: 46
Default

Tags NM, H0 and H1 are quite confusing, I discuss it in this thread, please take a look,
http://seqanswers.com/forums/showthread.php?t=2258

The read you listed above could be interpreted in two ways:
NM:i:1 H0:1:1 H1:i:0
1. Unique mapping has 1 mismatch, the number of hit with no mismatch is 1, the number of hit with 1 mismatch is 0. NM field contradicts H0 field.

2. Unique mapping has 1 mismatch, the number of hits of best hit is 1, the number of suboptimal hits with 1 more mismatch is 0. This is explainable.

However, I 've come across this as well:
MF:i:32 AM:i:47 NM:i:2 UQ:i:60 H0:i:0 H1:i:1

According to the second explanation, H0 should be 1, and the best hit has 2 mismatches. I may not consent this is a bug of maq or maq2sam-long, but the ambiguous definition of tags.
totalnew is offline   Reply With Quote
Old 08-13-2009, 10:59 AM   #3
wjeck
Member
 
Location: Chapel Hill, NC

Join Date: Mar 2009
Posts: 39
Default

I agree that this is abiguous, but looking at the documentation is even more concerning. NM, by this definition, should refer to the particular alignment being reported not just to unique alignments. In this case the H1:i:0 tag is a misreport, because it implies that there are no reads with 1-difference from the reference, but simultaneously is itself reporting a read 1-difference from the reference.

See: http://samtools.sourceforge.net/SAM1.pdf - page 7
wjeck is offline   Reply With Quote
Old 08-13-2009, 11:47 AM   #4
totalnew
Member
 
Location: Canada

Join Date: Apr 2009
Posts: 46
Default

It still mixed me up, I thought NM (edit distance) is more or less similar to "number of mismatches of the best hit" defined in MAQ manual. I ever parsed the maq output (.map) file, the distribution of field "number of mismatches of the best hit" I count is exactly same as the distribution I count for NM tag from the sam file converted by same map file.
totalnew is offline   Reply With Quote
Old 08-13-2009, 11:58 AM   #5
wjeck
Member
 
Location: Chapel Hill, NC

Join Date: Mar 2009
Posts: 39
Default

Yea I am getting the same result. I think that's because maq2sam-long is returning only the best hit when it outputs in SAM format, so NM is equal to the number of matches in the best hit AND that hit. Clearly NM is consistent with the entire rest of the line. However H1:i and H0:i are not, and I believe need fixing in maq2sam-long or in maq (but probably not in maq)
wjeck is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 02:19 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO