Go Back   SEQanswers > Bioinformatics > Bioinformatics

Similar Threads
Thread Thread Starter Forum Replies Last Post
samtools flagstat output nguyendofx Bioinformatics 23 01-22-2014 05:45 AM
samtools flagstat output manore Bioinformatics 11 07-24-2013 12:19 AM
Samtools flagstat - no duplicates? Orr Shomroni Bioinformatics 3 11-25-2011 01:46 AM
Samtools flagstat Anelda Bioinformatics 0 09-26-2011 04:55 AM
samtools flagstat bair Bioinformatics 3 05-28-2010 07:15 AM

Thread Tools
Old 08-09-2011, 10:37 PM   #1
Junior Member
Location: India

Join Date: Aug 2011
Posts: 1
Default SAMtools flagstat output interpretation


I got the following info after running a samtools flagstat on a Novoalign bam file:

126597089 in total
0 QC failure
0 duplicates
122446987 mapped (96.72%)
126597089 paired in sequencing
63372478 read1
63224611 read2
104053862 properly paired (82.19%)
118953502 with itself and mate mapped
3493485 singletons (2.76%)
14745930 with mate mapped to a different chr
8838136 with mate mapped to a different chr (mapQ>=5)

what does the line no. 5 signify? and since these are paired reads, shouldn't read1 and read2 numbers be the same? does it have anything to do with using the "-r A" option for sam generation?

Thanks in advance.
a2z@blr is offline   Reply With Quote
Old 10-20-2011, 06:31 AM   #2
Irina Pulyakhina
Location: Oxford

Join Date: Sep 2010
Posts: 24

What kind of aligner do you use? You have paired-end data, so you can have your mate pairs aligned as pairs (and this is the proper way) --> this is number in line 5, or independently --> and is number for "singletons". So for left and right mate pairs you summarize number of alignments within a pair and number of independent alignments -- this is the "... read1" and "... read2".
Irina Pulyakhina is offline   Reply With Quote
Old 10-20-2011, 02:23 PM   #3
Senior Member
Location: San Diego

Join Date: May 2008
Posts: 912

I don't think Flagstat isn't all that smart. It's just reading the flags. All 126597089 reads you gave it are flagged as being paired, so it's relaying that info. bam entries are also flagged as to whether they came from read1 or read 2, and flagstat is just telling you what it sees. Probably, your bam went through some kind of quality filtering where more read2 reads were filtered away than read1 reads. That makes sense experimentally, and the flags wouldn't necessarily change as a result of that. When I use bwa and samtools, after running rmdup, I get different numbers of read 1 and read2 reads as well. I'm guessing that rmdup is also doing some kind of quality filtereing too. The file that I put into rmdup has the same numebr of read 1 and read2 reads.
swbarnes2 is offline   Reply With Quote

flagstat, interpret, samtools

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

All times are GMT -8. The time now is 09:11 PM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2022, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO