I used samtools stats to measure some basic metrics of a input bam file and got the following results:
raw total sequences: 1105415
filtered sequences: 80516
sequences: 1024899
is sorted: 1
1st fragments: 1024899
last fragments: 0
reads mapped: 940001
reads mapped and paired: 0 # paired-end technology bit set + both mates mapped
reads unmapped: 84898
reads properly paired: 0 # proper-pair bit set
reads paired: 0 # paired-end technology bit set
reads duplicated: 0 # PCR or optical duplicate bit set
reads MQ0: 14800 # mapped and MQ=0
reads QC failed: 0
non-primary alignments: 0
total length: 5395194643 # ignores clipping
bases mapped: 4998712634 # ignores clipping
bases mapped (cigar): 4531562523 # more accurate
bases trimmed: 0
bases duplicated: 0
mismatches: 688215582 # from NM fields
error rate: 1.518716e-01 # mismatches / bases mapped (cigar)
average quality: 19.8
insert size average: 0.0
How was wondering how's the average quality calculated? (It's a bit higher than I expected) Is it related to read's mean base quality? i.e. For each read, calculate its mean base quality, and then take the average of all reads?
Thanks in advance!
raw total sequences: 1105415
filtered sequences: 80516
sequences: 1024899
is sorted: 1
1st fragments: 1024899
last fragments: 0
reads mapped: 940001
reads mapped and paired: 0 # paired-end technology bit set + both mates mapped
reads unmapped: 84898
reads properly paired: 0 # proper-pair bit set
reads paired: 0 # paired-end technology bit set
reads duplicated: 0 # PCR or optical duplicate bit set
reads MQ0: 14800 # mapped and MQ=0
reads QC failed: 0
non-primary alignments: 0
total length: 5395194643 # ignores clipping
bases mapped: 4998712634 # ignores clipping
bases mapped (cigar): 4531562523 # more accurate
bases trimmed: 0
bases duplicated: 0
mismatches: 688215582 # from NM fields
error rate: 1.518716e-01 # mismatches / bases mapped (cigar)
average quality: 19.8
insert size average: 0.0
How was wondering how's the average quality calculated? (It's a bit higher than I expected) Is it related to read's mean base quality? i.e. For each read, calculate its mean base quality, and then take the average of all reads?
Thanks in advance!
Comment