Seqanswers Leaderboard Ad

**lindenb** · 01-08-2015, 11:13 PM

"I mean if the first operator after the clip (H or S) is D (deletion), so the POS should be one after the position where the deletion happens? "

YES. The position starts at the first M/=.

**blaboon** · 01-09-2015, 02:22 AM

Originally posted by lindenb View Post

"I mean if the first operator after the clip (H or S) is D (deletion), so the POS should be one after the position where the deletion happens? "

YES. The position starts at the first M/=.

Hi, thanks for your quick reply.

In my understanding of the SAM specification, M is equal to X/=, which means alignment match can be sequence match or mismatch

, please suggest.

bless~

**dpryan** · 01-09-2015, 02:31 AM

Correct, POS refers to the first of M=X, though you rarely see X or = in real life.

**maubp** · 01-09-2015, 08:12 AM

This came up in discussion on the samtools-dev mailing list, I think James Bonfield constructed some good examples... Evidently the spec needs a bit more clarification here?

**dpryan** · 01-09-2015, 08:15 AM

Do you remember when that came up? I thought I recalled that but couldn't find it with some quick searching.

**Brian Bushnell** · 01-09-2015, 10:48 AM

Originally posted by dpryan View Post

Correct, POS refers to the first of M=X, though you rarely see X or = in real life.

I see it all the time, since BBMap outputs those by default

"10S1D139M" is not a cigar string that should ever be produced. "D" should be internal to "M/X/=". If you see a read that violates this, just throw it away; it's nonsense.

"I" is a bit more tricky; it can occur and be valid at the ends. In that case, those bases should be ignored with respect to the POS, just like "S" bases.

Incidentally, I wrote another tool, BBMerge, which can merge paired reads, and adjusts the quality of overlapping bases to reflect whether or not they match. It's mainly used for merging reads by overlap, but it can merge based on mapping locations also, if you use it like this (the example assumes interleaved reads but they can be in two files also):

bbmap.sh ref=reference.fasta in=reads.fastq outm=mapped.fastq pairedonly renamebymapping pairlen=800
bbmerge.sh in=mapped.fq out=merged.fq usemapping parsecustom

Sorry, bbmerge does not currently work with sam/bam files; it requires custom headers on reads to merge them based on mapping data. The first step maps the reads and adds custom headers. The "pairlen" flag when mapping restricts the maximum distance between paired reads - if the reads overlap this is negative, and if they don't it is positive; insert size = (pairlen + read1 length + read2 length).

**blaboon** · 01-09-2015, 05:23 PM

Originally posted by dpryan View Post

Correct, POS refers to the first of M=X, though you rarely see X or = in real life.

Great, thanks, that really clear my confusion.

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 25 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 28 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 24 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 52 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

In the SAM format, how the POS field affected by insertion and deletion

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News