Seqanswers Leaderboard Ad

**dpryan** · 03-14-2014, 01:43 AM

Yes, you can do that. In fact, in the samtools C API there's a function (bam_calend) that does exactly that given a starting position and CIGAR string. The only CIGAR operations you have to worry about are 'M', '=', 'X', 'D', and 'N'. In each of those cases, just increment the position by the length of the operation (so 30M would increment by 30).

Remember to decrement the value by 1 at some point, or else you'll end up being 1 base off (if you were dealing with a BAM, this wouldn't be needed, since the coordinate is 0-based then and the result would then be correct in 1-based coordinates).

**dpryan** · 03-14-2014, 01:47 AM

BTW, there's also a 'B' operation (value 9, or BAM_CBACK), which I've never actually seen and seems to have been intended for Complete Genomics data. You can likely ignore it, since it's never made its way into actual use.

**Coryza** · 03-14-2014, 01:51 AM

Ok thanks! I'll do that. I do have an other question perhaps you can answer me, otherwise I'll make a new threat.

I've got Paired-End Illumina data mapped against the Human Hg19. When viewing the SAM output, how can I check if a pair mapped against the forward Hg19 genome sequence or against the reverse Hg19 genome sequence?

**dpryan** · 03-14-2014, 01:58 AM

Is this from strand-specific (or "directional") data? If not, you can't determine the strand of the original fragment. If this is stranded data, it ends up depending on the prep that you did. Most of them that I've seen work such that the orientation of read #1 decides the strand. When in doubt, open things in IGV and just have a look at a couple genes, that'll always clarify things.

**Coryza** · 03-14-2014, 02:10 AM

Originally posted by dpryan View Post

Is this from strand-specific (or "directional") data? If not, you can't determine the strand of the original fragment. If this is stranded data, it ends up depending on the prep that you did. Most of them that I've seen work such that the orientation of read #1 decides the strand. When in doubt, open things in IGV and just have a look at a couple genes, that'll always clarify things.

As far as I know this are all the cDNA sequences, forward and reverse data. I was hoping that I could see whenever a pair-end of sequences matches to the forward hg19 genome, or reverse hg19 genome. It matters because I want to look at a few + stranded genes and - stranded genes, and I would be handy if I can sort that during my analysis.

**dpryan** · 03-14-2014, 02:12 AM

You're best off just opening things in IGV and having a look at a couple genes. Then you'll know how the library prep was done and if you can use the 0x10 bit in the flag or not.

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 39 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 41 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 35 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 55 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

[Bowtie2] CIGAR string calculation.

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News