SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
How to parse the bases string in pileup file lethalfang Bioinformatics 0 06-08-2014 08:12 PM
[Bowtie2] CIGAR string calculation. Coryza Bioinformatics 6 03-14-2014 02:12 AM
CIGAR string from BWA-SW output incorrect ? robs Bioinformatics 13 01-13-2012 04:07 AM
BWA generating incorrect CIGAR string? foxyg Bioinformatics 6 09-16-2011 11:22 AM
generate CIGAR string from 2 sequences? bbimber Bioinformatics 0 03-20-2010 09:44 AM

Reply
 
Thread Tools
Old 03-19-2015, 04:49 PM   #1
tedwong
Member
 
Location: Sydney

Join Date: Mar 2015
Posts: 13
Default Parse CIGAR string in C/C++

What'd be the best way to parse a CIGAR string fully according to the specification in C/C++? Would regular expression work?
tedwong is offline   Reply With Quote
Old 03-19-2015, 08:15 PM   #2
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

Quote:
Originally Posted by tedwong View Post
What'd be the best way to parse a CIGAR string fully according to the specification in C/C++? Would regular expression work?
No. Unless you simply want to detect the presence of some operation, the best way is with a custom loop.

Here's an example in Java that can easily be translated to C++:

http://seqanswers.com/forums/showthread.php?t=51162
Brian Bushnell is offline   Reply With Quote
Old 03-20-2015, 12:31 AM   #3
dpryan
Devon Ryan
 
Location: Freiburg, Germany

Join Date: Jul 2011
Posts: 3,480
Default

If you're using C/C++ already, then just use htslib. The functions for this have already been written (afterall, it's what samtools uses) and the API is generally convenient.
dpryan is offline   Reply With Quote
Old 03-20-2015, 01:29 AM   #4
lindenb
Senior Member
 
Location: France

Join Date: Apr 2010
Posts: 143
Default

cross posted: http://stackoverflow.com/questions/2...lar-expression
lindenb is offline   Reply With Quote
Old 04-08-2015, 03:30 AM   #5
student-t
Member
 
Location: Garvan Institute

Join Date: Mar 2015
Posts: 16
Default

For reference (using the htslib library)



#include <htslib/sam.h>

auto f = sam_open(file.c_str(), "r");
auto h = sam_hdr_read(f);
auto t = bam_init1();

while (sam_read1(f, h, t) >= 0)
{
auto id = std::string(h->target_name[0]);
auto mapped = !(t->core.flag & BAM_FUNMAP);

const auto cigar = bam_get_cigar(t);

for (int k = 0; k < t->core.n_cigar; k++)
{
const int op = bam_cigar_op(cigar[k]);
const int ol = bam_cigar_oplen(cigar[k]);

if (op == BAM_CMATCH || op == BAM_CINS || op == BAM_CDEL)
{
// your code, you have the length in ol (eg: 101M -> ol == 101)
}
}
}

sam_close(f);
student-t is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 03:13 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO