SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
SOAP alignment format convert to SAM/BAM KevinLam Bioinformatics 31 01-10-2018 08:05 PM
Multiple sequence alignment analysis biobudhan Bioinformatics 1 03-28-2012 07:11 PM
bfast gapped alignment Protaeus Bioinformatics 1 08-30-2010 09:33 PM
Gapped Alignment agc Bioinformatics 3 06-07-2010 09:20 AM
Gapped alignment using SOAP? MattB Bioinformatics 0 11-06-2009 09:28 PM

Reply
 
Thread Tools
Old 03-24-2011, 10:03 AM   #1
query
Junior Member
 
Location: boston

Join Date: Feb 2009
Posts: 3
Default BAM/SAM to a gapped multiple sequence alignment

Are there any tools out there that would convert a SAM/BAM file to a gapped multiple sequence alignment of the reads and the reference? I am looking for a text formatted output (like fasta) of this rather than a vizualization tool.
query is offline   Reply With Quote
Old 03-25-2011, 12:12 AM   #2
maubp
Peter (Biopython etc)
 
Location: Dundee, Scotland, UK

Join Date: Jul 2009
Posts: 1,543
Default

It is a non trivial task since the SAM/BAM format isn't a multiple sequence alignment but a collection of pairwise alignments to the reference. The problem comes with conflicting inserts.

e.g. Suppose the first insert is between columns 10 and 11. Most reads say there is no insert here, one read says there should a A, another says AT, a third ATG and a forth says TG. How do you align that? Maybe:

--- (most reads)
A--
AT-
ATG
-TG

That was a fairly easy example - but what if there was another group of reads with an insert of C at this point?

My point is the conversion would require doing this kind of realignment for every insert.
maubp is offline   Reply With Quote
Old 03-25-2011, 04:17 AM   #3
lh3
Senior Member
 
Location: Boston

Join Date: Feb 2008
Posts: 693
Default

This is why SAM has a "P" operation, but for now few use that.
lh3 is offline   Reply With Quote
Old 03-25-2011, 07:52 AM   #4
query
Junior Member
 
Location: boston

Join Date: Feb 2009
Posts: 3
Default

I understand that this is non-trivial and choices have to be made. I was wondering if a trivial converter that mostly assumes single-base insertions and makes the most obvious choices for a multi-base insertions was available.

Are there any aligners present that provide the padding information?
query is offline   Reply With Quote
Old 03-28-2011, 12:36 AM   #5
jkbonfield
Senior Member
 
Location: Cambridge, UK

Join Date: Jul 2008
Posts: 146
Default

Quote:
Originally Posted by query View Post
Are there any tools out there that would convert a SAM/BAM file to a gapped multiple sequence alignment of the reads and the reference? I am looking for a text formatted output (like fasta) of this rather than a vizualization tool.
I had some code that performed a column wise "samtools pileup", so insertions would get their own rows in the output rather than shoehorning them into the same line via +seq. It needs some tidying up as it's no longer so standalone (it's part of my BAM reading code in Gap5), but could be made so again if it's useful.

As others have pointed out this is only half the problem though. It's not going to get your data aligned better and put P CIGAR operators in. I'm not aware of tools to automatically do this, but I'm sure some must exist.
jkbonfield is offline   Reply With Quote
Old 03-31-2011, 06:42 AM   #6
Sylphide
Member
 
Location: France

Join Date: Feb 2011
Posts: 11
Default

I would also like to obtain "alignments" from SAM/BAM. Since the samtools tview allows to view the alignment it should be possible to obtain these "alignments" in text format, no ?
Sylphide is offline   Reply With Quote
Old 04-06-2011, 05:07 AM   #7
himakr
Junior Member
 
Location: Milan

Join Date: Mar 2011
Posts: 3
Unhappy SAM/BAM Learning

Hi,

Can anyone help me in learning SAM/BAM and PileUp? Any tutorial avalable for self study?

Hima
himakr is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 07:00 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO