SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
casava 1.8 bam conversion to gatk bam kingsalex Bioinformatics 1 02-14-2012 11:47 AM
How do I convert 454 ace to a regular ace? lskatz Bioinformatics 5 11-22-2010 07:31 AM
TOPHAT EMPTY accepted_hits.bam ISSUE waterboy Bioinformatics 1 11-16-2010 08:48 AM
ACE to SAM/BAM pile-up JueFish Bioinformatics 3 06-16-2010 08:39 AM
Fasta to Ace conversion Farhat Bioinformatics 19 05-15-2010 06:08 PM

Reply
 
Thread Tools
Old 06-22-2011, 06:55 AM   #1
pmiguel
Senior Member
 
Location: Purdue University, West Lafayette, Indiana

Join Date: Aug 2008
Posts: 2,317
Default 454 .ace to .bam conversion issue

I have heard that the "next" version of the Roche 454 software will include a .bam output format.

Until then (and presuming this is actually the case) I am stuck with the brutal amos2bnk methodology outlined here:

http://sourceforge.net/apps/mediawik...SAM_Conversion

There are numerous non-documented gotchas ready to pounce on the misguided novice who attempts this protocol. But I have managed to traverse the procedure a couple of times and emerge bloodied but (largely) unbroken.

One final issue involves actually view the resulting .bam file in IGV. (BTW, you definitely want to turn off "show soft-clipped bases" in the preferences.) I think the issue derives from the crazy long cigar strings produced. Some of this is to be expected because of the 454's well-known homopolymer issues. However, it looks to me like the cigar strings are being produced from the padded reads in the .ace file. That legions of "deletions" versus the consensus are shown in the viewer.

Anyone seen this? Anyone have solution to suggest?

--
Phillip
pmiguel is offline   Reply With Quote
Old 06-22-2011, 07:02 AM   #2
maubp
Peter (Biopython etc)
 
Location: Dundee, Scotland, UK

Join Date: Jul 2009
Posts: 1,543
Default

Do you have to have a SAM/BAM file? Why not stick with the ACE file and use a viewer that supports that?
maubp is offline   Reply With Quote
Old 06-22-2011, 07:15 AM   #3
pmiguel
Senior Member
 
Location: Purdue University, West Lafayette, Indiana

Join Date: Aug 2008
Posts: 2,317
Default cigar string hacking

Here is an example cigar string produced:
Code:
69S10M1P4M1P2M1P12M1P11M1P2M1P10M1P18M1P2M1P3M1P30M1P6M1P3M1P1M1P3M1P9M1P9M1P5M1P7M1P1M1P1M2P7M1P1M1P2M1P2M2P3M1P6M1P3M1P3M1P6M1P3M1P5M1P3M1P1M3P1M3P1M1P1M1P1M4P1M1P1M2P4M1P1M1P2M1P3M1P1M1P5M1P2M1P7M1P2M1P7M2P5M1P1M1P2M1P1M1P2M1P3M1P2M1P1M1P1M1P1M2P1M1P3M1I1D9M1P1M1P1M1P3M2P1M1P1M1P3M1P1M1I1D6M1P2M1P3M1I2D2M1I2M1I1D2M1P7M1P12M2P6M1P2M1D1I15M1P9M1P3M1P4M1P1M1P3M1P3M1P22M1P7M1P26M1P15M1P1M1P4M1P6M2P9M1P6M1P2M2P2M1P9M1P2M1P3M4S
The "P", from the sam specification says it denotes: "padding (silent deletion from padded reference)". So maybe I could parse through, deleting the P\d+ fields and collapsing adjacent "M" no longer separated by the pads?

--
Phillip
pmiguel is offline   Reply With Quote
Old 06-22-2011, 07:22 AM   #4
pmiguel
Senior Member
 
Location: Purdue University, West Lafayette, Indiana

Join Date: Aug 2008
Posts: 2,317
Default

Quote:
Originally Posted by maubp View Post
Do you have to have a SAM/BAM file? Why not stick with the ACE file and use a viewer that supports that?
Newbler does not currently produce BAM files.

I really like IGV.
The only ACE file viewer I use is consed. Great for BAC sized assemblies. Not good for full bacterial genomes. Do you have an ACE viewer you would recommend?

--
Phillip
pmiguel is offline   Reply With Quote
Old 06-22-2011, 07:25 AM   #5
maubp
Peter (Biopython etc)
 
Location: Dundee, Scotland, UK

Join Date: Jul 2009
Posts: 1,543
Default

I personally use Tablet for viewing ACE files (and SAM/BAM files too). It supports some other formats as well: http://bioinf.hutton.ac.uk/tablet/

If you are interested in editing the ACE file, GAP4 or GAP5 would be worth a look too.
maubp is offline   Reply With Quote
Old 06-22-2011, 12:52 PM   #6
sklages
Senior Member
 
Location: Berlin, DE

Join Date: May 2008
Posts: 628
Default

Quote:
Originally Posted by maubp View Post
I personally use Tablet for viewing ACE files (and SAM/BAM files too). It supports some other formats as well: http://bioinf.hutton.ac.uk/tablet/

If you are interested in editing the ACE file, GAP4 or GAP5 would be worth a look too.
AFAIK these won't work with ACE ... ACE is bad ;-)
sklages is offline   Reply With Quote
Old 06-23-2011, 03:41 AM   #7
pmiguel
Senior Member
 
Location: Purdue University, West Lafayette, Indiana

Join Date: Aug 2008
Posts: 2,317
Default

Quote:
Originally Posted by maubp View Post
I personally use Tablet for viewing ACE files (and SAM/BAM files too). It supports some other formats as well: http://bioinf.hutton.ac.uk/tablet/

If you are interested in editing the ACE file, GAP4 or GAP5 would be worth a look too.
Okay, Tablet is a good viewer for large .ace files. For editing .ace files, even giant ones, consed is fine.

It did not work for the BAM file I tried. I get:
SAM validation error: ERROR: Record 44, Read name H-148_49:1:1208:17598:140372, Mate Alignment start should != 0 because reference name != *.

Errors encountered by Tablet when processing BAM files are often related to using files that have not been sorted,
or where the index file is out of date. Please resort and/or reindex this file using samtools 0.1.8 or higher.
But we are running samtools 0.1.5, so possibly that is the issue.

Anyway, thanks for the advice.

--
Phillip
pmiguel is offline   Reply With Quote
Old 06-23-2011, 04:06 AM   #8
pmiguel
Senior Member
 
Location: Purdue University, West Lafayette, Indiana

Join Date: Aug 2008
Posts: 2,317
Default Tablet wants future version of samtools?

Quote:
Originally Posted by pmiguel View Post

It did not work for the BAM file I tried. I get:
SAM validation error: [...]
Please resort and/or reindex this file using samtools 0.1.8 or higher.
But we are running samtools 0.1.5, so possibly that is the issue.
Now that I check:

http://sourceforge.net/projects/samtools/

the current version of samtools is 0.1.6

*****Note added after posting:
Arrgg! The most recent version of samtools is 0.1.16! I seem to be unable to read decimal numbers today.
Well, I don't want to let the facts get in the way of my (lame) joke. So, please continue reading...
*****

Possibly Tablet works so well because it was sent back in time by coders from the future using advanced methodologies? But they failed to account for the lack of certain key dependencies?

Ah well, it works for ACE files...

--
Phillip

Last edited by pmiguel; 06-23-2011 at 05:06 AM. Reason: Erroneous info on current samtools version
pmiguel is offline   Reply With Quote
Old 06-23-2011, 04:31 AM   #9
ulz_peter
Senior Member
 
Location: Graz, Austria

Join Date: Feb 2010
Posts: 219
Default

We've got the same issue. As we only do resequencing the ACe file format is definitely not the best solutoin for outputting alignment (especially against the human reference genome, as it is realyy large even for small projects).

Tablet can visualize ACE files, however it is a pain to get some additional data to visualize easily (in our case gene annotations).

I#M really looking forward to the next software version with BAM output. I was already thinking of coding something myself...

Does anyone know when the new version will arrive?
ulz_peter is offline   Reply With Quote
Old 06-23-2011, 04:37 AM   #10
pmiguel
Senior Member
 
Location: Purdue University, West Lafayette, Indiana

Join Date: Aug 2008
Posts: 2,317
Default

Well, this is little more than speculation...
But, generally major chemistry/hardware releases are accompanied by a new software version. The new longer read chemistry upgrades are either happening now or rolling out over the summer.
So that would suggest that the answer is "soon"?

--
Phillip
pmiguel is offline   Reply With Quote
Old 06-23-2011, 04:49 AM   #11
imilne
Member
 
Location: JHI, Dundee, UK

Join Date: Jan 2010
Posts: 68
Default

Quote:
Originally Posted by pmiguel View Post
Okay, Tablet is a good viewer for large .ace files. For editing .ace files, even giant ones, consed is fine.

It did not work for the BAM file I tried. I get:
[INDENT]SAM validation error: ERROR: Record 44, Read name H-148_49:1:1208:17598:140372, Mate Alignment start should != 0 because reference name != *.
Use Tablet's Options to change the BAM validation setting from stringent to lenient.
__________________
Our software: Tablet | Flapjack | Strudel | CurlyWhirly | TOPALi
imilne is offline   Reply With Quote
Old 06-23-2011, 04:56 AM   #12
imilne
Member
 
Location: JHI, Dundee, UK

Join Date: Jan 2010
Posts: 68
Default

Quote:
Originally Posted by pmiguel View Post
Now that I check:

http://sourceforge.net/projects/samtools/

the current version of samtools is 0.1.6
No, it's 0.1.16, which is quite a few versions on from 0.1.6.

Tablet tries to read the statistics (specifically read counts per contig) for a BAM's index file (.bai) and will warn if it can't do this, which usually happens if the index file was created using a version of samtools earlier than 0.1.8, which is the first version (we're aware of) that added these stats. It's been available since last summer.
__________________
Our software: Tablet | Flapjack | Strudel | CurlyWhirly | TOPALi
imilne is offline   Reply With Quote
Old 06-23-2011, 05:23 AM   #13
pmiguel
Senior Member
 
Location: Purdue University, West Lafayette, Indiana

Join Date: Aug 2008
Posts: 2,317
Default Tablet SAM validation error workaround.

Quote:
Originally Posted by pmiguel View Post
Okay, Tablet is a good viewer for large .ace files. For editing .ace files, even giant ones, consed is fine.

It did not work for the BAM file I tried. I get:
SAM validation error: ERROR: Record 44, Read name H-148_49:1:1208:17598:140372, Mate Alignment start should != 0 because reference name != *.

Errors encountered by Tablet when processing BAM files are often related to using files that have not been sorted,
or where the index file is out of date. Please resort and/or reindex this file using samtools 0.1.8 or higher.
But we are running samtools 0.1.5, so possibly that is the issue.

Anyway, thanks for the advice.

--
Phillip
We were running samtools 0.1.15 -- not the problem.

The Tablet guys responded to an email I sent them about this issue with:

Quote:
The SAM validation error is an error message which is given out by the PICARD API which we use for loading BAM files. It’s indicating that the file you’re loading doesn’t fully conform to the BAM spec. You can tweak Tablet so that it will ignore these errors, but we have it set to flag them up by default. If you go to the Tablet application menu, then access Tablet Options and select the Importing tab. Make sure the “Set BAM validation stringency to lenient rather than strict (BAM only)” option is selected and click OK. You should now be able to load the data as the underlying PICARD API will ignore the error messages rather than show error messages.
This works for me!

--
Phillip
pmiguel is offline   Reply With Quote
Old 06-23-2011, 04:18 PM   #14
RCJK
Senior Member
 
Location: Australia

Join Date: May 2009
Posts: 155
Default

v2.6 is a part of the upgrade package for the XL+ upgrade. Last I've heard they (my local Roche reps) seem somewhat confident that the upgrade will launch at the end of June. A recent article on GenomeWeb also mentioned end of June, but we'll see. That's what they were saying last year.
RCJK is offline   Reply With Quote
Old 06-23-2011, 11:30 PM   #15
flxlex
Moderator
 
Location: Oslo, Norway

Join Date: Nov 2008
Posts: 415
Default

Quote:
Originally Posted by RCJK View Post
v2.6 is a part of the upgrade package for the XL+ upgrade. Last I've heard they (my local Roche reps) seem somewhat confident that the upgrade will launch at the end of June. A recent article on GenomeWeb also mentioned end of June, but we'll see. That's what they were saying last year.
We are getting the upgrade 'most likely' in mid July...
flxlex is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 06:50 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO