SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Reply
 
Thread Tools
Old 03-01-2012, 12:15 AM   #1
ulz_peter
Senior Member
 
Location: Graz, Austria

Join Date: Feb 2010
Posts: 215
Default Cram

Hi Guys,

Anyone already thought about deploying CRAM for archiving alignment data, rather than BAM?

http://www.ebi.ac.uk/ena/about/cram_toolkit

Our Illumina representative told us they want to switch to that format...
ulz_peter is offline   Reply With Quote
Old 03-01-2012, 12:29 AM   #2
maubp
Peter (Biopython etc)
 
Location: Dundee, Scotland, UK

Join Date: Jul 2009
Posts: 1,303
Default

I've heard Sanger is considering it, perhaps even this year if CRAM continues to mature rapidly.
maubp is offline   Reply With Quote
Old 03-01-2012, 04:22 AM   #3
lh3
Senior Member
 
Location: Boston

Join Date: Feb 2008
Posts: 667
Default

Yes, cram has a great potential. It may ultimately replace BAM (if cram does not do that, there will be a binary format to achieve sooner or later). Nonetheless, cram may not replace BAM right now. It does not (at least did not) support all the tags. I do not know the progress on compressing unmapped reads. Furthermore, I am concerned with the compression model. I also think lossy compression is the way to go, but this should be done by reducing the resolution of quality, instead of by selectively dropping all the quality information.
lh3 is offline   Reply With Quote
Old 03-01-2012, 04:41 AM   #4
maubp
Peter (Biopython etc)
 
Location: Dundee, Scotland, UK

Join Date: Jul 2009
Posts: 1,303
Default

Quote:
Originally Posted by lh3 View Post
Yes, cram has a great potential. It may ultimately replace BAM (if cram does not do that, there will be a binary format to achieve sooner or later). Nonetheless, cram may not replace BAM right now. It does not (at least did not) support all the tags.
Supporting all the tags is expected in CRAM 0.7 due soon, see e.g.
http://lists.open-bio.org/pipermail/...ch/036295.html
Quote:
Originally Posted by lh3 View Post
I do not know the progress on compressing unmapped reads.
I heard at a recent seminar that the CRAM team are looking at doing a mini-assembly of the unmapped reads in order to generate dummy reference sequences which can then be used for reference based compression. If I understood correctly this might be transparent to the user.
Quote:
Originally Posted by lh3 View Post
Furthermore, I am concerned with the compression model. I also think lossy compression is the way to go, but this should be done by reducing the resolution of quality, instead of by selectively dropping all the quality information.
Also at the same seminar we were told CRAM has several modes of quality compression, one of which is simply reducing the resolution.
maubp is offline   Reply With Quote
Old 03-01-2012, 05:19 AM   #5
lh3
Senior Member
 
Location: Boston

Join Date: Feb 2008
Posts: 667
Default

That is great! Hope these can be done soon!
lh3 is offline   Reply With Quote
Old 03-01-2012, 07:56 AM   #6
cjfields
Junior Member
 
Location: Champaign, IL, USA

Join Date: Sep 2009
Posts: 4
Default

That does seem very promising.
cjfields is offline   Reply With Quote
Old 03-09-2012, 01:19 AM   #7
vadim
Member
 
Location: Cambridge, UK

Join Date: Sep 2009
Posts: 37
Default

I am the developer of CRAM and can answer any questions about it.

The code is here:
https://github.com/vadimzalunin/crammer/

Documentation can be found here:
http://www.ebi.ac.uk/ena/about/cram_toolkit

We just released v0.7, which is not a long term support yet but stable enough to try it out.
vadim is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 10:04 AM.


Powered by vBulletin® Version 3.8.6
Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.