SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Error with MarkDuplicates in Picard slowsmile Bioinformatics 12 09-03-2014 08:32 AM
Picard's MarkDuplicates -> OutOfMemoryError elgor Bioinformatics 15 08-05-2013 06:37 AM
MarkDuplicates in picard bair Bioinformatics 3 12-23-2010 11:00 AM
How to use Picard's MarkDuplicates cliff Bioinformatics 11 12-22-2010 04:47 AM
Picard MarkDuplicates wangzkai Bioinformatics 2 05-18-2010 09:14 PM

Reply
 
Thread Tools
Old 09-16-2010, 09:47 AM   #1
rcorbett
Member
 
Location: canada

Join Date: Sep 2009
Posts: 29
Default picard markduplicates on huge files

Hey folks,
I'm trying to run markduplicates on some massive files (merged Solid bams), and often 100Gigs of RAM doesn't get the job done.

Does anyone have a good suggestion for a workaround. I don't want to split into chromosomes because I lose the ability to mark dups that span multiple chromosomes.

thanks!
rcorbett is offline   Reply With Quote
Old 09-16-2010, 07:17 PM   #2
malachig
Senior Member
 
Location: WashU

Join Date: Aug 2010
Posts: 115
Default

Is the BAM file sorted already? If not, is it possible that it is during sorting that picard uses excessive memory? If that's the case, you could, presort the BAM file using SAM tools (which allows max memory usage to be specified) and then use picard to mark duplicates using the 'ASSUME_SORTED' option...
malachig is offline   Reply With Quote
Old 09-17-2010, 04:39 AM   #3
drio
Senior Member
 
Location: 4117'49"N / 24'42"E

Join Date: Oct 2008
Posts: 323
Default

Quote:
Originally Posted by rcorbett View Post
Hey folks,
I'm trying to run markduplicates on some massive files (merged Solid bams), and often 100Gigs of RAM doesn't get the job done.

Does anyone have a good suggestion for a workaround. I don't want to split into chromosomes because I lose the ability to mark dups that span multiple chromosomes.

thanks!
Can you show us how your data looks like and the commands you are using?
__________________
-drd
drio is offline   Reply With Quote
Reply

Tags
bam, duplicates, picard

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 12:20 AM.


Powered by vBulletin® Version 3.8.6
Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.