SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
TopHat - Bowtie read trimming adrian Bioinformatics 2 12-06-2013 12:48 AM
picard add read groups HGENETIC Bioinformatics 0 01-25-2012 03:19 AM
Periodical illumina read length distribution after trimming of low-quality bases luxmare General 4 12-20-2010 03:18 PM
Read trimming- color space JohnK SOLiD 4 11-02-2010 11:25 PM
PubMed: Efficient frequency-based de novo short read clustering for error trimming in Newsbot! Literature Watch 0 05-15-2009 05:00 AM

Reply
 
Thread Tools
Old 10-29-2010, 07:28 AM   #1
JohnK
Senior Member
 
Location: Los Angeles, China.

Join Date: Feb 2010
Posts: 106
Default Read trimming and Picard

Hi,


Does anyone have a recommended read-trimming software that works with color-space data?


Also, I'm not trying to-repost but I'm getting some odd-errors and the help-email list for SamTools seems dead. What is the source of this error:


INFO 2010-10-29 09:39:34 MarkDuplicates Read 46000000 records. Tracking 687328 as yet unmatched pairs. 46728 records in RAM. Last sequence index: 9
INFO 2010-10-29 09:39:45 MarkDuplicates Read 47000000 records. Tracking 686480 as yet unmatched pairs. 32624 records in RAM. Last sequence index: 9
INFO 2010-10-29 09:40:08 MarkDuplicates Read 48000000 records. Tracking 684660 as yet unmatched pairs. 17477 records in RAM. Last sequence index: 9
INFO 2010-10-29 09:40:18 MarkDuplicates Read 49000000 records. Tracking 682311 as yet unmatched pairs. 479 records in RAM. Last sequence index: 9
[Fri Oct 29 09:40:37 CDT 2010] net.sf.picard.sam.MarkDuplicates done.
Runtime.totalMemory()=772931584
Exception in thread "main" net.sf.picard.PicardException: Exception writing ReadEnds to file.
at net.sf.picard.sam.ReadEndsCodec.encode(ReadEndsCodec.java:74)
at net.sf.picard.sam.ReadEndsCodec.encode(ReadEndsCodec.java:32)
at net.sf.samtools.util.SortingCollection.spillToDisk(SortingCollection.java:185)
at net.sf.samtools.util.SortingCollection.add(SortingCollection.java:140)
at net.sf.picard.sam.MarkDuplicates.buildSortedReadEndLists(MarkDuplicates.java:269)
at net.sf.picard.sam.MarkDuplicates.doWork(MarkDuplicates.java:109)
at net.sf.picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:150)
at net.sf.picard.sam.MarkDuplicates.main(MarkDuplicates.java:93)
Caused by: java.io.IOException: No space left on device
at java.io.FileOutputStream.writeBytes(Native Method)
at java.io.FileOutputStream.write(FileOutputStream.java:260)
at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
at java.io.DataOutputStream.flush(DataOutputStream.java:106)
at net.sf.picard.sam.ReadEndsCodec.encode(ReadEndsCodec.java:71)
... 7 more


I can't seem to find any documentation on it and nobody answered my last post.

Finally, I've been reading on some previous seq-answers posts and I wanted to see if anyone can clarify that samtools removes duplicates based on start/stop alone and doesn't consider identical sequences. Are you sure?
JohnK is offline   Reply With Quote
Old 10-29-2010, 08:15 AM   #2
drio
Senior Member
 
Location: 4117'49"N / 24'42"E

Join Date: Oct 2008
Posts: 323
Default

It seems you run out of space?

Code:
...
at net.sf.picard.sam.MarkDuplicates.main(MarkDuplicat es.java:93)
Caused by: java.io.IOException: No space left on device
at java.io.FileOutputStream.writeBytes(Native Method)
...
Regarding samtools, look at this thread with comments from the author.

It does not considers the sequence. Also, take a look to the mathematical models implemented in samtools. Entries 1.1 and 1.2 detail changes of getting duplicates at library and mapping level.
__________________
-drd
drio is offline   Reply With Quote
Old 10-29-2010, 08:27 AM   #3
JohnK
Senior Member
 
Location: Los Angeles, China.

Join Date: Feb 2010
Posts: 106
Default

Quote:
Originally Posted by drio View Post
It seems you run out of space?

Code:
...
at net.sf.picard.sam.MarkDuplicates.main(MarkDuplicat es.java:93)
Caused by: java.io.IOException: No space left on device
at java.io.FileOutputStream.writeBytes(Native Method)
...
Regarding samtools, look at this thread with comments from the author.

It does not considers the sequence. Also, take a look to the mathematical models implemented in samtools. Entries 1.1 and 1.2 detail changes of getting duplicates at library and mapping level.

Thanks for the reply, Drio. I run out of space, but I also set the MAX* param for a much higher value with the same end result. Still get the error...
JohnK is offline   Reply With Quote
Old 10-29-2010, 12:03 PM   #4
westerman
Rick Westerman
 
Location: Purdue University, Indiana, USA

Join Date: Jun 2008
Posts: 1,104
Default

Quote:
Originally Posted by JohnK View Post
Thanks for the reply, Drio. I run out of space, but I also set the MAX* param for a much higher value with the same end result. Still get the error...
Why do you expect the setting the MAX* param would eliminate the "running out of space" error? Now if you said "I ran out and put a new 10 TB Raid-5 disk on my system and slapped on an extra 256 GB of memory with the same end result" then I would be concerned.

More seriously, it is possible that -- assuming you are on running on a *nix based system -- that the program is set to saving temporary files in '/tmp'. On many system '/tmp' is actually memory instead of disk. Thus it is possible to run of out of "disk space" even though you have lots of disk space.

Or you may simply be out of disk space. How much do you have free?
westerman is offline   Reply With Quote
Old 11-02-2010, 09:20 PM   #5
JohnK
Senior Member
 
Location: Los Angeles, China.

Join Date: Feb 2010
Posts: 106
Default

Quote:
Originally Posted by westerman View Post
Why do you expect the setting the MAX* param would eliminate the "running out of space" error? Now if you said "I ran out and put a new 10 TB Raid-5 disk on my system and slapped on an extra 256 GB of memory with the same end result" then I would be concerned.

More seriously, it is possible that -- assuming you are on running on a *nix based system -- that the program is set to saving temporary files in '/tmp'. On many system '/tmp' is actually memory instead of disk. Thus it is possible to run of out of "disk space" even though you have lots of disk space.

Or you may simply be out of disk space. How much do you have free?
It was a similar issue, but my sys-admin found it. One program was eating /tmp and putting it over the top.
JohnK is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 04:25 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO