SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
BBMap (aligner for DNA/RNAseq) is now open-source and available for download. Brian Bushnell Bioinformatics 586 11-07-2017 10:06 AM
BBMap for BitSeq dietmar13 Bioinformatics 1 04-30-2015 09:40 AM
Please help my BBMap cannot remove Illumina adapter TofuKaj Bioinformatics 4 04-28-2015 09:53 AM
BBMap Error Phage Hunter Bioinformatics 5 01-14-2015 05:34 AM
Introducing BBMap, a new short-read aligner for DNA and RNA Brian Bushnell Bioinformatics 24 07-07-2014 10:37 AM

Reply
 
Thread Tools
Old 10-23-2017, 07:15 PM   #181
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,550
Default

Can you check the R2 file by the same method?
GenoMax is offline   Reply With Quote
Old 10-23-2017, 09:29 PM   #182
bio_d
Junior Member
 
Location: USA

Join Date: Oct 2017
Posts: 5
Default

Hi,

There seems to be nothing wrong with the second read too

zcat mate_pair_2.fq.gz | sed -n '2051361,2051364p'

@HWI-D00294:282:CAB4VANXX:7:1101:13812:601352:N:0:GCCAAT
TTGAAGCAGCAGTTCAAAAACATTGTCTCAGTCTGTCTTAATTTGGTATAATCCCCTGAATCTATTAAACCAAGACCAGCTGTCTGACATTTTTCACTATTTTCTTTTCTCCGCTTGTTCTTTTC
+
@[email protected]@[email protected]>[email protected]@[email protected]@>FG>FG>DEG1E>[email protected]>[email protected]>GGFC=D>[email protected]>F>[email protected]>[email protected]

Last edited by bio_d; 10-23-2017 at 09:34 PM.
bio_d is offline   Reply With Quote
Old 11-02-2017, 09:32 AM   #183
santiagorevale
Member
 
Location: UK

Join Date: Dec 2016
Posts: 16
Default

Quote:
Originally Posted by santiagorevale View Post
Hi Brian,

I'm using "filterbyname.sh" script from bbmap v37.60 (using Java 1.8.0_102) to extract some reads from a FastQ file given a list of IDs.

The current FastQ file has 196 Mi reads and I want to keep 85 Mi. Uncompressed FastQ file size is 14G while compressed is only 1.4G. IDs file is 3.1G.

When running the script using 24G of RAM it dies with OutOfMemoryError. Isn't it an excessive use of memory for just filtering a FastQ file? Also, among the script arguments the is no "threads" option, however the script is using all available cores. Any way of limiting both memory as well as threads usage?

Here is the error:

java -ea -Xmx24G -cp /software/bbmap-37.60/current/ driver.FilterReadsByName -Xmx24G include=t in=Sample1.I1.fastq.gz out=filtered.Sample1.I1.fastq.gz names=reads.ids
Executing driver.FilterReadsByName [-Xmx24G, include=t, in=Sample1.I1.fastq.gz, out=filtered.Sample1.I1.fastq.gz, names=reads.ids]

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOfRange(Arrays.java:3664)
at java.lang.String.<init>(String.java:207)
at java.lang.String.toLowerCase(String.java:2647)
at java.lang.String.toLowerCase(String.java:2670)
at driver.FilterReadsByName.<init>(FilterReadsByName.java:145)
at driver.FilterReadsByName.main(FilterReadsByName.java:40)

Thank you very much in advance.

Best regards,
Santiago
Hi there,

Any hint on what I've previously asked?

Thanks!
santiagorevale is offline   Reply With Quote
Old 11-06-2017, 08:08 PM   #184
TomHarrop
Member
 
Location: New Zealand

Join Date: Jul 2014
Posts: 20
Default

Hi Brian & others,

I tried to run bbnorm with a kmer size of 99, but it crashed with the following error:

Code:
Exception in thread "Thread-371" Exception in thread "Thread-357" Exception in thread "Thread-368" Exception in thread "Thread-377" Exception in thread "Thread-367" Exception in thread "Thread-362" Exception in thread "Thread-380" Exception in thread "Thread-363" Exception in thread "Thread-365" Exception in thread "Thread-364" Exception in thread "Thread-366" Exception in thread "Thread-358" Exception in thread "Thread-361" Exception in thread "Thread-360" Exception in thread "Thread-381" Exception in thread "Thread-387" Exception in thread "Thread-372" Exception in thread "Thread-399" java.lang.AssertionError: this function not tested with k>31
    at jgi.KmerNormalize.correctErrors(KmerNormalize.java:2124)
    at jgi.KmerNormalize.access$19(KmerNormalize.java:2121)
    at jgi.KmerNormalize$ProcessThread.normalizeInThread(KmerNormalize.java:3043)
    at jgi.KmerNormalize$ProcessThread.run(KmerNormalize.java:2806)
I'm wondering if I used a bad combination of parameters. Here's the call:

Code:
java -ea -Xmx132160m -Xms132160m -cp PATH/TO/bbmap/current/ jgi.KmerNormalize bits=32 in=output/trim_decon/reads.fastq.gz threads=50 out=output/k_99/norm/normalised.fastq.gz zl=9 hist=output/k_99/norm/hist_before.txt histout=output/k_99/norm/hist_after.txt target=50 min=5 prefilter ecc k=99 peaks=output/k_99/norm/peaks.txt
Otherwise, is it supported to use bbnorm with larger kmer sizes, or would you recommend estimating the target coverage for k = 99 based on the coverage at k = 31?

I've posted the log on pastebin: https://pastebin.com/jPkKagFs

Thanks again for the bbmap suite!

Tom
TomHarrop is offline   Reply With Quote
Old 11-07-2017, 02:54 AM   #185
boulund
Member
 
Location: Sweden

Join Date: Jan 2017
Posts: 16
Default

Hi, just want to make sure I'm not missing anything here, but randomreads.sh cannot produce metagenomes according to a specific profile, right? I only find information about it drawing random numbers from an exponential distribution for each reference sequence and thus produces a simulated metagenome from a set of reference sequences, which of course is awesome, but right now I would like to produce a simulated metagenome with a very specific composition.
boulund is offline   Reply With Quote
Old 11-07-2017, 05:01 AM   #186
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,550
Default

Quote:
Originally Posted by santiagorevale View Post
Hi there,

Any hint on what I've previously asked?

Thanks!
Perhaps. But if you have more memory why not allocate more and see if that helps. Unless you are being charged by every megabyte you use
GenoMax is offline   Reply With Quote
Old 11-07-2017, 06:00 AM   #187
santiagorevale
Member
 
Location: UK

Join Date: Dec 2016
Posts: 16
Default

Quote:
Originally Posted by GenoMax View Post
Perhaps. But if you have more memory why not allocate more and see if that helps. Unless you are being charged by every megabyte you use
Hi GenoMax,

Because I'm running this in a cluster, to get more memory means to get more cores (slots), and processes requiring more cores take longer to be executed. Also, I was running this command along other commands requiring same amount of cores.

However, isn't it weird for the script to require much more memory than the size of both uncompressed FastQ plus IDs files together?

Thanks!
santiagorevale is offline   Reply With Quote
Old 11-07-2017, 07:27 AM   #188
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,550
Default

Quote:
Originally Posted by santiagorevale View Post
Hi GenoMax,

Because I'm running this in a cluster, to get more memory means to get more cores (slots), and processes requiring more cores take longer to be executed. Also, I was running this command along other commands requiring same amount of cores.

However, isn't it weird for the script to require much more memory than the size of both uncompressed FastQ plus IDs files together?

Thanks!
While that is an odd restriction it is what it is when one is using shared compute resources.

Just for kicks have you tried to run this on a local desktop that has a decent amount of RAM (16G)? Just keeping fastq headers in memory should not take a large amount of RAM as you speculate.
GenoMax is offline   Reply With Quote
Reply

Tags
bbmap

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 11:25 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2017, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO