![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
short read aligner with 3 mismatch and one gap allowed | NicoBxl | Bioinformatics | 2 | 11-09-2011 11:26 AM |
The best short read aligner | Deutsche | Bioinformatics | 4 | 04-14-2011 08:12 PM |
Short Read Micro re-Aligner Paper | nilshomer | Literature Watch | 0 | 10-29-2010 10:59 AM |
New Short Read Aligner | sparks | Bioinformatics | 48 | 08-26-2009 09:01 AM |
Very Short Read aligner | Rupinder | Bioinformatics | 1 | 06-02-2009 08:10 PM |
![]() |
|
Thread Tools |
![]() |
#1 |
Super Moderator
Location: Walnut Creek, CA Join Date: Jan 2014
Posts: 2,707
|
![]()
BBMap will be publicly released soon, pending confirmation with LBL's legal department.
In the meantime feel free to look at these graphs of its performance: https://drive.google.com/file/d/0B3l...it?usp=sharing Note that this is a 50MB powerpoint file. It contains graphs of relative performance of BBMap and other short read aligners (bwa, bowtie2, gsnap, smalt) mapping synthetic data. EDIT: This thread is now closed; please use this one to post questions. Last edited by Brian Bushnell; 11-10-2014 at 12:09 PM. |
![]() |
![]() |
#2 |
Senior Member
Location: Stockholm, Sweden Join Date: Feb 2008
Posts: 319
|
![]()
Looks very impressive! Can it beat STAR (speed and accuracy wise) for RNA-seq though? (RNA-seq is listed as one of the use cases towards the end)
|
![]() |
![]() |
#3 |
Super Moderator
Location: Walnut Creek, CA Join Date: Jan 2014
Posts: 2,707
|
![]()
I have compared it to tophat, which it greatly outperforms in speed and has higher sensitivity on real RNA-seq data. I have not yet compared it to STAR - I tried to but was unable to get STAR to run without core-dumping so I gave up. I may have compiled it wrong; I'll try again eventually.
However, I don't have a really good tool for generating and evaluating synthetic RNA-seq data, so it's harder to quantify. The closest I can get is to generate synthetic DNA reads with very large deletions, which is not quite the same thing since RNA-seq data has other strange artifacts and the introns are not distributed randomly. |
![]() |
![]() |
#4 |
Devon Ryan
Location: Freiburg, Germany Join Date: Jul 2011
Posts: 3,480
|
![]()
It'd be great if you could get in touch with the authors of this paper and just use their test datasets. That would allow comparisons against most of the popular aligners out there.
|
![]() |
![]() |
#5 | |
Senior Member
Location: East Coast USA Join Date: Feb 2008
Posts: 7,080
|
![]() Quote:
http://www.ebi.ac.uk/arrayexpress/ex...s/E-MTAB-1728/ http://www.ebi.ac.uk/arrayexpress/ex...-1728/samples/ |
|
![]() |
![]() |
#6 | |
Super Moderator
Location: Walnut Creek, CA Join Date: Jan 2014
Posts: 2,707
|
![]() Quote:
|
|
![]() |
![]() |
#7 |
Senior Member
Location: Vienna Join Date: Mar 2010
Posts: 107
|
![]()
why is RUM always neglected by comparing RNA-seq mappers?
In my hands RUM outperforms other pipelines, e.g. tophat, in sensitivity, especially for spliced reads... RUM: RNA Seq Unified Mapper https://github.com/itmat/rum/wiki RUM is rather slow, but using multithreaded servers allows mapping in tolerable time (compared to sample and library generation and data interpretation) dietmar |
![]() |
![]() |
#8 |
Member
Location: Norwich Join Date: Jan 2014
Posts: 20
|
![]()
Hi Brian,
I got a file with cleaned sequence data and I want to assemble this de-novo using velvet. Due to the nature of the sequencing and the library protocol, my kmer coverage is quite variable and I wanted to use BBnorm to normalize the coverage a bit to aid the assembly. Am I correct that BBnorm is the right thing to use for this? Anyway, currently trying to give it a go and I got this error message: bbmap$ sh bbnorm.sh in=Fowleri_combined.fastq out=normFowleri.fastq target=15 bbnorm.sh: 104: bbnorm.sh: Bad substitution bbnorm.sh: 112: bbnorm.sh: [[: not found bbnorm.sh: 112: bbnorm.sh: [[: not found bbnorm.sh: 118: bbnorm.sh: source: not found bbnorm.sh: 119: bbnorm.sh: parseXmx: not found bbnorm.sh: 120: bbnorm.sh: [[: not found bbnorm.sh: 123: bbnorm.sh: freeRam: not found java -ea -Xmxm -cp /home/martin/Downloads/bbmap/current/ jgi.KmerNormalize bits=32 in=Fowleri_combined.fastq Invalid maximum heap size: -Xmxm Could not create the Java virtual machine. Any ideas? Many thanks, Sarah |
![]() |
![]() |
#9 |
Super Moderator
Location: Walnut Creek, CA Join Date: Jan 2014
Posts: 2,707
|
![]()
Sarah,
Yes, BBNorm is the correct tool. I'm not sure, but I suspect that your shell is not bash. You could retry the command with "bash" instead of "sh", which may work. But the easier thing is just to skip the shellscript and invoke java manually: java -ea -Xmx14g -cp /home/martin/Downloads/bbmap/current/ jgi.KmerNormalize bits=32 in=Fowleri_combined.fastq out=normFowleri.fastq target=15 That command would work if you had 16g of RAM. Just set the -Xmx parameter (highlighted in purple) to about 85% of however much RAM is on the machine. If you don't know, you should be able to find out like this on a Linux system: cat /proc/meminfo ...then look at the first line, "MemTotal". However, 15x is a fairly low target depth. For velvet I would suggest at least 30x for an optimal assembly, unless you just don't have enough data. -Brian |
![]() |
![]() |
#10 |
Member
Location: Norwich Join Date: Jan 2014
Posts: 20
|
![]()
Hi Brian,
That worked like a charm, thank you! The normalization also greatly improved the assemblies and the kmer-coverage distribution looks much nicer. I was just wondering: by default, bbnorm will use a kmer of 31. But for my assembly I am using 41. The assembly works fine, but is it advisable to normalize the coverage using a kmer of 41? Thanks, Sarah |
![]() |
![]() |
#11 |
Super Moderator
Location: Walnut Creek, CA Join Date: Jan 2014
Posts: 2,707
|
![]()
Sarah,
It might be better to normalize using a kmer length of 41, but BBNorm only supports a maximum of 31 ![]() It's a lot more computationally efficient to use a max kmer length of 31, so that's how I designed it. I've tried shorter kmers down to about k=25 and not noticed an appreciable difference in normalization or error correction. As for your prior (deleted) post, sorry for not responding - I think the problem was that you were running Java 6 instead of Java 7. Most of the programs in BBTools work fine in Java 6 but it looks like BBNorm requires Java 7 (or higher). |
![]() |
![]() |
#12 |
Member
Location: Norwich Join Date: Jan 2014
Posts: 20
|
![]()
Hi Brian,
Thanks so much for that explanation ![]() Sorry as well for just deleting my post (and bombarding you with simple questions, new to the world of NGS!), I played around with updating the Java on our Linux machine and that did the trick ![]() Thanks again for your help! And the fantastic and easy to use script!! Sarah |
![]() |
![]() |
#13 |
Member
Location: USA Join Date: Jun 2012
Posts: 10
|
![]()
Hi Brian,
Is there an option to set read quality encoding in bbnorm? I had to set qin=33 in bbduk for some Illumina 1.9 paired end libraries, but this option doesn't seem to exist in bbnorm (used BBMap v. 32.32 for Java 7). Thanks Olaf |
![]() |
![]() |
#14 |
Super Moderator
Location: Walnut Creek, CA Join Date: Jan 2014
Posts: 2,707
|
![]()
Olaf,
It's there, I just forgot to document it; sorry! I'll add that to the shellscript in the next release. I think that all of the programs in the package that read fastq input allow the "qin" flag. -Brian |
![]() |
![]() |
#15 |
Member
Location: USA Join Date: Jun 2012
Posts: 10
|
![]()
Indeed, just tried it and it works well with bbnorm.
Thanks Olaf |
![]() |
![]() |
#16 |
Member
Location: USA Join Date: Jun 2012
Posts: 10
|
![]()
Brian,
I ran into a smaller issue with bbnorm. When trying to input and output separate files for a PE library like this: Code:
bbnorm.sh in1=R1.fastq.gz in2=R2.fastq.gz out1=R1.bbnorm.fastq.gz out2=R2.bbnorm.fastq.gz prefilter=t tossbadreads=t ecc=t fixspikes=t qin=33 -Xmx72g target=40 Code:
Exception in thread "main" java.lang.AssertionError: Please do not set 'interleaved=true' with dual input files. at stream.ConcurrentGenericReadInputStream.<init>(ConcurrentGenericReadInputStream.java:132) at stream.ConcurrentGenericReadInputStream.getReadInputStream(ConcurrentGenericReadInputStream.java:661) at stream.ConcurrentGenericReadInputStream.getReadInputStream(ConcurrentGenericReadInputStream.java:641) at kmer.KmerCount7MTA.countFastq(KmerCount7MTA.java:355) at kmer.KmerCount7MTA.makeKca(KmerCount7MTA.java:222) at jgi.KmerNormalize.runPass(KmerNormalize.java:1006) at jgi.KmerNormalize.main(KmerNormalize.java:736) Olaf |
![]() |
![]() |
#17 |
Super Moderator
Location: Walnut Creek, CA Join Date: Jan 2014
Posts: 2,707
|
![]()
Olaf,
Currently, BBNorm uses single interleaved files for temporary storage when using multiple passes. And I have not implemented any way to specify dual files in intermediate stages, since everyone at JGI uses interleaved files for everything. You have two options. 1) You could set "passes=1", which is faster, but I don't recommend it because it doesn't give as good results as 2-pass normalization. or 2) You could specify only a single output file, which will get interleaved reads: bbnorm.sh in1=R1.fastq.gz in2=R2.fastq.gz out=R12.bbnorm.fastq.gz prefilter=t tossbadreads=t ecc=t fixspikes=t qin=33 -Xmx72g target=40 ...Then, if you need to, de-interleave it afterward: reformat.sh in=R12.bbnorm.fastq.gz out1=R1.bbnorm.fastq.gz out2=R2.bbnorm.fastq.gz Sorry for the inconvenience! I'll try to fix that by the next release, though unlike documenting the "qin" flag, this will take more work so no guarantees. Thanks for bringing it to my attention. FYI, the flag "interleaved" has no effect on output, only input. -Brian Last edited by Brian Bushnell; 06-23-2014 at 07:05 PM. |
![]() |
![]() |
#18 |
Member
Location: USA Join Date: Jun 2012
Posts: 10
|
![]()
Thanks for the info Brian, it wasn't a big issue.
Olaf |
![]() |
![]() |
#19 |
Super Moderator
Location: Walnut Creek, CA Join Date: Jan 2014
Posts: 2,707
|
![]()
Olaf,
This has been fixed in the latest release, 33.04 |
![]() |
![]() |
#20 |
Member
Location: USA Join Date: Jun 2012
Posts: 10
|
![]()
Excellent, just did a test run. This is very useful software!
Olaf |
![]() |
![]() |
Tags |
bbmap, bbnorm, bbtools, short read alignment |
Thread Tools | |
|
|