SEQanswers

Go Back   SEQanswers > Search Forums


Showing results 1 to 25 of 32
Search took 0.00 seconds.
Search: Posts Made By: Brian Bushnell
Forum: Bioinformatics 10-13-2017, 11:14 AM
Replies: 104
Views: 41,659
Posted By Brian Bushnell
Hi Gopo, I don't particularly recommend...

Hi Gopo,

I don't particularly recommend Tadpole for diploid (or higher) genomes, as it has absolutely no capability of dealing with heterozygous sites. However, it's really fast, so even with a...
Forum: Bioinformatics 04-06-2017, 10:12 AM
Replies: 104
Views: 41,659
Posted By Brian Bushnell
That's correct. However, most datasets have a...

That's correct. However, most datasets have a strong peak for singleton kmers, due to sequencing error. It's just that typically, for isolates, there is also an obvious higher peak (at, say, 40x...
Forum: Bioinformatics 04-04-2017, 05:01 PM
Replies: 104
Views: 41,659
Posted By Brian Bushnell
You might start by looking at a kmer frequency...

You might start by looking at a kmer frequency histogram to see if there is a single clear peak, or multiple peaks, or what. Additionally, BLASTing your assembly to nt to see if it is perhaps...
Forum: Bioinformatics 03-23-2017, 01:46 PM
Replies: 104
Views: 41,659
Posted By Brian Bushnell
Thanks for this report. This may be an aspect of...

Thanks for this report. This may be an aspect of garbage collection. I noticed that it says "Ways=541", which I would not have expected; it's supposed to set "ways" to be much lower, based on the...
Forum: Bioinformatics 02-28-2017, 10:31 AM
Replies: 104
Views: 41,659
Posted By Brian Bushnell
Tadpole may generate contigs that overlap by up...

Tadpole may generate contigs that overlap by up to K-1 on the ends when there is a branch in the De Bruijn graph. You can trim these with flag "trimends=15" where the number should be set to K/2,...
Forum: Bioinformatics 02-09-2017, 10:12 AM
Replies: 104
Views: 41,659
Posted By Brian Bushnell
De-novo assembly can return single-contig...

De-novo assembly can return single-contig assemblies, but it's data-dependent. Normally, if I pull the PhiX reads from a dataset, Tadpole will assemble them into a single, perfect contig. But it...
Forum: Bioinformatics 02-05-2017, 10:55 AM
Replies: 104
Views: 41,659
Posted By Brian Bushnell
BBMerge can't force-trim just one read of the...

BBMerge can't force-trim just one read of the pair, sorry. You'd have to have the paired reads in two files, and run BBDuk on the r2 file only, with the flags "ftr2=50 ordered".

"ecct" and...
Forum: Bioinformatics 02-03-2017, 09:24 AM
Replies: 104
Views: 41,659
Posted By Brian Bushnell
Tadpole is a purely kmer-based assembler, so by...

Tadpole is a purely kmer-based assembler, so by the time assembly is being done, there is no knowledge of which reads were used. Therefore, Tadpole can't provide that output. So, there are two ways...
Forum: Bioinformatics 02-02-2017, 11:37 AM
Replies: 104
Views: 41,659
Posted By Brian Bushnell
This appears to be getting killed by the...

This appears to be getting killed by the scheduler because it's using more memory than it's allowed to, although usually that happens much faster so I can't be sure. I suggest you reduce the -Xmx...
Forum: Bioinformatics 02-01-2017, 03:51 PM
Replies: 104
Views: 41,659
Posted By Brian Bushnell
So, I would recommend a command would like this: ...

So, I would recommend a command would like this:

bbmerge-auto.sh in1=r1.fq in2=r2.fq out=merged.fq outu=unmerged.fq prefilter=1 extend2=50 k=62 rem adapter=default

This operates in 3 phases.
...
Forum: Bioinformatics 01-27-2017, 08:16 AM
Replies: 104
Views: 41,659
Posted By Brian Bushnell
Yes, the input can be comma-delimited. For...

Yes, the input can be comma-delimited. For error-correction:

tadpole.sh in=a.fq.gz,b.fq.gz out=a_ecc.fq.gz,b_ecc.fq.gz ecc

For assembly:

tadpole.sh in=a.fq.gz,b.fq.gz out=contigs.fa
...
Forum: Bioinformatics 01-17-2017, 03:34 PM
Replies: 104
Views: 41,659
Posted By Brian Bushnell
I don't really know where the high mutation rate...

I don't really know where the high mutation rate comes from in supposedly isolate libraries. Unfortunately, there's no mechanism in Tadpole to fix them - it always halts at a branch and breaks into...
Forum: Bioinformatics 01-17-2017, 01:07 PM
Replies: 104
Views: 41,659
Posted By Brian Bushnell
Hi Jake, The problem is the presence of...

Hi Jake,

The problem is the presence of spaces in your filepath. Try adding quotes:

java -ea -Xmx15000m -Xms15000m -cp /Users/DJV/programs/bbmap/current/ assemble.Tadpole...
Forum: Bioinformatics 12-16-2016, 08:43 AM
Replies: 104
Views: 41,659
Posted By Brian Bushnell
Thanks for noting that; I've corrected it. ...

Thanks for noting that; I've corrected it. Originally, for error-correction, you had to say "oute" for the output but I changed that a while ago so the current syntax is "out1" and "out2".
Forum: Bioinformatics 12-01-2016, 10:24 AM
Replies: 104
Views: 41,659
Posted By Brian Bushnell
Since you are pulling the kmer information from...

Since you are pulling the kmer information from randomly-sheared metagenomic reads, and using the variable region as a seed, it should be safe. However, there is a possibility of running into a...
Forum: Bioinformatics 06-30-2016, 10:15 AM
Replies: 104
Views: 41,659
Posted By Brian Bushnell
Gringer is correct; the files appear to be...

Gringer is correct; the files appear to be corrupt. But just to add a small correction, it does not look like wc can handle gzipped files, so I'd suggest either:

zcat read1.corrected.fq.gz | wc...
Forum: Bioinformatics 11-06-2015, 10:52 AM
Replies: 104
Views: 41,659
Posted By Brian Bushnell
Sure: #First link reference as ref.fa and...

Sure:

#First link reference as ref.fa and reads as reads.fa

/global/projectb/sandbox/gaag/bbtools/jgi-bbtools/kmercountexact.sh in=reads.fq.gz khist=khist_raw.txt peaks=peaks_raw.txt
...
Forum: Bioinformatics 10-28-2015, 04:59 PM
Replies: 104
Views: 41,659
Posted By Brian Bushnell
Hi Rob, Tadpole does not do subsampling;...

Hi Rob,

Tadpole does not do subsampling; you'd have to first sample 10% of the reads with another tool (such as Reformat, with the "samplerate=0.1" flag). However, you CAN restrict the input to a...
Forum: Bioinformatics 10-17-2015, 12:45 PM
Replies: 104
Views: 41,659
Posted By Brian Bushnell
With extremely high coverage, it is often...

With extremely high coverage, it is often beneficial to normalize first, for any assembler. For example -

bbnorm.sh in=reads.fq out=normalized.fq min=2 target=100

You don't need that for...
Forum: Bioinformatics 10-16-2015, 04:19 AM
Replies: 104
Views: 41,659
Posted By Brian Bushnell
If you have super-high coverage like that, at a...

If you have super-high coverage like that, at a minimum, increasing K above the default (31) is usually helpful.
Forum: Bioinformatics 10-15-2015, 11:02 PM
Replies: 104
Views: 41,659
Posted By Brian Bushnell
In default mode, Tadpole assembles reads and...

In default mode, Tadpole assembles reads and produces contigs. In "extend" or "correct" mode, it will extend or correct input sequences - which can be reads or contigs, but it's designed for reads. ...
Forum: Bioinformatics 10-14-2015, 09:08 PM
Replies: 104
Views: 41,659
Posted By Brian Bushnell
True... but I still want to evaluate the...

True... but I still want to evaluate the difference in speed between that and a lookup-array - "if(array[char])", which would only require 128 bytes (assuming negative values were prefiltered, which...
Forum: Bioinformatics 10-14-2015, 07:11 PM
Replies: 104
Views: 41,659
Posted By Brian Bushnell
Thanks for reporting that... I didn't realize...

Thanks for reporting that... I didn't realize Tadpole required Java 1.7+. I'll look into it tomorrow - I may be able to switch to something supported in 1.6. Or, of course, just write the method...
Forum: Bioinformatics 10-14-2015, 06:46 PM
Replies: 104
Views: 41,659
Posted By Brian Bushnell
You can merge reads like this: ...

You can merge reads like this:

bbmerge-auto.sh in=reads.fq out=merged.fq outu=unmerged.fq ihist=ihist.txt extend2=20 iterations=10 k=31 ecct qtrim2=r trimq=12 strict

BBMerge will then attempt...
Forum: Bioinformatics 10-14-2015, 06:17 PM
Replies: 104
Views: 41,659
Posted By Brian Bushnell
That's fine, and expected - with "mode=extend...

That's fine, and expected - with "mode=extend el=50 er=50" reads will be extended at most 50bp in each direction, then stop. So for 2x250bp data, you could at best generate 350bp sequences. The...
Showing results 1 to 25 of 32

 


All times are GMT -8. The time now is 08:49 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO