View Single Post
Old 01-05-2017, 01:42 PM   #407
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

Quote:
Originally Posted by JVGen View Post
That's unfortunately about as much as the error messages says - that Tadpole cannot use a mixture of paired and unpaired reads.
How many files do you have after BBMap, and what are they named?

Quote:
It might be the read-name format that is throwing it off? For instance, my reads are named in the following format after BBMap (In Geneious):

MN00123:91:000H22WH3:1:22104:14105:5672_1:N:0:1/2
MN00123:91:000H22WH3:1:22104:14105:5672_1:N:0:1/1
I doubt it - BBTools should be able to handle reads named like that.

Quote:
I didn't know about that feature with BBDuk! Will entropy of 0.01 remove any string of a mononucleotide? Or, how many must be present in a string to flag it? Is this with a window size of 50 and kmer size of 5?
For the default window=50 entropyk=5, reads must be at least 50bp long to be processed by the entropy filter (you can reduce that by making the window smaller). And entropy=0.01 will remove any sequence that is a singly mononucleotide, as long as it's at least 50bp long. Note that if there are some errors so that it is no longer a pure mononucleotide you'd need a higher value for entropy. Something like "AAAAAAAAAAGGGGGGGGGGGGGGGG" would also need a higher value (50 A's and 50 G's appears to need entropy=0.21). Don't set it too high, though, or you'll lose the low complexity parts of your genome.
Brian Bushnell is offline   Reply With Quote