dietmar13 04-29-2015 10:48 PM

BBMap for BitSeq

I want use BBmap for estimating transcript expression level from RNA-seq reads. Mapped will be against transcripts, not the genome.

The Manual for BitSeq uses Bowtie with following conditions:


bowtie -q -v 3 -3 0 -p 4 -a -m 100 --sam
3 mismatches allowed
no 3' trimming (and no 5' trimming) - both default
report reads only if < 100 possible mapping positions

how should the BBmap parameters look like?

Code: ambiguous=all maxsites2=100 ( secondary=TRUE sssr=0.95 maxsites=100 )
is there a parameter for allowed mismatches per read?
will "secondary=TRUE sssr=0.95 maxsites=100" be a good idea for mapping reads to the human transcriptome. sssr=0.95 means approximate how many mismatches per 75 bases reads ~ 2?

should the reads be trimmed and quality filtered before, and if yes with which parameters?


Brian Bushnell 04-30-2015 09:40 AM

I imagine the main reason for specifying 3 mismatches with Bowtie is that 3 is the maximum it allows.

BBMap normally adjusts sensitivity with the mind/minratio parameter, though you can optionally set "subfilter=3" to ban alignments with more than 3 substitutions. I don't see how that would be beneficial to transcriptome mapping, though.

I would suggest:

" (file parameters) ambig=all maxsites=100 maxindel=100"

...and if you really want, add "subfilter=3". For transcriptome mapping there's no reason to allow the default maxindel=16000. And there is not much need to apply any quality-trimming or filtering unless you add the "subfilter" flag and have low-quality data, though you can do that if you want with "qtrim=rl trimq=10" to trim both ends to Q10. I do, however, always recommend adapter-trimming, particularly when requiring high-identity alignments.

