Hi everyone, I am working with human transcriptome sequenced by SOLiD - at the moment single read, but plan to try mate paired next - and using Bowtie/TopHat/Cufflinks to align reads. I am fairly new here and I have a simple but very important question.
How many multiple alignments should you allow for each read?
Its a difficult question when a read will map equally well to more than one location (how do you call which is best?). As I understand it TopHat, as default, reports up to 40 different alignments per read. I know this can be adjusted using the -max-multihits option, but this gets rid of every alignment for reads mapping at multiple sites. Bowtie has options for --best --strata but this still reports all reads which are best (at least when I run the command it does). There are also options for -k, -a, -m, -M.
How am I supposed to know or choose which option(s) to apply to my data? Surely if I allow the software to report all possible mappings then I am biasing my data for later stages when RPKM is calculated?
I just wondered what the wider community was doing about this, and what sort of parameters you are applying to your data?
Many thanks
Helen
How many multiple alignments should you allow for each read?
Its a difficult question when a read will map equally well to more than one location (how do you call which is best?). As I understand it TopHat, as default, reports up to 40 different alignments per read. I know this can be adjusted using the -max-multihits option, but this gets rid of every alignment for reads mapping at multiple sites. Bowtie has options for --best --strata but this still reports all reads which are best (at least when I run the command it does). There are also options for -k, -a, -m, -M.
How am I supposed to know or choose which option(s) to apply to my data? Surely if I allow the software to report all possible mappings then I am biasing my data for later stages when RPKM is calculated?
I just wondered what the wider community was doing about this, and what sort of parameters you are applying to your data?
Many thanks
Helen