Originally posted by alexdobin
View Post
On a side note, do you know what is the best approach to generate the unique mappability of all genomic positions? Since I discard multi-mappers and consider only uniquely mapped reads, I want to exclude the positions in the genome (transcriptome) that are inherently unmappable and have a duplicate(s) somewhere else (e.g., paralogs) for a given read length.
This is not trivial, mainly because I need to artificially synthesize the reads myself, if they span only one junction it is OK, but if they span multiple junctions, or the second exon undergoes A5SS, then this becomes a serious headache. Have you heard of any tools for this, or a readily-available map?
GH
PS. This is a MUCH harder task with Bowtie2 (or any splicing-ignorant mapper). STAR could map the read to any non-contiguous region of the genome, while one needs to make a library of all junctions with various exon arrangements for Bowtie2, but still forming the reads is not easy.
Comment