I am working with data of Illumina from human cells. The reads are paired ends and initially 50bp long. I had to trim some of them so I have reads of different lengths. When I run TopHat, I get a warning about the disadvantages of using reads shorter than 20bp. Should I remove the reads shorter than 20bp or it is recommended to remove larger reads to avoid TopHat run quite slow and take large amount of memory? Does the use of the option '-g'(max multihits) reduce these problems? I have a big library and I'd like to avoid this problems.
Thanks you.
Thanks you.