View Single Post
Old 11-07-2012, 06:51 PM   #3
dvanic
Member
 
Location: Sydney, Australia

Join Date: Jan 2012
Posts: 61
Default

Quote:
Ok, maybe some of them where rejected a priori for some quality issue.
How many reads is tophat telling you it is filtering in prep_reads.info?
Quote:
18645993 reads; of these:
18645993 (100.00%) were unpaired; of these:
7576936 (40.64%) aligned 0 times
5480546 (29.39%) aligned exactly 1 time
5588511 (29.97%) aligned >1 times
59.36% overall alignment rate
How is the overall quality of your read, especially the end? Is your sequencing machine calling bases irrespective of what the quality of that base is, or does it start calling and N at low quality?

Tophat, unlike BWA, does not clip reads to remove low-quality ends. If your ends are <=q20 and your sequencer force-called the bases you may have nucleotides at the end that are making your reads unalignable, because those ends are preventing Tophat from positioning the read where it belongs - you're getting too many mismatches.

Quote:
I'm beginning to think that maybe I should try STAR.
If you think trying a new tool on the block that has not been used that much is a "safer option" , especially one that will prevent you needing to do
Quote:
Or that I definitely need 3 months to perform "unit tests" also on these well known, widely used and presented in literature tools.
I think you're wrong

Everyone (especially the biologists around me (and I used to be one)) think that NGS data analysis is easy and a technique, a service that can be provided and not something that involves a boatload of time and benchmarking and intellectual effort no less complicated than designing some "pretty" wet lab experiments. So, yes, if you're going to do an analysis you need to know what your tools are doing, and the sad state of the field is format incompatibility, weird mappings and activities by different softwares that give different outputs, and you need to look at the data and the biology of your system to figure out what makes sense and what is probably an artefact.
dvanic is offline   Reply With Quote