Dear all,
I sent this to the Tuxedo mailing list 3 days ago and so far there hasn't been any replies (though, yes, it's been the weekend). I'm repeating that message here in the hopes someone can offer some advice.
I am encountering some strange output with TopHat -- hopefully someone here can explain to me what's going on or what I'm doing wrong.
Basically, the results from TopHat version 2.0.13 seems to change drastically based on the number of threads used for the data sets I'm analyzing. Generally, the more threads I use, the less alignments there are.
For example, when I use less threads, the output BAM file is clearly larger in size and in number of lines. Between (for example) 12 to 2 threads, I see a difference in size of 250 MB to 7 GB. At first, I thought there might be some redundancy in the file or something that doesn't affect the final results. But, I threw the BAM files into IGV and I do see a difference in terms of the number of reads that align. However, I expected there to be no difference if I change the number of threads.
In the attached screen capture of IGV, the three tracks from top to bottom are from using the same input file with TopHat. It depicts a region of 1.5 genes in mouse; but I see something like this throughout the data. The only difference is that the number of threads increases from 2 to 8 to 12. Am I missing something obvious?
Any help would be appreciated!
Ray
I sent this to the Tuxedo mailing list 3 days ago and so far there hasn't been any replies (though, yes, it's been the weekend). I'm repeating that message here in the hopes someone can offer some advice.
I am encountering some strange output with TopHat -- hopefully someone here can explain to me what's going on or what I'm doing wrong.
Basically, the results from TopHat version 2.0.13 seems to change drastically based on the number of threads used for the data sets I'm analyzing. Generally, the more threads I use, the less alignments there are.
For example, when I use less threads, the output BAM file is clearly larger in size and in number of lines. Between (for example) 12 to 2 threads, I see a difference in size of 250 MB to 7 GB. At first, I thought there might be some redundancy in the file or something that doesn't affect the final results. But, I threw the BAM files into IGV and I do see a difference in terms of the number of reads that align. However, I expected there to be no difference if I change the number of threads.
In the attached screen capture of IGV, the three tracks from top to bottom are from using the same input file with TopHat. It depicts a region of 1.5 genes in mouse; but I see something like this throughout the data. The only difference is that the number of threads increases from 2 to 8 to 12. Am I missing something obvious?
Any help would be appreciated!
Ray
Comment