SEQanswers (
-   Bioinformatics (
-   -   How does inner distances(--mate-inner-dist and --mate-std-dev) work in tophat? (

Gangcai 01-28-2010 08:02 PM

How does inner distances(--mate-inner-dist and --mate-std-dev) work in tophat?
I am confused about the inner distance setting. From the manual file, it should be set to (fragementlength-2*readslength , eg: 300-2*50=100). But if the distance counting is based on genome location, then the distance between the pairs should be (fragementlength-2*readslength+inserted_introns_length). Does anybody know how tophat manage the intron insertion?

Boel 02-15-2010 11:15 AM

I think that the mate inner distances is set in order to detect whether the mates are in different exons etc. That is, if the distance is much larger/smaller than the 200 (300-2*50) that the software has been told to expect, something interesting might be going on.

snp_analyser 06-15-2010 05:32 AM

I'm trying different settings of inner distance for my experiments. Anybody out there with the answer as to which value is good?

Bio.X2Y 06-15-2010 06:32 AM

Hi snp_analyser. I'm afraid there is no good value - the inner distances depend on the sizes of your fragments and the lengths of your reads, which are experiment specific.

We typically use a tool like Bowtie to help us find our fragment sizes empirically. We run a paired-end alignment with Bowtie, using default parameters for -I and -X. We then examine the output to see, in general, how far apart the reads in a pair as aligned. This indicates the mate inner distance.

In terms of terminology, the "gap" or "inner distance" is the distance between the aligned reads (not counting the reads themselves). The "insert size", on the other hand, includes the reads themselves, so can be thought of as the "fragment size".

If you look at Bowtie output, the alignment position of a read is the position of the first base in the read (from the perspective of the forward reference strand). This means that if you subtract the alignment positions of the two reads in a pair, the result is actually "inner distance" + "read length". So you will need to subtract the read length to get the inner distance.

You should probably write (or find) a script to do this for you to ensure you examine enough pairs to get a representative feel for the value. Our data typically shows a normally distributed inner distance.

snp_analyser 06-15-2010 06:53 AM

Thanks for the detailed reply!

All times are GMT -8. The time now is 10:26 PM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.