I majored in statistics and am a novice in biological. Recently, I learned how to use TopHat and Cufflinks to handle with some RNA-seq data. But when I studied TopHat manual, I encontered some questions.
(1) What the options --read-gap-length and --read-edit-dist/--read-realign-edit-dist stand for in the biological?
About option --read-gap-length: Can I think it is mean that when length of gaps between two or several parts of a read beyond a threhold value, the read is thought unmappable.
About option --read-edit-dist/--read-realign-edit-dist: what does edit distance mean in biological?
(2) There is one terminology, 'anchor', mentioned several times in the manual when discript options such as --min-anchor-length. What dose anchor mean?
(3) I found that TopHat caculates not only the mean inner distance but also standard deviation of inner distance between mate pairs. Do these have something to do with distribution of inner distances? What can we learn from these values?
Thanks!!
(1) What the options --read-gap-length and --read-edit-dist/--read-realign-edit-dist stand for in the biological?
About option --read-gap-length: Can I think it is mean that when length of gaps between two or several parts of a read beyond a threhold value, the read is thought unmappable.
About option --read-edit-dist/--read-realign-edit-dist: what does edit distance mean in biological?
(2) There is one terminology, 'anchor', mentioned several times in the manual when discript options such as --min-anchor-length. What dose anchor mean?
(3) I found that TopHat caculates not only the mean inner distance but also standard deviation of inner distance between mate pairs. Do these have something to do with distribution of inner distances? What can we learn from these values?
Thanks!!
Comment