Cufflinks is one of the most popular transcript assembly software. Generally, a reference annotation is need by its inputs by both "-GTF" and "-GTF-guide" options. The former option "Tells Cufflinks to use the supplied reference annotation (a GFF file) to estimate isoform expression. It will not assemble novel transcripts, and the program will ignore alignments not structurally compatible with any reference transcript. ". And the latter option "Tells Cufflinks to use the supplied reference annotation (GFF) to guide RABT assembly. Reference transcripts will be tiled with faux-reads to provide additional information in assembly. Output will include all reference transcripts as well as any novel genes and isoforms that are assembled. ".
So, from above descriptions cited from cufflinks document, transcripts of the "-GTF" option should be included by the '-GTF-guide' option. But, after test using a real rna-seq dataset with default values of other options, the "-GTF" option gave 19680 transcripts marked with "=" and the '-GTF-guide' option gave 17401 transcripts marked with "=" (for other classcodes, 5 "C", 1064 "e", 4836 "i", 8469 "j", 89 "o", 665 "p", 7 "s", 116 "x").
My question is why the '-GTF-guide' option gives smaller number of "=" transcripts than that of "-GTF" option?
So, from above descriptions cited from cufflinks document, transcripts of the "-GTF" option should be included by the '-GTF-guide' option. But, after test using a real rna-seq dataset with default values of other options, the "-GTF" option gave 19680 transcripts marked with "=" and the '-GTF-guide' option gave 17401 transcripts marked with "=" (for other classcodes, 5 "C", 1064 "e", 4836 "i", 8469 "j", 89 "o", 665 "p", 7 "s", 116 "x").
My question is why the '-GTF-guide' option gives smaller number of "=" transcripts than that of "-GTF" option?
Comment