View Single Post
Old 11-02-2010, 04:45 AM   #1
Location: Dublin

Join Date: Mar 2010
Posts: 19
Default Binary characters in cuffcompare result & Questions on cuffdiff


I am using tophat/cufflinks packages analyzing my RNA-seq data. I found a small bug in cuffcompare.

After I compared my reference gtf with transcript.gtf, I got the combined.gtf. But, sometimes, I found some of the strand information was in binary character. For example, if I use "less" to check the combined.gtf, for some transcripts, the strand information is "^@". If I submit this combined.gtf to UCSC genome browser, it will say "cannot read xxx.gtf file". After I changed these binary characters into ".", it works fine.

Another question is, does anyone know how to set up the minimal threshold in the cuffdiff to do the test. For example, I have a gene expressed mildly in one sample (FPKM 8), but no expression in the other sample (FPKM 0). It is actually one of the most interesting genes I was looking for. But in the cuffdiff, it has the mark of "NOTEST", thus the significance is "no". Can anyone give me any help on this? Can I manually select these genes as differentially expressed genes, because they are expressed and actually the pvalue is also 0?

Plus, can I remove genes expressed in the low level manually, e.g. for genes with FPKM < 1? These genes dont look very promising...

nkwuji is offline   Reply With Quote