SEQanswers

Go Back   SEQanswers > Applications Forums > RNA Sequencing



Similar Threads
Thread Thread Starter Forum Replies Last Post
cuffmerge/cuffcompare: result contains no p_id martin_313 Bioinformatics 1 01-12-2012 04:23 AM
questions for cuffcompare and cuffdiff output liuxq Bioinformatics 0 01-04-2011 05:17 AM
compare expression? cuffcompare or cuffdiff vebaev RNA Sequencing 6 12-21-2010 10:59 PM
cuffcompare vs cuffdiff? jli525 Bioinformatics 0 06-08-2010 11:40 AM
Cuffcompare Result help Wei-HD Bioinformatics 0 04-16-2010 12:18 PM

Reply
 
Thread Tools
Old 11-02-2010, 04:45 AM   #1
nkwuji
Member
 
Location: Dublin

Join Date: Mar 2010
Posts: 19
Default Binary characters in cuffcompare result & Questions on cuffdiff

Hi,

I am using tophat/cufflinks packages analyzing my RNA-seq data. I found a small bug in cuffcompare.

After I compared my reference gtf with transcript.gtf, I got the combined.gtf. But, sometimes, I found some of the strand information was in binary character. For example, if I use "less" to check the combined.gtf, for some transcripts, the strand information is "^@". If I submit this combined.gtf to UCSC genome browser, it will say "cannot read xxx.gtf file". After I changed these binary characters into ".", it works fine.

Another question is, does anyone know how to set up the minimal threshold in the cuffdiff to do the test. For example, I have a gene expressed mildly in one sample (FPKM 8), but no expression in the other sample (FPKM 0). It is actually one of the most interesting genes I was looking for. But in the cuffdiff, it has the mark of "NOTEST", thus the significance is "no". Can anyone give me any help on this? Can I manually select these genes as differentially expressed genes, because they are expressed and actually the pvalue is also 0?

Plus, can I remove genes expressed in the low level manually, e.g. for genes with FPKM < 1? These genes dont look very promising...

Cheers,
Jun
nkwuji is offline   Reply With Quote
Old 11-02-2010, 06:01 AM   #2
sdarko
Member
 
Location: Bethesda, MD

Join Date: Apr 2009
Posts: 51
Default

I'm glad I found this post. I was having the exact same problem and changing the binary character to "." fixed my (current) issues as well.

Sam
sdarko is offline   Reply With Quote
Old 11-02-2010, 06:57 AM   #3
RockChalkJayhawk
Senior Member
 
Location: Rochester, MN

Join Date: Mar 2009
Posts: 191
Default

Quote:
Originally Posted by nkwuji View Post
Hi,

Another question is, does anyone know how to set up the minimal threshold in the cuffdiff to do the test. For example, I have a gene expressed mildly in one sample (FPKM 8), but no expression in the other sample (FPKM 0). It is actually one of the most interesting genes I was looking for. But in the cuffdiff, it has the mark of "NOTEST", thus the significance is "no". Can anyone give me any help on this? Can I manually select these genes as differentially expressed genes, because they are expressed and actually the pvalue is also 0?

Plus, can I remove genes expressed in the low level manually, e.g. for genes with FPKM < 1? These genes dont look very promising...

Cheers,
Jun
The cuffdiff -c option might be what you are looking for
Code:
-c/--min-alignment-count <int>
This limits the differential testing based on counts - rather than FPKM. However, do you think it is wise/necessary to use this feature if what you want to say is that it is present in one condition and not the other?
RockChalkJayhawk is offline   Reply With Quote
Old 11-03-2010, 03:01 AM   #4
nkwuji
Member
 
Location: Dublin

Join Date: Mar 2010
Posts: 19
Default

Thx RockChalkJayhawk.

I will think about this part, though the result seems to be a little weird on genes expressed at low levels. For example, for this gene expressed in one sample with FPKM of 8, and in the other sample with FPKM of 0, the result is shown as NOTEST. But for the other gene, in one sample, the FPKM is 0.25, and in the other sample is 0. THe result is OK, and significant.

Possibly it can be explained by the second gene is longer, and the min-alignment-count could be higher than default setting and got the test significant. But I think it may be better to limit the result by FPKM (or average coverage) other than total fragments(or reads), otherwise, it may have bias on longer genes.
nkwuji is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 06:42 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO