SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
PubMed: A biologist's guide to de novo genome assembly using next-generation sequence Newsbot! Literature Watch 0 01-10-2012 12:10 PM
Trouble running Tophat & Cufflinks on Galaxy with non-UCSC-supported genome NSher Bioinformatics 2 12-29-2011 07:05 PM
Modification of reference genome annotation by cufflinks/cuffdiff? markr Bioinformatics 3 07-20-2011 01:20 AM
tophat/cufflinks for novel genome annotation darked89 Bioinformatics 1 11-18-2010 06:53 AM
Reference genome for MAQ - split reference genome by chromosome or not? inesdesantiago Bioinformatics 4 02-18-2009 08:44 AM

Reply
 
Thread Tools
Old 01-05-2012, 01:37 PM   #1
maryb
Junior Member
 
Location: CA

Join Date: Jan 2012
Posts: 6
Default running cufflinks with a genome annotation as a strict reference or as a guide.

Hi folks, I am running cufflinks in two ways: 1-with a genome annotation as a strict reference, 2-a genome annotation as a guide. these are options -G/-g.
After cufflinks, I run cuffcompare and cuffdiff, using the exact same options in the two runs. For cuffdiff I use a minimum alignment of 10 and quartile normalization.
this is an example of my cuffdiff-transcript differential expression testing outputs:

RUN1=annotation as a STRICT reference (option G)
test_id gene_id gene locus status sample 1 sample 2 ln(fold_change) test_stat p_value q_value significant
TCONS_00000808 XLOC_000674 OBP51 2L:26101763-26135649 OK 0 3.59443 -1.79769e+308 -1.79769e+308 0.00048021 0.00886047 yes
TCONS_00000244 XLOC_000192 AGAP005079 2L:9735320-9746596 OK 0 12.3904 -1.79769e+308 -1.79769e+308 1.40E-10 1.18E-08 yes
TCONS_00000245 XLOC_000192 AGAP005079 2L:9735320-9746596 OK 0 26.4766 -1.79769e+308 -1.79769e+308 1.03E-11 1.09E-09 yes
TCONS_00001658 XLOC_001422 AGAP007633 2L:48492236-48511972 OK 0 4.76816 -1.79769e+308 -1.79769e+308 4.54E-09 3.07E-07 yes




RUN2=annotation as a GUIDE (option g)
test_id gene_id gene locus status sample 1 sample2 ln(fold_change) test_stat p_value q_value significant
TCONS_00001429 XLOC_000713 OBP51 2L:26101763-26135649 OK 0 1.05238 1.79769e+308 1.79769e+308 0.0100919 0.0436812 yes
TCONS_00001430 XLOC_000713 OBP51 2L:26101763-26135649 OK 0 1.87352 1.79769e+308 1.79769e+308 0.00140353 0.0102479 yes
TCONS_00006075 XLOC_003687 AGAP005079 2L:9725746-9755495 OK 461420 0 -1.79769e+308 -1.79769e+308 0.00735375 0.0350816 yes
TCONS_00009083 XLOC_001472 AGAP007633 2L:48492236-48512012 OK 0.94828 10.0326 2.35895 -7.9951 1.33E-15 2.11E-13 yes


can somebody help me understand:
1-why there is so much difference?
2-which is the most stringent option? which transcripts should I consider as deferentially expressed? I get 1094 transcripts significantly deferentially expressed in the first run and many many more in the second run and the overlap is minimal. what should i use?

thanks
maryb is offline   Reply With Quote
Old 01-05-2012, 02:09 PM   #2
severin
Genome Informatics Facility
 
Location: Iowa @isugif

Join Date: Sep 2009
Posts: 105
Default Some Thoughts

It depends on how well you feel your genome is annotated. If I feel the genes I am interested in are contained within the annotation for my organism then I use strict.

I sometimes will run the guide or remove the reads that aligned to annotated genes then use a denovo cufflinks to identify potential genes that might have been missed by the annotation and could be of interest to my particular biological question.

denovo cufflinks however tends to find exons rather than full genes and will look for differential expression of the exons which could possibly explain why you are finding a significantly higher number of differentially expressed "genes" in your guided cufflinks output. Unfortunately, exon comparisons can also lead to the unfortunate case where due to poor sampling one exons suggests a significant increase in expression between two conditions while a different exon of the same gene shows a significant decrease in expression.




Quote:
Originally Posted by maryb View Post
Hi folks, I am running cufflinks in two ways: 1-with a genome annotation as a strict reference, 2-a genome annotation as a guide. these are options -G/-g.
After cufflinks, I run cuffcompare and cuffdiff, using the exact same options in the two runs. For cuffdiff I use a minimum alignment of 10 and quartile normalization.
this is an example of my cuffdiff-transcript differential expression testing outputs:

RUN1=annotation as a STRICT reference (option G)
test_id gene_id gene locus status sample 1 sample 2 ln(fold_change) test_stat p_value q_value significant
TCONS_00000808 XLOC_000674 OBP51 2L:26101763-26135649 OK 0 3.59443 -1.79769e+308 -1.79769e+308 0.00048021 0.00886047 yes
TCONS_00000244 XLOC_000192 AGAP005079 2L:9735320-9746596 OK 0 12.3904 -1.79769e+308 -1.79769e+308 1.40E-10 1.18E-08 yes
TCONS_00000245 XLOC_000192 AGAP005079 2L:9735320-9746596 OK 0 26.4766 -1.79769e+308 -1.79769e+308 1.03E-11 1.09E-09 yes
TCONS_00001658 XLOC_001422 AGAP007633 2L:48492236-48511972 OK 0 4.76816 -1.79769e+308 -1.79769e+308 4.54E-09 3.07E-07 yes




RUN2=annotation as a GUIDE (option g)
test_id gene_id gene locus status sample 1 sample2 ln(fold_change) test_stat p_value q_value significant
TCONS_00001429 XLOC_000713 OBP51 2L:26101763-26135649 OK 0 1.05238 1.79769e+308 1.79769e+308 0.0100919 0.0436812 yes
TCONS_00001430 XLOC_000713 OBP51 2L:26101763-26135649 OK 0 1.87352 1.79769e+308 1.79769e+308 0.00140353 0.0102479 yes
TCONS_00006075 XLOC_003687 AGAP005079 2L:9725746-9755495 OK 461420 0 -1.79769e+308 -1.79769e+308 0.00735375 0.0350816 yes
TCONS_00009083 XLOC_001472 AGAP007633 2L:48492236-48512012 OK 0.94828 10.0326 2.35895 -7.9951 1.33E-15 2.11E-13 yes


can somebody help me understand:
1-why there is so much difference?
2-which is the most stringent option? which transcripts should I consider as deferentially expressed? I get 1094 transcripts significantly deferentially expressed in the first run and many many more in the second run and the overlap is minimal. what should i use?

thanks
severin is offline   Reply With Quote
Old 07-19-2012, 05:59 AM   #3
kopardev
Member
 
Location: VA, USA

Join Date: Oct 2011
Posts: 18
Default 1.79769e+308

what does in fold change of 1.79769e+308 mean? Can anyone explain that?
Thanks!
kopardev is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 01:03 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO