Hi folks, I am running cufflinks in two ways: 1-with a genome annotation as a strict reference, 2-a genome annotation as a guide. these are options -G/-g.
After cufflinks, I run cuffcompare and cuffdiff, using the exact same options in the two runs. For cuffdiff I use a minimum alignment of 10 and quartile normalization.
this is an example of my cuffdiff-transcript differential expression testing outputs:
RUN1=annotation as a STRICT reference (option G)
test_id gene_id gene locus status sample 1 sample 2 ln(fold_change) test_stat p_value q_value significant
TCONS_00000808 XLOC_000674 OBP51 2L:26101763-26135649 OK 0 3.59443 -1.79769e+308 -1.79769e+308 0.00048021 0.00886047 yes
TCONS_00000244 XLOC_000192 AGAP005079 2L:9735320-9746596 OK 0 12.3904 -1.79769e+308 -1.79769e+308 1.40E-10 1.18E-08 yes
TCONS_00000245 XLOC_000192 AGAP005079 2L:9735320-9746596 OK 0 26.4766 -1.79769e+308 -1.79769e+308 1.03E-11 1.09E-09 yes
TCONS_00001658 XLOC_001422 AGAP007633 2L:48492236-48511972 OK 0 4.76816 -1.79769e+308 -1.79769e+308 4.54E-09 3.07E-07 yes
RUN2=annotation as a GUIDE (option g)
test_id gene_id gene locus status sample 1 sample2 ln(fold_change) test_stat p_value q_value significant
TCONS_00001429 XLOC_000713 OBP51 2L:26101763-26135649 OK 0 1.05238 1.79769e+308 1.79769e+308 0.0100919 0.0436812 yes
TCONS_00001430 XLOC_000713 OBP51 2L:26101763-26135649 OK 0 1.87352 1.79769e+308 1.79769e+308 0.00140353 0.0102479 yes
TCONS_00006075 XLOC_003687 AGAP005079 2L:9725746-9755495 OK 461420 0 -1.79769e+308 -1.79769e+308 0.00735375 0.0350816 yes
TCONS_00009083 XLOC_001472 AGAP007633 2L:48492236-48512012 OK 0.94828 10.0326 2.35895 -7.9951 1.33E-15 2.11E-13 yes
can somebody help me understand:
1-why there is so much difference?
2-which is the most stringent option? which transcripts should I consider as deferentially expressed? I get 1094 transcripts significantly deferentially expressed in the first run and many many more in the second run and the overlap is minimal. what should i use?
thanks
After cufflinks, I run cuffcompare and cuffdiff, using the exact same options in the two runs. For cuffdiff I use a minimum alignment of 10 and quartile normalization.
this is an example of my cuffdiff-transcript differential expression testing outputs:
RUN1=annotation as a STRICT reference (option G)
test_id gene_id gene locus status sample 1 sample 2 ln(fold_change) test_stat p_value q_value significant
TCONS_00000808 XLOC_000674 OBP51 2L:26101763-26135649 OK 0 3.59443 -1.79769e+308 -1.79769e+308 0.00048021 0.00886047 yes
TCONS_00000244 XLOC_000192 AGAP005079 2L:9735320-9746596 OK 0 12.3904 -1.79769e+308 -1.79769e+308 1.40E-10 1.18E-08 yes
TCONS_00000245 XLOC_000192 AGAP005079 2L:9735320-9746596 OK 0 26.4766 -1.79769e+308 -1.79769e+308 1.03E-11 1.09E-09 yes
TCONS_00001658 XLOC_001422 AGAP007633 2L:48492236-48511972 OK 0 4.76816 -1.79769e+308 -1.79769e+308 4.54E-09 3.07E-07 yes
RUN2=annotation as a GUIDE (option g)
test_id gene_id gene locus status sample 1 sample2 ln(fold_change) test_stat p_value q_value significant
TCONS_00001429 XLOC_000713 OBP51 2L:26101763-26135649 OK 0 1.05238 1.79769e+308 1.79769e+308 0.0100919 0.0436812 yes
TCONS_00001430 XLOC_000713 OBP51 2L:26101763-26135649 OK 0 1.87352 1.79769e+308 1.79769e+308 0.00140353 0.0102479 yes
TCONS_00006075 XLOC_003687 AGAP005079 2L:9725746-9755495 OK 461420 0 -1.79769e+308 -1.79769e+308 0.00735375 0.0350816 yes
TCONS_00009083 XLOC_001472 AGAP007633 2L:48492236-48512012 OK 0.94828 10.0326 2.35895 -7.9951 1.33E-15 2.11E-13 yes
can somebody help me understand:
1-why there is so much difference?
2-which is the most stringent option? which transcripts should I consider as deferentially expressed? I get 1094 transcripts significantly deferentially expressed in the first run and many many more in the second run and the overlap is minimal. what should i use?
thanks
Comment