Seqanswers Leaderboard Ad

**Wolfgang Huber** · 07-12-2013, 06:59 AM

Dear alittleboy

assessment on simulated data have their value (more on that below) but as a reader I would be unlikely to pay much attention to a benchmark study that only relied on simulated data and had no real-data assessment. A very good example of how to perform such a benchmark -at the time, for Affymetrix GeneChip gene-level differential expression- was Rafa Irizarry's Affycomp. The paper on their rationale & study design is worth reading for anyone embarking on benchmarking: http://bioinformatics.oxfordjournals.../22/7/789.full Perhaps the major limitation of that one was still that the data were 'too clean' and did not have many of the defects that real data have (esp. from observational studies on non-lab-animals, such as humans).

Re your point 1., I think that was quoting Simon a bit out of context, he was pointing out the all-importance of the simulation assumptions when using simulated data to assess a method. If you can convince your readers that the model assumptions you use for the data simulation are in fact the relevant ones, then the results from such a benchmark are useful. But arguably, if we understood the data generation process that well, then writing a method for detecting differential expression would be simple. The real problem is that we don't.

Just my 2 (euro)cents...

Best wishes
Wolfgang

**alittleboy** · 07-12-2013, 09:02 AM

Originally posted by Wolfgang Huber View Post

Dear alittleboy

assessment on simulated data have their value (more on that below) but as a reader I would be unlikely to pay much attention to a benchmark study that only relied on simulated data and had no real-data assessment. A very good example of how to perform such a benchmark -at the time, for Affymetrix GeneChip gene-level differential expression- was Rafa Irizarry's Affycomp. The paper on their rationale & study design is worth reading for anyone embarking on benchmarking: http://bioinformatics.oxfordjournals.../22/7/789.full Perhaps the major limitation of that one was still that the data were 'too clean' and did not have many of the defects that real data have (esp. from observational studies on non-lab-animals, such as humans).

Re your point 1., I think that was quoting Simon a bit out of context, he was pointing out the all-importance of the simulation assumptions when using simulated data to assess a method. If you can convince your readers that the model assumptions you use for the data simulation are in fact the relevant ones, then the results from such a benchmark are useful. But arguably, if we understood the data generation process that well, then writing a method for detecting differential expression would be simple. The real problem is that we don't.

Just my 2 (euro)cents...

Best wishes
Wolfgang

Dear Wolfgang:

Thank you so much for your insights! Yes, I strongly agree that if we know the data-generating truth, then there should be a consensus on which DE test tool is best.

Because I am currently comparing different methods for DEU on real datasets, I am very curious how can I tell if one method is better than others (or they're comparable). For example, methods A and B:

1. For mock comparisons, A reported fewer genes with DEU than B, so A wins
2. For proper comparisons, A reported more genes with DEU than B, so A seems to have more detection power (?)
3. If in proper comparison, A reported 100 genes with DEU, B with 50 genes with DEU, and the number of overlap is 25 -- what can I tell from this result?

It is also noted that both methods detect a key gene -- I guess probably this is because the gene has so strong signal due to treatment that every method can detect it?

Thanks again for your help!

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 31 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 32 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 28 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 53 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

benchmarking tools for evaluating methods for differential exon usage

Comment

Comment

Latest Articles

ad_right_rmr

News