Hi everyone,
I have thought for a while that it would be incredibly useful to get some idea of how other peoples instruments are performing compared to mine. It would certainly give me some possibilities when something looks suspicious with a run. There is always the possibility of spotting issues more quickl with a reagent as well, failures might crop up all over the place at the same time. This idea comes from an Affymetrix database that was setup in my last institute to compare Affy rpt QC files. I was also reinspired by the Sanger plots; http://www.sanger.ac.uk/Teams/Team117/#mpsa_error.
To do this I would like to see some run metrics brought together in one place in a format that would be easily downloadable and comparable. Wherever they are brought together it would be best if anyone could upload files of run data easily. The easier it is to get data into a site the more useable the database becomes and it becomes self perpetuating.
Posible metrics would include some of the summary stats; yield, cpt, %PF, error, etc. Alongside this we would need to see run length, run type, instrument version, library type. It might also be handy to have some data that people might consider sensitive, genome, instrument location, operator, etc. I would be happy to publish most of this from my facility and a quick question to the submitting scientist should release the rest.
It is getting difficult to find out how well systems are perfomring when I talk to people as it is too easy to forget that we are not comparing identical runs; SE and PE, mRNA or ChIP, 35bp or 100bp, etc, etc, etc. What yield do you get from a 45bp SE run, we have had almost 6GBp which I think is good but I would like to kow if we should be trying harder!
So how do we get started? Who will take up the chalenge and how can we decide they can be trusted to build something reliable? Would we be happy asking Illumina to do this? Will anyone else out there upload data, with or without the more sensitive metadata? Will anyone look at it?
James.
I have thought for a while that it would be incredibly useful to get some idea of how other peoples instruments are performing compared to mine. It would certainly give me some possibilities when something looks suspicious with a run. There is always the possibility of spotting issues more quickl with a reagent as well, failures might crop up all over the place at the same time. This idea comes from an Affymetrix database that was setup in my last institute to compare Affy rpt QC files. I was also reinspired by the Sanger plots; http://www.sanger.ac.uk/Teams/Team117/#mpsa_error.
To do this I would like to see some run metrics brought together in one place in a format that would be easily downloadable and comparable. Wherever they are brought together it would be best if anyone could upload files of run data easily. The easier it is to get data into a site the more useable the database becomes and it becomes self perpetuating.
Posible metrics would include some of the summary stats; yield, cpt, %PF, error, etc. Alongside this we would need to see run length, run type, instrument version, library type. It might also be handy to have some data that people might consider sensitive, genome, instrument location, operator, etc. I would be happy to publish most of this from my facility and a quick question to the submitting scientist should release the rest.
It is getting difficult to find out how well systems are perfomring when I talk to people as it is too easy to forget that we are not comparing identical runs; SE and PE, mRNA or ChIP, 35bp or 100bp, etc, etc, etc. What yield do you get from a 45bp SE run, we have had almost 6GBp which I think is good but I would like to kow if we should be trying harder!
So how do we get started? Who will take up the chalenge and how can we decide they can be trusted to build something reliable? Would we be happy asking Illumina to do this? Will anyone else out there upload data, with or without the more sensitive metadata? Will anyone look at it?
James.
Comment