SEQanswers

Go Back   SEQanswers > Sequencing Technologies/Companies > SOLiD



Similar Threads
Thread Thread Starter Forum Replies Last Post
smallRNA GAIIx raw fastq files - quality filter? vebaev Bioinformatics 0 08-22-2011 10:30 AM
PubMed: Accuracy and quality assessment of 454 GS-FLX Titanium pyrosequencing. Newsbot! Literature Watch 0 05-20-2011 03:40 AM
input files for IMAGE Maegwin Bioinformatics 4 04-22-2011 04:54 PM
IMAGE input files skingan Genomic Resequencing 0 07-29-2010 12:02 PM
ShortRead: a bioconductor package for input, quality assessment and exploration Chien-Yuan Chen Literature Watch 0 11-09-2009 01:36 PM

Reply
 
Thread Tools
Old 04-07-2010, 01:47 PM   #1
throwaway
Junior Member
 
Location: America

Join Date: Apr 2010
Posts: 9
Default For sequence quality assessment, do the raw image files provide any added value?

I just got done with an analysis based on a Sanger sequencing assay, where having the raw trace files was absolutely invaluable. They saved me from publishing some very "interesting" results which were actually experimental artifacts because they showed up in the trace files as sequencing anomalies which weren't caught by the SNP detection software we used.

Now I'm moving on to a project based on SOLiD sequencing, and I've learned that my collaborators are throwing away the raw fluorescence image files. The fact that I will not be able to go back to the raw data to check for anomalies makes me nervous, and I am trying to decide whether to push back on this policy. I am wondering whether people assessing SOLiD sequence calls ever get any added value from examining a sample of pertinent raw image files.
throwaway is offline   Reply With Quote
Old 04-08-2010, 06:47 PM   #2
darkmatter
Junior Member
 
Location: US

Join Date: Apr 2010
Posts: 1
Default Me too

I would love to see this too. I have the same problem. Are there even any samples out there? I could imagine even going a step further back and looking at calibration files if they exist.
darkmatter is offline   Reply With Quote
Old 04-08-2010, 09:21 PM   #3
nilshomer
Nils Homer
 
nilshomer's Avatar
 
Location: Boston, MA, USA

Join Date: Nov 2008
Posts: 1,285
Default

Quote:
Originally Posted by throwaway View Post
I just got done with an analysis based on a Sanger sequencing assay, where having the raw trace files was absolutely invaluable. They saved me from publishing some very "interesting" results which were actually experimental artifacts because they showed up in the trace files as sequencing anomalies which weren't caught by the SNP detection software we used.

Now I'm moving on to a project based on SOLiD sequencing, and I've learned that my collaborators are throwing away the raw fluorescence image files. The fact that I will not be able to go back to the raw data to check for anomalies makes me nervous, and I am trying to decide whether to push back on this policy. I am wondering whether people assessing SOLiD sequence calls ever get any added value from examining a sample of pertinent raw image files.
Saving a sampling of images is great for debugging after the fact, but not for re-use in downstream analysis. Saving every 1000th image or something would seem a good data reduction. Anyone else have some thoughts?
nilshomer is offline   Reply With Quote
Old 04-09-2010, 02:28 PM   #4
Alex Coventry
Junior Member
 
Location: Ithaca, NY

Join Date: Oct 2009
Posts: 5
Default

..........

Last edited by Alex Coventry; 04-09-2010 at 02:35 PM.
Alex Coventry is offline   Reply With Quote
Old 04-09-2010, 02:37 PM   #5
throwaway
Junior Member
 
Location: America

Join Date: Apr 2010
Posts: 9
Default

So, what kind of debugging do you use the images for? I have heard of people using them to diagnose primer failures, but not much beyond that.
throwaway is offline   Reply With Quote
Old 04-09-2010, 02:43 PM   #6
nilshomer
Nils Homer
 
nilshomer's Avatar
 
Location: Boston, MA, USA

Join Date: Nov 2008
Posts: 1,285
Default

Quote:
Originally Posted by throwaway View Post
So, what kind of debugging do you use the images for? I have heard of people using them to diagnose primer failures, but not much beyond that.
Pretty much exactly that.
nilshomer is offline   Reply With Quote
Old 04-11-2010, 06:47 PM   #7
snetmcom
Senior Member
 
Location: USA

Join Date: Oct 2008
Posts: 157
Default

The cost to store images often outweighs the value. I have never seen a reason to go back to the raw image files. If your primary metrics are fine, the images will not help.
snetmcom is offline   Reply With Quote
Old 04-11-2010, 09:16 PM   #8
nilshomer
Nils Homer
 
nilshomer's Avatar
 
Location: Boston, MA, USA

Join Date: Nov 2008
Posts: 1,285
Default

Quote:
Originally Posted by snetmcom View Post
The cost to store images often outweighs the value. I have never seen a reason to go back to the raw image files. If your primary metrics are fine, the images will not help.
It also depends on the company, but you might need a subset of images to provide to the company to prove that there is a problem with the sequencer or reagents. Just a thought.

Last edited by nilshomer; 04-11-2010 at 09:16 PM. Reason: speak-and-spell
nilshomer is offline   Reply With Quote
Old 04-12-2010, 01:02 AM   #9
clivey
Member
 
Location: Oxford

Join Date: Jul 2008
Posts: 24
Default Image QC

Was very useful when setting up and debugging machines, chemistry and protocols at Solexa and Sanger.

Often systemic issues with the instrument showed up as easily recognisible problems in the images. It is easy to know what an ideal image should look like and any deviation from this can be quickly and easily recognized by eye (human brain is very good at this sort of pattern recognition). For example, contaminants in the reagents would show as bright blobs that would then get falsely called as clusters leading to pseudo sequences. Blobs adjacent to real clusters would spill over signal and skew base calls (and perhaps your SNP calls). Badly set up optics would lead to uneven illumination across the tile giving poor or artefactual base calls. Flaws in the focusing software, occuring sporadically would do the same. Manufacturing faults in the flowcells would lead to flowcell walls being images, altering focusing, and giving rise to pseudo-sequences. Other flaws in the flowcell coatings would show up in the images but not in the metrics, 'black holes' in the surfaces. Primer problems would give rise distinct sub-populations of 'speck' clusters. Optical duplicates could be spotted using the X,Y coordinates of similar sequences (I think based on alignment start and stops) and backtracking to the images demonstrated that some of this was due to egde effects and stage movements (and others) - and so on.

In those days on GAI and IIs you could breeze into the lab and watch the images popping up on a selected number of tiles and hopefully they would be reassuringly normal.

Yes indeed, if the manufacturers changed something in reagents or in the instrumentation and it was not beneficial or introduced a sporadic bug then the images provided a very powerful way to communicate that to them - especially when those bugs were only affecting some machines or batches and not others. We went back to images a lot in fact, contrary to what one of the posters asserts, although i do not know if this still holds true with more recent systems.

The argument was - when the instruments and reagents become completely reliable black box, low variability systems with well defined behavior and outputs - of course then you don't need to back track to images.

For sure, these days, a lot of QC metrics are provided and it may be possible to spot things more easily from these, and of course, far more images are produced now making it much more difficult to keep and review them. But personally, given the rate of change of these systems and the issues with variability, if i were working on a very large project spanning and long time period, I'd backtrack a bit and ensure that a run or base-call on a recent run equates to a base call on one of my earlier data sets where i may be combining these data or making comparisons.

If you are modifying run protocols, cluster creation or other aspects of the system that are non-standard then its probably wise to back rack to images.

Personally, I like to know whats going on under the hood, then i can head off a breakdown and get more mileage and economy. Some people like to just turn the key and drive.

Last edited by clivey; 04-12-2010 at 01:09 AM.
clivey is offline   Reply With Quote
Old 04-14-2010, 07:02 AM   #10
throwaway
Junior Member
 
Location: America

Join Date: Apr 2010
Posts: 9
Default

Thanks for the information, clivey. It's hard to know how many of those issues will arise with SOLiD runs, but it provides a place to start looking.
throwaway is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 11:14 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO