I'd have to double check, but I think we're storing CIF files too.
Unconfigured Ad
Collapse
X
-
True, but that's more of an added convenience for us than a requirement. Realistically, we'd keep things locally for 3-6 months and then off-load them elsewhere, so the odds of needing to recompute would be quite small and it might prove more convenient to just move the random mucked up dataset back. Obviously as things grow this might change.Originally posted by GenoMax View PostDevon: There is a subtle but significant difference. Google's nearline storage supposedly offers access with just a 3-5 second delay (so you could compute on it via Google compute, Edit: Not 100% certain about this). Glacier is truly meant for long term storage.
Comment
-
-
In the past we used the CIFs for re-basecalling single lanes. We don't do that anymore (there is no need to). It is just the definition of the term "rawdata", some of us are obligated to store the sequencing "rawdata". This defintion varies vastly ...Originally posted by GenoMax View PostI am not sure why one would want to save the CIF files (perhaps only if the sample is irreplaceable). This may become a moot point as technology moves along.
@Sven: Does illumina even allow saving CIF files for V4 chemistry runs?
v4 chemistry does not allow for saving CIFs using HCS; you can AFAIK tweak the config to do so. But it makes no sense in my eyes (thinking about OLB/RTA development) and is not recommended (supported) by Illumina. One should especially take care with v4 as there is much more data produced in the same time.
But we haven't upgraded all HiSeqs :-)
Comment
-
-
That approach seems prone to problems for analyses that consider things like technical replicates, batch effects, cross-contamination, and basically anything involving imperfections in the sequencing process. It would be fine if sequencing was perfect and unbiased, and the platforms and chemistry stable and unchanging, but that's not really the case.
Comment
-
-
Ah Brian,Originally posted by Brian Bushnell View PostThat approach seems prone to problems for analyses that consider things like technical replicates, batch effects, cross-contamination, and basically anything involving imperfections in the sequencing process. It would be fine if sequencing was perfect and unbiased, and the platforms and chemistry stable and unchanging, but that's not really the case.
I think you need to face that DNA, the natural RAWDATA storage form, is superior to your crummy digital methodologies. Step out from in front of your computer screen, head down to the lab and take a look at what the real meaning of "high tech" is. Nanotechnology! Pfah! DNA encodes information at a sub-nanometer resolution.
From the earliest automated sanger machine days there were less processed storage forms for the instruments that could be used to clog up your hard drives for as many years as you might keep them. (How many of you have tried to save the initial TIFF image of an ABI377 gel?)
Better to let the instruments use their brittle embedded systems to convert that massive data glob into something approaching a durable storage format. For Sanger sequencers that ended up being the .ab1 file. For Illumina sequencers -- fastq. Heave everything else into the dumpster.
Okay, to be fair, I'm a hypocrite. I still have the autorads from all 100+ 35S sequencing gels that I ran back in the day. I have them labelled and indexed. But I don't see myself going back to re-read them, ever.
Seriously though, have you actually seen technical replicate differences sufficient to swamp biological replicate differences? I mean in cases that were not just the result of loading errors like over clustering?
--
Phillip
Comment
-
-
I don't touch that stuff; it's dirty. Data is only real once it's in a computerOriginally posted by pmiguel View PostI think you need to face that DNA, the natural RAWDATA storage form, is superior to your crummy digital methodologies.
As for how important these considerations are... hmmm, I don't know. I'm just tossing in something to worry about.
Comment
-
Latest Articles
Collapse
-
by SEQadmin2
Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.
The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
...-
Channel: Articles
06-02-2026, 10:05 AM -
-
by SEQadmin2
With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.
Introduction
Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...-
Channel: Articles
05-22-2026, 06:42 AM -
ad_right_rmr
Collapse
News
Collapse
| Topics | Statistics | Last Post | ||
|---|---|---|---|---|
|
Sequencing the Two-Toed Sloth Genome Reveals Jumping Genes Tied to Its Extreme Metabolism
by SEQadmin2
Started by SEQadmin2, Yesterday, 11:58 AM
|
0 responses
10 views
0 reactions
|
Last Post
by SEQadmin2
Yesterday, 11:58 AM
|
||
|
Started by SEQadmin2, 06-05-2026, 10:09 AM
|
0 responses
25 views
0 reactions
|
Last Post
by SEQadmin2
06-05-2026, 10:09 AM
|
||
|
Started by SEQadmin2, 06-04-2026, 08:59 AM
|
0 responses
35 views
0 reactions
|
Last Post
by SEQadmin2
06-04-2026, 08:59 AM
|
||
|
Started by SEQadmin2, 06-02-2026, 12:03 PM
|
0 responses
58 views
0 reactions
|
Last Post
by SEQadmin2
06-02-2026, 12:03 PM
|
Comment