Seqanswers Leaderboard Ad

**ECO** · 04-03-2008, 02:05 PM

Official press release here:

http://www.businesswire.com/portal/site/google/?ndmViewId=news_view&newsId=20080403005053&newsLang=en

Helicos BioSciences Announces Single Molecule DNA Sequence Data Published in Science Magazine

Data Validates the World’s First Single Molecule Sequencing of an Organism

CAMBRIDGE, Mass.--(BUSINESS WIRE)--Helicos BioSciences (NASDAQ: HLCS), a life science company focused on innovative genetic analysis technologies, today announced the publication of a report in Science Magazine demonstrating the first single molecule sequencing of an organism. The report depicts the use of Helicos’ proprietary True Single Molecule Sequencing (tSMS)™ technology to re-sequence the M13 viral genome. The report will appear in the April 4, 2008 print issue of Science Magazine.

The report demonstrates that the tSMS technology can reliably re-sequence a moderately complex genome without the associated errors, cost, and experimental complexity of amplification. The tSMS process captures images of single dye labeled nucleotides as they are incorporated to determine the sequence of the individual DNA strands. In addition, the tSMS method simplifies the DNA sample preparation process and maximizes throughput by packing individual strands of DNA at high densities onto the sequencing surface.

“The ability to sequence individual strands of genomic DNA has been a goal of the scientific community for more than 20 years,” said Timothy Harris, PhD, senior director of research at Helicos BioSciences and the report’s corresponding author. “The data in Science Magazinedemonstrate the robustness of our single molecule method and demonstrate our ability to accurately detect single base mutations. Not only does this data represent the first of its kind, but a significant milestone in the genomics revolution.”
To validate its technology, Helicos scientists sequenced the M13 virus genome, examining more than 280,000 strands of captured DNA, directly visualizing the sequential incorporation of individual labeled nucleotides. Overall per-base accuracy was better than 99% and the accuracy of the consensus sequence was 100%. To assess accuracy and robustness of mutation detection, Helicos’ scientists introduced in silico single nucleotide changes into the reference M13 virus genome sequence and compared them to Helicos DNA sequences. The tSMS technology correctly found 98% of 500 simulated mutations with zero false positive errors.

“This data, remarkable as it is, was based on the first generation of our tSMS chemistry,” said Bill Efcavitch, PhD, senior vice president for product R&D at Helicos BioSciences. “We have since developed new generations of ‘one-base-at-a-time’ nucleotides which allow more accurate homopolymer sequencing, and lower overall error rates.”
The report published in Science Magazine initiates the path to many other scientific reports Helicos plans to publish in the upcoming months. These reports will highlight data recently announced at the AGBT meeting in Marco Island further demonstrating single molecule sequencing being applied to both BAC sequencing accuracy, and the ability to count microRNAs as well as identify putative novel miRNAs.

**terabase** · 04-04-2008, 03:14 AM

And the stock is rising

It does not take much to boost their stock price - up 40% within a week !!

Their paper is rather a proof of concept than the presentation of a machine that can compete in the nextgen seq market.

**ECO** · 04-04-2008, 07:29 AM

Something else I noticed, in their introduction they bad mouth the library preparation protocols for all the other platforms, basically saying that adding adapters is labor intensive, etc, then they go on to prove that they absolutely MUST use adapters to get bidirectional reads because their error rates are so high.

Seems like C incorporations are a killer...

**terabase** · 04-04-2008, 08:17 AM

they "sold" a Heliscope to Expression Analysis

Actually their stock price went up initially a week ago when they announced the sale of the first machine to http://www.expressionanalysis.com/ . Is actually the second time they announced the first sale. They do not tell what Expression Analysis payed for or how much Helicos had to pay to make them try the machine. May be the will resequence bacteriophage lambda soon. Is about four times the size of M13.
At the current cash burn rate Helicos has enough cash for about a year or so -> they absolutely need positive news to at least temporarily drive the stock price up.

**ECO** · 04-04-2008, 02:41 PM

Originally posted by terabase View Post

Maybe the will resequence bacteriophage lambda soon. Is about four times the size of M13.

Tiny genome resequencing service. No one ever said how big the $1000 genome had to be!

**Chipper** · 04-04-2008, 03:03 PM

Originally posted by ECO View Post

Tiny genome resequencing service. No one ever said how big the $1000 genome had to be!

Lol, but remember that this experiment was done on a pre-production machine, using only one lane (out of 2x25 per run) with about 100x coverage per strand. And the obvious advantage is the lack of amplicication bias, not that you dont have to ligate linkers. And multipass readings are not the same as bidirectional reads. I guess we will se more in the coming days but if they could come close to what they say it will be hard times for SOLiD / Solexa sytems to compete at the current reagent costs...

**Mr. Gunn** · 04-09-2008, 01:28 PM

Originally posted by ECO View Post

Something else I noticed, in their introduction they bad mouth the library preparation protocols for all the other platforms, basically saying that adding adapters is labor intensive, etc, then they go on to prove that they absolutely MUST use adapters to get bidirectional reads because their error rates are so high.

I noticed that too. I still think the killer for them is going to be the expensive optics. There are other ways of detecting really small amounts that don't require a million dollars in instrumentation, ya know?

**kmay** · 08-01-2008, 06:17 AM

seen helicos data

Hi,

Helicos seems to to be so popular here

Well, bells and whistles about M13 was maybe not such a wise decision...

However, I currently analyze a data set I received from Helicos. A DGE study from a human tissue. I have to say - looks pretty good.

Can´t tell more here... NDA!

But to summarize: biological results absolutely comparable to such derived from Solexa! I think they get their act together.

Cheers

Klaus

**Chipper** · 08-01-2008, 10:31 AM

Hi Klaus,

good to see that they are generating usable data with the Heliscope. Could you share any numbers from the sequencing or is that also under NDA?...

**kmay** · 08-01-2008, 01:09 PM

I´ll see

Chipper,

I´ll see what I can do. But not before next week. I cant access our secure servers from here...

Klaus

**kmay** · 08-04-2008, 03:28 AM

Okay,

what I can share are numbers from our first step analysis after mapping. Mapping was very stringent: best unique hit
(= at least one shortest unique sub-sequence contained, point mutations allowed, no indels allowed):

here the summary:
The data set contains reads from the following organism:
Homo sapiens 4020914

Read length (bp) number
11 86
12 1333
13 6904
14 20384
15 49695
16 91159
17 131835
18 153064
19 166469
20 164352
21 169349
22 171943
23 178216
24 185147
25 210388
26 230526
27 245471
28 178388
29 168223
30 143917
31 135030
32 122484
33 113991
34 106351
35 98137
36 90642
37 83322
38 77510
39 70555
40 63349
41 56651
42 50235
43 44120
44 38896
45 34378
46 30472
47 26112
48 22580
49 18993
50 15407
51 12211
52 9602
53 7529
54 5801
55 4115
56 2990
57 2174
58 1517
59 1204
60 953
61 821
62 623
63 664
64 700
65 464
66 572
67 514
68 655
69 356
70 400
71 287
72 158
73 140
74 103
75 80
76 55
77 33
78 25
79 26
80 7
81 15
82 4
83 7
84 1
85 1
86 4
87 3
88 2
89 2
90 3
92 2
93 3
94 6
95 1
96 4
98 1
101 1
102 2
107 1
110 1
112 1
113 3
116 2
123 1

Annotation:
Intergenic regions 1810570558bp 58.8%
Promoters 44676168bp 1.5%
Exons 97616725bp 3.2%
Introns 1172232197bp 38.1%

Read distribution:
Intergenic regions 1694883 42.2%
Promoters 325016 8.1%
Exon 1079293 26.8%
Intron 1167470 29.0%
Partial 79268 2.0%

=======================

Next step: clustering
summary output:

Cluster detection:
window size: 100
reads/window: 7
probability.: 1.1e-10
clusters detected: 35496
reads in clusters: 2118299 52.68%
min. cluster length: 13
max. cluster length: 5876
avg. cluster length: 117
min. number of reads: 7
max. number of reads: 251937
avg. number of reads: 59

Classification
intergenic regions 10945 30.8%
promoters 3369 9.5%
exon 10501 29.6%
intron 8883 25.0%
partial 5167 14.6%

==========================

expression analysis:

analyzed transcripts: 85562
expressed transcripts: 72514 84.8%
normalized expression value (NE):
minimum: 0.000
maximum: 95.675
average: 0.061
analyzed loci: 32514
expressed loci: 26160 80.5%

NE Transcripts
(0.000:0.020] 48993
(0.020:0.040] 9557
(0.040:0.060] 4390
(0.060:0.080] 2294
(0.080:0.100] 1465
(0.100:0.120] 1020
(0.120:0.140] 707
(0.140:0.160] 576
(0.160:0.180] 453
(0.180:0.200] 353
(0.200:0.220] 245
(0.220:0.240] 251
(0.240:0.260] 174
(0.260:0.280] 166
(0.280:0.300] 131
(0.300:0.320] 121
(0.320:0.340] 100
(0.340:0.360] 69
(0.360:0.380] 81
(0.380:0.400] 107
(0.400:95.675] 1261

====================================

This was very crude first analysis run at all parameters default.
Mapping on our mapping station took 10 minutes
(parameters for best unique are least time consuming)

Rest of analysis took 7 minutes on GGA.

Cheers

Klaus

**Chipper** · 08-04-2008, 05:51 AM

Klaus,

thanks for sharing the numbers, it sure looks promising. Was this data from one lane only?

**kmay** · 08-04-2008, 06:05 AM

The raw reads were pooled from two channels.

Again, this was a quick and dirty first pass. Mapped tag numbers can be increased significantly with more relaxed mapping parameters. However, downstream pathway mining of the expressed transcripts 100% confirms the biological context of the sample.

Klaus

**kmay** · 08-05-2008, 04:05 AM

thanks for the Helicos data. What number of mismatches was allowed in the alignment?

Basically there was no limit on the number of point mutations allowed. The "unique best match" setting in our method works like that:

There is a tree with shortest unique words for each position in the genome. This shortest unique word matches exactly once in the genome. E.g. one starts with a tuple of 5 checks uniqueness, increases one bp, checks uniqueness,6..,7.. 8.. and so on until the "word" is unique. SNPs are taken into account. This library of shortest unique words has a variable length.

For mapping parameters can be introduced: point mutations and indels within those shortest unique words.

For "unique best match" none of the above is allowed (=most stringent). Reads from Helicos were checked whether they contain at least one exact shortest unique word in full. Then around this position, alignmet grows into the read in both directions. Here point mutations were allowed, no limit imposed. At this growth, in this case, SNPs were not taken into account. So several of the observed point mutations can originate from a SNP.

Very basic statistics:
Point mutations # of reads
0 509622
1 369486
2 318733
3 313244
4 297974
5 297301
6 298140
7 344822
8 233730
9 191911
10 153682
11 131893
12 113460
13 98302
14 82719
15 69071
16 57540
17 46855
18 37505
19 30710
20 24214

Keep in mind that we have read lengths up to 123 bp. The above numbers need to be normalized to read length and length and count of shortest unique words contained.

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 18 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 22 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 17 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 49 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

First Helicos Publication! Single Molecule DNA Seq of a "Viral" Genome

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News