In the specification sheet for the Illumina IIx upgrade, I see data output for 2x50 experiment as 9.5-12 GB. They specify that they are "high quality" reads because they pass v1.3 filters. Does anyone know what percentage of high quality reads are generally mappable to the reference genome?
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
That is highly dependent on the quality of your library, the organism you're sequencing and the alignment algorithm you're using. As well as your definition of 'mapping' back to the genome.
For example with PhiX, you generally get greater than 98% of your reads mapping back to the genome using ELAND since you don't have to worry about reads in repeat regions or contamination. With human you'll get around 80% mapping back to the genome (depending on your library prep) using ELAND. The reason that the results for human are worse than PhiX is that ELAND only maps back unique alignments and the human genome sequence is highly repetitive (ie. any unknown region isn't going to align, and if you have the same sequence twice it won't count it unless it can resolve it's pair). ELAND also can't handle more than 2 errors in the seed it uses for the alignment so you lose some that way.
Hope this helps,
BradLast edited by basickler; 04-10-2009, 08:55 AM.
Latest Articles
Collapse
-
by seqadmin
The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...-
Channel: Articles
04-22-2024, 07:01 AM -
-
by seqadmin
Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...-
Channel: Articles
04-04-2024, 04:25 PM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, Yesterday, 11:49 AM
|
0 responses
15 views
0 likes
|
Last Post
by seqadmin
Yesterday, 11:49 AM
|
||
Started by seqadmin, 04-24-2024, 08:47 AM
|
0 responses
16 views
0 likes
|
Last Post
by seqadmin
04-24-2024, 08:47 AM
|
||
Started by seqadmin, 04-11-2024, 12:08 PM
|
0 responses
61 views
0 likes
|
Last Post
by seqadmin
04-11-2024, 12:08 PM
|
||
Started by seqadmin, 04-10-2024, 10:19 PM
|
0 responses
60 views
0 likes
|
Last Post
by seqadmin
04-10-2024, 10:19 PM
|
Comment