As the gel-free Nextera mate-pair method generates a range of mate-pair library insert sizes, how does one handle assembly informatically? If you assemble in an iterative fashion, say 3Kb, 8Kb and 15Kb, should you remove the reads which are identified in each of the assemblies and then feed them back in separately? Any suggestions are appreciated.
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
Hi there
Yeah I am at a complete loss of what to do with this data. None of the scaffolders want to use this data. I've tried SSPACE, SOPRA, OPERA (this doesn't exclude joins based on size), MIP and many assemblers with inbuilt scaffolders. None have worked. The only time I get scaffolding to work at all with the gel-free data is when I include paired-reads together with the mate-pair data as an input. But then I find there are expansions of ambiguities and I get crazy total genome sizes...... This is painful because even though you're joining contigs how does one report on any metrics with a genome that is twice the size it should be? I get decent mapping as well so it's not that my reads aren't mapping. Hope you've made some progress! Did you try the iterative approach?
BTW I have had some success with using NextClip for identifying true mate-pairs.
Cheers
J
Latest Articles
Collapse
-
by seqadmin
The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist on Modified Bases...-
Channel: Articles
Yesterday, 07:01 AM -
-
by seqadmin
Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...-
Channel: Articles
04-04-2024, 04:25 PM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, 04-11-2024, 12:08 PM
|
0 responses
39 views
0 likes
|
Last Post
by seqadmin
04-11-2024, 12:08 PM
|
||
Started by seqadmin, 04-10-2024, 10:19 PM
|
0 responses
41 views
0 likes
|
Last Post
by seqadmin
04-10-2024, 10:19 PM
|
||
Started by seqadmin, 04-10-2024, 09:21 AM
|
0 responses
35 views
0 likes
|
Last Post
by seqadmin
04-10-2024, 09:21 AM
|
||
Started by seqadmin, 04-04-2024, 09:00 AM
|
0 responses
55 views
0 likes
|
Last Post
by seqadmin
04-04-2024, 09:00 AM
|
Comment