If the the two duplicated regions are similar in sequence (90% identity) and very closely localized to each other (100bp distant), how likely do they show identical sequences due to assembly error? Thanks very much!
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
Originally posted by maimaiti2008 View PostIf the the two duplicated regions are similar in sequence (90% identity) and very closely localized to each other (100bp distant), how likely do they show identical sequences due to assembly error? Thanks very much!
It would probably depend on the assembler you use, and on the length of your reads. Different assemblers give different results. If you have long reads or paired reads that can span the duplicated regions, then the assemblers are less likely to make an 'error'.
-
To me, 90% identity doesn't sound like too much for de novo assemblers to tell the difference.
If you're using a k-mer based approach, you have to have an exact k-1 identity to join neighboring k-mers. So, if you have 1/10 bases different and you're assembling at k=35, your kmers covering these genes will have, on average, 3.5 nucleotide differences. So unless you have regions of near 100% identity, where the assembler might think the more divergent region between them is just a bubble, you should be fine. And even then, paired end reads or longer kmers would likely take that problem away entirely.
Comment
Latest Articles
Collapse
-
by seqadmin
Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...-
Channel: Articles
03-22-2024, 06:39 AM -
-
by seqadmin
The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.
Avian Conservation
Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...-
Channel: Articles
03-08-2024, 10:41 AM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, 03-27-2024, 06:37 PM
|
0 responses
13 views
0 likes
|
Last Post
by seqadmin
03-27-2024, 06:37 PM
|
||
Started by seqadmin, 03-27-2024, 06:07 PM
|
0 responses
11 views
0 likes
|
Last Post
by seqadmin
03-27-2024, 06:07 PM
|
||
Started by seqadmin, 03-22-2024, 10:03 AM
|
0 responses
53 views
0 likes
|
Last Post
by seqadmin
03-22-2024, 10:03 AM
|
||
Started by seqadmin, 03-21-2024, 07:32 AM
|
0 responses
69 views
0 likes
|
Last Post
by seqadmin
03-21-2024, 07:32 AM
|
Comment