Hi i did velvet denovo assembly for a species which does not have its previous genome information , so how can I validate my assembly?
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
That's the 64 million dollar question.
You'll have to talk to the biologists in your collaboration to see what is know that you can check (e.g. any ESTs or other previously sequenced bits like important genes, experimentally charactered genome size, GC percentage), and what related organisms you might be able to compare it to.
Specific to velvet, I'm sure there is plenty of good advice in the documentation and mailing list archive about common pitfalls etc.
-
I agree with Peter, this is the question we'd all love to be able to answer with confidence!
This is a question of having multiple pieces of evidence to give you a confidence level as to your assembly. With current technology it is still impossible to "prove" an assembly is correct, but you can get pretty damn close.
Optical mapping is a complementary technology which might be helpful for indepedent verification of contig order (particularly large contigs >100kb).
Sequencing with another technology, particularly 454 might give some clues as to the extent of misassemblies. Paired-end 454 data will be even more helpful.
You could do de novo assembly with other assemblers and see if they agree, but this is probably weak/ circumstantial evidence.
Another method of verifying an assembly is to design primers to amplify the entire genome in overlapping segments, say 10kb and check them on a gel. This of course relies on you having a finished genome sequence to check with.
You might find an easier question to answer is "what level of assembly accuracy will permit me to answer my scientific question?"
Comment
-
You [the OP] might be oversimplifying a tad.
Any assembly will be composed of:- correct contigs
- fragmented contigs
- chimeric contigs
- spurious contigs
and may suffer from:- missing contigs
I would consider chimeras and spurious contigs to be distinguished by length - spurious contigs are an artifact of the debruijn method and are very short. I don't think chimeras are very common in Velvet compared to other assemblers - any ambiguity normally results in fragments.
Velvet assemblies performed under high stringency (high kmer, high cvCut) conditions will minimize chimeric, fragmented and spurious contigs at the expense of more missing contigs.
To validate a de-novo short read assembly, especially a transcriptome which by its very nature will never form long contigs, you need to decide whether you are willing to accept some bad with the good or insist on just the good and get less of it. This is a classic signal-to-noise problem.
One way to judge an assembly is to run Velvet under varying parameters and see if the results converge. If you get wildly different results you can examine which contigs are spliced or fragmented under different settings and make your own judgments from there.
Comment
-
Good answer Zigster!
I'd add the final possibility of "correct" contigs containing consensus errors due to transposed nucleotides in repeats which have been resolved using paired-end information, as discussed in my blog post at http://pathogenomics.bham.ac.uk/blog...nome-assembly/
Comment
-
hi zingster and nicklomen .. thanx for your replies ..it was very use full .. My idea to validate is, if we have the sanger sequences of the species what we are assembling then we can do a blast against the assembled contigs of solexa and then we can take the assembly which has the maximum sanger sequences covered in the blast (for eg more than 90 percent) as a valied assembly ..what do you think?
Comment
Latest Articles
Collapse
-
by seqadmin
In recent years, precision medicine has become a major focus for researchers and healthcare professionals. This approach offers personalized treatment and wellness plans by utilizing insights from each person's unique biology and lifestyle to deliver more effective care. Its advancement relies on innovative technologies that enable a deeper understanding of individual variability. In a joint documentary with our colleagues at Biocompare, we examined the foundational principles of precision...-
Channel: Articles
01-27-2025, 07:46 AM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, Yesterday, 09:07 AM
|
0 responses
11 views
0 likes
|
Last Post
by seqadmin
Yesterday, 09:07 AM
|
||
Started by seqadmin, 01-31-2025, 08:31 AM
|
0 responses
22 views
0 likes
|
Last Post
by seqadmin
01-31-2025, 08:31 AM
|
||
Started by seqadmin, 01-24-2025, 07:35 AM
|
0 responses
78 views
0 likes
|
Last Post
by seqadmin
01-24-2025, 07:35 AM
|
||
Started by seqadmin, 01-23-2025, 09:43 AM
|
0 responses
46 views
0 likes
|
Last Post
by seqadmin
01-23-2025, 09:43 AM
|
Comment