Hi ,Iam working with velvet denovo assembly of illumina reads.Initially i trimmed the raw reads of illumina based on quality and after running the velvet with the subset of the reads the N50 is found to be always low , It is like 29 ,20 ..I tried with various parameters , nothing improved the N50.So any suggestions???
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
what is your read length? and the estimated genome length? if the sequencing coverage is very high (>=200) and uneven, the N50 is very slow for Velvet because of the sequencing errors and SNPs , even though you used subset and set important parameters, k-mers, exp_cov and cov_cutoff.Last edited by gridbird; 02-09-2011, 11:25 AM.
Comment
-
how many reads you have? how high is you estimated coverage? which kmers you tried? how many contigs you get? do you get some long contigs? what coverage is stated in the ids of the contigs?
i really can't say where the problem might be with the amount of information you stated.
anyway the best way to address velvet problems is over the mailinglist:
zerbino is also very active on this mailinglist.
Comment
-
17 mln reads from one lane from one end. Coverage near 200 (but may be contamination from nuclear genome). Length of reads 36. k-mers from 23 to 31. Number of contigs from 300 to 2500. N50 12 - 21. We try with 1 mln and 100 ths reads, but the result was few better (N50 54). Maximum contigs length near 100 nucleotides. What the "ids"?
(Data from 454 from this material gave a chloroplast genome map.)Last edited by vtosha; 02-25-2011, 06:07 AM.
Comment
-
i mean the tag (id) of the contigs. they are like: >NODE_length_xxxxx_cov_xxxxx.xxxxxx, so you can check what coverage velvet assigns to the contigs.
did you also set the parameter -unused_reads yes, to check how many reads velvet does not use? Do you do any quality trimming before using velvet?
A coverage near 200 should give you better results, there seems to be something wrong. :-/ did you tried another assembler?
Comment
-
Coverage in the ids of contigs near 1000.
We didn't set parameter -unused reads by ourselves. But velvet write how many reads it use: 100 ths-1 mlns from 17 mlns reads. When we use for assembly 1 mlns or 100 ths reads: 82 ths used from 1 mln, 1000 from 100 ths. No, we didn't trim reads.
We try Edena - no good results (contig 134 nucleotides and no BLAST to anything).
No BLAST to adapters or primers.
May the problem be in abundant PCR?
Comment
-
I think it is the sequencing error which give you this problem. There is no good N50 for Velvet with high coverage and sequencing error. did you check velvet paper? for error free reads, Velvet can always get good N50 no matter how much high coverage. But for real reads, N50 will drop with coverage which is caused by sequencing error. you can randomly selected several coverage, such as 10,50,100, 150, 200 and assembly them using velvet and pick up a good N50.
did you try the error correction program, such as shrec, quake, to correct sequencing error before assembly? also, you can use Solexaqa to trimmed some reads with low quality before assembly.
Comment
-
velvet
Hi guys,
I have been working on an yeast strain. I have raw paired end illumina reads. Is there any way to find out the read length and coverage from the sequence data itself? I am trying to use velvet for assembly of the genome. However, i am new to velvet and have certain queries on the same. What is the optimization criteria for k mer length? Secondly, the contigs obtained after running velvetg, how it can be further used to generate full genome sequence of the organism? how can further genes be predicted from the sequence?
Comment
-
velvet N50
You can get the read length by having a look at your fastq files, and
FastQC will also give you the read length.
There is a script called velvetk that will calculate kmer coverage from your fastq files before you run velvet.
See
You may also find Velvet Optimiser and Velvet Advisor useful.
Comment
Latest Articles
Collapse
-
by seqadmin
Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...-
Channel: Articles
03-22-2024, 06:39 AM -
-
by seqadmin
The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.
Avian Conservation
Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...-
Channel: Articles
03-08-2024, 10:41 AM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, Yesterday, 06:37 PM
|
0 responses
10 views
0 likes
|
Last Post
by seqadmin
Yesterday, 06:37 PM
|
||
Started by seqadmin, Yesterday, 06:07 PM
|
0 responses
9 views
0 likes
|
Last Post
by seqadmin
Yesterday, 06:07 PM
|
||
Started by seqadmin, 03-22-2024, 10:03 AM
|
0 responses
49 views
0 likes
|
Last Post
by seqadmin
03-22-2024, 10:03 AM
|
||
Started by seqadmin, 03-21-2024, 07:32 AM
|
0 responses
67 views
0 likes
|
Last Post
by seqadmin
03-21-2024, 07:32 AM
|
Comment