velvet N50

bioenvisage

Member

Join Date: Oct 2009

Posts: 40
- Share
- Tweet
#1

velvet N50

04-06-2010, 02:13 AM

Hi ,Iam working with velvet denovo assembly of illumina reads.Initially i trimmed the raw reads of illumina based on quality and after running the velvet with the subset of the reads the N50 is found to be always low , It is like 29 ,20 ..I tried with various parameters , nothing improved the N50.So any suggestions???
Tags: None
biomed

Junior Member

Join Date: Dec 2009

Posts: 6
- Share
- Tweet
#2

06-13-2010, 12:48 AM

Important parameters for velvet: K-mer, exp_cov, cov_cutoff.... you could play with these three to get a better N50...
Comment
gridbird

Member

Join Date: Oct 2010

Posts: 16
- Share
- Tweet
#3

02-09-2011, 10:43 AM

what is your read length? and the estimated genome length? if the sequencing coverage is very high (>=200) and uneven, the N50 is very slow for Velvet because of the sequencing errors and SNPs ， even though you used subset and set important parameters, k-mers, exp_cov and cov_cutoff.

Last edited by gridbird; 02-09-2011, 11:25 AM.
Comment
Thorondor

Member

Join Date: Feb 2011

Posts: 69
- Share
- Tweet
#4

02-11-2011, 01:30 AM

as gridbird already stated, we need more information to help you. ;-)

if you have a high coverage you can choose a high kmer and use cov_cutoff to remove contigs with low coverage which are normally small.
Comment
vtosha

Member

Join Date: May 2010

Posts: 36
- Share
- Tweet
#5

02-25-2011, 04:39 AM

We have this problem too: we had an excellent run for our samples (chloroplast genome) but assembly with Velvet gave N50 lower than read lengh (36). We played with all Velvet parametres but maximum N50 was 29. Where may be a problem?
Comment
Thorondor

Member

Join Date: Feb 2011

Posts: 69
- Share
- Tweet
#6

02-25-2011, 05:31 AM

how many reads you have? how high is you estimated coverage? which kmers you tried? how many contigs you get? do you get some long contigs? what coverage is stated in the ids of the contigs?

i really can't say where the problem might be with the amount of information you stated.

anyway the best way to address velvet problems is over the mailinglist:

EBI-EMBL Mailman list

http://listserver.ebi.ac.uk/mailman/listinfo/velvet-users

zerbino is also very active on this mailinglist.
Comment
vtosha

Member

Join Date: May 2010

Posts: 36
- Share
- Tweet
#7

02-25-2011, 06:00 AM

17 mln reads from one lane from one end. Coverage near 200 (but may be contamination from nuclear genome). Length of reads 36. k-mers from 23 to 31. Number of contigs from 300 to 2500. N50 12 - 21. We try with 1 mln and 100 ths reads, but the result was few better (N50 54). Maximum contigs length near 100 nucleotides. What the "ids"?
(Data from 454 from this material gave a chloroplast genome map.)

Last edited by vtosha; 02-25-2011, 06:07 AM.
Comment
Thorondor

Member

Join Date: Feb 2011

Posts: 69
- Share
- Tweet
#8

02-25-2011, 06:16 AM

i mean the tag (id) of the contigs. they are like: >NODE_length_xxxxx_cov_xxxxx.xxxxxx, so you can check what coverage velvet assigns to the contigs.
did you also set the parameter -unused_reads yes, to check how many reads velvet does not use? Do you do any quality trimming before using velvet?

A coverage near 200 should give you better results, there seems to be something wrong. :-/ did you tried another assembler?
Comment
vtosha

Member

Join Date: May 2010

Posts: 36
- Share
- Tweet
#9

02-25-2011, 06:30 AM

Coverage in the ids of contigs near 1000.
We didn't set parameter -unused reads by ourselves. But velvet write how many reads it use: 100 ths-1 mlns from 17 mlns reads. When we use for assembly 1 mlns or 100 ths reads: 82 ths used from 1 mln, 1000 from 100 ths. No, we didn't trim reads.
We try Edena - no good results (contig 134 nucleotides and no BLAST to anything).
No BLAST to adapters or primers.
May the problem be in abundant PCR?
Comment
Thorondor

Member

Join Date: Feb 2011

Posts: 69
- Share
- Tweet
#10

02-25-2011, 06:36 AM

well when you velvet use only 1mln out of 17mlns there seems to be quality issue about the reads. And when you contigs have a cov around 1000 it looks like they are from repetitive sequences.
Comment
gridbird

Member

Join Date: Oct 2010

Posts: 16
- Share
- Tweet
#11

02-25-2011, 09:46 AM

I think it is the sequencing error which give you this problem. There is no good N50 for Velvet with high coverage and sequencing error. did you check velvet paper? for error free reads, Velvet can always get good N50 no matter how much high coverage. But for real reads, N50 will drop with coverage which is caused by sequencing error. you can randomly selected several coverage, such as 10,50,100, 150, 200 and assembly them using velvet and pick up a good N50.
did you try the error correction program, such as shrec, quake, to correct sequencing error before assembly? also, you can use Solexaqa to trimmed some reads with low quality before assembly.
Comment
diptarka

Member

Join Date: Mar 2013

Posts: 10
- Share
- Tweet
#12

07-19-2013, 03:38 AM

velvet

Hi guys,
I have been working on an yeast strain. I have raw paired end illumina reads. Is there any way to find out the read length and coverage from the sequence data itself? I am trying to use velvet for assembly of the genome. However, i am new to velvet and have certain queries on the same. What is the optimization criteria for k mer length? Secondly, the contigs obtained after running velvetg, how it can be further used to generate full genome sequence of the organism? how can further genes be predicted from the sequence?
Comment
mastal

Senior Member

Join Date: Mar 2009

Posts: 666
- Share
- Tweet
#13

07-19-2013, 05:50 AM

velvet N50

You can get the read length by having a look at your fastq files, and
FastQC will also give you the read length.

There is a script called velvetk that will calculate kmer coverage from your fastq files before you run velvet.

See

http://www.vicbioinformatics.com/software.velvetk.shtml

You may also find Velvet Optimiser and Velvet Advisor useful.

http://bioinformatics.net.au/software.velvetoptimiser.shtml

Velvet Advisor

http://dna.med.monash.edu.au/~torsten/velvet_advisor/
Comment

Previous template Next

Essential Discoveries and Tools in Epitranscriptomics

by seqadmin

The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
- Channel: Articles
04-22-2024, 07:01 AM
Current Approaches to Protein Sequencing

by seqadmin

Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
- Channel: Articles
04-04-2024, 04:25 PM

Topics	Statistics	Last Post
Genetic Variants and Diabetes Risk in Childhood Cancer Survivors by seqadmin Started by seqadmin, Today, 08:47 AM	0 responses 12 views 0 likes	Last Post by seqadmin Today, 08:47 AM
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 60 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 59 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 54 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM

Seqanswers Leaderboard Ad

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News