SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
velveth with -separate option leshin Bioinformatics 0 02-05-2013 08:28 AM
Velveth flag question winsettz De novo discovery 0 01-25-2013 07:43 AM
Velvetg output files nisha2683 Introductions 0 11-02-2012 10:00 AM
velveth assembly with single and paired ends Apexy RNA Sequencing 0 08-05-2011 08:41 AM
Problem with velveth asoke Bioinformatics 1 08-04-2009 07:11 PM

Reply
 
Thread Tools
Old 08-06-2013, 07:31 AM   #1
nareshvasani
Member
 
Location: NC

Join Date: Apr 2013
Posts: 57
Smile Velveth and velvetg use

Hello Everyone,

This is the first time I am using Velveth and velvetg.
I have around 5 million read, which has 50-300bp.
I used below cmd, and it work jusdt fine.
# velveth auto 31,45,2 -fastq -short -inputfile
output# it gave me 7 file with kmer length 31,33,35,36,39,41 &43.

Can anybody please give me suggetion about which kmer length to select, do i need to use long or shord read in command?

How do I excecute velvetg cmd?
What cutoff and min_contig_length to use?

Thanks in advance!
nareshvasani is offline   Reply With Quote
Old 08-06-2013, 08:44 AM   #2
winsettz
Member
 
Location: US

Join Date: Sep 2012
Posts: 91
Default

This question rightly belongs in http://seqanswers.com/forums/forumdisplay.php?f=27
which is the de novo assembly forum.

When you say "50-300 bp", are you referencing the length of what velvet calls inserts?

And in response to which Kmer to use; I refer you back to the manual:

Quote:
5.2 Choice of hash length k
The hash length is the length of the k-mers being entered in the hash table.
Firstly, you must observe three technical constraints:
• it must be an odd number, to avoid palindromes. If you put in an even
number, Velvet will just decrement it and proceed.
• it must be below or equal to MAXKMERHASH length (cf. 2.3.3, by
default 31bp), because it is stored on 64 bits
• it must be strictly inferior to read length, otherwise you simply will not
observe any overlaps between reads, for obvious reasons.
Now you still have quite a lot of possibilities. As is often the case, it’s a tradeoff between specificity and sensitivity. Longer kmers bring you more specificity
(i.e. less spurious overlaps) but lowers coverage (cf. below). . . so there’s a sweet
spot to be found with time and experience.
Experience shows that kmer coverage should be above 10 to start getting
decent results. If Ck is above 20, you might be “wasting” coverage. Experience
also shows that empirical tests with different values for k are not that costly to
run!
5.3 Choice of a coverage cutoff
Velvet was designed to be explicitly cautious when correcting the assembly, to
lose as little information as possible. This consequently will leave some obvious
errors lying behind after the Tour Bus algorithm (cf. 7) was run. To detect
them, you can plot out the distribution of k-mer coverages (5.2), using plotting
software (I use R).
http://www.ebi.ac.uk/~zerbino/velvet/Manual.pdf

velvetg is simply

Code:
velvetg auto
This would also be a good time to ask what you are assembling, and whether or not you have gotten your feet wet on de novo assembly for which there is an "answer", like E. coli MG1655.
winsettz is offline   Reply With Quote
Old 08-06-2013, 09:11 AM   #3
nareshvasani
Member
 
Location: NC

Join Date: Apr 2013
Posts: 57
Smile HI winsettz

My Fastq file has 50-300bp long sequence read. And all are single end read.

So I was wondering which command to executive;
For eg:
velveth auto 31 -fastq -short -inputfile

or

velveth auto 31 -fastq -long -inputfile
nareshvasani is offline   Reply With Quote
Old 08-06-2013, 10:29 AM   #4
winsettz
Member
 
Location: US

Join Date: Sep 2012
Posts: 91
Default

Quote:
Originally Posted by nareshvasani View Post
My Fastq file has 50-300bp long sequence read. And all are single end read.

So I was wondering which command to executive;
For eg:
velveth auto 31 -fastq -short -inputfile

or

velveth auto 31 -fastq -long -inputfile
Again, in the velvet manual

Quote:
5.6 What’s long and what’s short?
Velvet was pretty much designed with micro-reads (e.g. Illumina) as short and
short to long reads (e.g. 454 and capillary) as long. Reference sequences can
also be thrown in as long.
That being said, there is no necessary distinction between the types of reads.
The only constraint is that a short read be shorter than 32kb. The real difference
is the amount of data Velvet keeps on each read. Short reads are presumably
too short to resolve many repeats, so only a minimal amount of information is
kept. On the contrary, long reads are tracked in detail through the graph.
This means that whatever you call your reads, you should be able to obtain
the same initial assembly. The differences will appear as you are trying to resolve
repeats, as long reads can be followed through the graph. On the other hand,
long reads cost more memory. It is therefore perfectly fine to store Sanger reads
as “short” if necessary
Illumina stuff is definitely short-read; and things like PacBio will require you to determine this beforehand. 454 and Sanger will also likely meet the definition of short read for velvet.
winsettz is offline   Reply With Quote
Old 08-06-2013, 10:38 AM   #5
nareshvasani
Member
 
Location: NC

Join Date: Apr 2013
Posts: 57
Default winsettz

Thanks a lot.

This fastq file was generated from ion torrent proton instrumnet.
So I don't know what to consider this file as short or long?
nareshvasani is offline   Reply With Quote
Old 08-06-2013, 01:31 PM   #6
mastal
Senior Member
 
Location: uk

Join Date: Mar 2009
Posts: 667
Default

If you read the extract from the manual, as posted above, it tells you that for your size of reads, it really doesn't matter whether you call them short or long, you will get the same result.
mastal is offline   Reply With Quote
Old 08-07-2013, 07:19 AM   #7
nareshvasani
Member
 
Location: NC

Join Date: Apr 2013
Posts: 57
Smile Mastal

Thanks a lot!
nareshvasani is offline   Reply With Quote
Reply

Tags
bioinfomatics, denovo assembly, rna sequencing

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 12:27 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2018, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO