SEQanswers

SEQanswers (http://seqanswers.com/forums/index.php)
-   De novo discovery (http://seqanswers.com/forums/forumdisplay.php?f=27)
-   -   velvet columbus reference errors (http://seqanswers.com/forums/showthread.php?t=42953)

sfh838t 04-30-2014 06:19 AM

velvet columbus reference errors
 
I have seen some posts about using velvet with a reference, but they have not helped me........
I am trying to assemble a plant genome using a related species as reference. Trying to use velvet columbus, but I don't seem to be able to get the command line right.
following the manual, using:
velveth dir_name 31 -reference ref.fasta -sam illumina_align.sam
this way I get: -sam file cannot contain reference sequence.
if I try to add -short or -long in front of the -sam switch, it seems to ignore the -reference switch and I get that my "read1" (which is supposed to be the reference) is too long.
I have removed from the sam file anything that might have something to do with the reference, but still get the same error.
I am not good with the linux command line and I am getting the impression i am forgetting some little space, comma, slash or something.
I used bowtie2 for the alignment of the single end read Illumina reads. Does someone know if bowtie puts the reference sequence into it's sam output anywhere? Or, since all the posts about velvet columbus I have been able to find deal with PE input, can velvet with reference only be used with PE reads?

ctseto 05-09-2014 11:35 AM

By its nature, SAM has the alignment in it (SAM=Sequence Alignment Map).

First: Did you sort your SAM file? (will need samtools for this).

Checking the manual again, it looks like it should be:

Code:

velveth $FOLDERNAME $KMER -reference ref.fasta -shortPaired -sam illumina_align.sam
Can you post the first few lines of your sam file by using

Code:

head illumina_align.sam
Since you already have to do the alignment of your reads to the reference, have you tried looking at the alignments? Are there dips in coverage or re-arrangements, things like that?

mastal 05-09-2014 12:08 PM

Quote:

Originally Posted by sfh838t (Post 139166)
velveth dir_name 31 -reference ref.fasta -sam illumina_align.sam

If you have single end reads, you should also have '-short' before the '-sam' switch.

Have a look at the Columbus manual.
http://www.ebi.ac.uk/~zerbino/velvet...bus_manual.pdf

I would leave the header lines in the sam file.
The sam file should be sorted by read name. The default with samtools is to sort the files by chromosome and alignment position.

Also check the requirements for the reference.fasta file.

sfh838t 05-12-2014 04:54 AM

thanks for any and all replies :)
mastal: if I put -short before the sam, velvet seems to ignore the -reference switch. It then reads both files, puts them together into one file and like I said, I get "read 1" is too long" error.

ctseto: I used both sorted and unsorted files. I used unsorted at first, then noticed that was wrong and then used the very same file(s) that someone else tried for me (and that ran perfectly fine for them) and I ALWAYS get the very same error. Since I want to do an assembly I used the same entire reference seq for velvet that I used for bwa.

I have looked at the alignments (IGV) and yes, there are areas with lots of coverage, and then some without. I am working with a plant, and I do know that there will be lots and lots of repeat elements. but honestly, I do not see how that would stop velvet from even reading my sam file, because the error shows up as soon as the sam file is opening. takes about 3 secs flat :).

Any further suggestions anyone?


All times are GMT -8. The time now is 09:04 AM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.