Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • velvet columbus reference errors

    I have seen some posts about using velvet with a reference, but they have not helped me........
    I am trying to assemble a plant genome using a related species as reference. Trying to use velvet columbus, but I don't seem to be able to get the command line right.
    following the manual, using:
    velveth dir_name 31 -reference ref.fasta -sam illumina_align.sam
    this way I get: -sam file cannot contain reference sequence.
    if I try to add -short or -long in front of the -sam switch, it seems to ignore the -reference switch and I get that my "read1" (which is supposed to be the reference) is too long.
    I have removed from the sam file anything that might have something to do with the reference, but still get the same error.
    I am not good with the linux command line and I am getting the impression i am forgetting some little space, comma, slash or something.
    I used bowtie2 for the alignment of the single end read Illumina reads. Does someone know if bowtie puts the reference sequence into it's sam output anywhere? Or, since all the posts about velvet columbus I have been able to find deal with PE input, can velvet with reference only be used with PE reads?

  • #2
    By its nature, SAM has the alignment in it (SAM=Sequence Alignment Map).

    First: Did you sort your SAM file? (will need samtools for this).

    Checking the manual again, it looks like it should be:

    Code:
    velveth $FOLDERNAME $KMER -reference ref.fasta -shortPaired -sam illumina_align.sam
    Can you post the first few lines of your sam file by using

    Code:
    head illumina_align.sam
    Since you already have to do the alignment of your reads to the reference, have you tried looking at the alignments? Are there dips in coverage or re-arrangements, things like that?

    Comment


    • #3
      Originally posted by sfh838t View Post
      velveth dir_name 31 -reference ref.fasta -sam illumina_align.sam
      If you have single end reads, you should also have '-short' before the '-sam' switch.

      Have a look at the Columbus manual.


      I would leave the header lines in the sam file.
      The sam file should be sorted by read name. The default with samtools is to sort the files by chromosome and alignment position.

      Also check the requirements for the reference.fasta file.
      Last edited by mastal; 05-09-2014, 12:15 PM.

      Comment


      • #4
        thanks for any and all replies
        mastal: if I put -short before the sam, velvet seems to ignore the -reference switch. It then reads both files, puts them together into one file and like I said, I get "read 1" is too long" error.

        ctseto: I used both sorted and unsorted files. I used unsorted at first, then noticed that was wrong and then used the very same file(s) that someone else tried for me (and that ran perfectly fine for them) and I ALWAYS get the very same error. Since I want to do an assembly I used the same entire reference seq for velvet that I used for bwa.

        I have looked at the alignments (IGV) and yes, there are areas with lots of coverage, and then some without. I am working with a plant, and I do know that there will be lots and lots of repeat elements. but honestly, I do not see how that would stop velvet from even reading my sam file, because the error shows up as soon as the sam file is opening. takes about 3 secs flat .

        Any further suggestions anyone?

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Strategies for Sequencing Challenging Samples
          by seqadmin


          Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
          03-22-2024, 06:39 AM
        • seqadmin
          Techniques and Challenges in Conservation Genomics
          by seqadmin



          The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

          Avian Conservation
          Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
          03-08-2024, 10:41 AM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, Yesterday, 06:37 PM
        0 responses
        12 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, Yesterday, 06:07 PM
        0 responses
        10 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 03-22-2024, 10:03 AM
        0 responses
        51 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 03-21-2024, 07:32 AM
        0 responses
        68 views
        0 likes
        Last Post seqadmin  
        Working...
        X