Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Velvet and reads from different generations of Illumina

    Greetings colleagues!

    I am a second year Ph.D. student and I am currently taking a genome assembly class. We have been given a class project to de novo assemble a set of reads. The data set that we have been given contains mate pair reads from two generations of Illumina sequencing- some of the reads are 36bp and some of the reads are 76 bp. The professor of the class suggested that Velvet may produce different assemblies if I separate out the reads by size and do my assembly with the reads of different lengths in different files. However, as these sequences aren't actually long (capillary) reads, I can't figure out how to run velvet with all of that information.

    The velveth command I think I would run (I'm doing this in Linux) is...

    velveth [output directory] [kmer] -short -fasta short_forward.fa short_reverse.fa long_forward.fa long_reverse.fa -shortPaired short_shuffled.fa long_shuffled.fa -singletons short_singletons.fa long_singletons.fa

    Is this correct? I haven't tried it yet, mind you, but I cannot conceptually understand how or if velvet would be able to use this information properly.

    I have done the assembly with the data contained in single files successfully, but was hoping to generate something better if it's possible.

    Thank you for considering my question!

  • #2
    So, I stumbled into the velvet page on the seqwiki page and it answered my question, I think. I'll try doing what is suggested on the page, and let you all know how it works.

    Comment


    • #3
      So, I managed to only sort of be successful... I cannot include any singleton data when I run my assemblies.

      Does anyone know how I might do this?

      Comment


      • #4
        From reading the manual, I think this should work:
        velveth [output directory] [kmer] -shortPaired -fasta short_forward.fa short_reverse.fa -shortPaired2 long_forward.fa long_reverse.fa -short short_shuffled.fa long_shuffled.fa short_singletons.fa long_singletons.fa

        It depends on what kind of insert size you have of course. Use -shortPaired for the two files of one insert size, and -shortPaired2 if you have a second set of pair ends files with a different insert size. Just use -short and list the singleton files after the option, like "-short short_singletons.fa long_singletons.fa".

        Btw, the manual is at http://www.ebi.ac.uk/~zerbino/velvet/, and I think it will answer most of your questions. Hope this helps.

        Comment


        • #5
          Thank you for your help! I read the manual, and also the wiki, and what I ended up doing was...

          -short fasta short_fwd.fa short_rev.fa \
          -shortpaired short_shuffled.fa \
          -short short_singletons.fa \
          -short2 long_fwd.fa long_rev.fa \
          -shortpaired2 long_shuffled.fa \
          -short2 long_singletons.fa ;

          And it worked just fine.

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Current Approaches to Protein Sequencing
            by seqadmin


            Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
            04-04-2024, 04:25 PM
          • seqadmin
            Strategies for Sequencing Challenging Samples
            by seqadmin


            Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
            03-22-2024, 06:39 AM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, 04-11-2024, 12:08 PM
          0 responses
          30 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 10:19 PM
          0 responses
          32 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 09:21 AM
          0 responses
          28 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-04-2024, 09:00 AM
          0 responses
          53 views
          0 likes
          Last Post seqadmin  
          Working...
          X