Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Velvet 1.2.10: why the big difference in results with -long vs -short w/ 250bp reads?

    Greetings.

    I am doing some de novo assembly of a 23 Mb genome using MiSeq paired end Illumina reads (250bp reads, 400bp insert (SD 130)). These reads, however, have been trimmed for quality and range widely in their finished size, with most at about 190bp. Assembly using -long/-longPaired vs -short/shortPaired gives surprisingly different final results. Any ideas why this is happening or which results are more reliable?

    Thanks!

    Commands:
    Code:
    velveth Genome1_71 71 -short -fastq reads_R1.trimmed.fastq.se reads_R2.trimmed.fastq.se  -shortPaired -separate -fastq reads_R1.trimmed.fastq.pe reads_R2.trimmed.fastq.pe
    velvetg Genome1_71 -exp_cov 43 -ins_length 407 -ins_length_sd 130
    
    velveth Genome1_71 71 -long -fastq reads_R1.trimmed.fastq.se reads_R2.trimmed.fastq.se  -longPaired -separate -fastq reads_R1.trimmed.fastq.pe reads_R2.trimmed.fastq.pe
    velvetg Genome1_71 -exp_cov 43 -ins_length_long 407 -ins_length_long_sd 130
    Results, short:
    Code:
    Final graph has 128642 nodes and n50 of 17324, max 332339, total 26267561, using 6042595/7501247 reads
    Results, long:
    Code:
    Final graph has 148426 nodes and n50 of 1610, max 28675, total 26984545, using 6083488/7501247 reads

  • #2
    Zerbino tells us there shouldn't be any difference, but what you've found is interesting.

    Have you tried this without the singletons and just the paired reads?

    It's interesting that your -long flag increases read utilization and subsequently affects your n50. It's breaking up your reads since you've lost fragments larger than 28675...

    Does that 28675 fragment exist in your -short assembly?

    Comment

    Latest Articles

    Collapse

    • seqadmin
      Essential Discoveries and Tools in Epitranscriptomics
      by seqadmin


      The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist on Modified Bases...
      Yesterday, 07:01 AM
    • seqadmin
      Current Approaches to Protein Sequencing
      by seqadmin


      Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
      04-04-2024, 04:25 PM

    ad_right_rmr

    Collapse

    News

    Collapse

    Topics Statistics Last Post
    Started by seqadmin, 04-11-2024, 12:08 PM
    0 responses
    39 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-10-2024, 10:19 PM
    0 responses
    41 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-10-2024, 09:21 AM
    0 responses
    35 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-04-2024, 09:00 AM
    0 responses
    55 views
    0 likes
    Last Post seqadmin  
    Working...
    X