Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • SOAPdenovo issue

    Hi everyone,

    I am getting some strange results after using SOAPdenovo. First I run prepare, then I run map and scaff like usual. My problem is, that I load 6886 contigs into the scaff, but then all the sudden later in my results for scaff it says there are a total of 13772 contigs. Does anyone have an idea what might have happened? Below I will copy and paste the output I got from scaff and put in bold the discrepancy I am talking about. Thanks!

    Scaff
    ********************

    Parameters: scaff -g ICH55

    ICH55.Arc: no such file or empty file!

    There are 1 grad(s), 277742070 read(s), max read len 101.
    Kmer size: 55
    There are 13772 edge(s) in edge file.
    Mask contigs with coverage lower than 0.3 or higher than 6.0, and strict length 0.
    Average contig coverage is 3, 0 contig(s) masked.
    Mask contigs shorter than 57, 134 contig(s) masked.
    0 arc(s) loaded, average weight is 0.
    6886 contig(s) loaded.
    Done loading updated edges.
    Time spent on loading updated edges: 2s.

    *****************************************************
    Start to load paired-end reads information.

    For insert size: 270
    Total PE links 102415205
    Normal PE links on same contig 101347414
    Incorrect oriented PE links 712334
    PE links of too small insert size 336470
    PE links of too large insert size 3060
    Correct PE links 9412
    Accumulated connections 1966
    Use contigs longer than 270 to estimate insert size:
    PE links 101326878
    Average insert size 95
    SD 175
    983 new connections.

    All paired-end reads information loaded.
    Time spent on loading paired-end reads information: 293s.

    *****************************************************
    Start to construct scaffolds.

    ***************************
    For insert size: 270
    Total PE links 982
    PE links to masked contigs 6
    On same scaffold PE links 0
    Cutoff of PE links to make a reliable connection: 3
    Active connections 1952
    Weak connections 850
    Weak ratio 43.5%
    7 circles removed.
    Start to remove transitive connection.
    Total contigs 13772
    Masked contigs 162
    Remained contigs 13610
    None-outgoing-connection contigs 12655 (92.983101%)
    Single-outgoing-connection contigs 894
    Multi-outgoing-connection contigs 17
    Cycle 1
    Two-outgoing-connection contigs 44
    Potential transitive connections 1
    Transitive connections 0
    Transitive ratio 0.0%
    Start to linearize sub-graph.
    Picked sub-graphs 58
    Connection-conflict 0
    Significant overlapping 55
    Eligible 0
    Bubble structures 0
    Mask repeats:
    Puzzles 43
    Masked contigs 36
    Start to remove transitive connection.
    Total contigs 13772
    Masked contigs 234
    Remained contigs 13538
    None-outgoing-connection contigs 12686 (93.706604%)
    Single-outgoing-connection contigs 845
    Multi-outgoing-connection contigs 1
    Cycle 1
    Two-outgoing-connection contigs 6
    Potential transitive connections 1
    Transitive connections 0
    Transitive ratio 0.0%
    Start to linearize sub-graph.
    Picked sub-graphs 6
    Connection-conflict 0
    Significant overlapping 6
    Eligible 0
    Bubble structures 0
    Non-strict linearization.
    Start to linearize sub-graph.
    Picked sub-graphs 5
    Connection-conflict 0
    Significant overlapping 4
    Eligible 0
    Bubble structures 0
    Start to mask puzzles.
    Masked contigs 3
    Remained puzzles 0
    Freezing done.

    Recover contigs.
    Total recovered contigs 0
    Single-route cases 0
    Multi-route cases 0

    All links loaded.
    Time spent on constructing scaffolds: 0s.

    The final rank

    *******************************
    Scaffold number 345
    In-scaffold contig number 6832
    Total scaffold length 17430856
    Average scaffold length 50524
    Filled gap number 0
    Longest scaffold 647306
    Scaffold and singleton number 6408
    Scaffold and singleton length 61984163
    Average length 9672
    N50 53501
    N90 4087
    Weak points 0

    *******************************

    Done with 345 scaffolds, 0 gaps finished, 424 gaps overall.

    Overall time spent on constructing scaffolds: 5m.

  • #2
    Soap denovo counts the result of contigs >100bp as a quality assessment for the contig step. It also makes all contigs down to a size of the kmer+1. It keeps all the contigs from k+1 to 100 for the scaffolding stage later (options allow them to be masked) but they are going to be there.

    The confusion is that scaffolding reports on the k+1 or larger contigs, while other parts of the program talk about 100bp+ contigs.

    Generally lots of the small contigs are based on errors in sequencing or are at the boundary between 2 repeats segments. (repeat) (short contig) (repeat).

    Comment

    Latest Articles

    Collapse

    • seqadmin
      Strategies for Sequencing Challenging Samples
      by seqadmin


      Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
      03-22-2024, 06:39 AM
    • seqadmin
      Techniques and Challenges in Conservation Genomics
      by seqadmin



      The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

      Avian Conservation
      Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
      03-08-2024, 10:41 AM

    ad_right_rmr

    Collapse

    News

    Collapse

    Topics Statistics Last Post
    Started by seqadmin, Yesterday, 06:37 PM
    0 responses
    8 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, Yesterday, 06:07 PM
    0 responses
    8 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 03-22-2024, 10:03 AM
    0 responses
    49 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 03-21-2024, 07:32 AM
    0 responses
    67 views
    0 likes
    Last Post seqadmin  
    Working...
    X