Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Velvet - hybrid assembly failure

    Dear Velvet users,

    I try to do some hybrid assembly with Abi Solid and Roche 454 reads, when encountered memory / swap out error.
    We have a 48Gb RAMed / 24Gb swapped 2*8 Xeon workstation, runs on the latest Ubuntu Linux 64bit.

    Reads stat:

    Solid: mate pair, ~37 000 000 reads, 2*50nt length
    454: frag ~160000 reads, ~400nt length

    Predicted genome size: 7Mb

    Steps:

    1.a ./saet_mp solidreads_F3.csfasta solidreads _F3_QV.qual 7000000 -numcores 16 -globalrounds 2 -qvupdate -qvhigh -nosampling OK!!!

    1.a ./saet_mp solidreads _R3.csfastas olidreads _R3_QV.qual 7000000 -numcores 16 -globalrounds 2 -qvupdate -qvhigh -nosampling OK!!!

    1c. ./encodeFasta.py -l -n -a 454reads.fna > samplebacteria.de (colorizer from Corona Lite package) OK!!!


    2. ./solid_denovo_preprocessor_v1.2.pl --run_type mates --output preproced_dir_name --f3 solidreads _F3.csfasta --r3 solidreads_R3.csfasta OK!!!


    3. ./velveth_de hashed/ 21 -shortPaired doubleEncoded_input.de -long samplebacteria.de OK!!!


    4. ./velvetg_de hashed/ -exp_cov 20 -ins_length 2600 -min_contig_lgth 200 -cov_cutoff 2

    At this step I ran out of memory/swap and the process automatically killed by the OS.

    .
    .
    [3502.183999] 25532000 nodes visited
    [3502.252314] Concatenation...
    [3504.923849] Renumbering nodes
    [3504.923874] Initial node count 13837915
    [3505.706345] Removed 384358 null nodes
    [3505.706369] Concatenation over!
    [3505.706372] Clipping short tips off graph, drastic
    [3515.809932] Concatenation...
    [3560.510149] Renumbering nodes
    [3560.510170] Initial node count 13453557
    [3560.980761] Removed 3542018 null nodes
    [3560.980778] Concatenation over!
    [3560.980781] 9911539 nodes left
    [3561.090558] Writing into graph file hashed//Graph2...
    [6433.379904] Removing contigs with coverage < -2.000000...
    [6521.066093] Concatenation...
    [6573.221114] Renumbering nodes
    [6573.221139] Initial node count 9911539
    [6573.252071] Removed 0 null nodes
    [6573.252101] Concatenation over!
    [6573.691763] Concatenation...
    [6575.770354] Renumbering nodes
    [6575.770376] Initial node count 9911539
    [6575.789992] Removed 0 null nodes
    [6575.790011] Concatenation over!
    [6575.790476] Clipping short tips off graph, drastic
    [6576.074753] Concatenation...
    [6578.174284] Renumbering nodes
    [6578.174312] Initial node count 9911539
    [6578.193785] Removed 0 null nodes
    [6578.193809] Concatenation over!
    [6578.193811] 9911539 nodes left
    [6578.193971] Read coherency...
    [6578.891643] Identifying unique nodes
    [6579.250084] Done, 8196 unique nodes counted
    [6579.250108] Trimming read tips
    [6598.101153] Renumbering nodes
    [6598.101177] Initial node count 9911539
    [6605.287075] Removed 1511 null nodes
    [6605.500090] Renumbering nodes
    [6605.500105] Initial node count 9910028
    [6605.520381] Removed 0 null nodes
    [6605.520400] Confronted to 5 multiple hits and 15329 null over 16845
    [6605.520403] Read coherency over!
    [6610.547042] Starting pebble resolution...
    [6610.910790] Preparing to correct graph with cutoff 0.200000
    [6635.017686] Computing read to node mapping array sizes
    Killed

    Intermediate file sizes:

    55Gb Graph2
    2,7Gb sequences
    2,8Gb roadmaps
    533Mb pregraph

    What can be wrong? I've made successful hybrid assembly with the CLC Geno Workbench on the same dataset, generated ~200 long contigs with ~7Mb summarized lenght.

    Thank you for any idea:

    Blaize

  • #2
    Hello Blaize,

    You should probably ask this question on the Velvet mailing list where Daniel Zerbino will most likely answer it:


    Greetings,
    Leonardo
    L. Collado Torres, Ph.D. student in Biostatistics.

    Comment


    • #3
      Hi Leonardo,

      I've already did it (without any reply

      Originally posted by lcollado View Post
      Hello Blaize,

      You should probably ask this question on the Velvet mailing list where Daniel Zerbino will most likely answer it:


      Greetings,
      Leonardo

      Comment


      • #4
        Hello Blaize,

        I just checked (a few times) the recent mails from the list and I cannot find yours. So it most likely didn't get sent to the mailing list and I recommend you to re-send it.

        Greetings,
        Leonardo
        L. Collado Torres, Ph.D. student in Biostatistics.

        Comment


        • #5
          Yeah, I've resent it.

          Comment


          • #6
            Hi, Leonardo,

            Daniel recommends to raise the cutoff value, I hope it'll help us!

            B

            Comment


            • #7
              Did This work?

              Would love to hear if you actually pulled off this hybrid assembly

              Comment


              • #8
                Hi Temima,

                we did the job, evaluating the results going now

                B

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Strategies for Sequencing Challenging Samples
                  by seqadmin


                  Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                  03-22-2024, 06:39 AM
                • seqadmin
                  Techniques and Challenges in Conservation Genomics
                  by seqadmin



                  The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                  Avian Conservation
                  Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                  03-08-2024, 10:41 AM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, Yesterday, 06:37 PM
                0 responses
                11 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, Yesterday, 06:07 PM
                0 responses
                10 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 03-22-2024, 10:03 AM
                0 responses
                51 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 03-21-2024, 07:32 AM
                0 responses
                67 views
                0 likes
                Last Post seqadmin  
                Working...
                X