Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    No.

    @luc is referring to DISCOVAR DeNovo which is an assembler meant for large genomes. It *requires* 2 x 250 bp reads which currently can only be produced by suitably equipped HiSeq 2500 (in the amount needed, so MiSeq practically does not count).

    On a different note, DISCOVAR DeNovo also needs ~1 TB(+) of RAM to function well (e.g. you are going to use 500 million reads for assembly). You read that right!
    Last edited by GenoMax; 11-04-2015, 04:33 AM.

    Comment


    • #17
      The one time I tried out DISCOVAR DeNovo on a mammalian genome I had to borrow a 512 GB machine. As is all too typical for bioinformatics programs DISCOVAR at times took up all CPUs and at other times poked along using a single CPU. Relevant lines from the log file follow:

      Code:
      physical memory: 504.74 GB
      
      using 708,777,836 reads
      data extraction complete, peak mem = 260.85 GB
      3.27 hours used extracting reads
      
      back from buildReadQGraph
      memory in use = 191.83 GB, peak = 405.28 GB
      
      1 peak mem usage = 405.28 GB
      2.42 minutes used loading stuff
      2 peak mem usage = 405.28 GB
      launching gap assemblies, mem usage = 179,701,415,936
      
      now processing 411707 blobs
      memory in use = 191.38 GB, peak = 405.28 GB
      
      contig line N50: 46,487
      scaffold line N50: 108,870
      total bases in 1 kb+ scaffolds: 2,223,980,361
      total bases in 10 kb+ scaffolds: 2,102,334,133
      There are 708,777,836 reads of mean length 229.9 and mean base quality 34.3.
      MPL1 = mean length of first read in pair up to first error = 199
      (normal range is 175-225 for 250 base reads)
      Estimated chimera rate in read pairs (including mismapping) = 0.46%.
      genomic read coverage, using 1 kb+ scaffolds for genome size estimate: 73.3
      
      peak mem usage = 405.28 GB, total time = 40.9 hours
      Since I had mate-pairs I followed up DISCOVAR with BESST and got a very nice 2.4 GB genome with max of 9.7 MB, N50 of 1.8 MB with 375 scaffolds at N50 or greater.

      My "go-to" default assembler (ABySS) only came up with a 2.4 GB genome with max of 2.6 MB, N50 of 230 KB with 2,689 scaffolds at N50 or greater. So DISCOVAR/BESST is a nice option if you have the reads.

      Comment


      • #18
        Ok, sorry, I thought that reads needed to be 250 bp long in total not 2x250 bp. We cannot afford (and it has no sense) a HiSeq but we could purchase a NextSeq and try to perform deNovo (besides RNA-Seq and resec. but for that there is no problem).

        I work in a supercomputing center (but logically the sequencer will be used by a partner with background in genetics) so those computational requirements wouldn't be a problem. I'll have a look to DISCOVAR assembler.

        Thank you, best regards

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Current Approaches to Protein Sequencing
          by seqadmin


          Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
          04-04-2024, 04:25 PM
        • seqadmin
          Strategies for Sequencing Challenging Samples
          by seqadmin


          Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
          03-22-2024, 06:39 AM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, 04-11-2024, 12:08 PM
        0 responses
        27 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 10:19 PM
        0 responses
        30 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 09:21 AM
        0 responses
        26 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-04-2024, 09:00 AM
        0 responses
        52 views
        0 likes
        Last Post seqadmin  
        Working...
        X