Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • De novo assembler for 300 million Solexa reads

    Hi~ all ^^

    I'm currently trying to assemble genome sequence and I have about 300 million Solexa reads (Paired-end; 220bp insert size).

    When I using Velvet for assembling, I've got an error as below:
    ---------------------------------------------------------------------------------
    velvetg: Can't calloc 18446744072010747658 InsertionMarkers totalling
    18446744046528688288 bytes: Cannot allocate memory
    Reading roadmap file ./Roadmaps
    301083362 roadmaps reads
    Creating insertion markers
    ---------------------------------------------------------------------------------

    So I try to use other assemblers. If somebody tried to assemble paired-end Solexa reads and successfully completed, please tell me about the assembler and run command.

    Server specification: 12 CPU, 72GB RAM

    Thanks for any comments.

  • #2
    you might need almost 10x that amount of memory to handle 300M reads
    use a subset of the reads, trim aggressively, and use stringent kmer and cvCut settings
    --
    Jeremy Leipzig
    Bioinformatics Programmer
    --
    My blog
    Twitter

    Comment


    • #3
      I think that a solution to your problem is use SOAPdenovo or ABySS that have lower memory requirements

      Francesco

      Comment


      • #4
        I second giving ABySS a try. I find it's memory requirements are much more reasonable.

        However, if you do want to get Velvet working you could look at Curtain, which might help you out.

        Comment


        • #5
          Hi natstreet
          there is a thing that I don't understand about Curtain. It is a reference assisted assembler (it uses maq to align reads against a reference and then if improved the reference assembly) or after assembling with velvet or others it maps with maq on the contigs anc then uses the pair read information to improve the assembly?

          Francesco

          Comment


          • #6
            Sorry - I forgot to say that curtain only works if you have a reference as a starting point. I don't have hands-on experience with it - I just came across it and thought it might be worth a pointer.

            Personally, I would give ABySS a try as a starting point.

            Comment


            • #7
              Thanks for all the comments everyone. I'm going to use ABySS.

              Comment


              • #8
                How was your experience with ABySS?

                Thanks,
                Jason

                Comment


                • #9
                  In my experience, SOAPdenovo better than ABySS for assembling large Solexa read set.
                  I successfully completed assembly using SOAPdenovo. :-)

                  Comment


                  • #10
                    Originally posted by odysseus View Post
                    In my experience, SOAPdenovo better than ABySS for assembling large Solexa read set.
                    I successfully completed assembly using SOAPdenovo. :-)
                    what genome?
                    -drd

                    Comment


                    • #11
                      Can anyone tell how much RAM needed for 300M PE reads if we use soapdenovo? I tried soapdenovo and found 264G RAM cannot fill up the soapdenovo requirements.

                      Comment

                      Latest Articles

                      Collapse

                      • seqadmin
                        Strategies for Sequencing Challenging Samples
                        by seqadmin


                        Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                        03-22-2024, 06:39 AM
                      • seqadmin
                        Techniques and Challenges in Conservation Genomics
                        by seqadmin



                        The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                        Avian Conservation
                        Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                        03-08-2024, 10:41 AM

                      ad_right_rmr

                      Collapse

                      News

                      Collapse

                      Topics Statistics Last Post
                      Started by seqadmin, Yesterday, 06:37 PM
                      0 responses
                      11 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, Yesterday, 06:07 PM
                      0 responses
                      10 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 03-22-2024, 10:03 AM
                      0 responses
                      51 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 03-21-2024, 07:32 AM
                      0 responses
                      68 views
                      0 likes
                      Last Post seqadmin  
                      Working...
                      X