Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • De novo assembler for 300 million Solexa reads

    Hi~ all ^^

    I'm currently trying to assemble genome sequence and I have about 300 million Solexa reads (Paired-end; 220bp insert size).

    When I using Velvet for assembling, I've got an error as below:
    ---------------------------------------------------------------------------------
    velvetg: Can't calloc 18446744072010747658 InsertionMarkers totalling
    18446744046528688288 bytes: Cannot allocate memory
    Reading roadmap file ./Roadmaps
    301083362 roadmaps reads
    Creating insertion markers
    ---------------------------------------------------------------------------------

    So I try to use other assemblers. If somebody tried to assemble paired-end Solexa reads and successfully completed, please tell me about the assembler and run command.

    Server specification: 12 CPU, 72GB RAM

    Thanks for any comments.

  • #2
    you might need almost 10x that amount of memory to handle 300M reads
    use a subset of the reads, trim aggressively, and use stringent kmer and cvCut settings
    --
    Jeremy Leipzig
    Bioinformatics Programmer
    --
    My blog
    Twitter

    Comment


    • #3
      I think that a solution to your problem is use SOAPdenovo or ABySS that have lower memory requirements

      Francesco

      Comment


      • #4
        I second giving ABySS a try. I find it's memory requirements are much more reasonable.

        However, if you do want to get Velvet working you could look at Curtain, which might help you out.

        Comment


        • #5
          Hi natstreet
          there is a thing that I don't understand about Curtain. It is a reference assisted assembler (it uses maq to align reads against a reference and then if improved the reference assembly) or after assembling with velvet or others it maps with maq on the contigs anc then uses the pair read information to improve the assembly?

          Francesco

          Comment


          • #6
            Sorry - I forgot to say that curtain only works if you have a reference as a starting point. I don't have hands-on experience with it - I just came across it and thought it might be worth a pointer.

            Personally, I would give ABySS a try as a starting point.

            Comment


            • #7
              Thanks for all the comments everyone. I'm going to use ABySS.

              Comment


              • #8
                How was your experience with ABySS?

                Thanks,
                Jason

                Comment


                • #9
                  In my experience, SOAPdenovo better than ABySS for assembling large Solexa read set.
                  I successfully completed assembly using SOAPdenovo. :-)

                  Comment


                  • #10
                    Originally posted by odysseus View Post
                    In my experience, SOAPdenovo better than ABySS for assembling large Solexa read set.
                    I successfully completed assembly using SOAPdenovo. :-)
                    what genome?
                    -drd

                    Comment


                    • #11
                      Can anyone tell how much RAM needed for 300M PE reads if we use soapdenovo? I tried soapdenovo and found 264G RAM cannot fill up the soapdenovo requirements.

                      Comment

                      Latest Articles

                      Collapse

                      • seqadmin
                        Current Approaches to Protein Sequencing
                        by seqadmin


                        Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                        04-04-2024, 04:25 PM
                      • seqadmin
                        Strategies for Sequencing Challenging Samples
                        by seqadmin


                        Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                        03-22-2024, 06:39 AM

                      ad_right_rmr

                      Collapse

                      News

                      Collapse

                      Topics Statistics Last Post
                      Started by seqadmin, 04-11-2024, 12:08 PM
                      0 responses
                      25 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 04-10-2024, 10:19 PM
                      0 responses
                      28 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 04-10-2024, 09:21 AM
                      0 responses
                      24 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 04-04-2024, 09:00 AM
                      0 responses
                      52 views
                      0 likes
                      Last Post seqadmin  
                      Working...
                      X