Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • toAmos Out of Memory!

    Hello everyone,

    I'm analysing Solexa reads from a bacterial genome currently unavailable. As first step, of course, I'm trying to assemble the reads, and my pipeline includes AMOScmp as an important step in the assemblage. However, my Solexa library has more than 11 millions reads, with 36 bp each, and I just can't compile the AFG file for insufficient memory. I tried in many computers, and the best I reached was ~9 millions reads in a tmp file (using a machine with 16Gb in RAM), but I just can't get the work complete.

    I have the same library in a variety of other formats, such as PRB (the original file), fastQ (Solexa and Sanger/Standard), Fasta/Qual, and even the WGS' FRG format. Unless the last one is pretty similar to AFG, I can't make toAmos convert FRG to AFG.

    Does anyone know how to convert FRG (wgs-assembler) to AFG (AMOS) in a memory efficient way?, or otherwise, does anyone know how to merge two AFG files into a single one?

    I tried to make a script for AFG merging, but I'm still having some issues with the format definition, such as "What does FRG's eid means?" or "How must I order the RED's frg?".

    Thanks in advance!

  • #2
    I'm not sure if AMOScmp is what you want if you are doing 'de-novo' assembly (you said "analysing Solexa reads from a bacterial genome currently unavailable" - do you mean you don't have the sequence i.e. you are doing 'de-novo' assembly?)

    I think you may want to try an alternative assembly method.

    Do you have the complete sequence of a close relative? Could anyone suggest a good hybrid strategy in this case?

    I have been reading about SAM format for short reads, I think this may be what you need in this case rather than formats for WGS assembly of long reads.

    HTH,
    Dan.
    Homepage: Dan Bolser
    MetaBase the database of biological databases.

    Comment


    • #3
      Hi Dan,

      Yes, I have the complete sequence of a close relative. And yes, it's a hybrid strategy involving an AMOScmp run as single step of various re-assembling and filtering steps.

      Thanks,
      LRR

      Comment


      • #4
        You can try some softwares like maq, Bowtie, which can map the short reads sequence to a reference genome. Maybe the result you get with such softwares will be thousands of contigs.

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Current Approaches to Protein Sequencing
          by seqadmin


          Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
          04-04-2024, 04:25 PM
        • seqadmin
          Strategies for Sequencing Challenging Samples
          by seqadmin


          Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
          03-22-2024, 06:39 AM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, 04-11-2024, 12:08 PM
        0 responses
        18 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 10:19 PM
        0 responses
        22 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 09:21 AM
        0 responses
        17 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-04-2024, 09:00 AM
        0 responses
        49 views
        0 likes
        Last Post seqadmin  
        Working...
        X