Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • is my reference too big for maq?

    I am trying to convert my reference fasta file to bfa using maq with the command fasta2bfa.

    I am having a segmentation fault.

    My script has worked before with other old references but I just downloaded the new one from Ensmble and this one it doesn't work.

    I have noticed that it is a little bigger than usual.

    My reference is 11 Gb.

    Could it be that Maq cannot handle the file size or should I run my program with a bigger ram memory?

  • #2
    Run the program "fasta2bfa" and on another terminal check the memory usage every 15 - 20 seconds with the command: free -m (in megabytes) or free -g (in gigabytes) if the memory usage approaches the total Ram that your machine has just before the segmentation fault, That's your problem...

    Comment


    • #3
      According to my network administrator, I am using a node in the cluster that has 32 gigs of ram.

      So I find that the cluster is not the problem, it is more like a C++ limitation in the memory pointing.

      What other tests do you think i can run?

      I will try your post.

      I did though a little experiment splitting the file and 7 gigs is also too big. :S

      Comment


      • #4
        Hello,

        apparently is not a memory problem, since I checked for the memory resources just before the program crashes and there is a lot free.

        The program crashes just after line 166 in the fasta2bfa.c source file.

        Can somebody help me out?

        Comment


        • #5
          Originally posted by luisczul View Post
          I am trying to convert my reference fasta file to bfa using maq with the command fasta2bfa.

          My script has worked before with other old references but I just downloaded the new one from Ensmble and this one it doesn't work.
          In another forum there was a thread about someone having problems indexing the latest human assembly with blat. The problem was that the length of the new haplotype chromosomes pushed the overall genome length above what could be handled by a 32 bit pointer. There may be a similar issue with maq.

          Since the extra haplotype sequences are mostly poly-N (to keep the positions the same as the originals) you could either delete them altogether, or remove the Ns and see if things start working again.

          Comment


          • #6
            Solved

            static void ma_fasta2csfa_core(FILE *fpout, FILE *fpin)
            {
            seq_t seq;
            int i, c1, c2, c;
            char name[256], comment[4096];
            INIT_SEQ(seq);

            So finally the problem got solved.

            The reference file that I was downloading had a very big header file information.

            As you can see in the code i pasted from maq, there was a limitation of char name[256]. When the header went over that then a segmentation fault was created.

            Cheers hope this helps somebody.

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Current Approaches to Protein Sequencing
              by seqadmin


              Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
              04-04-2024, 04:25 PM
            • seqadmin
              Strategies for Sequencing Challenging Samples
              by seqadmin


              Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
              03-22-2024, 06:39 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, 04-11-2024, 12:08 PM
            0 responses
            23 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 10:19 PM
            0 responses
            24 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 09:21 AM
            0 responses
            20 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-04-2024, 09:00 AM
            0 responses
            52 views
            0 likes
            Last Post seqadmin  
            Working...
            X