Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • samtools view segmentation fault

    Hi all,

    I am trying to convert a sorted .bam file back into a .sam file for downstream programs
    but when I run the command:

    samtools view -h in_sorted.bam > out_sorted.sam

    I get the error:

    zsh: segmentation fault samtools view -h in_sorted.bam > out_sorted.sam

    I am on a Ubuntu 12.04 desktop machine (BioLinux), with 8 GB RAM and 1 TB Hard-disk space.

    I could convert from a .sam file to a .bam file through samtools sort to get the sorted .bam file on the same machine so I don't think I'm running out of memory.

    I wonder if anyone could help me figure out what is going wrong.

  • #2
    Hi again,
    I thought I'd write and say I've sorted it out.
    Instead of trying to write a new .sam file I pipe the contents of the .bam file straight to the downstream programs in .sam format

    e.g.
    Code:
    samtools view -h in_sorted.bam | htseq-count - features.gtf

    Comment


    • #3
      Hi, I figured I'd just comment on this thread since I was going to name my thread the same thing. I'm trying to go from sam -> bam with
      Code:
      samtools view -bS ../results/Y920/Y920-trimmed-idba_ud/alignment.sam > ../results/Y920/Y920-trimmed-idba_ud/alignment.bam
      but I'm getting a segmentation fault. I have 8 GB of RAM, which I guess could be a problem. My sam file is ~300Mb. Do I need more RAM or is this a problem with samtools? I'm running 0.1.19 from the Ubuntu Software Repository.

      Also I'm using a virtual machine.
      Last edited by arundurvasula; 07-08-2014, 12:41 PM.

      Comment


      • #4
        Can you confirm that you .bam really is a .bam file? That's the first thing I'd check.

        Comment


        • #5
          swbarnes2 meant ".sam" rather than ".bam". It's likely that a line is corrupt in the SAM file. If the file looks correct, then use "head -n some_number ... | samtools view -Sb - > ..." to determine if you can find the line triggering the problem (or just run the thing in a debugger).

          Comment


          • #6
            @arundurvasular: using the view command requires indeed more RAM (though 300Mb is actually a really small file) Try the following instead:

            Build an index of your genome
            Code:
            samtools faidx genomeFile.fasta
            Use the import command to convert from sam to bam
            Code:
            samtools import $samIndex $samFile $bamFile

            Comment


            • #7
              The "import" command was removed years ago. It's an alias for "samtools view" now.
              Code:
              int main_import(int argc, char *argv[])
              {
                      int argc2, ret;
                      char **argv2;
                      if (argc != 4) {
                              fprintf(stderr, "Usage: bamtk import <in.ref_list> <in.sam> <out.bam>\n");
                              return 1;
                      }
                      argc2 = 6;
                      argv2 = calloc(6, sizeof(char*));
                      argv2[0] = "import", argv2[1] = "-o", argv2[2] = argv[3], argv2[3] = "-bt", argv2[4] = argv[1], argv2[5] = argv[2];
                      ret = main_samview(argc2, argv2);
                      free(argv2);
                      return ret;
              }

              Comment


              • #8
                oh, good to know

                It's then just the provided index that results in less memory consumption?! Cause my command works faster and requires less RAM on my dataset then the previously posted command.

                Comment


                • #9
                  I'd be a bit surprised if it used less memory (at least if there's a significant difference). The only difference is the reheadering that's done with samtools import. A header doesn't really take that much RAM, at least unless it's filled with lots of small contigs.

                  Comment


                  • #10
                    yes, sry, my fault - indeed my header is kind of extremely big (17M contigs)
                    But nevertheless: Memory shouldn't actually be an issue here. The sam file is really small and although arundurvasular is running a virtual machine, 8Gb should be enough.

                    Comment


                    • #11
                      Yeah, 8 gigs should be way more than could possibly be needed.

                      @arundurvasular: If you're unable to find a way around this, then try posting the file somewhere (dropbox, google drive, etc.) and one of us can run samtools in a debugger (unless you're familiar with doing that yourself). It's possible that this is an obscure bug, but I suspect the original file is just malformed.

                      Comment

                      Latest Articles

                      Collapse

                      • seqadmin
                        Strategies for Sequencing Challenging Samples
                        by seqadmin


                        Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                        03-22-2024, 06:39 AM
                      • seqadmin
                        Techniques and Challenges in Conservation Genomics
                        by seqadmin



                        The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                        Avian Conservation
                        Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                        03-08-2024, 10:41 AM

                      ad_right_rmr

                      Collapse

                      News

                      Collapse

                      Topics Statistics Last Post
                      Started by seqadmin, Yesterday, 06:37 PM
                      0 responses
                      10 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, Yesterday, 06:07 PM
                      0 responses
                      9 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 03-22-2024, 10:03 AM
                      0 responses
                      49 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 03-21-2024, 07:32 AM
                      0 responses
                      67 views
                      0 likes
                      Last Post seqadmin  
                      Working...
                      X