Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Samtools sorting incorrectly

    Don't know what's going on. Samtools has forgotten how to sort. I used to have no problem but it is now ordering mouse chromosomal hits in the following order:

    chr10
    chr11
    chr12
    chr13
    chr14
    chr15
    chr16
    chr17
    chr18
    chr19
    chr1
    chr1_GL456210_random
    chr1_GL456211_random
    chr1_GL456212_random
    chr1_GL456213_random
    chr1_GL456221_random
    chr2
    chr3
    chr4
    chr4_GL456216_random
    chr4_GL456350_random
    chr4_JH584292_random
    chr4_JH584293_random
    chr4_JH584294_random
    chr4_JH584295_random
    chr5
    chr5_GL456354_random
    chr5_JH584296_random
    chr5_JH584297_random
    chr5_JH584298_random
    chr5_JH584299_random
    chr6
    chr7
    chr7_GL456219_random
    chr8
    chr9
    chrM
    chrUn_GL456239
    chrUn_GL456359
    chrUn_GL456360
    chrUn_GL456366
    chrUn_GL456367
    chrUn_GL456368
    chrUn_GL456370
    chrUn_GL456372
    chrUn_GL456378
    chrUn_GL456379
    chrUn_GL456381
    chrUn_GL456382
    chrUn_GL456383
    chrUn_GL456385
    chrUn_GL456387
    chrUn_GL456389
    chrUn_GL456390
    chrUn_GL456392
    chrUn_GL456393
    chrUn_GL456394
    chrUn_GL456396
    chrUn_JH584304
    chrX
    chrX_GL456233_random
    chrY
    chrY_JH584300_random
    chrY_JH584301_random
    chrY_JH584302_random
    chrY_JH584303_random


    Like I said, this is a new issue. All previous processing was chr1, chr2, chr3, etc. This is causing problems when I try to merge the new sorted bam with old (correctly) sorted bam files.

    I'm suspecting this might be due to a newer samtools installation. How would I find out what version is installed?
    Last edited by drdna; 12-13-2014, 10:35 AM.

  • #2
    Code:
    samtools --version
    The chromosome sort order in the file produced by samtools sort should be based "on the order in which the @SQ lines appear in the header of the unsorted BAM file."

    I would check the header of the unsorted BAM file before putting the blame on "samtools sort".

    Code:
    samtools view -H unsorted.bam
    You can reorder the chromosomes with Picard tools' ReorderSam. You'll need a reference FASTA file with the chromosomes in the desired order, e.g. karyotypic.

    Comment


    • #3
      Originally posted by blancha View Post
      Code:
      samtools --version
      The chromosome sort order in the file produced by samtools sort should be based "on the order in which the @SQ lines appear in the header of the unsorted BAM file."

      I would check the header of the unsorted BAM file before putting the blame on "samtools sort".

      Code:
      samtools view -H unsorted.bam
      You can reorder the chromosomes with Picard tools' ReorderSam. You'll need a reference FASTA file with the chromosomes in the desired order, e.g. karyotypic.

      It's definitely a samtools error: File.sam was created using bowtie2. The .sam header is present with all entries in correct order.

      File.sam was processed using: samtools view -bS File.sam > File.bam.
      The header was stripped out of the resulting File.bam. This was never a problem in the past. None of my earlier .bam files had headers prior to sorting. I'm going to give it a try using samtools view -bSh to see if I can retain the header in the .bam file.

      Comment


      • #4
        Originally posted by blancha View Post
        Code:
        samtools --version
        Are you sure that samtools --version is the correct command? I had already tried this and it gave me a "[main] unrecognized command '--version' " error.

        Comment


        • #5
          It works if you force inclusion of the header when converting sam to bam:

          Code:
          samtools view -bSH File.sam > File.bam
          Thanks for the heads up about the header blancha.

          Comment


          • #6
            Option -H has no effect with -b, and a mapped BAM file always has @SQ lines. There is no way to strip them off. Please attach a SAM example if you believe samtools is wrong.
            Last edited by lh3; 12-13-2014, 02:40 PM.

            Comment


            • #7
              Originally posted by lh3 View Post
              Option -H has no effect with -b, and a mapped BAM file always has @SQ lines. There is no way to strip them off. Please attach a SAM example if you believe samtools is wrong.
              Oops, I meant:

              Code:
              samtools view -bSh File.sam > File.bam

              Comment


              • #8
                Samtools sort doesn't reorder the header. If the header is in a weird order, then that's because the reference fasta file is in that order (bowtie2 will output header lines in the same order as it encounters them). If you want to reorder the header too, then there's a Picard tools command for that.

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Strategies for Sequencing Challenging Samples
                  by seqadmin


                  Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                  03-22-2024, 06:39 AM
                • seqadmin
                  Techniques and Challenges in Conservation Genomics
                  by seqadmin



                  The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                  Avian Conservation
                  Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                  03-08-2024, 10:41 AM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, Yesterday, 06:37 PM
                0 responses
                10 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, Yesterday, 06:07 PM
                0 responses
                9 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 03-22-2024, 10:03 AM
                0 responses
                50 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 03-21-2024, 07:32 AM
                0 responses
                67 views
                0 likes
                Last Post seqadmin  
                Working...
                X