Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • indexing tophat bam files

    Hi,

    I am having trouble using samtools to index my tophat output for IGV viewing. The tophat output bam should be sorted (although I am having trouble too using samtools to sort the tophat output bam file).

    This is how I call the tophat:
    tophat2 -M --b2-very-sensitive --GTF ~/Documents/transcriptome_gtf/genes.gtf -p 7 --read-realign-edit-dist 0 --output-dir ./example ~/Documents/genome_UCSC/genome ~/Documents/Data/example.fastq

    I then call samtools indexing using:
    samtools index accepted_hits.bam

    But I would get this error:
    [bam_index_build2] fail to create the index file.

    Doing samtools sorting with below command give me this error:
    samtools sort ./accepted_hits.bam sort.prefix

    [bam_sort_core] merging from 12 files...
    open: No such file or directory
    [bam_merge_core] fail to open file sort.prefix.0000.bam

    At this point, I'm not sure what is going on. Please help!

    Zach

  • #2
    Can you sort using this command

    Code:
    $ samtools sort ./accepted_hits.bam accepted_hits_sorted
    and they try indexing the sorted file.

    Comment


    • #3
      Originally posted by GenoMax View Post
      Can you sort using this command

      Code:
      $ samtools sort ./accepted_hits.bam accepted_hits_sorted
      and they try indexing the sorted file.
      Nope, I would get the same error message.

      Comment


      • #4
        Which version of samtools are you using?

        Is sorting process making temporary files (with names containing 0001.bam etc) before you get that error?
        Last edited by GenoMax; 12-05-2014, 05:41 PM.

        Comment


        • #5
          Originally posted by GenoMax View Post
          Which version of samtools are you using?
          Version number is 0.1.19-4428cd

          Originally posted by GenoMax View Post
          Is sorting process making temporary files (with names containing 0001.bam etc) before you get that error?
          It looks like no temporary files are created. The command throws the error message after less than a minute of running (actually I'm not sure how long it typically takes). It looks like it stops after loading the file, since calling the same command with the unmapped bam file as argument is much faster in reaching the error message.

          Comment


          • #6
            Is this the version bundled with TopHat code (which is the one tested to work)?

            Comment


            • #7
              Originally posted by GenoMax View Post
              Is this the version bundled with TopHat code (which is the one tested to work)?
              I think I installed samtools before tophat. Everything works actually with tophat and I am able to use the BAM files for HTSEQ and then DESEQ2.

              Comment


              • #8
                Check how much free disk space you have.

                Comment


                • #9
                  Originally posted by blancha View Post
                  Check how much free disk space you have.
                  That shouldn't be a problem, there are more than 700gb left on the hard-drive.

                  Comment


                  • #10
                    Devon Ryan seems to describe the bug here.


                    I would just install samtools 1.1 which has many interesting new features anyway.
                    It should fix the issue.

                    Comment


                    • #11
                      No harm in trying the latest samtools but TopHat page has this to say

                      Removed SAMtools as an external dependency in order to avoid incompatibility issues with recent and future changes of SAMtools and its code library (an older, stable SAMtools version is now packaged with TopHat)
                      I also see a v.0.1.20 on samtools download page so if you want to stay with the old series give that a try.

                      Comment


                      • #12
                        Right, you should also get the latest version of TopHat that comes bundled with the appropriate version of samtools required by TopHat.

                        You'll then have the best of best worlds, the latest version of TopHat running with a tried and tested version of samtools, and the latest version of samtools with all the new bells and whistles.

                        I'm basing all these assumptions on Devon Ryan's post, but his explanations are quite convincing and his description of the bug corresponds to yours.

                        My advice:
                        1- Install the very latest version of samtools with all the new bells and whistles, and without the bug.
                        2- Install the latest version of TopHat2 which comes bundled with a tried and tested version of samtools, that has been tested for compatibility with TopHat2. (This version will be used internally by TopHat.)

                        Comment


                        • #13
                          Incidentally, you will still need to sort the BAM file before indexing it, as GenoMax pointed out.

                          Comment


                          • #14
                            Thanks for all the inputs, looks like updating fixed this bug!

                            Comment

                            Latest Articles

                            Collapse

                            • seqadmin
                              Strategies for Sequencing Challenging Samples
                              by seqadmin


                              Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                              03-22-2024, 06:39 AM
                            • seqadmin
                              Techniques and Challenges in Conservation Genomics
                              by seqadmin



                              The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                              Avian Conservation
                              Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                              03-08-2024, 10:41 AM

                            ad_right_rmr

                            Collapse

                            News

                            Collapse

                            Topics Statistics Last Post
                            Started by seqadmin, Yesterday, 06:37 PM
                            0 responses
                            8 views
                            0 likes
                            Last Post seqadmin  
                            Started by seqadmin, Yesterday, 06:07 PM
                            0 responses
                            8 views
                            0 likes
                            Last Post seqadmin  
                            Started by seqadmin, 03-22-2024, 10:03 AM
                            0 responses
                            49 views
                            0 likes
                            Last Post seqadmin  
                            Started by seqadmin, 03-21-2024, 07:32 AM
                            0 responses
                            66 views
                            0 likes
                            Last Post seqadmin  
                            Working...
                            X