Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Error when testing Tophat installation

    I'm just starting out with RNA-seq, and I'm trying to get the installation of Tophat to work. I'm following the "Getting started" instructions on the site, and am currently testing the installation. When I run

    Code:
    tophat -r 20 test_ref reads_1.fq reads_2.fq
    ... this is what I get:

    Code:
    [2014-06-20 13:01:22] Beginning TopHat run (v2.0.11)
    -----------------------------------------------
    [2014-06-20 13:01:22] Checking for Bowtie
    		  Bowtie version:	 2.2.3.0
    [2014-06-20 13:01:22] Checking for Samtools
    		Samtools version:	 0.1.19.0
    [2014-06-20 13:01:22] Checking for Bowtie index files (genome)..
    	Found both Bowtie1 and Bowtie2 indexes.
    [2014-06-20 13:01:22] Checking for reference FASTA file
    [2014-06-20 13:01:22] Generating SAM header for test_ref
    [2014-06-20 13:01:22] Preparing reads
    	 left reads: min. length=75, max. length=75, 100 kept reads (0 discarded)
    	right reads: min. length=75, max. length=75, 100 kept reads (0 discarded)
    [2014-06-20 13:01:22] Mapping left_kept_reads to genome test_ref with Bowtie2 
    	[FAILED]
    Error running:
    /Users/erikfasterius/bin/bam2fastx --all ./tophat_out/tmp/left_kept_reads.bam|/Users/erikfasterius/bin/bowtie2 -k 20 -D 15 -R 2 -N 0 -L 20 -i S,1,1.25 --gbar 4 --mp 6,2 --np 1 --rdg 5,3 --rfg 5,3 --score-min C,-14,0 -p 1 --sam-no-hd -x test_ref -|/Users/erikfasterius/bin/fix_map_ordering --bowtie2-min-score 15 --read-mismatches 2 --read-gap-length 2 --read-edit-dist 2 --read-realign-edit-dist 3 --index-outfile ./tophat_out/tmp/left_kept_reads.mapped.bam.index --sam-header ./tophat_out/tmp/test_ref_genome.bwt.samheader.sam - ./tophat_out/tmp/left_kept_reads.mapped.bam ./tophat_out/tmp/left_kept_reads_unmapped.bam
    ... and I don't really understand the error. It doesn't say what type of error it is, just says [FAILED] and give the long string of whatever caused the error, which I don't understand.

    Any help would be greatly appreciated!
    Erik

  • #2
    There should be a "logs" directory in the folder you specified for tophat output. Start looking at the tophat.log and run.log to see if there is some indication in there as to why the run is failing.

    Comment


    • #3
      Ok, in the tophat.log I can see this as different from the output in Terminal:

      Code:
      [sam_read1] missing header? Abort!
      Full tophat.log:

      Code:
      [2014-06-20 14:57:28] Beginning TopHat run (v2.0.11)
      -----------------------------------------------
      [2014-06-20 14:57:28] Checking for Bowtie
      		  Bowtie version:	 2.2.3.0
      [2014-06-20 14:57:28] Checking for Samtools
      		Samtools version:	 0.1.19.0
      [2014-06-20 14:57:28] Checking for Bowtie index files (genome)..
      [2014-06-20 14:57:28] Checking for reference FASTA file
      [2014-06-20 14:57:28] Generating SAM header for test_ref
      [2014-06-20 14:57:28] Preparing reads
      	 left reads: min. length=75, max. length=75, 100 kept reads (0 discarded)
      	right reads: min. length=75, max. length=75, 100 kept reads (0 discarded)
      [2014-06-20 14:57:28] Mapping left_kept_reads to genome test_ref with Bowtie2 
      [sam_read1] missing header? Abort!
      	[FAILED]
      Error running:
      /Users/erikfasterius/bin/bam2fastx --all ./tophat_out/tmp/left_kept_reads.bam|/Users/erikfasterius/bin/bowtie2 -k 20 -D 15 -R 2 -N 0 -L 20 -i S,1,1.25 --gbar 4 --mp 6,2 --np 1 --rdg 5,3 --rfg 5,3 --score-min C,-14,0 -p 1 --sam-no-hd -x test_ref -|/Users/erikfasterius/bin/fix_map_ordering --bowtie2-min-score 15 --read-mismatches 2 --read-gap-length 2 --read-edit-dist 2 --read-realign-edit-dist 3 --index-outfile ./tophat_out/tmp/left_kept_reads.mapped.bam.index --sam-header ./tophat_out/tmp/test_ref_genome.bwt.samheader.sam - ./tophat_out/tmp/left_kept_reads.mapped.bam ./tophat_out/tmp/left_kept_reads_unmapped.bam
      The run.log looks like this:

      Code:
      /Users/erikfasterius/bin/tophat -r 20 test_ref reads_1.fq reads_2.fq
      #>prep_reads:
      /Users/erikfasterius/bin/prep_reads --min-anchor 8 --splice-mismatches 0 --min-report-intron 50 --max-report-intron 500000 --min-isoform-fraction 0.15 --output-dir ./tophat_out/ --max-multihits 20 --max-seg-multihits 40 --segment-length 25 --segment-mismatches 2 --min-closure-exon 100 --min-closure-intron 50 --max-closure-intron 5000 --min-coverage-intron 50 --max-coverage-intron 20000 --min-segment-intron 50 --max-segment-intron 500000 --read-mismatches 2 --read-gap-length 2 --read-edit-dist 2 --read-realign-edit-dist 3 --max-insertion-length 3 --max-deletion-length 3 -z gzip --inner-dist-mean 20 --inner-dist-std-dev 20 --no-closure-search --no-coverage-search --no-microexon-search --aux-outfile=./tophat_out/prep_reads.info --index-outfile=./tophat_out/tmp/%side%_kept_reads.bam.index --sam-header=./tophat_out/tmp/test_ref_genome.bwt.samheader.sam --outfile=./tophat_out/tmp/%side%_kept_reads.bam reads_1.fq reads_2.fq
      #>map_start:
      /Users/erikfasterius/bin/bam2fastx --all ./tophat_out/tmp/left_kept_reads.bam|/Users/erikfasterius/bin/bowtie2 -k 20 -D 15 -R 2 -N 0 -L 20 -i S,1,1.25 --gbar 4 --mp 6,2 --np 1 --rdg 5,3 --rfg 5,3 --score-min C,-14,0 -p 1 --sam-no-hd -x test_ref -|/Users/erikfasterius/bin/fix_map_ordering --bowtie2-min-score 15 --read-mismatches 2 --read-gap-length 2 --read-edit-dist 2 --read-realign-edit-dist 3 --index-outfile ./tophat_out/tmp/left_kept_reads.mapped.bam.index --sam-header ./tophat_out/tmp/test_ref_genome.bwt.samheader.sam - ./tophat_out/tmp/left_kept_reads.mapped.bam ./tophat_out/tmp/left_kept_reads_unmapped.bam

      Comment


      • #4
        Does your account have write permissions to the directory where your test data is?

        Can you run bam2fastx and see if that is properly installed (i.e. do you see help printed to screen)?

        Comment


        • #5
          Well, being on an admin account I most certainly hope so! =P

          This is what I get when I run bam2fastx, which I assume means it's working correctly:

          Code:
          Usage: bam2fastx [--fasta|-a|--fastq|-q] [--color] [-Q] [--sam|-s|-t]
             [-M|--mapped-only|-A|--all] [-o <outfile>] [-P|--paired] [-N] <in.bam>
             
          Note: By default, reads flagged as not passing quality controls are
             discarded; the -Q option can be used to ignore the QC flag.
             
          Use the -N option if the /1 and /2 suffixes should be appended to
             read names according to the SAM flags
             
          Use the -O option to ignore the OQ tag, if present, when writing quality values

          Comment


          • #6
            That looks good.

            Can you download the test data again and try (it is small) with that new copy? I just tried it and did not have any problems with my copy of TopHat 2.0.11.

            Comment


            • #7
              New copy of test_data, still the same results, and the logs look the same.

              Comment


              • #8
                Did you compile the software from source or are you using the pre-compiled binaries? What flavor of linux is this?

                Comment


                • #9
                  Both Tophat and Bowtie were precompiled, but not Samtools or Boost - I followed the instructions here for them (as well as for the rest of the testing, of course).

                  Comment


                  • #10
                    Let us look at some more log files. Can you check prep_reads.log, bowtie.*.log to see if there are any errors there?

                    Comment


                    • #11
                      bowtie.left_kept_reads.log:

                      Code:
                      100 reads; of these:
                        100 (100.00%) were unpaired; of these:
                          59 (59.00%) aligned 0 times
                          41 (41.00%) aligned exactly 1 time
                          0 (0.00%) aligned >1 times
                      41.00% overall alignment rate
                      prep_reads.log:

                      Code:
                      prep_reads v2.0.11 (4203)
                      ---------------------------
                      0 out of 100 reads have been filtered out
                      0 out of 100 read mates have been filtered out
                      Looks error-free to me.

                      Comment


                      • #12
                        True. Not sure what to say next.

                        Do you also have all these other bowtie related log files?

                        bowtie_build.log
                        bowtie.left_kept_reads.log
                        bowtie.left_kept_reads_seg1.log
                        bowtie.left_kept_reads_seg2.log
                        bowtie.left_kept_reads_seg3.log
                        bowtie.right_kept_reads.log
                        bowtie.right_kept_reads_seg1.log
                        bowtie.right_kept_reads_seg2.log
                        bowtie.right_kept_reads_seg3.log

                        Comment


                        • #13
                          Nope, only got the four we've already been through... So, something related to that, mayhap?

                          Comment


                          • #14
                            Definitely.

                            If you do not have the bowtie-build.log then that means the indexes are missing. Do you have bowtie2-build in your path?

                            Comment


                            • #15
                              Yeah, a bunch of them (bowtie2-build-l/s/l-debug/s-debug).

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Current Approaches to Protein Sequencing
                                by seqadmin


                                Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                                04-04-2024, 04:25 PM
                              • seqadmin
                                Strategies for Sequencing Challenging Samples
                                by seqadmin


                                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                                03-22-2024, 06:39 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, 04-11-2024, 12:08 PM
                              0 responses
                              30 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-10-2024, 10:19 PM
                              0 responses
                              32 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-10-2024, 09:21 AM
                              0 responses
                              28 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-04-2024, 09:00 AM
                              0 responses
                              53 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X