Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Tophat Reporting output tracks - [FAILED]

    Hi everyone,

    I am having a problem with Tophat that has been posted here before here.

    Basically, Tophat works fine, but in the last step when it writes all of the output, the operation fails, and we instead get the message:

    [2012-10-16 07:44:11] Reporting output tracks
    [FAILED]
    Error running tophat-2.0.5/tophat_reports --min-anchor 8 --splice-mismatches 0 --min-report-intron 50 --max-report-intron 500000 --min-isoform-fraction 0.15 --output-dir ./tophat_out/ --max-multihits 20 --max-seg-multihits 40 --segment-length 25 --segment-mismatches 2 --min-closure-exon 100 --min-closure-intron 50 --max-closure-intron 5000 --min-coverage-intron 50 --max-coverage-intron 20000 --min-segment-intron 50 --max-segment-intron 500000 --read-mismatches 2 --read-gap-length 2 --read-edit-dist 2 --read-realign-edit-dist 3 --max-insertion-length 3 --max-deletion-length 3 -z gzip -p11 --inner-dist-mean 50 --inner-dist-std-dev 20 --no-closure-search --no-coverage-search --no-microexon-search --sam-header ./tophat_out/tmp/genome_ref_genome.bwt.samheader.sam --report-secondary-alignments --report-discordant-pair-alignments --report-mixed-alignments --samtools=/apps/group/bioinformatics/apps/samtools-0.1.18/bin/samtools --bowtie2-max-penalty 6 --bowtie2-min-penalty 2 --bowtie2-penalty-for-N 1 --bowtie2-read-gap-open 5 --bowtie2-read-gap-cont 3 --bowtie2-ref-gap-open 5 --bowtie2-ref-gap-cont 3 ./tophat_out/tmp/genome_ref.fa ./tophat_out/junctions.bed ./tophat_out/insertions.bed ./tophat_out/deletions.bed ./tophat_out/fusions.out ./tophat_out/tmp/accepted_hits ./tophat_out/tmp/left_kept_reads.mapped.bam,./tophat_out/tmp/left_kept_reads.candidates ./tophat_out/tmp/left_kept_reads.bam ./tophat_out/tmp/right_kept_reads.mapped.bam,./tophat_out/tmp/right_kept_reads.candidates ./tophat_out/tmp/right_kept_reads.bam
    Loaded 246756 junctions


    I originally received this message when running Tophat 2.0.5. The previous post I referenced had many people with the exact same problem, and they reported that switching to Tophat 2.0.4 often solved the error for some reason.

    However, I've also tried using Tophat 2.0.4, with the exact same results.


    Does anyone have any insight as to why this is happening? I appreciate your advice

  • #2
    I just got the same thing, on Tophat 2.0.5:

    [2012-10-17 07:55:21] Reporting output tracks
    [FAILED]
    Error running /usr/local/bin/tophat_reports --min-anchor 8 --splice-mismatches 0 --min-report-intron 50 --max-report-intron 500000 --min-isoform-fraction 0.15 --output-dir analysis-
    multi/CaS1D/ --max-multihits 20 --max-seg-multihits 40 --segment-length 25 --segment-mismatches 2 --min-closure-exon 100 --min-closure-intron 50 --max-closure-intron 5000 --min-cove
    rage-intron 50 --max-coverage-intron 20000 --min-segment-intron 50 --max-segment-intron 500000 --read-mismatches 2 --read-gap-length 2 --read-edit-dist 6 --read-realign-edit-dist 7
    --max-insertion-length 3 --max-deletion-length 3 -z gzip -p8 --inner-dist-mean 50 --inner-dist-std-dev 20 --gtf-annotations Reference/AAA/melper.gtf --gtf-juncs analysis-multi/CaS1D
    /tmp/melper.juncs --no-closure-search --no-microexon-search --rg-id HS3 --sam-header analysis-multi/CaS1D/tmp/melper_genome.bwt.samheader.sam --report-secondary-alignments --report-
    discordant-pair-alignments --report-mixed-alignments --samtools=/usr/local/bin/samtools --bowtie2-max-penalty 6 --bowtie2-min-penalty 2 --bowtie2-penalty-for-N 1 --bowtie2-read-gap-
    open 5 --bowtie2-read-gap-cont 3 --bowtie2-ref-gap-open 5 --bowtie2-ref-gap-cont 3 Reference/AAA/melper.fa analysis-multi/CaS1D/junctions.bed analysis-multi/CaS1D/insertions.bed ana
    lysis-multi/CaS1D/deletions.bed analysis-multi/CaS1D/fusions.out analysis-multi/CaS1D/tmp/accepted_hits analysis-multi/CaS1D/tmp/left_kept_reads.m2g.bam,analysis-multi/CaS1D/tmp/lef
    t_kept_reads.m2g_um.mapped.bam,analysis-multi/CaS1D/tmp/left_kept_reads.m2g_um.candidates analysis-multi/CaS1D/tmp/left_kept_reads.bam analysis-multi/CaS1D/tmp/right_kept_reads.m2g.
    bam,analysis-multi/CaS1D/tmp/right_kept_reads.m2g_um.mapped.bam,analysis-multi/CaS1D/tmp/right_kept_reads.m2g_um.candidates analysis-multi/CaS1D/tmp/right_kept_reads.bam

    One thing that seemed to be coming up with greater-than-chance frequency in the last round of issues was that the reads were paired end... Or the bugfix on 2.0.4 says "for large datasets". Mine is about 30M paired end reads, so I'm not sure if that hits their threshold or not...

    I'm going to try building from source, as one person suggested, and see if that fixes anything. Will report back if there's success.

    Comment


    • #3
      It could be something due to paired end, but I doubt if the size of the data set is relevant. To test this, I only used the first 50,000 reads of a single sample, for forward and reverse since PE, and ran the same tophat commands.

      Tophat v2.0.4 and v2.0.5 both resulted in the same error message.

      As a side note, I am running tophat with the following parameters:
      --max-multihits 20 --report-secondary-alignments


      Perhaps everyone receiving this error is also passing these parameters??

      Comment


      • #4
        So, I found out something interesting.

        No matter the version, if I remove the optional parameters specifying tophat to report all secondary alignments, the program no longer crashes.

        Therefore, it is likely these optional parameters causing the problem. Has anyone been able to successfully pass these parameters and complete tophat error free??

        Comment


        • #5
          Originally posted by all_your_base View Post
          As a side note, I am running tophat with the following parameters:
          --max-multihits 20 --report-secondary-alignments


          Perhaps everyone receiving this error is also passing these parameters??

          Good catch. I'm running with (for 2.0.4):
          -p8 --no-novel-juncs --read-mismatches 6 --report-secondary-alignments

          That doesn't seem to be a problem on single-end libraries, so it's possible there's some interaction between reporting secondary alignments and the paired ends.

          Comment


          • #6
            Not sure if this is helpful for debugging, but when I try to use the --resume feature in 2.0.5, it errors out:

            [2012-10-17 12:36:36] Resuming TopHat run in directory 'analysis-multi/CaS1D/' stage 'tophat_reports'
            -----------------------------------------------
            [2012-10-17 12:36:37] Checking for Bowtie
            Bowtie version: 2.0.0.7
            [2012-10-17 12:36:37] Checking for Samtools
            Samtools version: 0.1.18.0
            [2012-10-17 12:36:37] Checking for reference FASTA file
            format: fastq
            quality scale: phred33 (default)
            [2012-10-17 12:36:40] Reading known junctions from GTF file
            [2012-10-17 12:36:42] Prepared reads:
            left reads: min. length=50, max. length=50, 36717882 kept reads (972 discarded)
            right reads: min. length=50, max. length=50, 36687792 kept reads (31062 discarded)
            [2012-10-17 12:36:42] Using pre-built transcriptome index..
            Traceback (most recent call last):
            File "/usr/local/bin/tophat", line 4035, in <module>
            sys.exit(main())
            File "/usr/local/bin/tophat", line 4002, in main
            user_supplied_deletions)
            File "/usr/local/bin/tophat", line 3406, in spliced_alignment
            map2gtf(params, sam_header_filename, ref_fasta, left_reads, right_reads)
            File "/usr/local/bin/tophat", line 3245, in map2gtf
            transcriptome_header_filename = get_index_sam_header(params, m2g_bwt_idx)
            File "/usr/local/bin/tophat", line 1391, in get_index_sam_header
            bowtie_sam_header_filename = tmp_dir + idx_prefix.split('/')[-1]
            AttributeError: 'NoneType' object has no attribute 'split'

            Comment


            • #7
              @rflrob,

              I think you may be right about the PE reads + secondary alignment reporting... I just reran a series of tests using all defaults except the following parameter:

              --report-secondary-alignments

              Instead of the two parameters I used last time:

              --max-multihits 20 --report-secondary-alignments



              So, with no special parameters, tophat runs great with my PE data, but when I ask it to report the secondary alignments, the job fails. I wonder if anyone else has experienced this same problem.

              Comment


              • #8
                I've gotten in touch with the Tophat maintainers, and sent them a minimal set of data that reproduces the error, so hopefully we'll get a bugfix soon. I'm also going to try running it on a Mac to see if it's a Linux specific error...

                Comment


                • #9
                  Lo and behold, it does seem to crash on the Mac as well. Fortunately, thanks to a failure to clean out old versions, I discovered it does work on tophat 2.0.0. Time to start walking backwards through the versions until I find something not broken...

                  Comment


                  • #10
                    That was quick! 2.0.3 works as well... until I hear back about a bug fix, I'll use that.

                    Comment


                    • #11
                      Interesting, on my data set I tried 2.0.0, 2.0.4, and 2.0.5... all of which failed! (But missed the magical 2.0.3) I wonder why your 2.0.0 worked and mine did not...

                      Anyway, let's see what the devs say.

                      Comment


                      • #12
                        The devs got back to me with a link to the unofficial version of 2.0.6 that also seems to work. It's up at http://tophat.cbcb.umd.edu/downloads/ , although I'm sure the usual caveats apply about this being software in development, may not function as expected, may ransom your firstborn to Somali pirates, etc.

                        Comment


                        • #13
                          That's great, thanks for the link. I reran my analysis in a different manner to circumvent this tophat bug, but if I need run the same pipeline in the future I'll be sure to use 2.0.6. Hopefully some of the other people posting on the last forum regarding this problem will see your post too.

                          Comment


                          • #14
                            Hello, This is my first post.
                            i am using Tophat 2.0.6 with Bowtie 0.12 and i am trying to run this command on my clusters:
                            tophat -o /fastspace/bioinfo_projects/asfaw_degu/th2_CS21_2 -i 10 -I 11000 --min-coverage-intron 10 --max-coverage-intron 11000 --min-segment-intron 10 --max-segment-intron 11000 -p 22 -G /storage16/projects/asfaw_degu/02.genome/Vitis_vinifera.IGGP_12x.15.gtf -M /storage16/projects/asfaw_degu/02.genome/Vitis_vinifera /storage16/projects/asfaw_degu/01.fastq/00.Project_Aaron_Fait/Sample_CS_21/R1/CS_21_TAGCTT_L006_R1.fastq /storage16/projects/asfaw_degu/01.fastq/00.Project_Aaron_Fait/Sample_CS_21/R2/CS_21_TAGCTT_L006_R2.fastq
                            and i get an error at "Reporting output tracks" this is the log's tail:

                            2012-11-28 23:25:28] Reporting output tracks
                            [FAILED]
                            Error running /storage16/app/bioinfo/tophat-2.0.6.Linux_x86_64/tophat_reports --min-anchor 8 --splice-mismatches 0 --min-report-intron 10 --max-report-intron 11000 --min-isoform-fraction 0.15 --output-dir /fastspace/bioinfo_projects/asfaw_degu/th2_CS21_2/ --max-multihits 20 --max-seg-multihits 40 --segment-length 25 --segment-mismatches 2 --min-closure-exon 100 --min-closure-intron 50 --max-closure-intron 5000 --min-coverage-intron 10 --max-coverage-intron 11000 --min-segment-intron 10 --max-segment-intron 11000 --read-mismatches 2 --read-gap-length 2 --read-edit-dist 2 --read-realign-edit-dist 3 --max-insertion-length 3 --max-deletion-length 3 --bowtie1 -z gzip -p22 --inner-dist-mean 50 --inner-dist-std-dev 20 --gtf-annotations /storage16/projects/asfaw_degu/02.genome/Vitis_vinifera.IGGP_12x.15.gtf --gtf-juncs /fastspace/bioinfo_projects/asfaw_degu/th2_CS21_2/tmp/Vitis_vinifera.juncs --no-closure-search --no-coverage-search --no-microexon-search --sam-header /fastspace/bioinfo_projects/asfaw_degu/th2_CS21_2/tmp/Vitis_vinifera_genome.bwt.samheader.sam --report-discordant-pair-alignments --report-mixed-alignments --samtools=/usr/bin/samtools --bowtie2-max-penalty 6 --bowtie2-min-penalty 2 --bowtie2-penalty-for-N 1 --bowtie2-read-gap-open 5 --bowtie2-read-gap-cont 3 --bowtie2-ref-gap-open 5 --bowtie2-ref-gap-cont 3 /storage16/projects/asfaw_degu/02.genome/Vitis_vinifera.fa /fastspace/bioinfo_projects/asfaw_degu/th2_CS21_2/junctions.bed /fastspace/bioinfo_projects/asfaw_degu/th2_CS21_2/insertions.bed /fastspace/bioinfo_projects/asfaw_degu/th2_CS21_2/deletions.bed /fastspace/bioinfo_projects/asfaw_degu/th2_CS21_2/fusions.out /fastspace/bioinfo_projects/asfaw_degu/th2_CS21_2/tmp/accepted_hits /fastspace/bioinfo_projects/asfaw_degu/th2_CS21_2/tmp/left_kept_reads.m2g.bam,/fastspace/bioinfo_projects/asfaw_degu/th2_CS21_2/tmp/left_kept_reads.m2g_um.mapped,/fastspace/bioinfo_projects/asfaw_degu/th2_CS21_2/tmp/left_kept_reads.m2g_um.candidates /fastspace/bioinfo_projects/asfaw_degu/th2_CS21_2/tmp/left_kept_reads.bam /fastspace/bioinfo_projects/asfaw_degu/th2_CS21_2/tmp/right_kept_reads.m2g.bam,/fastspace/bioinfo_projects/asfaw_degu/th2_CS21_2/tmp/right_kept_reads.m2g_um.mapped,/fastspace/bioinfo_projects/asfaw_degu/th2_CS21_2/tmp/right_kept_reads.m2g_um.candidates /fastspace/bioinfo_projects/asfaw_degu/th2_CS21_2/tmp/right_kept_reads.bam
                            open: Too many open files
                            i assume this is related to one of the flags as some said in the other posts above me, but my error is different than what they got.

                            any advice would be appreciated!

                            Comment


                            • #15
                              Googling the error at the very bottom ("open: Too many open files"), yours might be a different issue. Do you get the same error if you try to run it on a much smaller test set (say, 10,000 reads)? Is there anyone else on the same linux box doing file-intensive stuff?

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Strategies for Sequencing Challenging Samples
                                by seqadmin


                                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                                03-22-2024, 06:39 AM
                              • seqadmin
                                Techniques and Challenges in Conservation Genomics
                                by seqadmin



                                The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                                Avian Conservation
                                Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                                03-08-2024, 10:41 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, Yesterday, 06:37 PM
                              0 responses
                              10 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, Yesterday, 06:07 PM
                              0 responses
                              9 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 03-22-2024, 10:03 AM
                              0 responses
                              50 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 03-21-2024, 07:32 AM
                              0 responses
                              67 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X