Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • holywoool
    Member
    • Sep 2010
    • 27

    TopHat(1.3.2) reports error!help!

    Mapping RNA-Seq reads to reference with TopHat(v-0.1.3.2) results in error in that
    Code:
    Mapping right_kept_reads against genome_o_6 with bowtie
    gzip:stdout:Broken pipe
    ...
    10Error:[Error 2]No such file or directory:'./tophat_out/temp/right_kept_reads_missing.fq'
    what's happened to gzip with right_kept_reads?The left_kept_reads was processed properly.
    Need your help,thanks!
  • Artur Jaroszewicz
    Member
    • Sep 2011
    • 45

    #2
    holywoool, have you found the answer to your question yet? I am getting the same error when I try to run Tophat. Does anybody know what is going on? Here is my entire message:

    [Wed Nov 16 16:17:54 2011] Beginning TopHat run (v1.3.3)
    -----------------------------------------------
    [Wed Nov 16 16:17:54 2011] Preparing output location /u/home/mcdb/arturj/B1/
    [Wed Nov 16 16:17:54 2011] Checking for Bowtie index files
    [Wed Nov 16 16:17:54 2011] Checking for reference FASTA file
    [Wed Nov 16 16:17:54 2011] Checking for Bowtie
    Bowtie version: 0.12.5.0
    [Wed Nov 16 16:17:54 2011] Checking for Samtools
    Samtools Version: 0.1.18
    [Wed Nov 16 16:17:54 2011] Generating SAM header for hg19_c
    [Wed Nov 16 16:17:54 2011] Preparing reads
    format: fastq
    quality scale: phred64 (reads generated with GA pipeline version >= 1.3)
    [Wed Nov 16 16:17:54 2011] Reading known junctions from GTF file
    Left reads: min. length=49, count=164672067
    [Wed Nov 16 17:01:35 2011] Mapping left_kept_reads against hg19_c with Bowtie
    [Wed Nov 16 17:01:35 2011] Processing bowtie hits

    gzip: stdout: Broken pipe
    Traceback (most recent call last):
    File "/u/home/mcdb/arturj/tophat-1.3.3.Linux_x86_64/tophat", line 2604, in ?
    sys.exit(main())
    File "/u/home/mcdb/arturj/tophat-1.3.3.Linux_x86_64/tophat", line 2563, in main
    user_supplied_deletions)
    File "/u/home/mcdb/arturj/tophat-1.3.3.Linux_x86_64/tophat", line 2218, in spliced_alignment
    segment_len)
    File "/u/home/mcdb/arturj/tophat-1.3.3.Linux_x86_64/tophat", line 1820, in split_reads
    zreads = ZReader(reads_filename, params.system_params, False)
    File "/u/home/mcdb/arturj/tophat-1.3.3.Linux_x86_64/tophat", line 1190, in __init__
    self.file=open(filename)
    IOError: [Errno 2] No such file or directory: '/u/home/mcdb/arturj/B1/tmp/left_kept_reads_missing.fq'

    And the bash script calling it..

    PATH=$PATH:/u/home/mcdb/arturj/bowtie-0.12.5/:/u/home/mcdb/arturj/tophat-1.3.3.Linux_x86_64/:/u/home/mcdb/arturj/samtools-0.1.18/
    export PATH
    export BOWTIE_INDEXES=/u/home/mcdb/arturj/bowtie-0.12.5/indexes/

    tophat --solexa1.3-quals -g 1 -G /u/home/mcdb/arturj/hg19_refflat.gtf -p 8 -o /u/home/mcdb/arturj/B1 hg19_c /u/home/mcdb/arturj/B1/B1.fastq


    Please help!

    Comment

    • holywoool
      Member
      • Sep 2010
      • 27

      #3
      Indeed,there needs a "-Q" option in the TopHat's command.Mine was done after adding "-Q".You may have a try.

      Comment

      • Artur Jaroszewicz
        Member
        • Sep 2011
        • 45

        #4
        That definitely does not help me. I tried it, despite the manual saying it's for separate quality files. Has anyone else had this problem?
        Artur

        Comment

        • Artur Jaroszewicz
          Member
          • Sep 2011
          • 45

          #5
          And thank you for your response, holywoool!

          Comment

          • Ramprasad
            Junior Member
            • Jun 2011
            • 7

            #6
            I guess tat means that there are no unaligned reads..
            Why dont you simulate some reads and cat them to your read files and then try??

            Comment

            • Thomas Doktor
              Senior Member
              • Apr 2009
              • 105

              #7
              Where do you see the release of version 1.3.1 and 1.3.3? They are downloadable from the site, but nothing about them is mentioned on the TopHat site.

              Comment

              • cjp
                Member
                • Jun 2011
                • 58

                #8
                Setting -g means you won't see any reads that align more than once as it sets both the -m and -k flags in Bowtie to 1:

                -k <int> report up to <int> good alignments per read (default: 1)
                -m <int> suppress all alignments if > <int> exist (def: no limit)

                Chris

                Comment

                • Artur Jaroszewicz
                  Member
                  • Sep 2011
                  • 45

                  #9
                  Originally posted by Thomas Doktor View Post
                  Where do you see the release of version 1.3.1 and 1.3.3? They are downloadable from the site, but nothing about them is mentioned on the TopHat site.
                  I see it only on the downloads site... http://tophat.cbcb.umd.edu/downloads/
                  It might be just a bug in this version, but I did try running 4 different datasets at a different time, and it worked fine. I've kept all the same options. The only thing that has changed is the data. The ONLY thing I can think of is that maybe it's an OOM error, just cleverly disguised. However, I'm running my data on 8 processors with 8 Gb each. Should be enough for a 30 Gb fastq file, no? After this try, I'll try going back to 1.3.1 or earlier.

                  cjp, I've been doing the -g option on all my data, and it's never not worked. I'm trying it without the -g argument, however. I'll let you know in a little bit if it works.

                  Comment

                  • Artur Jaroszewicz
                    Member
                    • Sep 2011
                    • 45

                    #10
                    Still nothing.. Tried changing the annotation file from refflat to ensembl, tried removing the '-g' option, -Q (even though I don't see how that would work at all), and it's impossible for there to not be any unaligned transcripts with 164672067 49-bp reads. From my experience, only about half of the reads align anyway, so the probability of nothing aligning would be 2^(-8236033). Maybe I'll try using an earlier version or something, unless someone has any other suggestions?

                    Comment

                    • cjp
                      Member
                      • Jun 2011
                      • 58

                      #11
                      -Q is usually for SOLiD colour-space.

                      When debugging failed TopHat runs, you can also try to run the individual commands from the command line yourself.

                      In the logs/ sub-directory of your output directory there should be a file called run.log which shows the commands that TopHat runs. There are also other log files in there - look to see if you can find errors (especially in the newest one or two files). Otherwise try running each command by itself from the directory you started TopHat in and see if and where empty files are made. The tmp stuff needed to do this is usually kept if a run fails.

                      Chris

                      Comment

                      • Artur Jaroszewicz
                        Member
                        • Sep 2011
                        • 45

                        #12
                        Originally posted by cjp View Post
                        -Q is usually for SOLiD colour-space.

                        When debugging failed TopHat runs, you can also try to run the individual commands from the command line yourself.

                        In the logs/ sub-directory of your output directory there should be a file called run.log which shows the commands that TopHat runs. There are also other log files in there - look to see if you can find errors (especially in the newest one or two files). Otherwise try running each command by itself from the directory you started TopHat in and see if and where empty files are made. The tmp stuff needed to do this is usually kept if a run fails.

                        Chris
                        Excellent suggestion. I will try it once I get out of my Tryptophan coma

                        Comment

                        • Artur Jaroszewicz
                          Member
                          • Sep 2011
                          • 45

                          #13
                          And once our computing cluster is up. Boo, maintenance!

                          Comment

                          • Artur Jaroszewicz
                            Member
                            • Sep 2011
                            • 45

                            #14
                            Wowee. So I evidently was using the wrong index. I thought hg19_c was complete (it's prebuilt from the Bowtie website). I was supposed to use hg19.

                            Originally posted by cjp View Post
                            -Q is usually for SOLiD colour-space.

                            When debugging failed TopHat runs, you can also try to run the individual commands from the command line yourself.

                            In the logs/ sub-directory of your output directory there should be a file called run.log which shows the commands that TopHat runs. There are also other log files in there - look to see if you can find errors (especially in the newest one or two files). Otherwise try running each command by itself from the directory you started TopHat in and see if and where empty files are made. The tmp stuff needed to do this is usually kept if a run fails.

                            Chris
                            However, Chris, I learned a lot about debugging trying to find this error. Thank you greatly for the suggestion!

                            Comment

                            Latest Articles

                            Collapse

                            • SEQadmin2
                              From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                              by SEQadmin2


                              Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                              The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                              ...
                              06-02-2026, 10:05 AM
                            • SEQadmin2
                              Single-Cell Sequencing at an Inflection Point: Early Impacts of New Platforms and Emerging Trends
                              by SEQadmin2


                              With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.


                              Introduction

                              Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...
                              05-22-2026, 06:42 AM
                            • SEQadmin2
                              Environmental Genomics in the Age of NGS: From Microbes to Conservation Strategies
                              by SEQadmin2

                              Studying ecosystems means dealing with complex, multi-species communities that are hard to observe at scale. This complexity, however, hides many important questions to be answered, from how biogeochemical cycles work and how climate change can affect species distribution to how conservation strategies can work best.


                              Genomics, particularly since the expansion of NGS, has transformed ecosystem ecology. By sequencing environmental DNA, we can now assess biodiversity without direct...
                              05-06-2026, 09:04 AM

                            ad_right_rmr

                            Collapse

                            News

                            Collapse

                            Topics Statistics Last Post
                            Started by SEQadmin2, 06-02-2026, 12:03 PM
                            0 responses
                            21 views
                            0 reactions
                            Last Post SEQadmin2  
                            Started by SEQadmin2, 06-02-2026, 11:40 AM
                            0 responses
                            14 views
                            0 reactions
                            Last Post SEQadmin2  
                            Started by SEQadmin2, 05-28-2026, 11:40 AM
                            0 responses
                            29 views
                            0 reactions
                            Last Post SEQadmin2  
                            Started by SEQadmin2, 05-26-2026, 10:12 AM
                            0 responses
                            31 views
                            0 reactions
                            Last Post SEQadmin2  
                            Working...