Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Tophat error: Broken pipe

    Hi,

    I am a student trying my hand at bioinformatics. While performing an analysis I got the following output:

    [2014-11-04 11:54:56] Beginning TopHat run (v2.0.9)
    -----------------------------------------------
    [2014-11-04 11:54:56] Checking for Bowtie
    Bowtie version: 2.1.0.0
    [2014-11-04 11:54:56] Checking for Samtools
    Samtools version: 0.1.19.0
    [2014-11-04 11:54:56] Checking for Bowtie index files (genome)..
    Found both Bowtie1 and Bowtie2 indexes.
    [2014-11-04 11:54:56] Checking for reference FASTA file
    [2014-11-04 11:54:56] Generating SAM header for genome
    format: fastq
    quality scale: phred33 (default)
    [2014-11-04 11:54:57] Reading known junctions from GTF file
    [2014-11-04 11:55:01] Preparing reads
    left reads: min. length=76, max. length=101, 946251048 kept reads (4766237 discarded)
    right reads: min. length=76, max. length=101, 945724834 kept reads (5292451 discarded)
    [2014-11-04 19:36:04] Building transcriptome data files..
    [2014-11-04 19:36:37] Building Bowtie index from genes.fa
    [2014-11-04 19:46:42] Mapping left_kept_reads to transcriptome genes with Bowtie2
    [2014-11-06 20:19:53] Mapping right_kept_reads to transcriptome genes with Bowtie2
    [2014-11-08 21:00:22] Resuming TopHat pipeline with unmapped reads
    samtools: writing to standard output failed: Broken pipe
    samtools: error closing standard output: -1
    samtools: writing to standard output failed: Broken pipe
    samtools: error closing standard output: -1
    [2014-11-08 21:00:22] Mapping left_kept_reads.m2g_um to genome genome with Bowtie2
    samtools: writing to standard output failed: Broken pipe
    samtools: error closing standard output: -1
    [2014-11-10 09:23:43] Mapping left_kept_reads.m2g_um_seg1 to genome genome with Bowtie2 (1/4)
    [2014-11-10 16:29:25] Mapping left_kept_reads.m2g_um_seg2 to genome genome with Bowtie2 (2/4)
    [2014-11-10 22:21:40] Mapping left_kept_reads.m2g_um_seg3 to genome genome with Bowtie2 (3/4)
    [2014-11-11 13:25:12] Mapping left_kept_reads.m2g_um_seg4 to genome genome with Bowtie2 (4/4)
    samtools: writing to standard output failed: Broken pipe
    samtools: error closing standard output: -1

    and so on. The test file went of without a hitch. I am thinking it may have something to do with the size of the reads(100 files, 545.1GB) Can anyone help me with this?

  • #2
    Are you specifying those 100 file names as input on the command line for tophat?

    Error is writing to "standard output". Are you running this job in a terminal or using a scheduler on a cluster?
    Last edited by GenoMax; 11-19-2014, 08:49 AM.

    Comment


    • #3
      A recent thread on similar issue on biostars: https://www.biostars.org/p/112749/

      If you are using new samtools then it would be best to drop back to 0.1.19 (which is bundled with tophat).

      You are running an old version of TopHat. It would be best to upgrade to the latest.

      Comment


      • #4
        I am specifying all 100 files and running in a terminal
        Last edited by sublimetech; 11-19-2014, 09:07 AM.

        Comment


        • #5
          Originally posted by GenoMax View Post
          A recent thread on similar issue on biostars: https://www.biostars.org/p/112749/

          If you are using new samtools then it would be best to drop back to 0.1.19 (which is bundled with tophat).

          You are running an old version of TopHat. It would be best to upgrade to the latest.
          [2014-11-04 11:54:56] Checking for Samtools
          Samtools version: 0.1.19.0

          Comment


          • #6
            The error is being thrown by samtools. I wonder if that is happening because your shell is unable to read a long list of variables (files). (Just for reference: http://superuser.com/questions/72889...nd-broken-pipe)

            Comment


            • #7
              but doesnt the command require a list?

              Comment


              • #8
                It does. Are these files biological/technical replicates or something else?

                See: https://www.biostars.org/p/105389/

                Comment


                • #9
                  Biological replicates. so what I gather the mentioned link is that I should run tophat 50 times? (once for each paired read)

                  Comment


                  • #10
                    You have 50 replicates?

                    Comment


                    • #11
                      I do, Also could you explain why biological replicates need to be done in separate runs as opposed to one run for technical replicates? (need to put everything in a report for school)

                      Comment


                      • #12
                        If you aligned all of the biological replicates at once then they'd all end up in one file. Then you couldn't measure variability...which would likely defeat the purpose of the experiment.

                        Comment


                        • #13
                          So I desided on starting runn yesterday, juse to so what would happen if I provided less inputs. Here is the output:

                          [2014-11-19 13:33:33] Beginning TopHat run (v2.0.9)
                          -----------------------------------------------
                          [2014-11-19 13:33:33] Checking for Bowtie
                          Bowtie version: 2.1.0.0
                          [2014-11-19 13:33:33] Checking for Samtools
                          Samtools version: 0.1.19.0
                          [2014-11-19 13:33:33] Checking for Bowtie index files (genome)..
                          Found both Bowtie1 and Bowtie2 indexes.
                          [2014-11-19 13:33:33] Checking for reference FASTA file
                          [2014-11-19 13:33:33] Generating SAM header for genome
                          format: fastq
                          quality scale: phred33 (default)
                          [2014-11-19 13:33:59] Reading known junctions from GTF file
                          [2014-11-19 13:34:02] Preparing reads
                          left reads: min. length=76, max. length=78, 89173496 kept reads (287240 discarded)
                          [2014-11-19 13:55:23] Building transcriptome data files..
                          [2014-11-19 13:55:35] Building Bowtie index from genes.fa
                          [2014-11-19 14:06:09] Mapping left_kept_reads to transcriptome genes with Bowtie2
                          [2014-11-19 18:27:54] Resuming TopHat pipeline with unmapped reads
                          samtools: writing to standard output failed: Broken pipe
                          samtools: error closing standard output: -1
                          samtools: writing to standard output failed: Broken pipe
                          samtools: error closing standard output: -1
                          [2014-11-19 18:27:54] Mapping left_kept_reads.m2g_um to genome genome with Bowtie2
                          samtools: writing to standard output failed: Broken pipe
                          samtools: error closing standard output: -1
                          [2014-11-19 21:33:06] Mapping left_kept_reads.m2g_um_seg1 to genome genome with Bowtie2 (1/3)
                          [2014-11-19 21:50:51] Mapping left_kept_reads.m2g_um_seg2 to genome genome with Bowtie2 (2/3)
                          [2014-11-19 22:09:23] Mapping left_kept_reads.m2g_um_seg3 to genome genome with Bowtie2 (3/3)
                          [2014-11-19 22:58:35] Searching for junctions via segment mapping
                          [2014-11-19 23:14:49] Retrieving sequences for splices
                          [2014-11-19 23:16:07] Indexing splices
                          samtools: writing to standard output failed: Broken pipe
                          samtools: error closing standard output: -1
                          [2014-11-19 23:17:02] Mapping left_kept_reads.m2g_um_seg1 to genome segment_juncs with Bowtie2 (1/3)
                          [2014-11-19 23:21:35] Mapping left_kept_reads.m2g_um_seg2 to genome segment_juncs with Bowtie2 (2/3)
                          [2014-11-19 23:26:24] Mapping left_kept_reads.m2g_um_seg3 to genome segment_juncs with Bowtie2 (3/3)
                          [2014-11-19 23:33:42] Joining segment hits
                          samtools: writing to standard output failed: Broken pipe
                          samtools: error closing standard output: -1
                          samtools: writing to standard output failed: Broken pipe
                          samtools: error closing standard output: -1
                          samtools: writing to standard output failed: Broken pipe
                          samtools: error closing standard output: -1
                          [2014-11-19 23:41:58] Reporting output tracks
                          samtools: writing to standard output failed: Broken pipe
                          samtools: error closing standard output: -1

                          with command: tophat -G genes.gtf genome SRR1283038_1.fastq,SRR1283038_2.fastq

                          I still don't get this...

                          Comment


                          • #14
                            Maybe you're running out of disk space? Anyway, if you have enough RAM, then give STAR a try. It's MUCH faster.

                            Comment


                            • #15
                              If it is possible, upgrade to the latest version of TopHat before you spend any time on debugging this.

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Essential Discoveries and Tools in Epitranscriptomics
                                by seqadmin


                                The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist on Modified Bases...
                                Yesterday, 07:01 AM
                              • seqadmin
                                Current Approaches to Protein Sequencing
                                by seqadmin


                                Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                                04-04-2024, 04:25 PM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, 04-11-2024, 12:08 PM
                              0 responses
                              49 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-10-2024, 10:19 PM
                              0 responses
                              50 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-10-2024, 09:21 AM
                              0 responses
                              44 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-04-2024, 09:00 AM
                              0 responses
                              55 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X