Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Tophat2.0.8b segment-junc err=1

    Hi,
    Last week I upgraded last version of Tophat; so now I'm using Tophat 2.0.8b with bowtie 1.0.0 (I need to align color space reads).
    Before I used Tophat 1.4.1 and bowtie 0.12.8, and I never have any problems.

    So I launch this command (the same of previous version except for --bowtie1) :
    tophat -C --bowtie1 --no-coverage-search --library-type fr-secondstrand -G $annotation_file -g 10 -p 12 -r 90 -o $output_path $reference_file $csfastaF3 $csfastaF5
    with reference_file="/.../ENSEMBL_HG19/Homo_sapiens.GRCh37.71.dna.23chr_c"

    So, now this message appers:
    [2013-05-06 16:04:37] Beginning TopHat run (v2.0.8b)
    -----------------------------------------------
    [2013-05-06 16:04:37] Checking for Bowtie
    Bowtie version: 1.0.0.0
    [2013-05-06 16:04:37] Checking for Samtools
    Samtools version: 0.1.18.0
    [2013-05-06 16:04:37] Checking for Bowtie index files
    [2013-05-06 16:04:37] Checking for reference FASTA file
    Warning: Could not find FASTA file /home/references/new_Human_annotations/ENSEMBL_HG19/Homo_sapiens.GRCh37.71.dna.23chr_c.fa
    [2013-05-06 16:04:37] Reconstituting reference FASTA file from Bowtie index
    Executing: ....

    ....
    ....


    [2013-05-06 16:06:57] Reading known junctions from GTF file
    [2013-05-06 16:07:11] Preparing reads

    WARNING: read pairing issues detected (check prep_reads.log) !

    left reads: min. length=75, max. length=75, 49812297 kept reads (187703 discarded)
    right reads: min. length=35, max. length=35, 49679413 kept reads (320587 discarded)
    [2013-05-06 16:20:43] Creating transcriptome data files..
    [2013-05-06 16:21:59] Building Bowtie index from Homo_sapiens.GRCh37.71.23chr.fa
    [2013-05-06 16:52:43] Mapping left_kept_reads to transcriptome Homo_sapiens.GRCh37.71.23chr with Bowtie
    [2013-05-06 17:07:08] Mapping right_kept_reads to transcriptome Homo_sapiens.GRCh37.71.23chr with Bowtie
    [2013-05-06 17:22:15] Resuming TopHat pipeline with unmapped reads
    [2013-05-06 17:22:16] Mapping left_kept_reads.m2g_um to genome Homo_sapiens.GRCh37.71.dna.23chr_c with Bowtie
    [2013-05-06 18:19:33] Mapping left_kept_reads.m2g_um_seg1 to genome Homo_sapiens.GRCh37.71.dna.23chr_c with Bowtie (1/3)
    [2013-05-06 18:53:33] Mapping left_kept_reads.m2g_um_seg2 to genome Homo_sapiens.GRCh37.71.dna.23chr_c with Bowtie (2/3)
    [2013-05-06 19:28:16] Mapping left_kept_reads.m2g_um_seg3 to genome Homo_sapiens.GRCh37.71.dna.23chr_c with Bowtie (3/3)
    [2013-05-06 20:01:48] Mapping right_kept_reads.m2g_um to genome Homo_sapiens.GRCh37.71.dna.23chr_c with Bowtie
    [2013-05-06 20:21:52] Mapping right_kept_reads.m2g_um_seg1 to genome Homo_sapiens.GRCh37.71.dna.23chr_c with Bowtie (1/1)
    [2013-05-06 20:26:56] Searching for junctions via segment mapping
    [FAILED]
    Error: segment-based junction search failed with err =1
    Error: could not get read# 11562180 from stream!
    Any suggestions (maybe also from autors)?

    Thank a lot.
    Last edited by mattia; 05-15-2013, 01:19 AM. Reason: Problem has been solved.

  • #2
    Try and check read# 11562180, maybe something is wrong with the input files.

    Comment


    • #3
      ok, I can check but I'm doubtful yet; I tested Tophat2, by using the same input file I've already aligned with Tophat1 (without any problem).

      Comment


      • #4
        I check: All my reads are OK, but the problem still remains.

        Comment


        • #5
          SOLVED!!! Tophat2.0.8b segment-junc err=1

          I solve my problem: I suppose it can be a bug (in all tophat 2.0.x versions). You have not to set -p parameter. In this way Tophat works correctly, but, obviously, it takes too long!!!!
          For paired-end alignment of 50 Mln of reads (Human) in color space:
          Tophat 1.4.1 (-p 12) takes ~ 5 hours;
          Tophat 2.0.8b (-p not set; default -p 1) takes ~20 hours.

          Comment


          • #6
            I still having problems without the -p option...I think is because the library_type option..maybe??
            I have paired-end reads (75b f and 35b rev) from SOLID 5500xl..

            I got this message when I try to use tophat (v 1.4.1 and laters):

            WARNING: read pairing issues detected (check prep_reads.log) !

            Could u help me?

            Comment


            • #7
              Have you looked at the "prep_reads.log" file to see if that has any additional information as the warning message indicates?

              Comment


              • #8
                When I open the prep_reads.log file I have this:

                prep_reads v1.4.1 (exported)
                ---------------------------
                3906264 out of 3906264 reads have been filtered out

                The reads are in colorspaced format (from SOLID 5500xl)...

                The usage to run tophat 1.4.1 is as follow:

                tophat -C --library-type fr-secondstrand -G file.gff3 -g 10 -r 12 -o output reference_index_dir f3.csfasta f5.csfasta f3.QV.qual f5.QV.qual

                Comment


                • #9
                  Originally posted by faltimiras View Post
                  When I open the prep_reads.log file I have this:

                  prep_reads v1.4.1 (exported)
                  ---------------------------
                  3906264 out of 3906264 reads have been filtered out

                  The reads are in colorspaced format (from SOLID 5500xl)...

                  The usage to run tophat 1.4.1 is as follow:

                  tophat -C --library-type fr-secondstrand -G file.gff3 -g 10 -r 12 -o output reference_index_dir f3.csfasta f5.csfasta f3.QV.qual f5.QV.qual
                  Can you post the first few lines from each of the csfastas and qual files?

                  I think you have to specify -Q/--quals as well, so it knows to expect qual values in a separate file. Just guessing here (I stopped working in colorspace years ago), but I think that without that flag it treats the csfastas as fastqs, so it's possibly looking for readIDs to start with "@" instead of ">".

                  Something like:

                  tophat -C --quals --library-type fr-secondstrand -G file.gff3 -g 10 -r 12 -o output reference_index_dir f3.csfasta f5.csfasta f3.QV.qual f5.QV.qual

                  Also, do you really expect only 12bp inner mate distance with this library (-r 12)? Just curious...

                  Comment

                  Latest Articles

                  Collapse

                  • seqadmin
                    Current Approaches to Protein Sequencing
                    by seqadmin


                    Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                    04-04-2024, 04:25 PM
                  • seqadmin
                    Strategies for Sequencing Challenging Samples
                    by seqadmin


                    Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                    03-22-2024, 06:39 AM

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by seqadmin, 04-11-2024, 12:08 PM
                  0 responses
                  22 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 04-10-2024, 10:19 PM
                  0 responses
                  24 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 04-10-2024, 09:21 AM
                  0 responses
                  19 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 04-04-2024, 09:00 AM
                  0 responses
                  50 views
                  0 likes
                  Last Post seqadmin  
                  Working...
                  X