Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • TopHat - empty junction database

    Hi all,

    I ran TopHat, it aligned reads but couldn't find any junctions! So I received files with left-hand reads and right-hand reads (around 30 mln reads) , but the junction database is empy. What can be the problem?

    Command line:
    /home/programs/tophat/bin/tophat -r 200 -p 8 -o tophatout-11-3 --segment-length 24 --keep-tmp /home/genomes/bowtie/hg19 SRR027870_1.fastq SRR027870_2.fastq

    Reads length is 45 bp, I tried to use --segment-reads 24 as mentioned in previous threads, but no effect.

    Here is some information:
    File segment_junc:
    segment_juncs v1.1.2 (exported)
    ---------------------------
    Loading reference sequences...
    Loading ...done
    Loading left segment hits... Processed 500000 root segment groups
    Microaligned 0 segments
    done.
    Loading right segment hits... Processed 500000 root segment groups
    Processed 1000000 root segment groups
    Processed 1500000 root segment groups
    Microaligned 0 segments
    done.
    Found 0 potential split-segment junctions
    Indexing extensions in /tophatout-11-3/tmp/left_kept_reads_seg1_missing.fq
    Indexing extensions in /tophatout-11-3/tmp/left_kept_reads_seg2_missing.fq
    Indexing extensions in /tophatout-11-3/tmp/right_kept_reads_seg1_missing.fq
    Indexing extensions in /tophatout-11-3/tmp/right_kept_reads_seg2_missing.fq
    Total extensions: 72218724
    Total extensions: 72218724
    Total extensions: 72218724
    Total extensions: 72218724
    Looking for junctions by island end pairings
    Adding hits from segment file 0 to coverage map
    Adding hits from segment file 1 to coverage map
    Adding hits from segment file 2 to coverage map
    Adding hits from segment file 3 to coverage map
    Map covers 31687325 bases
    Map covers 30905855 bases in sufficiently long segments
    Map contains 777008 good islands
    38604205 are left looking bases
    38603996 are right looking bases
    Collecting potential splice sites in islands
    reporting synthetic splice junctions...
    Found 0 potential island-end pairing junctions
    done
    -- seg --
    -- done --
    -- cov --
    -- done --
    -- buf --
    -- done --
    Reporting potential splice junctions...done
    Reported 0 total possible splices


    Report from TopHat:

    [Wed Jan 12 12:59:35 2011] Beginning TopHat run (v1.1.2)
    -----------------------------------------------
    [Wed Jan 12 12:59:35 2011] Preparing output location /tophatout-11-3/
    [Wed Jan 12 12:59:35 2011] Checking for Bowtie index files
    [Wed Jan 12 12:59:35 2011] Checking for reference FASTA file
    [Wed Jan 12 12:59:35 2011] Checking for Bowtie
    Bowtie version: 0.12.7.0
    [Wed Jan 12 12:59:36 2011] Checking for Samtools
    Samtools version: 0.1.10.0
    [Wed Jan 12 13:00:41 2011] Checking reads
    min read length: 45bp, max read length: 45bp
    format: fastq
    quality scale: phred33 (default)
    [Wed Jan 12 13:06:58 2011] Mapping reads against hg19 with Bowtie
    [Wed Jan 12 13:18:52 2011] Joining segment hits
    [Wed Jan 12 13:21:33 2011] Mapping reads against hg19 with Bowtie(1/2)
    [Wed Jan 12 13:25:00 2011] Mapping reads against hg19 with Bowtie(2/2)
    [Wed Jan 12 13:29:56 2011] Mapping reads against hg19 with Bowtie
    [Wed Jan 12 13:41:14 2011] Joining segment hits
    [Wed Jan 12 13:43:53 2011] Mapping reads against hg19 with Bowtie(1/2)
    [Wed Jan 12 13:48:26 2011] Mapping reads against hg19 with Bowtie(2/2)
    [Wed Jan 12 13:55:16 2011] Searching for junctions via segment mapping
    Warning: junction database is empty!
    [Wed Jan 12 13:58:43 2011] Joining segment hits
    [Wed Jan 12 14:01:41 2011] Joining segment hits
    [Wed Jan 12 14:04:35 2011] Reporting output tracks
    -----------------------------------------------
    Run complete [01:19:07 elapsed]


    Any help is very appreciated!

  • #2
    Empty junction file

    Hi,

    I'm having the same problem. I've tried using different parameters in a few different combinations and I'm still getting nothing. As you can see in the command below I've relaxed some of the parameters for finding junctions, I changed the --segment-length as people have recommended, and I used the --butterfly-search which is supposed to run a more sensitive search for junctions.

    I've also tried giving tophat a splice junction file as a .junc or .gtf file for a few of the introns that I'm interested in, and I got an error saying the junction file is empty. This is a separate issue though.

    FYI: Dataset is Illumina single reads (~30M) of 100bp trimmed to 85bp.

    Command line: tophat -p 4 -g 1 -a 4 -m 1 -F 0 -i 10 -I 200 --segment-length 24 --butterfly-search -o /media/Data_1/Cam/genome/trimmedreads/goe_1.9/ /media/Data_1/Cam/genome/trimmedreads/ecun /media/Data_1/Cam/genome/GOE1_trimmed85.fastq

    Tophat report:

    [Tue Jan 18 15:22:33 2011] Beginning TopHat run (v1.1.4)
    -----------------------------------------------
    [Tue Jan 18 15:22:33 2011] Preparing output location /media/Data_1/Cam/genome/trimmedreads/goe_1.9///
    [Tue Jan 18 15:22:33 2011] Checking for Bowtie index files
    [Tue Jan 18 15:22:33 2011] Checking for reference FASTA file
    Warning: Could not find FASTA file /media/Data_1/Cam/genome/trimmedreads/ecun.fa
    [Tue Jan 18 15:22:33 2011] Reconstituting reference FASTA file from Bowtie index
    [Tue Jan 18 15:22:33 2011] Checking for Bowtie
    Bowtie version: 0.12.7.0
    [Tue Jan 18 15:22:33 2011] Checking for Samtools
    Samtools version: 0.1.8.0
    [Tue Jan 18 15:22:33 2011] Checking reads
    min read length: 85bp, max read length: 85bp
    format: fastq
    quality scale: phred33 (default)
    [Tue Jan 18 15:34:57 2011] Mapping reads against ecun with Bowtie
    [Tue Jan 18 15:40:30 2011] Joining segment hits
    [Tue Jan 18 15:49:20 2011] Mapping reads against ecun with Bowtie(1/3)
    [Tue Jan 18 15:53:02 2011] Mapping reads against ecun with Bowtie(2/3)
    [Tue Jan 18 15:56:53 2011] Mapping reads against ecun with Bowtie(3/3)
    [Tue Jan 18 16:00:31 2011] Searching for junctions via segment mapping
    Warning: junction database is empty!
    [Tue Jan 18 16:10:39 2011] Joining segment hits
    [Tue Jan 18 16:12:14 2011] Reporting output tracks
    -----------------------------------------------
    Run complete [00:51:53 elapsed]

    It seems like this is a fairly common problem. Does anyone have any thoughts?

    Thanks

    Comment


    • #3
      Check reference.fa file. Mine was empty, may be that was the problem.
      I built it manually using bowtie-inspect (there is such option, read bowtie-inspect help).
      Now everything works.
      Did it help you?

      Comment


      • #4
        Re: empty junction file

        Thanks for the reply altodor.

        When I changed the name of my ref.fa file to match the index name it actually resulted in Tophat finding a few junctions. So thanks for that suggestion. However, it only found 2 junctions when there are over 30, and the boundaries of the one's it found are not perfect. So I still have to figure out why it won't recognize most of the splice boundaries.

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Current Approaches to Protein Sequencing
          by seqadmin


          Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
          04-04-2024, 04:25 PM
        • seqadmin
          Strategies for Sequencing Challenging Samples
          by seqadmin


          Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
          03-22-2024, 06:39 AM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, 04-11-2024, 12:08 PM
        0 responses
        30 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 10:19 PM
        0 responses
        32 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 09:21 AM
        0 responses
        28 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-04-2024, 09:00 AM
        0 responses
        52 views
        0 likes
        Last Post seqadmin  
        Working...
        X