Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • warning message on junction databse

    Hello everyone, I just signed up and wanted to say hi and also ask a question. I just installed tophat and bowtie and created a test file from my large fastq file. This test file has only ten sequences in it (40 lines total). When I run the tophat command it took a very short time and produced the following verbose and an output directory as was supposed to. My question is, in the verbose below there is a statement saying "Warning: junction database is empty!". Can anyone kindly explain what this means or perhaps refer me to a source where I could read about it more. The verbose is pasted below. Thank you for your time.

    [Wed Jun 2 12:10:20 2010] Beginning TopHat run (v1.0.13)
    -----------------------------------------------
    [Wed Jun 2 12:10:20 2010] Preparing output location ./tophat_out/
    [Wed Jun 2 12:10:20 2010] Checking for Bowtie index files
    [Wed Jun 2 12:10:20 2010] Checking for reference FASTA file
    [Wed Jun 2 12:10:20 2010] Checking for Bowtie
    Bowtie version: 0.12.5.0
    [Wed Jun 2 12:10:20 2010] Checking reads
    seed length: 76bp
    format: fastq
    quality scale: phred33 (default)
    [Wed Jun 2 12:10:20 2010] Mapping reads against mm9 with Bowtie
    [Wed Jun 2 12:10:53 2010] Joining segment hits
    Splitting reads into 3 segments
    [Wed Jun 2 12:10:53 2010] Mapping reads against mm9 with Bowtie
    [Wed Jun 2 12:11:28 2010] Mapping reads against mm9 with Bowtie
    [Wed Jun 2 12:12:02 2010] Mapping reads against mm9 with Bowtie
    [Wed Jun 2 12:12:36 2010] Searching for junctions via segment mapping
    Warning: junction database is empty!
    [Wed Jun 2 12:12:36 2010] Joining segment hits
    [Wed Jun 2 12:12:37 2010] Reporting output tracks
    -----------------------------------------------
    Run complete [00:02:17 elapsed]

  • #2
    Originally posted by aghazalp View Post
    My question is, in the verbose below there is a statement saying "Warning: junction database is empty!". Can anyone kindly explain what this means or perhaps refer me to a source where I could read about it more. The verbose is pasted below. Thank you for your time.

    [Wed Jun 2 12:10:20 2010] Beginning TopHat run (v1.0.13)
    -----------------------------------------------
    [Wed Jun 2 12:10:20 2010] Preparing output location ./tophat_out/
    [Wed Jun 2 12:10:20 2010] Checking for Bowtie index files
    [Wed Jun 2 12:10:20 2010] Checking for reference FASTA file
    [Wed Jun 2 12:10:20 2010] Checking for Bowtie
    Bowtie version: 0.12.5.0
    [Wed Jun 2 12:10:20 2010] Checking reads
    seed length: 76bp
    format: fastq
    quality scale: phred33 (default)
    [Wed Jun 2 12:10:20 2010] Mapping reads against mm9 with Bowtie
    [Wed Jun 2 12:10:53 2010] Joining segment hits
    Splitting reads into 3 segments
    [Wed Jun 2 12:10:53 2010] Mapping reads against mm9 with Bowtie
    [Wed Jun 2 12:11:28 2010] Mapping reads against mm9 with Bowtie
    [Wed Jun 2 12:12:02 2010] Mapping reads against mm9 with Bowtie
    [Wed Jun 2 12:12:36 2010] Searching for junctions via segment mapping
    Warning: junction database is empty!
    [Wed Jun 2 12:12:36 2010] Joining segment hits
    [Wed Jun 2 12:12:37 2010] Reporting output tracks
    -----------------------------------------------
    Run complete [00:02:17 elapsed]
    I guess there is nothing strange here. With only 10 reads is not unlikely that there is no junction being detected (unless you selected the 10 reads on purpose to span some junctions).

    Dario

    Comment


    • #3
      Hi Dario,

      I was thinking the same way as you mentioned in your reply so I just run the entire sequence file which took me about 8 hours for about 20 million sequences but I got the same verbose which I am pasting below. So there is definitely something not right. I am using the UCSC mm9 genome downloaded from the tophat website if that matters. here are my commands and the verbose of the output:

      tophat -o AG_tophat_result_reads1 /Users/anatoleghazalpour/ExeFiles/bowtie-0.12.5/indexes/mm9 reads.1.fastq

      [Wed Jun 2 14:52:16 2010] Preparing output location AG_tophat_result_reads1/
      [Wed Jun 2 14:52:16 2010] Checking for Bowtie index files
      [Wed Jun 2 14:52:16 2010] Checking for reference FASTA file
      [Wed Jun 2 14:52:16 2010] Checking for Bowtie
      Bowtie version: 0.12.5.0
      [Wed Jun 2 14:52:16 2010] Checking reads
      seed length: 76bp
      format: fastq
      quality scale: phred33 (default)
      [Wed Jun 2 15:02:05 2010] Mapping reads against mm9 with Bowtie
      [Wed Jun 2 16:03:26 2010] Joining segment hits
      Splitting reads into 3 segments
      [Wed Jun 2 16:10:15 2010] Mapping reads against mm9 with Bowtie
      [Wed Jun 2 18:21:45 2010] Mapping reads against mm9 with Bowtie
      [Wed Jun 2 21:43:26 2010] Mapping reads against mm9 with Bowtie
      [Wed Jun 2 23:22:15 2010] Searching for junctions via segment mapping
      Warning: junction database is empty!
      [Wed Jun 2 23:25:35 2010] Joining segment hits
      [Wed Jun 2 23:32:40 2010] Reporting output tracks

      Comment


      • #4
        Hello,
        I can't tell much... but in case you haven't already done so, have a look at the log files in log directory. Is there anything suspicious? How many reads have been aligned?

        Good luck!
        Dario

        Comment


        • #5
          thanx dario, I looked at them and nothing stands out. Am I the only one who has this problem?

          What can help me is if I could get my hands on a set of reads from mouse RNA which I know should work to test if it is my fastq file is correctly formatted or it is the tophat that is not doing what it is supposed to. Any idea where I can get my hands on a fastq file which has been tested before?

          thanx,

          Anatole

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Essential Discoveries and Tools in Epitranscriptomics
            by seqadmin




            The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
            04-22-2024, 07:01 AM
          • seqadmin
            Current Approaches to Protein Sequencing
            by seqadmin


            Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
            04-04-2024, 04:25 PM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, Yesterday, 11:49 AM
          0 responses
          13 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-24-2024, 08:47 AM
          0 responses
          16 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-11-2024, 12:08 PM
          0 responses
          61 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 10:19 PM
          0 responses
          60 views
          0 likes
          Last Post seqadmin  
          Working...
          X