Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Could not find Bowtie2 index files (genome.*.bt2)

    Hi, this is my first time here and I'm currently in the process of self-learning how to use some sequencing software and ubuntu in general. I've been following the steps of this paper:



    I'm working on a server that I currently have no idea what already is installed or not..

    I ran this command:

    tophat -p 4 -G genes.gtf -o C1_R1_thout genome C1_R1_1.fq C1_R1_2.fq

    and this is what showed afterwards:


    [2015-03-03 09:44:36] Beginning TopHat run (v2.0.9)
    -----------------------------------------------
    [2015-03-03 09:44:36] Checking for Bowtie
    Bowtie version: 2.1.0.0
    [2015-03-03 09:44:36] Checking for Samtools
    Samtools version: 0.1.19.0
    [2015-03-03 09:44:36] Checking for Bowtie index files (genome)..
    Error: Could not find Bowtie 2 index files (genome.*.bt2)

    How do I resolve this error?

    Thank you.

  • #2
    Did you download the genome index files from iGenomes: http://support.illumina.com/sequenci...e/igenome.html. Look for the Ensembl drosophila links. This is a big download. The files you need are going to be in this directory hierarchy in a directory called Bowtie2Index. The "genome" part in the command refers to the "basename" (there will be several files that start with genome and then have separate names after .) of the genome index.

    It also appears that you are using older version of Tuxedo suite programs. Before you go too far in perhaps you should ask your system administrators if they can update the software for you.

    Comment


    • #3
      Hi GenoMax,

      Yes I already downloaded the iGenome I believe and unpacked it. It was almost 2GB if I remember correctly. It was the BDGP5.25

      And yes, I'm using basically a test server to get familiar before getting to use real hardware.. Gotta start washing dishes before getting to work as a chef haha.

      Comment


      • #4
        In that case provide full directory path to /path_to/Bowtie2Index/genome location in your command above.

        Comment


        • #5
          Do I add this path after the genes.gtf in the command? I'm sorry if this is a dumb question, I'm still very new to all of this.

          Comment


          • #6
            Originally posted by Juntheboon View Post
            Do I add this path after the genes.gtf in the command? I'm sorry if this is a dumb question, I'm still very new to all of this.
            Like this:

            Code:
            $ tophat -p 4 -G /path_to/Annotations/Genes/genes.gtf -o C1_R1_thout /path_to/Bowtie2Index/genome C1_R1_1.fq C1_R1_2.fq
            Add the path to genes.gtf file while you are at it.

            Comment


            • #7
              I got this:

              $ tophat -p 4 -G /path_to/Annotations/Genes/genes.gtf -o C1_R1_thout /path_to/Bowtie2Index/genome C1_R1_1.fq C1_R1_2.fq

              [2015-03-03 10:21:58] Beginning TopHat run (v2.0.9)
              -----------------------------------------------
              [2015-03-03 10:21:58] Checking for Bowtie
              Bowtie version: 2.1.0.0
              [2015-03-03 10:21:58] Checking for Samtools
              Samtools version: 0.1.19.0
              Error: cannot find transcript file /path_to/Annotations/Genes/genes.gtf

              I checked through Nautilus whether or not I have the bowtie2 index and I definitely do.. Not sure what the issue is.

              Comment


              • #8
                You need to change the "path_to" part to reflect the actual path you have on your system.

                Comment


                • #9
                  Got it. I think it worked this way.

                  tophat -p 4 -G /home/jun/Drosophila_melanogaster/Ensembl/BDGP5.25/Annotation/Genes/genes.gtf -o C1_R1_thout /home/jun/Drosophila_melanogaster/Ensembl/BDGP5.25/Sequence/Bowtie2Index/genome C1_R1_1.fq C1_R1_2.fq

                  Now I get another error..


                  [2015-03-03 10:36:24] Checking for Bowtie index files (genome)..
                  [2015-03-03 10:36:24] Checking for reference FASTA file
                  [2015-03-03 10:36:24] Generating SAM header for /home/jun/Drosophila_melanogaster/Ensembl/BDGP5.25/Sequence/Bowtie2Index/genome
                  Traceback (most recent call last):
                  File "/usr/bin/tophat", line 4072, in <module>
                  sys.exit(main())
                  File "/usr/bin/tophat", line 3926, in main
                  params.read_params = check_reads_format(params, reads_list)
                  File "/usr/bin/tophat", line 1829, in check_reads_format
                  zf = ZReader(f_name, params)
                  File "/usr/bin/tophat", line 1782, in __init__
                  self.file=open(filename)
                  IOError: [Errno 2] No such file or directory: 'C1_R1_1.fq'

                  Comment


                  • #10
                    If your sequence files are not in the directory you are running tophat from then provide full path to both of those files.

                    Start looking at this on the side to understand what we are doing here: http://korflab.ucdavis.edu/Unix_and_...ent.html#part1

                    Comment


                    • #11
                      Thank you for all of the help. This guide is awesome.

                      Comment

                      Latest Articles

                      Collapse

                      • seqadmin
                        Essential Discoveries and Tools in Epitranscriptomics
                        by seqadmin


                        The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist on Modified Bases...
                        Yesterday, 07:01 AM
                      • seqadmin
                        Current Approaches to Protein Sequencing
                        by seqadmin


                        Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                        04-04-2024, 04:25 PM

                      ad_right_rmr

                      Collapse

                      News

                      Collapse

                      Topics Statistics Last Post
                      Started by seqadmin, 04-11-2024, 12:08 PM
                      0 responses
                      55 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 04-10-2024, 10:19 PM
                      0 responses
                      52 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 04-10-2024, 09:21 AM
                      0 responses
                      45 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 04-04-2024, 09:00 AM
                      0 responses
                      55 views
                      0 likes
                      Last Post seqadmin  
                      Working...
                      X