Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Unable to run Tophat2

    Hi,

    I am unable to run Tophat2 as I get an error.

    Here is the command I run: tophat2 -p 5 -r 62 –library-type fr-firststrand -G /home/jmotwani/mydata/Genomes/Homo_sapiens/Ensembl/GRCh37/Annotation/Genes/gene.gtf –o /home/jmotwani/RNASeq/Alignment_Tophat2 --BOWTIE2_INDEXES /home/jmotwani/mydata/Genomes/Homo_sapiens/Ensembl/GRCh37/Sequence/Bowtie2Index/ C95VLANXX-2046D-01-01-01_L003_R1_Trimmed.fastq C95VLANXX-2046D-01-01-01_L003_R2_Trimmed.fastq

    I get the following error:


    [2016-05-22 22:20:05] Beginning TopHat run (v2.1.1)
    -----------------------------------------------
    [2016-05-22 22:20:05] Checking for Bowtie
    Bowtie version: 2.2.9.0
    [2016-05-22 22:20:05] Checking for Bowtie index files (genome)..
    Error: Could not find Bowtie 2 index files (–library-type.*.bt2l)


    The indexed genome was downloaded from Illumina iGenomes page. Do I have to build it after downloading it? I downloaded the genome, gtf file, and indexed files and gave the path of those files in the command above.

    Could anyone please comment or advise on this.

    Thanks for your time.

    Regards, J

  • #2
    You need to provide a basename for the index files (which in this case should be genome). So the genome index file path becomes

    Code:
    /home/jmotwani/mydata/Genomes/Homo_sapiens/Ensembl/GRCh37/Sequence/Bowtie2Index/genome
    You don't need to include --BOWTIE2_INDEXES. That is a shell variable you could set beforehand.

    Comment


    • #3
      Thanks Genomax.

      I removed the BOWTIE2_INDEXES option and gave the path as suggested above but I still get the same error.

      I am wondering if its due to Bowtie2 index version incompatibility. Any comments?

      Comment


      • #4
        Can you show us a listing of

        Code:
        $ ls -lh /home/jmotwani/mydata/Genomes/Homo_sapiens/Ensembl/GRCh37/Sequence/Bowtie2Index/

        Comment


        • #5
          The Bowtie2Index folder has following files:

          genome.1.bt2
          genome.2.bt2
          genome.3.bt2
          genome.4.bt2
          genome.fa
          genome.rev.1.bt2
          genome.rev.2.bt2
          tophat_out

          Comment


          • #6
            Can you try the following? Looks like you had a single - in your library-type option before.

            Code:
            tophat2 -p 5 -r 62 --library-type fr-firststrand -G /home/jmotwani/mydata/Genomes/Homo_sapiens/Ensembl/GRCh37/Annotation/Genes/gene.gtf –o /home/jmotwani/RNASeq/Alignment_Tophat2 /home/jmotwani/mydata/Genomes/Homo_sapiens/Ensembl/GRCh37/Sequence/Bowtie2Index/genome C95VLANXX-2046D-01-01-01_L003_R1_Trimmed.fastq C95VLANXX-2046D-01-01-01_L003_R2_Trimmed.fastq

            Comment


            • #7
              I tried the following:

              tophat2 -p 5 -r 62 -–library-type fr-firststrand --GTF /home/jmotwani/mydata/Genomes/Homo_sapiens/Ensembl/GRCh37/Annotation/Genes/gene.gtf –o /home/jmotwani/RNASeq/Alignment_Tophat2 /home/jmotwani/mydata/Genomes/Homo_sapiens/Ensembl/GRCh37/Sequence/Bowtie2Index/genome C95VLANXX-2046D-01-01-01_L003_R1_Trimmed.fastq C95VLANXX-2046D-01-01-01_L003_R2_Trimmed.fastq

              but now I get a different error:
              tophat: option -? not recognized
              for detailed help see http://ccb.jhu.edu/software/tophat/manual.shtml

              Comment


              • #8
                Try

                Code:
                tophat2 --num-threads 5 --mate-inner-distance 62 --library-type fr-firststrand --GTF /home/jmotwani/mydata/Genomes/Homo_sapiens/Ensembl/GRCh37/Annotation/Genes/gene.gtf --output-dir /home/jmotwani/RNASeq/Alignment_Tophat2 /home/jmotwani/mydata/Genomes/Homo_sapiens/Ensembl/GRCh37/Sequence/Bowtie2Index/genome C95VLANXX-2046D-01-01-01_L003_R1_Trimmed.fastq C95VLANXX-2046D-01-01-01_L003_R2_Trimmed.fastq
                There is one last possibility. TopHat may not be liking the - you have in your fastq file names. So if you could change those to "_" if the above does not work.

                Code:
                tophat2 --num-threads 5 --mate-inner-distance 62 --library-type fr-firststrand --GTF /home/jmotwani/mydata/Genomes/Homo_sapiens/Ensembl/GRCh37/Annotation/Genes/gene.gtf --output-dir /home/jmotwani/RNASeq/Alignment_Tophat2 /home/jmotwani/mydata/Genomes/Homo_sapiens/Ensembl/GRCh37/Sequence/Bowtie2Index/genome C95VLANXX_2046D_01_01_01_L003_R1_Trimmed.fastq C95VLANXX_2046D_01_01_01_L003_R2_Trimmed.fastq

                Comment


                • #9
                  Thank you. The command worked this time but partly
                  Command :

                  tophat2 --num-threads 5 --mate-inner-dist 62 --library-type fr-firststrand --GTF /home/jmotwani/mydata/Genomes/Homo_sapiens/Ensembl/GRCh37/Annotation/Genes/genes.gtf --output-dir /home/jmotwani/RNASeq/Alignment_Tophat2 /home/jmotwani/mydata/Genomes/Homo_sapiens/Ensembl/GRCh37/Sequence/Bowtie2Index/genome read1.fastq read2.fastq

                  Output:


                  [2016-05-23 22:07:53] Beginning TopHat run (v2.1.1)
                  -----------------------------------------------
                  [2016-05-23 22:07:53] Checking for Bowtie
                  Bowtie version: 2.2.9.0
                  [2016-05-23 22:07:53] Checking for Bowtie index files (genome)..
                  [2016-05-23 22:07:53] Checking for reference FASTA file
                  [2016-05-23 22:07:53] Generating SAM header for /home/jmotwani/mydata/Genomes/Homo_sapiens/Ensembl/GRCh37/Sequence/Bowtie2Index/genome
                  [2016-05-23 22:07:55] Reading known junctions from GTF file
                  [2016-05-23 22:08:28] Preparing reads
                  [FAILED]
                  Error running 'prep_reads'
                  Error: qual length (111) differs from seq length (106) for fastq record !

                  Comment


                  • #10
                    That indicates that there is an error in your reads file. What trimming program did you uses (and was it paired-end aware)?

                    Comment


                    • #11
                      I used a in-house script (cleanadaptors) to trim the raw fastq files. I run the command to trim the data in the following way:

                      cleanadaptors -I /home/jmotwani/RNASeq/contam.fa -q 20 -x 25 -F C95VLANXX-2046D-01-01-01_L003_R1.fastq -o C95VLANXX-2046D-01-01-01_L003_R1_Trimmed.fastq -G C95VLANXX-2046D-01-01-01_L003_R2.fastq -O C95VLANXX-2046D-01-01-01_L003_R2_trimmed.fastq

                      -q is for quality and -x is for min length of the read

                      Comment


                      • #12
                        For that you are going to need to consult the person who wrote the script.

                        If you can't then I suggest that you use bbduk or trimmomatic or cutadapt (a standard trimming program).

                        Comment


                        • #13
                          I will be using trimmomatic now,thanks for all the help.

                          Comment

                          Latest Articles

                          Collapse

                          • seqadmin
                            Strategies for Sequencing Challenging Samples
                            by seqadmin


                            Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                            03-22-2024, 06:39 AM
                          • seqadmin
                            Techniques and Challenges in Conservation Genomics
                            by seqadmin



                            The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                            Avian Conservation
                            Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                            03-08-2024, 10:41 AM

                          ad_right_rmr

                          Collapse

                          News

                          Collapse

                          Topics Statistics Last Post
                          Started by seqadmin, Yesterday, 06:37 PM
                          0 responses
                          12 views
                          0 likes
                          Last Post seqadmin  
                          Started by seqadmin, Yesterday, 06:07 PM
                          0 responses
                          10 views
                          0 likes
                          Last Post seqadmin  
                          Started by seqadmin, 03-22-2024, 10:03 AM
                          0 responses
                          51 views
                          0 likes
                          Last Post seqadmin  
                          Started by seqadmin, 03-21-2024, 07:32 AM
                          0 responses
                          68 views
                          0 likes
                          Last Post seqadmin  
                          Working...
                          X