Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Script error for TopHat & Bowtie 2 for paired end and splice variant detection but?

    I have designed this script to run Bowtie2 on paired end fastq data files for a spliceoform investigation. However, it fails as it can't find the -1 input file, this is very strange as the name is correct...
    Please help as i'm not sure what to try?

    tophat -o 9-0GS_bowtie2_for_splicegrapher -j TAIR10_GFF3_genes.gff --no-convert-bam -a 10 -g 1 -F 0 -p 6 --segment-length=28 --segment-mismatches=0 -i 40 -I 5000 --min-coverage-intron=40 --max-coverage-intron=5000 --min-segment-intron=40 --max-segment-intron=5000 TAIR10_chr_all.bt2 -1 9-0GS_AGTCAA_run298_trimmed_1.fastq -2 9-0GS_AGTCAA_run298_trimmed_2.fastq


    [2012-08-09 15:17:04] Beginning TopHat run (v2.0.4)
    -----------------------------------------------
    [2012-08-09 15:17:04] Checking for Bowtie
    Bowtie version: 2.0.0.7
    [2012-08-09 15:17:04] Checking for Samtools
    Samtools version: 0.1.18.0
    [2012-08-09 15:17:04] Checking for Bowtie index files
    [2012-08-09 15:17:04] Checking for reference FASTA file
    Warning: Could not find FASTA file TAIR10_chr_all.bt2.fa
    [2012-08-09 15:17:04] Reconstituting reference FASTA file from Bowtie index
    Executing: /usr/local/bin/bowtie2-inspect TAIR10_chr_all.bt2 > 9-0GS_bowtie2_for_splicegrapher/tmp/TAIR10_chr_all.bt2.fa
    [2012-08-09 15:17:09] Generating SAM header for TAIR10_chr_all.bt2
    Traceback (most recent call last):
    File "/usr/local/bin/tophat", line 3948, in <module>
    sys.exit(main())
    File "/usr/local/bin/tophat", line 3808, in main
    params.read_params = check_reads_format(params, reads_list)
    File "/usr/local/bin/tophat", line 1725, in check_reads_format
    zf = ZReader(f_name, params)
    File "/usr/local/bin/tophat", line 1678, in __init__
    self.file=open(filename)
    IOError: [Errno 2] No such file or directory: '-1'

    But the directory is there, with the correct name, i've also tried with the whole root directory and got the same message.

    Any ideas?

  • #2
    Originally posted by Richard Barker View Post
    tophat -o 9-0GS_bowtie2_for_splicegrapher -j TAIR10_GFF3_genes.gff --no-convert-bam -a 10 -g 1 -F 0 -p 6 --segment-length=28 --segment-mismatches=0 -i 40 -I 5000 --min-coverage-intron=40 --max-coverage-intron=5000 --min-segment-intron=40 --max-segment-intron=5000 TAIR10_chr_all.bt2 -1 9-0GS_AGTCAA_run298_trimmed_1.fastq -2 9-0GS_AGTCAA_run298_trimmed_2.fastq

    [snip]...
    Warning: Could not find FASTA file TAIR10_chr_all.bt2.fa
    ...[snip]
    Any ideas?
    Richard,

    It's not complaining about not being able to find the read file; it can not find the fasta file associated with your bowtie index. This is because you passed the wrong name for the index file to bowtie. Do not include the .bt2 suffix in the name of the index. Also you need to use '-x' before the index name in your command line. You should specify the index as

    Code:
    -x TAIR10_chr_all
    And make sure that your BOWTIE_INDEXES environment variable is properly set so that it can find the index.

    ETA: Do not use -x! The index name and the read file names do not take option letters.
    Last edited by kmcarr; 08-12-2012, 03:59 PM. Reason: Incorrect parameter advice.

    Comment


    • #3
      Thanks for the swift response.

      I replaced TAIR10_chr_all.bt2 with -x TAIR10_chr_all
      and that solved the warning about not finding the FASTA file. Thanks!

      But now it just says
      tophat: option -1 not recognized

      even when i give it the specific fastq file location?

      richard@ubuntu:~/RNA_seq_analysis/paired_end$ tophat -o 9-0GS_bowtie2_for_splicegrapher --no-convert-bam -a 10 -g 1 -F 0 -p 6 --segment-length=28 --segment-mismatches=0 -i 40 -I 5000 --min-coverage-intron=40 --max-coverage-intron=5000 --min-segment-intron=40 --max-segment-intron=5000 -x TAIR10_chr_all -1 /home/richard/RNA_seq_analysis/paired_end/9-0GS_AGTCAA_run298_trimmed_1.fastq -2 /home/richard/RNA_seq_analysis/paired_end/9-0GS_AGTCAA_run298_trimmed_2.fastq
      tophat: option -1 not recognized

      Comment


      • #4
        Originally posted by Richard Barker View Post
        Thanks for the swift response.

        I replaced TAIR10_chr_all.bt2 with -x TAIR10_chr_all
        and that solved the warning about not finding the FASTA file. Thanks!

        But now it just says
        tophat: option -1 not recognized

        even when i give it the specific fastq file location?

        richard@ubuntu:~/RNA_seq_analysis/paired_end$ tophat -o 9-0GS_bowtie2_for_splicegrapher --no-convert-bam -a 10 -g 1 -F 0 -p 6 --segment-length=28 --segment-mismatches=0 -i 40 -I 5000 --min-coverage-intron=40 --max-coverage-intron=5000 --min-segment-intron=40 --max-segment-intron=5000 -x TAIR10_chr_all -1 /home/richard/RNA_seq_analysis/paired_end/9-0GS_AGTCAA_run298_trimmed_1.fastq -2 /home/richard/RNA_seq_analysis/paired_end/9-0GS_AGTCAA_run298_trimmed_2.fastq
        tophat: option -1 not recognized
        Richard,

        What version of TopHat are you running? When I saw in your original post mention of Bowtie2 I assumed you were running TopHat2. To determine which version just run the command:

        Code:
        tophat -v
        and report the output.

        Comment


        • #5
          used tophat -v and says I'm running TopHat v2.0.4

          Comment


          • #6
            Originally posted by Richard Barker View Post
            used tophat -v and says I'm running TopHat v2.0.4
            Ah...now I see the problem. You are running tophat but trying to pass command line options as if were bowtie. You do not use the -1 or -2 for tophat (you also should not use -x for the index, that was my mistake, sorry).

            Have a good look at the documentation for TopHat2. No option identifiers are used for the index or read files. They are the last three parameters passed to tophat, in the order shown. Your command should be in the format:

            Code:
            # tophat [options for tophat] <bowtie_index_name> <read1_file> <read2_file>

            Comment


            • #7
              Thanks kmcarr, it's running now (if interested the final script is below) and was designed for out puts that can be used by SpliceGrapher (Roger et al 2012 http://splicegrapher.sourceforge.net/).

              The script below has a lot of advance options that were recommended by the creators of SpliceGrapher inorder to optimise splice varient detection. Will the outputs also be suitable for cuffmerg and cuffdiff analysis?

              tophat -o 9-0GS_bowtie2_for_splicegrapher --no-convert-bam -a 10 -g 1 -F 0 -p 6 --segment-length=28 --segment-mismatches=0 -i 40 -I 5000 --min-coverage-intron=40 --max-coverage-intron=5000 --min-segment-intron=40 --max-segment-intron=5000 TAIR10_chr_all 9-0GS_AGTCAA_run298_trimmed_1.fastq 9-0GS_AGTCAA_run298_trimmed_2.fastq

              Comment


              • #8
                It worked for the first 2 files but when i went to run the second sample i got the following error message, i'm really not suer what to do?

                richard@ubuntu:~/RNA_seq_analysis/run299_paired_end_for_BowTie_2$ tophat -o 10-0MS_tophat_owtie2_for_splicegrapher --no-convert-bam -a 10 -g 1 -F 0 -p 6 --segment-length=28 --segment-mismatches=0 -i 40 -I 5000 --min-coverage-intron=40 --max-coverage-intron=5000 --min-segment-intron=40 --max-segment-intron=5000 TAIR10_chr_all.bt2 10-0MS_AGTTCC_L002_run298_R1_trimmed.fastq 10-0MS_AGTTCC_L002_run298_R2_trimmed.fastq

                [2012-08-12 14:40:38] Beginning TopHat run (v2.0.4)
                -----------------------------------------------
                [2012-08-12 14:40:38] Checking for Bowtie
                Bowtie version: 2.0.0.7
                [2012-08-12 14:40:38] Checking for Samtools
                Samtools version: 0.1.18.0
                [2012-08-12 14:40:38] Checking for Bowtie index files
                [2012-08-12 14:40:38] Checking for reference FASTA file
                Warning: Could not find FASTA file TAIR10_chr_all.bt2.fa
                [2012-08-12 14:40:38] Reconstituting reference FASTA file from Bowtie index
                Executing: /usr/local/bin/bowtie2-inspect TAIR10_chr_all.bt2 > 10-0MS_tophat_owtie2_for_splicegrapher/tmp/TAIR10_chr_all.bt2.fa
                [2012-08-12 14:40:43] Generating SAM header for TAIR10_chr_all.bt2
                format: fastq
                quality scale: phred33 (default)
                [2012-08-12 14:40:43] Preparing reads
                [FAILED]
                Error running 'prep_reads'
                Error: qual length (121) differs from seq length (87) for fastq record !

                Comment


                • #9
                  Richard,

                  This is exactly the same error as your original post. Check the solution up-thread (but ignore the bit about adding -x).

                  Comment

                  Latest Articles

                  Collapse

                  • seqadmin
                    Essential Discoveries and Tools in Epitranscriptomics
                    by seqadmin




                    The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                    04-22-2024, 07:01 AM
                  • seqadmin
                    Current Approaches to Protein Sequencing
                    by seqadmin


                    Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                    04-04-2024, 04:25 PM

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by seqadmin, Today, 08:47 AM
                  0 responses
                  11 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 04-11-2024, 12:08 PM
                  0 responses
                  60 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 04-10-2024, 10:19 PM
                  0 responses
                  59 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 04-10-2024, 09:21 AM
                  0 responses
                  53 views
                  0 likes
                  Last Post seqadmin  
                  Working...
                  X