Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Tophat prep_reads error

    Has anyone seen this error before? I had Tophat working fine, but wanted to use the newly provided Illumina bowtie_index and I'm now getting this error:

    [Thu Aug 4 13:56:11 2011] Beginning TopHat run (v1.3.0)
    -----------------------------------------------
    [Thu Aug 4 13:56:11 2011] Preparing output location /m/illumina/tophat/
    [Thu Aug 4 13:56:11 2011] Checking for Bowtie index files
    [Thu Aug 4 13:56:11 2011] Checking for reference FASTA file
    [Thu Aug 4 13:56:11 2011] Checking for Bowtie
    Bowtie version: 0.12.7.0
    [Thu Aug 4 13:56:11 2011] Checking for Samtools
    Samtools Version: 0.1.8
    [Thu Aug 4 13:56:11 2011] Generating SAM header for /m/ref/genome
    [Thu Aug 4 13:56:40 2011] Preparing reads
    format: fastq
    quality scale: phred33 (default)
    [FAILED]
    Error retrieving prep_reads info.

    OR!!! In two of my samples, I get this error message:

    Error: qual length (103) differs from seq length (63) for fastq record !

    In case it matters, there is a slice of my pre-maq ill2sanger .txt file and post-maq .fastq file attached.

    My apologies if this has been answered before.
    Attached Files

  • #2
    Same thing here. I have chmod 777 and 1TB free on the tophat_out destination, I have rebuilt my reference genome several times, bowtie runs perfectly, and the input files have been successfully aligned by tophat on another machine. Here's my output :

    [Wed Jan 4 16:20:51 2012] Beginning TopHat run (v1.3.3)
    -----------------------------------------------
    [Wed Jan 4 16:20:51 2012] Preparing output location /media/data/grimmer/tophat_out//
    [Wed Jan 4 16:20:51 2012] Checking for Bowtie index files
    [Wed Jan 4 16:20:51 2012] Checking for reference FASTA file
    [Wed Jan 4 16:20:51 2012] Checking for Bowtie
    Bowtie version: 0.12.7.0
    [Wed Jan 4 16:20:51 2012] Checking for Samtools
    Samtools Version: 0.1.18
    [Wed Jan 4 16:20:51 2012] Generating SAM header for /media/data/genomes/hg19.ebwt/hg19
    [Wed Jan 4 16:20:52 2012] Preparing reads
    format: fastq
    quality scale: phred33 (default)
    [FAILED]
    Error retrieving prep_reads info.

    Comment


    • #3
      Strangely enough I have got this error for the first time ..

      [2012-04-30 16:01:30] Preparing reads
      [FAILED]
      Error running 'prep_reads'
      terminate called after throwing an instance of 'int'


      -Abhi

      Full output below

      [2012-04-30 16:01:23] Beginning TopHat run (v2.0.0)
      -----------------------------------------------
      [2012-04-30 16:01:23] Checking for Bowtie
      Bowtie version: 2.0.0.5
      [2012-04-30 16:01:23] Checking for Samtools
      Samtools version: 0.1.18.0
      [2012-04-30 16:01:23] Checking for Bowtie index files
      [2012-04-30 16:01:23] Checking for reference FASTA file
      Warning: Could not find FASTA file ../../../reference/Chlamy_V5/reference_bowtie2/Chalmy_reinhardtii_110311_v5.fasta.fa
      [2012-04-30 16:01:23] Reconstituting reference FASTA file from Bowtie index
      Executing: /jgi/tools/bin/bowtie2-inspect ../../../reference/Chlamy_V5/reference_bowtie2/Chalmy_reinhardtii_110311_v5.fasta > tophat_v2_out/tmp/Chalmy_reinhardtii_110311_v5.fasta.fa
      [2012-04-30 16:01:29] Generating SAM header for ../../../reference/Chlamy_V5/reference_bowtie2/Chalmy_reinhardtii_110311_v5.fasta
      format: fastq
      quality scale: phred64 (reads generated with GA pipeline version >= 1.3)
      [2012-04-30 16:01:30] Preparing reads
      [FAILED]
      Error running 'prep_reads'
      terminate called after throwing an instance of 'int'

      Comment


      • #4
        Prep-read errors: Tophat

        I have attached the first 20 lines of Illumina sequence reads (sample.txt), these are 100 base single end reads. These are from Illumina's new pipeline (Version 1.8). The sequences are in fastq format. All sequences (raw and pass filter reads) are in a single file. Quality scores are in Sanger FASTQ format. The offset is ascii 33, instead of the previous Illumina Q score offset (ascii 64).

        I ran a trial tophat with this culled out sequence.

        Command used:
        Code:
        tophat -o ./tophat_out_test_2 -p4 --segment-length 50 --solexa1.3-quals  ../Genome/genome-index P1-test
        I get the following error:
        ...................
        format: fastq
        quality scale: phred64 (reads generated with GA pipeline version >= 1.3)
        [2012-12-05 16:22:18] Preparing reads
        [FAILED]
        Error running 'prep_reads'
        terminate called after throwing an instance of 'int'
        ...................


        I am not sure if this is because of something in the header file or because of the different quality scores. Can any body help me figure this out?

        Thanks
        Attached Files

        Comment


        • #5
          Illumina V 1.8 and Solexa quality

          In Illumina V 1.8 the reads quality scores are in Sanger Fastq format (off set of ASCII33 and not Phred 64) so using
          Code:
           --solexa-quals
          instead of
          Code:
          --solexa1.3-quals
          seems to work! I am still not sure if this is OK though.

          Full command that now works:
          Code:
          tophat -o ./tophat_out_test_2 -p4 --segment-length 50 --solexa-quals  ../Genome/genome-index P1-test

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Strategies for Sequencing Challenging Samples
            by seqadmin


            Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
            03-22-2024, 06:39 AM
          • seqadmin
            Techniques and Challenges in Conservation Genomics
            by seqadmin



            The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

            Avian Conservation
            Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
            03-08-2024, 10:41 AM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, Yesterday, 06:37 PM
          0 responses
          10 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, Yesterday, 06:07 PM
          0 responses
          9 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 03-22-2024, 10:03 AM
          0 responses
          49 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 03-21-2024, 07:32 AM
          0 responses
          67 views
          0 likes
          Last Post seqadmin  
          Working...
          X