Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • tophat issue-string index out of range

    I used tophat to map my RNA seq reads against a reference geneset and it didn't finish.

    The error message came after "Generating SAM header":
    ---------------------------------------------------------
    [2013-03-07 10:31:22] Generating SAM header for /gscmnt/gc6108/research/seqana/s
    pecies_independant/xgao/a_suum/database/asgenes2
    Traceback (most recent call last):
    File "/usr/lib/tophat2.0.0/bin/tophat", line 3778, in <module>
    sys.exit(main())
    File "/usr/lib/tophat2.0.0/bin/tophat", line 3645, in main
    params.read_params = check_reads_format(params, reads_list)
    File "/usr/lib/tophat2.0.0/bin/tophat", line 1679, in check_reads_format
    freader=FastxReader(zf.file, params.read_params.color, zf.fname)
    File "/usr/lib/tophat2.0.0/bin/tophat", line 1427, in __init__
    while hlines>0 and self.lastline[0] not in "@>" :

    IndexError: string index out of range
    -------------------------------------------
    I am using tophat v2.0.0

    I ran 20 sets of data and only one came back with this error message.

    Could anyone help me figure out why and how to get over it?

    Thanks a lot!

  • #2
    Capricy, I am getting this problem too. Did you ever solve it? if so, any advice on how you did it would be great.

    Comment


    • #3
      Did anyone figure this issue out? The only thing I'm doing differently is I'm using ligated index adaptors and trimming off the first 4 bases from the left reads..

      Any input would be greatly appreciated! (Using Tophat 2.0.4)

      Comment


      • #4
        Artur,
        In my case, I got the issue for the trimmed reads with Tophat when I used scythe adapter trimmer (http://biowulf.nih.gov/apps/scythe.html). However, there is no any issue for Tophat when I used another trimmer: http://www.usadellab.org/cms/index.php?page=trimmomatic. So I think the problem is that Tophat can not take your trimmed reads.

        Comment


        • #5
          ttkuaile,
          Unfortunately, using the trimmomatic trimmer didn't resolve this issue for me. I'm using tophat version 2.0.8b with bowtie version 2.1.0 on a mac. I'm using the pre-made Arabidopsis indexes from the Bowtie2 site. However I'm ending up with same error (arab is the "index_base" label):

          [2013-05-28 22:08:35] Beginning TopHat run (v2.0.8b)
          -----------------------------------------------
          [2013-05-28 22:08:35] Checking for Bowtie
          Bowtie version: 2.1.0.0
          [2013-05-28 22:08:35] Checking for Samtools
          Samtools version: 0.1.18.0
          [2013-05-28 22:08:35] Checking for Bowtie index files
          [2013-05-28 22:08:35] Checking for reference FASTA file
          [2013-05-28 22:08:35] Generating SAM header for arab
          Traceback (most recent call last):
          File "/usr/local/bin/tophat-2.0.8b.OSX_x86_64/tophat", line 4030, in <module>
          sys.exit(main())
          File "/usr/local/bin/tophat-2.0.8b.OSX_x86_64/tophat", line 3885, in main
          params.read_params = check_reads_format(params, reads_list)
          File "/usr/local/bin/tophat-2.0.8b.OSX_x86_64/tophat", line 1825, in check_reads_format
          freader=FastxReader(zf.file, params.read_params.color, zf.fname)
          File "/usr/local/bin/tophat-2.0.8b.OSX_x86_64/tophat", line 1570, in __init__
          while hlines>0 and self.lastline[0] not in "@>" :
          IndexError: string index out of range

          I used the following options with the trimmomatic:
          phred33
          HEADCROP:5 SLIDINGWINDOW:3:30 MINLEN:36


          Thanks in advance for the help!

          Cheers.

          Comment


          • #6
            I had the same error here, and in my case the unpaired output of my trimming program was the problem, because it was not gzip properly (I wrote "sample.gzz"). Fix the name and then all good.
            Check the tophat.log that TopHat creates, it normally saids the problem there.

            Cheers

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Current Approaches to Protein Sequencing
              by seqadmin


              Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
              04-04-2024, 04:25 PM
            • seqadmin
              Strategies for Sequencing Challenging Samples
              by seqadmin


              Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
              03-22-2024, 06:39 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, 04-11-2024, 12:08 PM
            0 responses
            30 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 10:19 PM
            0 responses
            32 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 09:21 AM
            0 responses
            28 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-04-2024, 09:00 AM
            0 responses
            53 views
            0 likes
            Last Post seqadmin  
            Working...
            X