Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    I was able to get tophat running on colorspace reads by doing what many others have mentioned: Removing comment lines in .csfasta and .qual; and converting all -1 qualities to 0.

    On 5/8 of my datasets, tophat works great. On the remaining 3 I get the following error.

    Code:
    Traceback (most recent call last):
      File "/share/apps/tophat-1.1.0.Linux_x86_64/tophat", line 2174, in ?
        sys.exit(main())
      File "/share/apps/tophat-1.1.0.Linux_x86_64/tophat", line 2133, in main
        user_supplied_juncs)
      File "/share/apps/tophat-1.1.0.Linux_x86_64/tophat", line 1848, in spliced_alignment
        segment_len)
      File "/share/apps/tophat-1.1.0.Linux_x86_64/tophat", line 1570, in split_reads
        split_record(read_name, read_seq, read_quals, output_files, offsets, color)
      File "/share/apps/tophat-1.1.0.Linux_x86_64/tophat", line 1503, in split_record
        read_seq_temp = convert_color_to_bp(read_seq)
      File "/share/apps/tophat-1.1.0.Linux_x86_64/tophat", line 1477, in convert_color_to_bp
        base = decode_dic[base+ch]
    KeyError: 'GN'
    Has anyone else seen an error like this? Is there a fix available?

    Comment


    • #17
      Originally posted by hyjkim View Post
      Has anyone else seen an error like this? Is there a fix available?
      Yes, I ran into the same thing. I just posted my fix (to the source code) on this thread: http://seqanswers.com/forums/showthread.php?p=26692

      Comment


      • #18
        Originally posted by Daehwan View Post
        Hi guys,

        I'm Daehwan, who made this bug and fixed it, you can grab a fixed version at http://tophat.cbcb.umd.edu/index.html

        Thanks
        Hi,
        I ran tophat-1.1.1 on my single end solid data but I encounter the following error.


        Traceback (most recent call last):
        File "/mlab/software/tophat-1.1.1.Linux_x86_64/tophat", line 2166, in <module>
        sys.exit(main())
        File "/mlab/software/tophat-1.1.1.Linux_x86_64/tophat", line 2125, in main
        user_supplied_juncs)
        File "/mlab/software/tophat-1.1.1.Linux_x86_64/tophat", line 1840, in spliced_alignment
        segment_len)
        File "/mlab/software/tophat-1.1.1.Linux_x86_64/tophat", line 1562, in split_reads
        split_record(read_name, read_seq, read_quals, output_files, offsets, color)
        File "/mlab/software/tophat-1.1.1.Linux_x86_64/tophat", line 1495, in split_record
        read_seq_temp = convert_color_to_bp(read_seq)
        File "/mlab/software/tophat-1.1.1.Linux_x86_64/tophat", line 1469, in convert_color_to_bp
        base = decode_dic[base+ch]
        KeyError: 'CN'

        Comment


        • #19
          Hey Nameeta,

          I also get the same types of errors with tophat v1.1.1. I am able to run tophat using v1.1.0 and dcjones' patch. I'd recommend you go that route until tophat fixes the solid-style "." wildcard errors.

          Comment


          • #20
            Thanks, I was able to fix it by changing tophat.py and then recompiling.

            def convert_color_to_bp(color_seq):
            decode_dic = { 'A0':'A', 'A1':'C', 'A2':'G', 'A3':'T', 'A4':'N', 'A.':'N', 'AN':'N',
            'C0':'C', 'C1':'A', 'C2':'T', 'C3':'G', 'C4':'N', 'C.':'N', 'CN':'N',
            'G0':'G', 'G1':'T', 'G2':'A', 'G3':'C', 'G4':'N', 'G.':'N', 'GN':'N',
            'T0':'T', 'T1':'G', 'T2':'C', 'T3':'A', 'T4':'N', 'T.':'N', 'TN':'N',
            'N0':'N', 'N1':'N', 'N2':'N', 'N3':'N', 'N4':'N', 'N.':'N', 'NN':'N' }

            Comment


            • #21
              Tophat error

              I have a problem when I ran tophat with solid colorspace single-end 50bp data . Did anyone has encountered such error ? I have run 5 samples but only the last one get this error. I don't know why , they are the same format data, the only difference is the last one is a little big than the others, about 6G data. Can anyone know the reason and help me to solve this problem? Thank you very much. The error is as follow:
              Traceback (most recent call last):
              File "/share/disk7-3/wuzhygroup/liwh/tophat/bin/tophat", line 2223, in ?
              sys.exit(main())
              File "/share/disk7-3/wuzhygroup/liwh/tophat/bin/tophat", line 2181, in main
              user_supplied_juncs)
              File "/share/disk7-3/wuzhygroup/liwh/tophat/bin/tophat", line 1891, in spliced_alignment
              segment_len)
              File "/share/disk7-3/wuzhygroup/liwh/tophat/bin/tophat", line 1613, in split_reads
              split_record(read_name, read_seq, read_quals, output_files, offsets, color)
              File "/share/disk7-3/wuzhygroup/liwh/tophat/bin/tophat", line 1546, in split_record
              read_seq_temp = convert_color_to_bp(read_seq)
              File "/share/disk7-3/wuzhygroup/liwh/tophat/bin/tophat", line 1520, in convert_color_to_bp
              base = decode_dic[base+ch]
              KeyError: 'GN'

              Comment


              • #22
                Originally posted by DerSeb View Post
                Hello. I can now successfully start and get passed the first error encountered here. However, I still run into the next error mentioned above:

                Code:
                Tue Oct  5 16:10:32 2010] Beginning TopHat run (v1.1.0)
                -----------------------------------------------
                [Tue Oct  5 16:10:32 2010] Preparing output location /home/schaefer/tophat/RBM20/Sample14//
                [Tue Oct  5 16:10:32 2010] Checking for Bowtie index files
                [Tue Oct  5 16:10:32 2010] Checking for reference FASTA file
                [Tue Oct  5 16:10:32 2010] Checking for Bowtie
                	Bowtie version:			 0.12.3.0
                [Tue Oct  5 16:10:32 2010] Checking for Samtools
                	Samtools version:		 0.1.8.0
                [Tue Oct  5 16:10:39 2010] Checking reads
                
                Error encountered parsing file ...fastq:
                 Length mismatch between sequence and quality strings for 853_8_25/1 (49 vs 49).
                When I check the fastq file, everything seems fine:

                Code:
                @853_8_25/1
                GNNGTGNTNCANNCGTNNGAGNNCACNNACANCCGANNACGNAAAGNAN
                +
                *""%%%"%"%%""%%%""%)&""%%%""%%+"&'%&""'(%"'))'"&"
                @853_8_35/1
                CNNACGNANACNNACCNNCCGNNTAANNNNGNGAACNNCNANCNCNNTN
                +
                :""=54"@"=+""A98""745"";98""""2"=@>8""<"4"<"6"";"
                @853_8_75/1
                GNNACCNCNTCNNAACNNTACNNCGANNGTGNGGACNNGTCNCGAGNCN
                +
                ="";<7"0";4""=;:"">;=""94<"".,5".;26""9%)"(%(("("
                @853_8_96/1
                ...
                Is this error still occuring to someone else?
                I got exact same error. What is worst, I was processing 8 files, I got this error for 5 files but TopHat worked okay for 3 files. This has me completely stumped.
                Sameet Mehta (Ph.D.),
                Visiting Fellow,
                National Cancer Insitute,
                Bethesda,
                US.

                Comment


                • #23
                  Originally posted by sameet View Post
                  I got exact same error. What is worst, I was processing 8 files, I got this error for 5 files but TopHat worked okay for 3 files. This has me completely stumped.
                  Folks, I am getting exactly the same error for 8 out of 8 FASTQ files that were generated using Casava1.8. BWA aligns everything OK without any compaints.
                  But TopHat gives me the following:


                  [Tue Sep 20 16:29:29 2011] Beginning TopHat run (v1.2.0)
                  -----------------------------------------------
                  [Tue Sep 20 16:29:29 2011] Preparing output location tophat83//
                  [Tue Sep 20 16:29:29 2011] Checking for Bowtie index files
                  [Tue Sep 20 16:29:29 2011] Checking for reference FASTA file
                  [Tue Sep 20 16:29:29 2011] Checking for Bowtie
                  Bowtie version: 0.12.7.0
                  [Tue Sep 20 16:29:29 2011] Checking for Samtools
                  Samtools Version: 0.1.12a
                  [Tue Sep 20 16:29:46 2011] Checking reads

                  Error encountered parsing file ../sample-R32011-083pf.fastq:
                  Length mismatch between sequence and quality strings for HostName:40:FC:1:1:4494:1059 1:Y:0: (118 vs 192).

                  Comment

                  Latest Articles

                  Collapse

                  • seqadmin
                    Current Approaches to Protein Sequencing
                    by seqadmin


                    Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                    04-04-2024, 04:25 PM
                  • seqadmin
                    Strategies for Sequencing Challenging Samples
                    by seqadmin


                    Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                    03-22-2024, 06:39 AM

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by seqadmin, 04-11-2024, 12:08 PM
                  0 responses
                  25 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 04-10-2024, 10:19 PM
                  0 responses
                  28 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 04-10-2024, 09:21 AM
                  0 responses
                  24 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 04-04-2024, 09:00 AM
                  0 responses
                  52 views
                  0 likes
                  Last Post seqadmin  
                  Working...
                  X