Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • krobison
    Senior Member
    • Nov 2007
    • 734

    TopHat 1.1 failing on colorspace SE reads

    I'm trying to analyze a single end dataset from SRA with the brand new version of TopHat (1.1). TopHat crashes with the below error message, and looking at this & the code it appears that even with single ends it is trying to run a validation check on the 2nd set of reads


    Code:
      File "/usr/local/bin/tophat", line 2093, in main
        params.read_params = check_reads(params.read_params, left_reads_list + "," + right_reads_list)
    TypeError: cannot concatenate 'str' and 'NoneType' objects

    If I replace line 2093 with

    Code:
            if params.skip_check_reads == False:
                if right_reads_list !=None:
                    params.read_params = check_reads(params.read_params, left_reads_list + "," + right_reads_list)
                else:
                    params.read_params = check_reads(params.read_params, left_reads_list)
    Then I get farther but hit a new error

    Code:
    Length mismatch between sequence and quality strings for SRR040290.1 VAB_ugc_85__100_137__138_121__123_bc_Frag50_solid0032_20090715_ugc_121__1231_49_36 length=50 (51 vs 51).
    I'm too worn out to puzzle how to get past that one -- my best guess is this is related to the "extra" colorspace value which bowtie option "--col-keepends" deals with
  • kopi-o
    Senior Member
    • Feb 2008
    • 319

    #2
    Yep, I just got the same error message (the first one; haven't tried to modify the code). I'm also using single-end color space reads (.csfasta + .qual)

    Comment

    • Telor
      Junior Member
      • Jul 2010
      • 6

      #3
      Yes, Just to confirm that you are not the only one, I'm getting this error too, but on standard single end reads (not color-space).

      Comment

      • sunnyjoy
        Junior Member
        • Sep 2009
        • 1

        #4
        Google search the same error message leads me here. Same problem with single-end color space reads.

        Comment

        • DerSeb
          Member
          • Oct 2009
          • 44

          #5
          same here.

          Comment

          • Daehwan
            Member
            • Oct 2010
            • 27

            #6
            Hi guys,

            I'm Daehwan, who made this bug and fixed it, you can grab a fixed version at http://tophat.cbcb.umd.edu/index.html

            Thanks

            Comment

            • krobison
              Senior Member
              • Nov 2007
              • 734

              #7
              Thanks for the rapid fix!!

              Comment

              • DerSeb
                Member
                • Oct 2009
                • 44

                #8
                great, i will try it right now! thx for the great support!

                Comment

                • Telor
                  Junior Member
                  • Jul 2010
                  • 6

                  #9
                  Great. Thanks for the speedy fix. It looks like its working fine so far (fingers crossed )

                  Comment

                  • DerSeb
                    Member
                    • Oct 2009
                    • 44

                    #10
                    Hello. I can now successfully start and get passed the first error encountered here. However, I still run into the next error mentioned above:

                    Code:
                    Tue Oct  5 16:10:32 2010] Beginning TopHat run (v1.1.0)
                    -----------------------------------------------
                    [Tue Oct  5 16:10:32 2010] Preparing output location /home/schaefer/tophat/RBM20/Sample14//
                    [Tue Oct  5 16:10:32 2010] Checking for Bowtie index files
                    [Tue Oct  5 16:10:32 2010] Checking for reference FASTA file
                    [Tue Oct  5 16:10:32 2010] Checking for Bowtie
                    	Bowtie version:			 0.12.3.0
                    [Tue Oct  5 16:10:32 2010] Checking for Samtools
                    	Samtools version:		 0.1.8.0
                    [Tue Oct  5 16:10:39 2010] Checking reads
                    
                    Error encountered parsing file ...fastq:
                     Length mismatch between sequence and quality strings for 853_8_25/1 (49 vs 49).
                    When I check the fastq file, everything seems fine:

                    Code:
                    @853_8_25/1
                    GNNGTGNTNCANNCGTNNGAGNNCACNNACANCCGANNACGNAAAGNAN
                    +
                    *""%%%"%"%%""%%%""%)&""%%%""%%+"&'%&""'(%"'))'"&"
                    @853_8_35/1
                    CNNACGNANACNNACCNNCCGNNTAANNNNGNGAACNNCNANCNCNNTN
                    +
                    :""=54"@"=+""A98""745"";98""""2"=@>8""<"4"<"6"";"
                    @853_8_75/1
                    GNNACCNCNTCNNAACNNTACNNCGANNGTGNGGACNNGTCNCGAGNCN
                    +
                    ="";<7"0";4""=;:"">;=""94<"".,5".;26""9%)"(%(("("
                    @853_8_96/1
                    ...
                    Is this error still occuring to someone else?

                    Comment

                    • Daehwan
                      Member
                      • Oct 2010
                      • 27

                      #11
                      DerSeb, what's your command?

                      Comment

                      • DerSeb
                        Member
                        • Oct 2009
                        • 44

                        #12
                        This is my command:

                        Code:
                        tophat -G /data/genetics/datasets/genome-annotation/ensembl-56/Rattus_norvegicus.RGSC3.4.56.gtf -o /home/schaefer/tophat/Sample14 -C rn4_c /home/schaefer/tophat/fastq/Sample_14_Qual.fastq

                        Comment

                        • Daehwan
                          Member
                          • Oct 2010
                          • 27

                          #13
                          Since you are using -C, which is for colorspace read, you need to use colorspace reads instead of nucleotide reads.

                          Comment

                          • DerSeb
                            Member
                            • Oct 2009
                            • 44

                            #14
                            I see... I converted my CS reads to fastq, using scripts supplied with MAQ (solid2fastq.pl or fq_all2std.pl csfa2std). They convert the cs values to letters mimicking a "pseudo" genetic sequence.

                            I will look into this and see how I can change this.

                            Thx!

                            Comment

                            • DerSeb
                              Member
                              • Oct 2009
                              • 44

                              #15
                              I have now started a thread dedicated to reformatting SOLiD reads for TopHat:

                              Discussion of next-gen sequencing related bioinformatics: resources, algorithms, open source efforts, etc

                              Comment

                              Latest Articles

                              Collapse

                              • SEQadmin2
                                From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                                by SEQadmin2


                                Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                                The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                                ...
                                06-02-2026, 10:05 AM
                              • SEQadmin2
                                Single-Cell Sequencing at an Inflection Point: Early Impacts of New Platforms and Emerging Trends
                                by SEQadmin2


                                With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.


                                Introduction

                                Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...
                                05-22-2026, 06:42 AM
                              • SEQadmin2
                                Environmental Genomics in the Age of NGS: From Microbes to Conservation Strategies
                                by SEQadmin2

                                Studying ecosystems means dealing with complex, multi-species communities that are hard to observe at scale. This complexity, however, hides many important questions to be answered, from how biogeochemical cycles work and how climate change can affect species distribution to how conservation strategies can work best.


                                Genomics, particularly since the expansion of NGS, has transformed ecosystem ecology. By sequencing environmental DNA, we can now assess biodiversity without direct...
                                05-06-2026, 09:04 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by SEQadmin2, Today, 08:59 AM
                              0 responses
                              9 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 06-02-2026, 12:03 PM
                              0 responses
                              21 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 06-02-2026, 11:40 AM
                              0 responses
                              17 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 05-28-2026, 11:40 AM
                              0 responses
                              30 views
                              0 reactions
                              Last Post SEQadmin2  
                              Working...