Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    I am getting the same error:
    This is with version tophat-1.0.13. Has anyone coded a fix for this?

    @2065
    AGGCCGCTCCGGGCGCTGGACGTTGGGCTCCTGGCGAACCTCTCGGCGCTGGAGACTGGATATAACACAAC
    +SOLEXA3_1:1:1:22:173/2
    EDHEGHHCDHHGHHHHHCG/HHHHGEDHH=HHHGGHF<?HHHEGHDCHHADB#6@=#8AAG:@@E=1#@>#
    Saw ASCII character 10 but expected 33-based Phred qual.
    terminate called after throwing an instance of 'int'
    Aborted

    Comment


    • #17
      I tried a workaround of changing all the "." characters in the sequence to "N" with awk:

      awk '{if ( (NR-2)%4==0) {gsub(/\./,"N",$0); print $0} else { print $0}}'

      Which gets me through prep_reads without any errors. Can anyone comment on if this is a safe workaround?

      Comment


      • #18
        Originally posted by rcorbett View Post
        I tried a workaround of changing all the "." characters in the sequence to "N" with awk:

        awk '{if ( (NR-2)%4==0) {gsub(/\./,"N",$0); print $0} else { print $0}}'

        Which gets me through prep_reads without any errors. Can anyone comment on if this is a safe workaround?
        I did the same conversion and was able to run the downstream analysis with cufflinks and the result seems to be fine. I think this is a safe workaround as 'N' is a legitimate character in the fasta sequence and I assume the alignment software (bowtie) treats it intelligently.

        Comment


        • #19
          I've added [ code ] tags round your FASTQ example for clarity - otherwise the forum messes things up.
          Originally posted by rcorbett View Post
          I am getting the same error:
          This is with version tophat-1.0.13. Has anyone coded a fix for this?

          Code:
          @2065
          AGGCCGCTCCGGGCGCTGGACGTTGGGCTCCTGGCGAACCTCTCGGCGCTGGAGACTGGATATAACACAAC
          +SOLEXA3_1:1:1:22:173/2
          EDHEGHHCDHHGHHHHHCG/HHHHGEDHH=HHHGGHF<?HHHEGHDCHHADB#6@=#8AAG:@@E=1#@>#
          Saw ASCII character 10 but expected 33-based Phred qual.
          terminate called after throwing an instance of 'int'
          Aborted
          The (optional repeated) identifier on the + line doesn't match the (mandatory) identifier on the @ line. Assuming nothing went wrong in the cut and paste into the forum, it looks like something is very wrong with your FASTQ file. This may be what is upsetting tophat.

          Comment


          • #20
            Hi, Cole_Trapnell

            Does current version of Tophat support SOLiD data? Thanks

            Clariet

            Originally posted by Cole Trapnell View Post
            Thanks for the heads up. We'll add the bug to our tracker and address it in the next release. Others are likely to have this problem.

            Comment


            • #21
              Originally posted by maubp View Post
              I've added [ code ] tags round your FASTQ example for clarity - otherwise the forum messes things up.

              The (optional repeated) identifier on the + line doesn't match the (mandatory) identifier on the @ line. Assuming nothing went wrong in the cut and paste into the forum, it looks like something is very wrong with your FASTQ file. This may be what is upsetting tophat.

              Actually, the mismatch between the "@" and "+" names should be fine, at least within TopHat. In fact, the program exploits this feature of FASTQ to make analyzing paired and long reads much easier. One of the first things TopHat does is "rename" the user's reads with increasing integer IDs, moving their true names from the "@" field down to the "+" field and rewriting the FASTQ files to a temporary file. Mate pairs from the same fragment get the same ID, making intermediate results for them them much easier to match back up later on.

              Comment


              • #22
                Originally posted by clariet View Post
                Hi, Cole_Trapnell

                Does current version of Tophat support SOLiD data? Thanks

                Clariet
                Currently, no.

                Comment


                • #23
                  Thank you. Are you planning to add this sometime soon? My colleague has very good comments on TopHat and I am very much looking forward to using this tool for SOLiD data, which I have access only. Bowtie, as far as I know, supports SOLiD.

                  Originally posted by Cole Trapnell View Post
                  Currently, no.

                  Comment


                  • #24
                    Empty (almost) junctions.bed file...

                    I have a related question.
                    I mapped Illumina 100nt reads using TopHat in default mode.
                    The resulting junctions.bed has only a single line, "track name=junctions description="TopHat junctions"."
                    When I used the resulting accepted_hits.sam file to identify transcripts using Cufflinks, the resulting transcripts.gtf has about 190,000 transcripts. But, all the transcripts have only one exon!
                    All these things together makes me wonder if there was a problem with mapping across intronic regions in the TopHat/Bowtie stage, with these data.

                    Please let me know if anyone has any idea on how to deal with this issue. Also please let me know if my description needs any clarifications.

                    Thanks!
                    -RSK

                    Comment


                    • #25
                      Same here

                      Originally posted by bzhang View Post
                      I did the same conversion and was able to run the downstream analysis with cufflinks and the result seems to be fine. I think this is a safe workaround as 'N' is a legitimate character in the fasta sequence and I assume the alignment software (bowtie) treats it intelligently.
                      I have used "maq sol2sanger" to convert my fastq file.

                      I had the same problem and applied that little gawk script (thanks guys, really saved me wasting hours).

                      And Cole: I am using tophat 1.0.14.

                      Comment


                      • #26
                        Originally posted by maximilianh View Post
                        I have used "maq sol2sanger" to convert my fastq file.

                        I had the same problem and applied that little gawk script (thanks guys, really saved me wasting hours).

                        And Cole: I am using tophat 1.0.14.
                        Where did you get version 1.0.14? I thought the latest that was out was TopHat 1.0.13 (BETA) release 2/5/2010

                        Comment


                        • #27
                          TopHat 1.0.14

                          Originally posted by thinkRNA View Post
                          Where did you get version 1.0.14? I thought the latest that was out was TopHat 1.0.13 (BETA) release 2/5/2010
                          From here: http://tophat.cbcb.umd.edu/index.html

                          Comment


                          • #28
                            Originally posted by shurjo View Post
                            If this is Illumina data, were your reads processed with pipeline v1.3 or later? If so, you have to include the --solexa-quals option in your TopHat run.
                            I've had the same problem when pre-processed the illumina file first with maq sol2sanger. I am using tophat 1.4 now and AM NOT PREPROCESSING anymore and it works!! Just use the .txt file

                            Max

                            Comment


                            • #29
                              I did not understand what this paragraph mean in the Manual, i am not a native english speaker.
                              "Arguments:
                              <ebwt_base> The basename of the index to be searched. The basename is the name of any of the five index files up to but not including the first period. bowtie first looks in the current directory for the index files, then looks in the indexes subdirectory under the directory where the currently-running bowtie executable is located, then looks in the directory specified in the BOWTIE_INDEXES environment variable. "
                              what does this paragraph? For example, i have bowtie index (dog.fa, dog.fa.1.ebwt, dog.fa.2.ebwt and so on) in director /home/index, when i try to run the software, i type "tophat -r 200 /home/index/dog.fa test.fq
                              but it always show :
                              checking for Bowtie index files
                              checking for reference FASTA file
                              Warning:Could not find FASTA file /home/indexdog.fa.fa
                              Reconstituting reference FASTA file from Bowtie index

                              What is the problem?

                              Comment


                              • #30
                                You would use dog, not dog.fa. It adds the .fa for you when it search.es

                                Comment

                                Latest Articles

                                Collapse

                                • seqadmin
                                  Techniques and Challenges in Conservation Genomics
                                  by seqadmin



                                  The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                                  Avian Conservation
                                  Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                                  03-08-2024, 10:41 AM
                                • seqadmin
                                  The Impact of AI in Genomic Medicine
                                  by seqadmin



                                  Artificial intelligence (AI) has evolved from a futuristic vision to a mainstream technology, highlighted by the introduction of tools like OpenAI's ChatGPT and Google's Gemini. In recent years, AI has become increasingly integrated into the field of genomics. This integration has enabled new scientific discoveries while simultaneously raising important ethical questions1. Interviews with two researchers at the center of this intersection provide insightful perspectives into...
                                  02-26-2024, 02:07 PM

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by seqadmin, 03-14-2024, 06:13 AM
                                0 responses
                                32 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 03-08-2024, 08:03 AM
                                0 responses
                                72 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 03-07-2024, 08:13 AM
                                0 responses
                                80 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 03-06-2024, 09:51 AM
                                0 responses
                                68 views
                                0 likes
                                Last Post seqadmin  
                                Working...
                                X