Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Quake!!!!!!!!!!!!!!

    After struggleing with the first two step of quake for one day, finally started the 3rd step, but got the following message:

    cmd:

    correct -f R1_renamed.fastq R2_renamed.fastq -k 15 -c 3.76 -m R.qcts -p 1 -q 33

    Result:

    8907980 trusted kmers
    AT% = 0.344978
    @DJDP4KN1:1:1101:10000:102173#TGACCA/1
    CCAGCAGGAAGAGCTTGCCGCGTGCGGTGGGGTCGAGGTGGGTGGCGTATTCGAGGTGGTGTTTTTCCATCACCAGGATGATGTCGGCCCAGCGGCTGAT
    +
    terminate called after throwing an instance of 'std:ut_of_range'
    what(): basic_string::substr
    Aborted



    Any tips?

    Binbin

  • #2
    Check the offending record in your fastq file and see if there's something weird about the quality line?

    Comment


    • #3
      That is the first sequence in the fastq file, could not find anything wrong:

      head R1_renamed.fastq
      @DJDP4KN1:1:1101:10000:102173#TGACCA/1
      CCAGCAGGAAGAGCTTGCCGCGTGCGGTGGGGTCGAGGTGGGTGGCGTATTCGAGGTGGTGTTTTTCCATCACCAGGATGATGTCGGCCCAGCGGCTGAT
      +
      CCCFFDFFHHHHHJJJJJJJJJHJIJJGHIJJHIJ<EH9BED=@BDD=BDDEDDDD7AB5?8@DDDDDDEDDDDDBDDDDDEDEDDDDDDDDCDDDDDAC

      Any tips?

      Comment


      • #4
        I'd say it's because your quality score length is not as long as your sequence length. That means that quake will be trying to use the newline character as a quality score, which is out of the range of the quality score. Either find out why your sequence file is corrupted, or pad the end of the quality sequence with low quality scores.

        Good luck!

        Comment


        • #5
          Originally posted by bryand View Post
          I'd say it's because your quality score length is not as long as your sequence length. That means that quake will be trying to use the newline character as a quality score, which is out of the range of the quality score. Either find out why your sequence file is corrupted, or pad the end of the quality sequence with low quality scores.

          Good luck!
          Actually they are of equal length, it's just that the forum font is not monospaced.

          Comment


          • #6
            Ok, how about this: Your quality scores don't correspond to the phred base that you specify? The ascii value of J is 74, and 74 - 33 = 41. I don't know how quake is evaluating the quality scores, but try lowering every quality score by 10 or so in that record and see if that fixes the problem (or at least reduce the J score).

            Comment


            • #7
              Originally posted by bryand View Post
              Ok, how about this: Your quality scores don't correspond to the phred base that you specify? The ascii value of J is 74, and 74 - 33 = 41. I don't know how quake is evaluating the quality scores, but try lowering every quality score by 10 or so in that record and see if that fixes the problem (or at least reduce the J score).
              But why subtract 10? And how to implement?

              Comment


              • #8
                I said 10 just to test your data and see if that's the case (in case any other of your characters are above ascii score of 40). You can pretty easily change just the H and J from the command line:

                perl -n -e 'tr/HIJ/EFG/; print;' fastq_to_check.fq > new.fq

                Again, try it just with this one fastq entry, otherwise you're going to parse your whole illumina file.

                Comment


                • #9
                  Do you mean Quake does not use quality score greater than 40? Could not see any useful info from the manual.

                  Comment


                  • #10
                    I'm not an author of the program, so I don't know - I've simply used it a couple of times and trying to guess as to what might in the end be causing your problem. I'd suggest you contact the authors directly and get their opinion if you can't solve it (assuming you don't want to go into the source code and figure it out for yourself)...

                    Comment

                    Latest Articles

                    Collapse

                    • seqadmin
                      Essential Discoveries and Tools in Epitranscriptomics
                      by seqadmin




                      The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                      04-22-2024, 07:01 AM
                    • seqadmin
                      Current Approaches to Protein Sequencing
                      by seqadmin


                      Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                      04-04-2024, 04:25 PM

                    ad_right_rmr

                    Collapse

                    News

                    Collapse

                    Topics Statistics Last Post
                    Started by seqadmin, Today, 11:49 AM
                    0 responses
                    12 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, Yesterday, 08:47 AM
                    0 responses
                    16 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 04-11-2024, 12:08 PM
                    0 responses
                    61 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 04-10-2024, 10:19 PM
                    0 responses
                    60 views
                    0 likes
                    Last Post seqadmin  
                    Working...
                    X