Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Hi,

    I just updated bowtie from version 0.11.3 to 0.12.2. With version 0.11.3 I was able to run the command "bowtie -m 25 -a -n 15 --un <file> -p 4 <ebwt> <infile> <outfile>". When I run this command in version 0.12.2, I get error "-n/--seedmms arg must be at least 0 and at most 3". Am I missing something in the change log about this parameter? Is the behavior of -n in version 0.11.3 accurate?

    Thank you.

    EDIT: I just realized that while version 0.11.3 will let me give -n greater than 3, it is still capped at -n 3. Is it possible to align with more than 3 mismatches? I am using bowtie to align 75bp reads to a genomic model (coding regions only) with the ultimate goal of calculating RPKM for each of the models. Is bowtie simply the wrong tool for this purpose?
    Last edited by bloomfi1; 02-15-2010, 05:34 PM.

    Comment


    • -v <int> report end-to-end hits w/ <=v mismatches; ignore qualities
      or
      -n/--seedmms <int> max mismatches in seed (can be 0-3, default: -n 2)
      -e/--maqerr <int> max sum of mismatch quals across alignment for -n (def: 70)
      -l/--seedlen <int> seed length for -n (default: 28)
      -v for end-to-end mismatches
      -n only for mismatches in the seed region, and you can specify the seed length by '-l'
      Xi Wang

      Comment


      • Both -v and -n have a maximum size of 3. What is the reason for this restriction?

        Comment


        • if you are using reads of length 75, would you change the seed length or bowtie figures that out?

          I can only align around 50% of my single read Illumina data from this paper using bowtie default setting : http://www.nature.com/nmeth/journal/...meth.1226.html

          Anyone knows what parameters to tweak to get more sequences aligned?

          Comment


          • I guess that you should trim your data and try to align your sequences again. Also, I don't think that "bowtie figures it out", though I'm no expert.
            L. Collado Torres, Ph.D. student in Biostatistics.

            Comment


            • Originally posted by bloomfi1 View Post
              Hi,

              I just updated bowtie from version 0.11.3 to 0.12.2. With version 0.11.3 I was able to run the command "bowtie -m 25 -a -n 15 --un <file> -p 4 <ebwt> <infile> <outfile>". When I run this command in version 0.12.2, I get error "-n/--seedmms arg must be at least 0 and at most 3". Am I missing something in the change log about this parameter? Is the behavior of -n in version 0.11.3 accurate?

              Thank you.

              EDIT: I just realized that while version 0.11.3 will let me give -n greater than 3, it is still capped at -n 3. Is it possible to align with more than 3 mismatches? I am using bowtie to align 75bp reads to a genomic model (coding regions only) with the ultimate goal of calculating RPKM for each of the models. Is bowtie simply the wrong tool for this purpose?
              Hi,

              Yes, the problem was that versions < 0.12.2 were failing to check for a too-high input for -n and -v. The manual and the usage message both said max=3, but bowtie erroneously didn't enforce it.

              Note that the -n option only constrains the number of mismatches in the seed, not in the entire alignment. The key is to set -n, -l and -e to reasonable numbers given your data. Since your reads are 75bp, I would suggest trying a few different settings, perhaps starting with -l 28 (the default) -n 2 and -e 180 and then adjusting all 3 until your getting your desired mix of speed and sensitivity.

              Thanks,
              Ben

              Comment


              • I am fairly new to the field of next-gen sequencing but find Bowtie to be fairly user friendlybut I do have a question regarding its use. What is the difference in reporting between the default bowtie and the use of the -a, --strata, and --best flags? I understand that with the flags all of the alignments are reported in a best to work format but what does the default bowtie report? For human sequencing data is there a best set of parameters to use in order to gain enough sensitivity in coverage while keeping the file sizes to a manageable number?
                thanks in advance for any help.

                Comment


                • Originally posted by Ben Langmead View Post
                  Hi,

                  Yes, the problem was that versions < 0.12.2 were failing to check for a too-high input for -n and -v. The manual and the usage message both said max=3, but bowtie erroneously didn't enforce it.

                  Note that the -n option only constrains the number of mismatches in the seed, not in the entire alignment. The key is to set -n, -l and -e to reasonable numbers given your data. Since your reads are 75bp, I would suggest trying a few different settings, perhaps starting with -l 28 (the default) -n 2 and -e 180 and then adjusting all 3 until your getting your desired mix of speed and sensitivity.

                  Thanks,
                  Ben
                  Hello and thank you for the advice. I am wondering about the maximum setting of 3, though. I have looked at the bowtie source a little bit and get the impression that this restriction is possibly an inherent restriction in the overall design of bowtie. Is this accurate? Otherwise, do you have any plans to increase this number in the future?

                  Thank you,
                  Sean

                  Comment


                  • Bowtie quality values error

                    Hello everyone,

                    We have been using MAQ for our Solexa assembly needs, but we're moving to another program for downstream analysis, and Bowtie seems much easier for upstream assembly. Unfortunately, this means learning another assembly program. I was trying to assemble some data that we have previously assembled and analyzed using MAQ using Bowtie and I'm running into an error I don't really understand. It states "Reads file contained a pattern with more than 1024 quality values." I'm using the -n alignment mode to assemble the paired alignments (and including the input option --solexa-quals), but have also tried in -v alignment mode (which I thought ignored quality values). We didn't have any issues assembling this data with MAQ, so I think I'm just missing something being new to Bowtie. Any help anyone can provide would be greatly appreciated.

                    Thanks

                    Comment


                    • Can you please post the Bowtie version you're using, and the command you used to run it?

                      Thanks,
                      Ben

                      Comment


                      • Originally posted by RichEast View Post
                        Hello everyone,

                        We have been using MAQ for our Solexa assembly needs, but we're moving to another program for downstream analysis, and Bowtie seems much easier for upstream assembly. Unfortunately, this means learning another assembly program. I was trying to assemble some data that we have previously assembled and analyzed using MAQ using Bowtie and I'm running into an error I don't really understand. It states "Reads file contained a pattern with more than 1024 quality values." I'm using the -n alignment mode to assemble the paired alignments (and including the input option --solexa-quals), but have also tried in -v alignment mode (which I thought ignored quality values). We didn't have any issues assembling this data with MAQ, so I think I'm just missing something being new to Bowtie. Any help anyone can provide would be greatly appreciated.

                        Thanks
                        I have seen this error when the number of bases does not equal the number of quality values in the fastq file. Assuming that isn't the problem it most likely has something to do with bowtie expecting a range of quality values that are not present in your fastq file. Which version of the Illumina pipeline did this data come from?

                        Comment


                        • Originally posted by Ben Langmead View Post
                          Can you please post the Bowtie version you're using, and the command you used to run it?

                          Thanks,
                          Ben
                          We're using Bowtie version 0.12.3, with the command line (running on a command prompt in windows) "Bowtie -n 2 -q --solexa1.3-quals -S Pbindex -1QN_read1 -2QN_read2 QNalign.sam" The FASTQ files are run off a Illunima GA II pipeline 1.4. Thanks.

                          rich

                          Comment


                          • Originally posted by RichEast View Post
                            We're using Bowtie version 0.12.3, with the command line (running on a command prompt in windows) "Bowtie -n 2 -q --solexa1.3-quals -S Pbindex -1QN_read1 -2QN_read2 QNalign.sam" The FASTQ files are run off a Illunima GA II pipeline 1.4. Thanks.

                            rich
                            Could you please paste a head of your data as the bowtie input here?
                            Xi Wang

                            Comment


                            • Originally posted by RichEast View Post
                              We're using Bowtie version 0.12.3, with the command line (running on a command prompt in windows) "Bowtie -n 2 -q --solexa1.3-quals -S Pbindex -1QN_read1 -2QN_read2 QNalign.sam" The FASTQ files are run off a Illunima GA II pipeline 1.4. Thanks.

                              rich
                              Hi Rich,

                              Another user just contacted me via email and described something similar. When I ran their reads through bowtie, I realized that part of the problem is that Bowtie is printing the wrong error message. In their case, the error message should have been something more like "Too many quality values for read..." because they had a fastq entry where the quality string was 2 characters longer than the sequence string. Do you notice any inconsistencies like that in your input?

                              I'll fix the error-message bug.

                              Thanks,
                              Ben

                              Comment


                              • Originally posted by Ben Langmead View Post
                                Hi Rich,

                                Another user just contacted me via email and described something similar. When I ran their reads through bowtie, I realized that part of the problem is that Bowtie is printing the wrong error message. In their case, the error message should have been something more like "Too many quality values for read..." because they had a fastq entry where the quality string was 2 characters longer than the sequence string. Do you notice any inconsistencies like that in your input?

                                I'll fix the error-message bug.

                                Thanks,
                                Ben
                                Ben,

                                That seems to be a likely problem. We took the first 20 or so paired reads and verified the sequence and quality value lengths, and that ran well, with the same command line. We'll go through the FASTQ files and try and find the quality string causing us problems. Thanks to everyone for the helpful suggestions.

                                rich

                                Comment

                                Latest Articles

                                Collapse

                                • seqadmin
                                  Current Approaches to Protein Sequencing
                                  by seqadmin


                                  Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                                  04-04-2024, 04:25 PM
                                • seqadmin
                                  Strategies for Sequencing Challenging Samples
                                  by seqadmin


                                  Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                                  03-22-2024, 06:39 AM

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by seqadmin, 04-11-2024, 12:08 PM
                                0 responses
                                31 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 04-10-2024, 10:19 PM
                                0 responses
                                32 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 04-10-2024, 09:21 AM
                                0 responses
                                28 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 04-04-2024, 09:00 AM
                                0 responses
                                53 views
                                0 likes
                                Last Post seqadmin  
                                Working...
                                X