Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Hi Ben,

    I am confused how Bowtie deals with the quality scores when counting mismatches.

    I noticed that there are two parameters related to this issue. First, -n/--seedmms <int> indicates the maximum mismatches in seed, meaning that if a hit with greater than the mismatch cutoff it will not be reported by Bowtie. And second, -e/--maqerr <int> indicates the maximum sum of quality scores allowed at the mismatched bases (is it right?). However, I don't know whether the two criteria are the same or complemental.

    Further, the two measurements of mismatches are both counted in seed region. Even though the users can specify the seed length, I am wondering where does the seed locate: from the leftmost of a query (read) or a random region in the query.

    Besides, there is another parameter -v <int>, which takes care the end-to-end mismatches, but does not consider the quality scores. Is it possible to make this consider the quality scores?

    Best regards!
    Xi
    Xi Wang

    Comment


    • Question regarding bwt paired end alignment

      I am currently trying to aligned paired end Illumina reads using bowtie and I want to compare the results to those from maq.

      I cannot see an option for reporting an alignment for a read when its mate does not map? Is this possible?

      The maq software still reports alignments for a read even if its mate does not map and I wanted to do the same thing with bowtie. A lot of pairs end up unaligned (significantly more than with maq) if this is not possible.

      If any one knows hows to do this I would really appreciate it, thanks.

      Comment


      • Hi Xi,

        Originally posted by Xi Wang View Post
        I noticed that there are two parameters related to this issue. First, -n/--seedmms <int> indicates the maximum mismatches in seed, meaning that if a hit with greater than the mismatch cutoff it will not be reported by Bowtie. And second, -e/--maqerr <int> indicates the maximum sum of quality scores allowed at the mismatched bases (is it right?). However, I don't know whether the two criteria are the same or complemental.
        They're complementary. If either limit is exceeded, the alignment is invalid.

        Originally posted by Xi Wang View Post
        Further, the two measurements of mismatches are both counted in seed region. Even though the users can specify the seed length, I am wondering where does the seed locate: from the leftmost of a query (read) or a random region in the query.
        From the leftmost end of the read. -e applies to the entire alignment, not just the seed, exactly as in Maq.

        Originally posted by Xi Wang View Post
        Besides, there is another parameter -v <int>, which takes care the end-to-end mismatches, but does not consider the quality scores. Is it possible to make this consider the quality scores?
        No; to consider qualities, use -n/-l/-e.

        Thanks,
        Ben

        Comment


        • Originally posted by lindseyjane View Post
          I cannot see an option for reporting an alignment for a read when its mate does not map? Is this possible?
          Your best bet is to run Bowtie in paired-end mode while using --un to dump unaligned reads to files. Then run again in unpaired mode using the unaligned reads as input.

          Let me know if that doesn't solve your problem.

          Thanks,
          Ben

          Comment


          • comparable parameters with maq

            Hi Ben,

            Excellent work with Bowtie - looking forward to cutting down data processing time. Working on a project in which I have used maq, but for subsequent paired end medip-seq of 45 bases I want to use Bowtie and parameters as close to maq as possible.

            Using maq I eliminate reads with a maq quality < 10 (the same read mapped to >1 location and hence ambiguous) and output to another file.
            I also keep only those flags 18 and 130 (correctly paired reads).
            Using ad-hoc script I only keep one hit if the same read is mapped to the same start and stop location multiple times (pcr bias)

            I'd like to create the same criteria using bowtie. Could you advise me? To begin with, the default in bowtie is good - 2MM in 28 base seed region with sum of e 70

            thank you

            Layla

            Comment


            • Originally posted by Ben Langmead View Post
              Hi Xi,

              to consider qualities, use -n/-l/-e.
              Thanks, Ben.
              I am still wondering whether the seed region is defined only for counting the mismatches or not. If I want to just use the quality score criterion, and set -l equal to 0, does it work?

              Best wishes,
              Xi
              Xi Wang

              Comment


              • Originally posted by Xi Wang View Post
                I am still wondering whether the seed region is defined only for counting the mismatches or not.
                Yes. The setting for -l matters for the -n limit but not for the -e limit.

                Originally posted by Xi Wang View Post
                If I want to just use the quality score criterion, and set -l equal to 0, does it work?
                No, -l must be set to 5 or greater.

                Ben

                Comment


                • Hi,
                  I'm New in the field of NGS (was working mainly on microarray data analysis) and i'm starting to invastigate comon tools related to sequence analysis.
                  I have human data (paired reads/ 75 base) and used Bowtie for the alignment.
                  I used standard parameter for alignment :
                  bowtie -t -p 8 h_sapiens_37_asm ./s_8_1_sequence.fq ./s_8_1_sequence.fq.bowtie.align
                  bowtie -t -p 8 h_sapiens_37_asm ./s_8_2_sequence.fq ./s_8_2_sequence.fq.bowtie.align
                  bowtie -t -p 8 h_sapiens_37_asm -1 ./s_8_1_sequence.fq -2 ./s_8_2_sequence.fq ./s_8_sequence.fq.bowtie.align

                  and I get respectively the following results:
                  # reads processed: 6660511
                  # reads with at least one reported alignment: 4615451 (69.30%)
                  # reads that failed to align: 2045060 (30.70%)
                  # reads with at least one reported alignment: 5050548 (75.83%)
                  # reads that failed to align: 1609963 (24.17%)
                  # reads with at least one reported alignment: 13371 (0.20%)
                  # reads that failed to align: 6647140 (99.80%)

                  The data quality is not optimal but i guess that having no alignment using paired end is not due to that fact and probably parameter should be tuned.
                  Any one could give me some insight about the optimal setting for the paired end alignment ?
                  Thanks in advance,
                  Best,
                  ramzi
                  Research Scientist - Bioinformatics
                  Sidra Medical and Research Center

                  Comment


                  • A question for number of mismatches. I can not set up -v 4. (error: -v arg must be at most 3) Does that mean Bowtie at most allow 3 mismatches for whatever length of reads? Thanks.

                    Comment


                    • Another question:

                      I'm reading the manual for -k -a and --best.

                      I'm confusing about if we put (-k or -a) with --best together. I thought that if a read has several "best" alignments, these "best" should have kinds of "equal" alignment scores. But the manual said that if -k or -a >1 and --best are specified, only best alignments will be reported and they are appear in best-to-worst order, which means that the best alignments are not "equally best".

                      Hopefully get your help soon, thanks.

                      Comment


                      • Originally posted by ramouz87 View Post
                        The data quality is not optimal but i guess that having no alignment using paired end is not due to that fact and probably parameter should be tuned.
                        Any one could give me some insight about the optimal setting for the paired end alignment ?
                        Thanks in advance,
                        Best,
                        ramzi
                        Hi Ramzi,

                        The options you're looking for are almost certainly -I/-X and --ff/--fr/--rf. You need to have a reasonably good idea of the expected insert size and specify an appropriate range with -I/-X. You should also confirm that your paired-end protocol produces pairs in the fw/rev orientation. This is the typical configuration for Illumina. If your paired-end data has a different orientation, change it with --ff or --rf.

                        Hope that helps,
                        Ben

                        Comment


                        • Originally posted by liu3zhen View Post
                          A question for number of mismatches. I can not set up -v 4. (error: -v arg must be at most 3) Does that mean Bowtie at most allow 3 mismatches for whatever length of reads? Thanks.
                          Hi liu3zhen,

                          To allow more than 3 mismatches in the alignment, use the Maq-like options: -n/-l/-e instead of -v.

                          Thanks,
                          Ben

                          Comment


                          • are pairs considered separately wrt mismatches and uniquness with soap-like policy

                            I have a couple of questions about how Bowtie deals with mismatches in a paired end run. (Using -v 1 and -m 1). I have my guesses as to how things work, but I am hoping that someone knowlegeable (e.g. Ben) will ring-in with the correct information.

                            1) Is it possible to obtain an alignment for a read pair where one read uniquely maps but the other doesn't? (my guess: no)

                            2) Does the mismatch setting apply to both reads or are they taken together. In other words if 1 mismatch is specified, can both members of a pair each have 1-mismatch? (my guess: yes)

                            Comment


                            • Originally posted by liu3zhen View Post
                              But the manual said that if -k or -a >1 and --best are specified, only best alignments will be reported and they are appear in best-to-worst order, which means that the best alignments are not "equally best".
                              That's right; --best does not limit the number of alignments Bowtie reports. If you ask for 1 alignment (default), --best guarantees it's the best. If you ask for -k 4, --best guarantees they're the 4 best, reported in best-to-worst order. If you ask for -a, --best guarantees that you'll get all of them in best-to-worst order.

                              Thanks
                              Ben

                              Comment


                              • Originally posted by ecabot View Post
                                1) Is it possible to obtain an alignment for a read pair where one read uniquely maps but the other doesn't? (my guess: no)
                                Definitely yes! That's exactly where paired-end sequencing pays off . If either read aligns uniquely, that alignment will be used as an anchor to look for the mate's alignment and, if it's found, that paired-end alignment will be reported.

                                Originally posted by ecabot View Post
                                2) Does the mismatch setting apply to both reads or are they taken together. In other words if 1 mismatch is specified, can both members of a pair each have 1-mismatch? (my guess: yes)
                                The mismatch setting applies to each read. So, yes, if -v 1 is specified, *both* mates are allowed to have a mismatch.

                                Hope that helps,
                                Ben

                                Comment

                                Latest Articles

                                Collapse

                                • seqadmin
                                  Essential Discoveries and Tools in Epitranscriptomics
                                  by seqadmin




                                  The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                                  Yesterday, 07:01 AM
                                • seqadmin
                                  Current Approaches to Protein Sequencing
                                  by seqadmin


                                  Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                                  04-04-2024, 04:25 PM

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by seqadmin, 04-11-2024, 12:08 PM
                                0 responses
                                57 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 04-10-2024, 10:19 PM
                                0 responses
                                53 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 04-10-2024, 09:21 AM
                                0 responses
                                45 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 04-04-2024, 09:00 AM
                                0 responses
                                55 views
                                0 likes
                                Last Post seqadmin  
                                Working...
                                X