Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #46
    hi there

    as of version 0.9.8 bowtie comes with the --unfa <filename> / --unfq <filename> options, which should be doing exactly what you are looking for



    cheers,

    florian

    Comment


    • #47
      Florian, thanks a lot for your reply. I was using v0.9.7, and did not know there is a new version available. It is a very helpful function. Thanks.

      Comment


      • #48
        Yeh, I agree. However, there seems to be a problem with multi-threading, so be advised NOT to use the -p option in conjunction with it. Ben (the main developer) has told me he was already working on a solution, though, so this will only be a intermittent drawback.

        Comment


        • #49
          Thanks Florian! - Indeed, version 0.9.8.1 of Bowtie was just released minutes ago and it fixes all known issues with the --unfa and --unfq options. It's highly recommended that you use that version.

          Thanks,
          Ben

          Comment


          • #50
            Hello Joachim,

            Sorry for the delay in replying! Please see responses below.

            Originally posted by joa_ds View Post
            [*]I don't quite understand what the 'nostrattum' flag does
            Without the --nostratum flag, Bowtie will only report alignments for the best "stratum" where alignments were found. By "best stratum", I mean the best category of alignment, categorized by number of mismatches in the seed region. Say you use -k 3 and a given read aligns once with 1 mismatch in the seed and twice with 2 mismatches in the seed. If you do *not* specify --nostratum (the default), then Bowtie will only report the single 1-mismatch hit. If you do specify --nostratum, Bowtie will report all 3 hits.

            I'll make a note to add a clear example in the documentation for future releases.

            [*]I am only interested in rather unique maps, the rest can go to another file and i can have a look at it later. the --unfa flag moves unmapped seqs to a file, but when i use -m 3 i will discard seqs that map more than 3 times, right? Those go that same file? or they are lost forever? The idea is that i want to do a preliminary analyses fast and i can remap those multimaps overnight or during a weekend when the server is not used.
            (Let me answer your question with respect to the just released 0.9.8.1 version of Bowtie, since version 0.9.8 had issues with --unfa and --unfq.)

            As of 0.9.8.1 Bowtie supports --maxfa/--maxfq options that dump reads that exceed the -m limit to a separate file. If --maxfa/--maxfq is not specified but --unfa/--unfq is, then these reads are dumped to the same file as the reads that don't align at all.

            [*]If i use the -k 3 flag, i want to report 3 maps, will it take the first 3 it encounters? And if i use the --best flag, will it go find all the possible maps and only report the best 3?

            ./bowtie -k 3 -m 10 --best --unfa MSC_bowtie_unal_fasta human_genome ../files/file.111.fastq MSC_bowtie

            is the commando i want to use. I hope it will find max 10 maps per sequence and report the best 3 (combining -k 3 and --best) Will this work? Just experimenting... In a later stage i will map everything(even x100 repeats) and output it to a db, so it doesnt really matter if it doesnt work, just trying to understand the program completely.
            Based on what you say, yes, that command will do what you intend. -k 3 --best should guarantee that you get up to three alignments of the "best" kind (best in terms of # of mismatches in the seed) and -m 10 will ensure that no alignments are reported for a read that aligns to more than 10 places. If you don't care whether the alignments come from the same strata, then you should also use "--nostrata".

            Hope that helps.

            Thanks,
            Ben

            Comment


            • #51
              I should have read Ben's post earlier. I spent a lot of time to find out that the sequences are all reversed, which caused weird results. I will try out the 0.9.8.1 now.

              Comment


              • #52
                Does Bowtie work with colorspace data?

                Comment


                • #53
                  Originally posted by doxologist View Post
                  Does Bowtie work with colorspace data?
                  Not yet - that's on the TODO list but we're going to tackle paired-end alignment and gapped alignments first.

                  Thanks,
                  Ben

                  Comment


                  • #54
                    thanks. looking forward to it.

                    Comment


                    • #55
                      HI Ben/Florian

                      I was using bowtie and I wanted to compare the mapping qualities when converted to .map format. Would these mapping quality scores be comparable to that for MAQ results?

                      Comment


                      • #56
                        hi zee

                        sorry, if i gave you the wrong impression, but i'm actually not a developer of bowtie. i cannot claim any of this fame -- unfortunately :-D.
                        as to the question, i'll better leave the answer to ben, as i'm not sure about the answer either.

                        cheers,

                        f

                        Comment


                        • #57
                          Originally posted by zee View Post
                          I was using bowtie and I wanted to compare the mapping qualities when converted to .map format. Would these mapping quality scores be comparable to that for MAQ results?
                          Bowtie's mapping qualities are only a rough approximation of Maq's. Maq computes mapping quality like this:

                          Q = min {q_2 - q_1 - 4.343log(n_2), 4 + (3-k')(q_bar - 14) - 4.343log(p_1(3-k,28)) }

                          Where:

                          q_2 is the quality-weighted Hamming distance of the best hit, q_2 is the quality-weighted Hamming distance of the second best hit, q_bar is the average quality value on the 5' end of the read, and "p1(k,28) is the probability that a perfect hit and a k-mismatch hit coexists given a 28bp sequence which can be estimated during alignment" (from the Maq paper).

                          Bowtie, as discussed previously in this thread, doesn't guarantee that it will find the best hit, and by default, won't even continue searching for the second best one. So Bowtie can't really compute Mapping quality this way. Instead, our Maq converter (which was derived from Heng Li's ELAND converter) calculates mapping qualities as follows:

                          Q = (3 - k) * 25 - log(# of other equally good occurances found by Bowtie).

                          Where k is the number of mismatches in the seed region of the alignment. This is not as nice as Maq's method. However, it works without forcing Bowtie to be used in one of it's slower modes (--all, for example). In our tests so far, Maq's assembler handles qualities computed this way pretty well and produces good SNP calls.

                          Comment


                          • #58
                            Cole,

                            Thanks for that clarification. I found that some work I was doing with bowtie seemed to produce a lot more results than what I expected, even when I did use the '--best' option for reporting hits. When I looked at how it stacked up against some other aligners, I felt that there was some kind of overestimation when using the mapping quality score to evaluate good quality SNPs, correct alignments.
                            I would just need to be cautious how I interpret bowtie's results with respect to metric's derived from other aligners.

                            Comment


                            • #59
                              Does anyone have any recommendations for scoring params when mapping long (76bp Illumina) reads?

                              Also, my reads are PE -- any chance this will be supported soon?

                              Comment


                              • #60
                                Hi Foram,

                                I would try upping both the seed length (-l) and the error tolerance (-e). Others may have better suggestions, though. If you find parameters you're happy with, please do post them back here since that will help others.

                                I'm working on paired-end support currently. Expect it in a few weeks or so.

                                Thanks,
                                Ben

                                Originally posted by foram View Post
                                Does anyone have any recommendations for scoring params when mapping long (76bp Illumina) reads?

                                Also, my reads are PE -- any chance this will be supported soon?

                                Comment

                                Latest Articles

                                Collapse

                                • seqadmin
                                  Advancing Precision Medicine for Rare Diseases in Children
                                  by seqadmin




                                  Many organizations study rare diseases, but few have a mission as impactful as Rady Children’s Institute for Genomic Medicine (RCIGM). “We are all about changing outcomes for children,” explained Dr. Stephen Kingsmore, President and CEO of the group. The institute’s initial goal was to provide rapid diagnoses for critically ill children and shorten their diagnostic odyssey, a term used to describe the long and arduous process it takes patients to obtain an accurate...
                                  12-16-2024, 07:57 AM
                                • seqadmin
                                  Recent Advances in Sequencing Technologies
                                  by seqadmin



                                  Innovations in next-generation sequencing technologies and techniques are driving more precise and comprehensive exploration of complex biological systems. Current advancements include improved accessibility for long-read sequencing and significant progress in single-cell and 3D genomics. This article explores some of the most impactful developments in the field over the past year.

                                  Long-Read Sequencing
                                  Long-read sequencing has seen remarkable advancements,...
                                  12-02-2024, 01:49 PM

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by seqadmin, 12-17-2024, 10:28 AM
                                0 responses
                                32 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 12-13-2024, 08:24 AM
                                0 responses
                                48 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 12-12-2024, 07:41 AM
                                0 responses
                                34 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 12-11-2024, 07:45 AM
                                0 responses
                                46 views
                                0 likes
                                Last Post seqadmin  
                                Working...
                                X