Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    I'm also interested in how MAQ assigns quality scores. Can you confirm what you meant by "Q0;Q10;Q20" in the MAQ tests? Is this a threshold on the quality score that MAQ gives the alignment (as opposed to the quality score of the read in the Fastq file)? If a read maps to multiple locations, MAQ reports one location at random and assigns a quality score of 0. Therefore the Q0 accuracy should be much less than if you had excluded these alignments. I think that this behavior is a bit strange; it would be less confusing if MAQ didn't report any matches for the non-uniquely mapping reads and instead reported the number of places that the read maps (the whole read, not just the first 25 bases).

    Comment


    • #17
      I'm definitely out of my league in this discussion, but if anyone needs hosting for some of these sample datasets, let me know!

      Comment


      • #18
        Q0;Q10;Q20 is threshold on the alignment quality score assigned by MAQ.

        MAQ is initially designed for resequencing and keeping these repetitive reads is quite useful for the subsequent SNP calling with MAQ. This also helps CNV calling. I could understand that a lot of people do not want to see all these repetitive reads, but putting a threshold on mapping quality is very easy anyway. In addition, different people may want to set different different threshold.

        As for the calculation of mapping quality, it just follows a very simple Bayesian procedure. You can calculate p(z|x,u) of read z mapped to u on the reference x. With Bayesian formula you get p(u|x,z). The mapping quality is -10log10(1-p(u|x,z)).

        Comment


        • #19
          Thanks for the info Ih3. I agree that it is very useful to report the locations of repetitive / non-uniquely mappable reads. However, can MAQ be set to report ALL of the repetitive locations rather than just a single random one? I know that this is off-topic; apologies.

          Comment


          • #20
            The latest version, 0.6.6, can output ALL hits with 0- or 1-mismatch in the seed.

            Comment


            • #21
              Mira

              Hello everybody,

              I am new to this group and couldnt resist myself to join this exiciting discussion

              Has anybody heard of MIRA ?



              There is this guy quitely working on another software tool for Next Gen assembly .

              The USP of this tool is it can perform a true hybrid assembly SAnger+454 or 454+Solexa which I believe will solve the Next gen assembly issues.

              Although the version 2.9.95 doesnt support SNP analysis yet but its compact.

              This tool might be on slower side becuase it performs assembly iterative correcting errors on the way.

              I hope somebody evaluates this new version becuase I dont have the much needed hardware to run this program.

              regds,
              Amit

              Comment


              • #22
                Originally posted by lh3 View Post
                The latest version, 0.6.6, can output ALL hits with 0- or 1-mismatch in the seed.
                Hi Heng.
                I was excited to see this feature added to MAQ in the latest version as much of my work is applied to RNA (hence, it is quantitative). This should (hopefully) allow me to reduce some biases introduced by losing reads which map to multiple locations. Now, I am wondering, how do I go about using this feature? Is there a new option when running maq map? Or mapview? I have been unable to find it in the manpage.

                Thanks,

                Ryan

                FOLLOWUP:

                I found out the answer to this in the latest doc provided with version 0.6.6 (not the version on the sourceforge page).

                Usage: maq map [options] <out.map> <chr.bfa> <reads_1.bfq> [reads_2.bfq]

                Options: -1 INT length of the first read (<64) [0]
                -2 INT length of the second read (<64) [0]
                -m FLOAT rate of difference between reads and references [0.001]
                -e INT maximum allowed sum of qualities of mismatches [70]
                -d FILE adapter sequence file [null]
                -a INT max distance between two paired reads [250]
                -n INT number of mismatches in the first 24bp [2]
                -M c|g methylation alignment mode [null]
                -u FILE dump unmapped and poorly aligned reads to FILE [null]
                -H FILE dump multiple/all 01-mismatch hits to FILE [null]
                -C INT max number of hits to output. >512 for all 01 hits. [250]
                -s INT seed for random number generator [random]
                -N record mismatch positions (max read length<=55)
                -t trim all reads (usually not recommended)
                -c match in the colorspace
                Last edited by myrna; 05-05-2008, 02:00 PM.

                Comment


                • #23
                  Originally posted by myrna View Post
                  -H FILE dump multiple/all 01-mismatch hits to FILE [null]
                  If you specify this option, maq will dump all hits to a gzip file. -C specifies how many hits to output.

                  Comment


                  • #24
                    Eland is going to be hard to beat. It has had a few years of hard work, optimisation and thought put into it by Anthony (SSAHA) Cox at Solexa. It was designed from day 1 for aligning far more reads than you currently get from a GAII, to a full human reference on a desktop computer.

                    Comment


                    • #25
                      Heng,
                      If you wanted to test RMAPQ, you could always convert the FASTQ files into an approximation of the PRB files:
                      Discussion of next-gen sequencing related bioinformatics: resources, algorithms, open source efforts, etc

                      Shaun

                      Comment


                      • #26
                        I am keen to find out what the optimum set of parameters are for MAQ in situations where we expect to find more indels as welll as the 0-2 mismatch hits.
                        The latest version sounds like it would be better for minimizing false positives.

                        Comment


                        • #27
                          Originally posted by Amit View Post
                          ...
                          MIRA
                          ...
                          Although the version 2.9.95 doesnt support SNP analysis yet but its compact.
                          ...
                          It does since 2.6 or something. Alas, the docs need to catch up.

                          Comment


                          • #28
                            SeqMap (http://biogibbs.stanford.edu/~jiangh/SeqMap/) - work like ELand, can do 3 or more bp mismatches and also insdel

                            Comment


                            • #29
                              Hi Heng!

                              Have you heard about SOCS (http://bioinformatics.oxfordjournals...tract/btn512v1, http://socs.biology.gatech.edu/)? In their article they say that "The overall algorithm is similar to that used by software tools developed for analysis of Illumina-Solexa data (Li et al, 2008; Smith et al, 2008)

                              I'm interested in alignment of SOLiD data and I'd like to know your opinion what to use.. Maq, Mosaik, SHRiMP, ZOOM or this new tool SOCS ..

                              Best regards,
                              Valentina

                              Comment


                              • #30
                                Where to find a recent benchmarking?

                                Hi lh3,

                                Thanks for the original benchmarking! I'm actually looking for such a benchmarking including the latest tools, like ZOOM!, bowtie, R Biostings pairwiseAlignment(), etc. Has anybody heard of that?

                                Cheers,

                                N.

                                Comment

                                Latest Articles

                                Collapse

                                • seqadmin
                                  Current Approaches to Protein Sequencing
                                  by seqadmin


                                  Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                                  04-04-2024, 04:25 PM
                                • seqadmin
                                  Strategies for Sequencing Challenging Samples
                                  by seqadmin


                                  Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                                  03-22-2024, 06:39 AM

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by seqadmin, 04-11-2024, 12:08 PM
                                0 responses
                                31 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 04-10-2024, 10:19 PM
                                0 responses
                                32 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 04-10-2024, 09:21 AM
                                0 responses
                                28 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 04-04-2024, 09:00 AM
                                0 responses
                                53 views
                                0 likes
                                Last Post seqadmin  
                                Working...
                                X