Seqanswers Leaderboard Ad

**ShaunMahony** · 05-04-2008, 08:40 AM

I'm also interested in how MAQ assigns quality scores. Can you confirm what you meant by "Q0;Q10;Q20" in the MAQ tests? Is this a threshold on the quality score that MAQ gives the alignment (as opposed to the quality score of the read in the Fastq file)? If a read maps to multiple locations, MAQ reports one location at random and assigns a quality score of 0. Therefore the Q0 accuracy should be much less than if you had excluded these alignments. I think that this behavior is a bit strange; it would be less confusing if MAQ didn't report any matches for the non-uniquely mapping reads and instead reported the number of places that the read maps (the whole read, not just the first 25 bases).

**ECO** · 05-04-2008, 09:11 AM

I'm definitely out of my league in this discussion, but if anyone needs hosting for some of these sample datasets, let me know!

**lh3** · 05-04-2008, 11:16 AM

Q0;Q10;Q20 is threshold on the alignment quality score assigned by MAQ.

MAQ is initially designed for resequencing and keeping these repetitive reads is quite useful for the subsequent SNP calling with MAQ. This also helps CNV calling. I could understand that a lot of people do not want to see all these repetitive reads, but putting a threshold on mapping quality is very easy anyway. In addition, different people may want to set different different threshold.

As for the calculation of mapping quality, it just follows a very simple Bayesian procedure. You can calculate p(z|x,u) of read z mapped to u on the reference x. With Bayesian formula you get p(u|x,z). The mapping quality is -10log10(1-p(u|x,z)).

**ShaunMahony** · 05-04-2008, 11:32 AM

Thanks for the info Ih3. I agree that it is very useful to report the locations of repetitive / non-uniquely mappable reads. However, can MAQ be set to report ALL of the repetitive locations rather than just a single random one? I know that this is off-topic; apologies.

**lh3** · 05-04-2008, 11:42 AM

The latest version, 0.6.6, can output ALL hits with 0- or 1-mismatch in the seed.

**Amit** · 05-05-2008, 04:55 AM

Mira

Hello everybody,

I am new to this group and couldnt resist myself to join this exiciting discussion

Has anybody heard of MIRA ?

MIRA

http://www.chevreux.org/projects_mira.html

There is this guy quitely working on another software tool for Next Gen assembly .

The USP of this tool is it can perform a true hybrid assembly SAnger+454 or 454+Solexa which I believe will solve the Next gen assembly issues.

Although the version 2.9.95 doesnt support SNP analysis yet but its compact.

This tool might be on slower side becuase it performs assembly iterative correcting errors on the way.

I hope somebody evaluates this new version becuase I dont have the much needed hardware to run this program.

regds,
Amit

**myrna** · 05-05-2008, 01:54 PM

Originally posted by lh3 View Post

The latest version, 0.6.6, can output ALL hits with 0- or 1-mismatch in the seed.

Hi Heng.
I was excited to see this feature added to MAQ in the latest version as much of my work is applied to RNA (hence, it is quantitative). This should (hopefully) allow me to reduce some biases introduced by losing reads which map to multiple locations. Now, I am wondering, how do I go about using this feature? Is there a new option when running maq map? Or mapview? I have been unable to find it in the manpage.

Thanks,

Ryan

FOLLOWUP:

I found out the answer to this in the latest doc provided with version 0.6.6 (not the version on the sourceforge page).

Usage: maq map [options] <out.map> <chr.bfa> <reads_1.bfq> [reads_2.bfq]

Options: -1 INT length of the first read (<64) [0]
-2 INT length of the second read (<64) [0]
-m FLOAT rate of difference between reads and references [0.001]
-e INT maximum allowed sum of qualities of mismatches [70]
-d FILE adapter sequence file [null]
-a INT max distance between two paired reads [250]
-n INT number of mismatches in the first 24bp [2]
-M c|g methylation alignment mode [null]
-u FILE dump unmapped and poorly aligned reads to FILE [null]
-H FILE dump multiple/all 01-mismatch hits to FILE [null]
-C INT max number of hits to output. >512 for all 01 hits. [250]
-s INT seed for random number generator [random]
-N record mismatch positions (max read length<=55)
-t trim all reads (usually not recommended)
-c match in the colorspace

**lh3** · 05-05-2008, 02:46 PM

Originally posted by myrna View Post

-H FILE dump multiple/all 01-mismatch hits to FILE [null]

If you specify this option, maq will dump all hits to a gzip file. -C specifies how many hits to output.

**cgb** · 05-06-2008, 04:55 AM

Eland is going to be hard to beat. It has had a few years of hard work, optimisation and thought put into it by Anthony (SSAHA) Cox at Solexa. It was designed from day 1 for aligning far more reads than you currently get from a GAII, to a full human reference on a desktop computer.

**ShaunMahony** · 05-09-2008, 12:19 PM

Heng,
If you wanted to test RMAPQ, you could always convert the FASTQ files into an approximation of the PRB files:

Converting FASTQ to RMAP prb files - SEQanswers

http://seqanswers.com/forums/showthread.php?t=236

Discussion of next-gen sequencing related bioinformatics: resources, algorithms, open source efforts, etc

Shaun

**zee** · 05-09-2008, 08:56 PM

I am keen to find out what the optimum set of parameters are for MAQ in situations where we expect to find more indels as welll as the 0-2 mismatch hits.
The latest version sounds like it would be better for minimizing false positives.

**BaCh** · 05-13-2008, 03:23 AM

Originally posted by Amit View Post

...
MIRA
...
Although the version 2.9.95 doesnt support SNP analysis yet but its compact.
...

It does since 2.6 or something. Alas, the docs need to catch up.

**jhui** · 06-19-2008, 12:20 PM

SeqMap (http://biogibbs.stanford.edu/~jiangh/SeqMap/) - work like ELand, can do 3 or more bp mismatches and also insdel

**valeu** · 11-04-2008, 07:59 AM

Hi Heng!

Have you heard about SOCS (http://bioinformatics.oxfordjournals...tract/btn512v1, http://socs.biology.gatech.edu/)? In their article they say that "The overall algorithm is similar to that used by software tools developed for analysis of Illumina-Solexa data (Li et al, 2008; Smith et al, 2008)

I'm interested in alignment of SOLiD data and I'd like to know your opinion what to use.. Maq, Mosaik, SHRiMP, ZOOM or this new tool SOCS ..

Best regards,
Valentina

**regyre** · 01-22-2009, 12:56 AM

Where to find a recent benchmarking?

Hi lh3,

Thanks for the original benchmarking! I'm actually looking for such a benchmarking including the latest tools, like ZOOM!, bowtie, R Biostings pairwiseAlignment(), etc. Has anybody heard of that?

Cheers,

N.

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 31 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 32 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 28 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 53 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News