Unconfigured Ad

**tonybolger** · 07-27-2011, 03:52 AM

As a quick hack, if you can live with the error of 'double counting' multi-nt errors, you can just factor out the 'weakest' quality. So assuming both a and b can't happen:

Perror(a) = 10**-(a/10)
Perror(b) = 10**-(b/10)

Perror (a or b) = Perror(a)+Perror(b)
(this double-counts if both happen at once)

Then just use:
m=min(a,b)
Perror (a or b) = Perror(m)*(Perror(a-m)+Perror(b-m))

If a and b are very different, the additional terms will effectively zero anyway, meaning that the combined error is almost exactly equal to the most likely error, which makes sense.

And you don't actually need to calculate Perror(m), you just keep it in Q space, and add it back in later (since multiplication of probabilities = addition of Q values).

You can also apply this approach selectively, say if the minimum quality score is at least say 30, where double errors will be very rare.

**sulicon** · 07-27-2011, 11:22 AM

Thanks a lot

The key is skipping double errors when the quality score is high enough, as you mentioned in your last sentence.

**srasdk** · 07-27-2011, 08:10 PM

If you painstakingly follow the math, the resulting formula would be:

c = 0.5(a+b)
- 10log10( 10**(0.05*(a-b)) + 10**(0.05*(b-a)) - 10**(-0.05*(a+b)))

As you can see, the first term is an average.
The second term reduces the sum, since it will always be positive (a and b are positive making the sum under log > 1).
In your example (a=b and a = 20) the value under log 20 reduces to:
2-10**(-2) ~= 2. The higher 'a' the closer you are to equality
So yes, your observations are correct
No, it will not be the same for a != b or when you start using low quality values.

This formula is safe for all phred values.

Topics	Statistics	Last Post
A New Method Makes Hantavirus Genome Analysis Faster and More Accessible by SEQadmin2 Started by SEQadmin2, 06-05-2026, 10:09 AM	0 responses 12 views 0 reactions	Last Post by SEQadmin2 06-05-2026, 10:09 AM
A New Single-Cell Method Maps DNA-Protein Interactions by SEQadmin2 Started by SEQadmin2, 06-04-2026, 08:59 AM	0 responses 24 views 0 reactions	Last Post by SEQadmin2 06-04-2026, 08:59 AM
Long-Read RNA Sequencing Uncovers a Hidden Layer of Immune Cell Regulation by SEQadmin2 Started by SEQadmin2, 06-02-2026, 12:03 PM	0 responses 28 views 0 reactions	Last Post by SEQadmin2 06-02-2026, 12:03 PM
DNA Methylation Study Reveals How Epigenetic Changes Pass Between Generations by SEQadmin2 Started by SEQadmin2, 06-02-2026, 11:40 AM	0 responses 22 views 0 reactions	Last Post by SEQadmin2 06-02-2026, 11:40 AM

Unconfigured Ad

Any suggestion for calculating overall Phred scale quality score for a sequence?

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News