Introducing CalcTrueQuality, a quality-score recalibrator

Brian Bushnell

Super Moderator

Join Date: Jan 2014

Posts: 2709
- Share
- Tweet
#16

11-10-2015, 05:15 PM

I suggest remapping them with BBMap. You don't need to remap all of them; if speed is a big concern, and you have a lot of reads, you could randomly subsample 10% of the pairs (in BBMap, the flag would be "samplerate=0.1"), or fewer, though obviously the more data, the more accurate. If the reads have MD tags, it is theoretically possible to convert the cigar strings to X and = without remapping, but I have not yet written something to do that. It probably exists, though.

-Brian

P.S. As long as the pairs are randomly sampled (rather than all from the beginning of the file), 5-10 million pairs is adequate for good recalibration. The recalibration is "soft"; where there is not enough data, it simply keeps the original quality score; and with more data, the output will asymptotically approach the measured quality score.

Last edited by Brian Bushnell; 11-10-2015, 05:19 PM.
Comment

Previous 1 2 template Next

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 27 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 30 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 26 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 52 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad