Unconfigured Ad

**SNPsaurus** · 01-26-2019, 03:38 PM

We polish with arrow and just list one of the outputs as fastq "-o sample_consensus.fastq" and it generates a fastq file with a consensus for each contig and the quality score. You might check if quiver has the same option, or switch to arrow (here's a blog about doing so https://dazzlerblog.wordpress.com/tag/arrow/ ).

**phleroy** · 01-29-2019, 05:50 AM

Thank you very much for this suggestion
We do have the possibility to obtain a fastq file with quiver with the option -o out.fastq as you have mentionned for arrow

The question, is then, how you recover the mean QV for the consensus?

into the fastq file we can see :
@000000F|quiver
ATCATTGTTACTACTAGAGGAAGAATCTTTCTTG ...
+
"RQQPQQQRQQQQQSRRQSTSSQRQRSSSRRRQQRSRRRQRSRQ ...

I guess the quality value for each consensus nucleotide is the second line ? but how to calculate it ?

Thank you again for any help
Philippe

**Magdoll** · 02-04-2019, 02:38 PM

You can convert the Phred QV scores to probabilities then sum over the probabilities over the entire sequence to get the expected number of errors.

You can use this Python script to calculate expected acc from a FASTQ files (though this is in a repo meant for PacBio transcriptome data, this script is generic):

Sequence Manipulation Wiki

https://github.com/Magdoll/cDNA_Cupcake/wiki/Sequence-Manipulation-Wiki#expacc

Miscellaneous collection of Python and R scripts for processing Iso-Seq data - Magdoll/cDNA_Cupcake

**phleroy** · 02-05-2019, 12:20 AM

Thank you so much, I will try this option as soon as possible and tell you :-)

**phleroy** · 02-11-2019, 07:26 AM

I tried the python script (calc_expected_accuracy_from_fastq.py) on our fastq consensus sequence which was obtained with quiver and obtained as expected the "expected_accurancy" which was : expected_accuracy=0.997

In a previous analysis I used two smrtlink python scripts to estimate the mean_QV
- summarize_coverage.py to obtain a alignment summary gff file
- polished_assembly.py to obtain the csv file which gives the a mean_qv of 48.65

I have the feeling that the two values estimate different metrics ? I am not a specialist of this area and I am curious to have any remarks or suggestion

Nevertheless, these two values : mean_qv and expected_accuracy should give an estimation of the quality of the consensus assembly. I just need to understand precisely what interpretation to have for each value

Thank you in advance
Philippe

**rhall** · 02-12-2019, 09:22 AM

If you assemble a set of reads, then use them to polish the assembly, there is no way to measure any truly meaningful consensus quality without an orthogonal datatype, or knowledge of ground truth. The expected accuracy from the fastq that results from polishing is highly dependent on the consensus algorithm and may not be a true indication of the quality of the consensus.

Topics	Statistics	Last Post
Study Captures the First Moments of DNA Replication by SEQadmin2 Started by SEQadmin2, 07-24-2026, 12:17 PM	0 responses 29 views 0 reactions	Last Post by SEQadmin2 07-24-2026, 12:17 PM
Chemotherapy Leaves Detectable DNA Signatures in Childhood Tumors by SEQadmin2 Started by SEQadmin2, 07-23-2026, 11:41 AM	0 responses 21 views 0 reactions	Last Post by SEQadmin2 07-23-2026, 11:41 AM
Single-Cell Atlases Skew Toward European Ancestry, Analysis Finds by SEQadmin2 Started by SEQadmin2, 07-20-2026, 11:10 AM	0 responses 212 views 0 reactions	Last Post by SEQadmin2 07-20-2026, 11:10 AM
UC San Diego Bioengineers Map Gene Function in Human Stem Cells by SEQadmin2 Started by SEQadmin2, 07-13-2026, 10:26 AM	0 responses 78 views 0 reactions	Last Post by SEQadmin2 07-13-2026, 10:26 AM

Unconfigured Ad

PacBio consensus quality

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News