SEQanswers

Go Back   SEQanswers > Sequencing Technologies/Companies > Pacific Biosciences



Similar Threads
Thread Thread Starter Forum Replies Last Post
Consensus Calling BCFTOOLS PacBio gaps uloeber Bioinformatics 2 07-27-2017 02:18 AM
Calculating consensus quality scores mothurwestcott Bioinformatics 6 09-23-2014 05:40 AM
what do the quality scores of consensus fastq mean? ymwur Bioinformatics 0 06-03-2013 11:49 PM
Compute the consensus quality, SNP quality in SAMTools lyz1030 Bioinformatics 0 04-13-2011 05:09 PM

Reply
 
Thread Tools
Old 01-25-2019, 12:02 AM   #1
phleroy
Junior Member
 
Location: Toulouse, France

Join Date: Jan 2019
Posts: 4
Default PacBio consensus quality

Hello

I have sequenced a BAC clone with PacBio RSII
To make the assembly I used Facon through pbbioconda and for polishing I used quiver

To have an estimation of the consensus quality I re map the original bam reads file against the consensus

How to estimate a mean quality value, in other world a consensus Phred score for the base calls of the consensus ... :-)

Thank you in advance
Philippe

Last edited by phleroy; 01-25-2019 at 12:07 AM.
phleroy is offline   Reply With Quote
Old 01-26-2019, 02:38 PM   #2
SNPsaurus
Registered Vendor
 
Location: Eugene, OR

Join Date: May 2013
Posts: 465
Default

We polish with arrow and just list one of the outputs as fastq "-o sample_consensus.fastq" and it generates a fastq file with a consensus for each contig and the quality score. You might check if quiver has the same option, or switch to arrow (here's a blog about doing so https://dazzlerblog.wordpress.com/tag/arrow/ ).
__________________
Providing nextRAD genotyping and PacBio sequencing services. http://snpsaurus.com
SNPsaurus is offline   Reply With Quote
Old 01-29-2019, 04:50 AM   #3
phleroy
Junior Member
 
Location: Toulouse, France

Join Date: Jan 2019
Posts: 4
Default

Thank you very much for this suggestion
We do have the possibility to obtain a fastq file with quiver with the option -o out.fastq as you have mentionned for arrow

The question, is then, how you recover the mean QV for the consensus?

into the fastq file we can see :
@000000F|quiver
ATCATTGTTACTACTAGAGGAAGAATCTTTCTTG ...
+
"RQQPQQQRQQQQQSRRQSTSSQRQRSSSRRRQQRSRRRQRSRQ ...

I guess the quality value for each consensus nucleotide is the second line ? but how to calculate it ?

Thank you again for any help
Philippe
phleroy is offline   Reply With Quote
Old 02-04-2019, 01:38 PM   #4
Magdoll
Member
 
Location: Bay Area

Join Date: Aug 2011
Posts: 30
Default

You can convert the Phred QV scores to probabilities then sum over the probabilities over the entire sequence to get the expected number of errors.

You can use this Python script to calculate expected acc from a FASTQ files (though this is in a repo meant for PacBio transcriptome data, this script is generic):
https://github.com/Magdoll/cDNA_Cupc...on-Wiki#expacc
Magdoll is offline   Reply With Quote
Old 02-04-2019, 11:20 PM   #5
phleroy
Junior Member
 
Location: Toulouse, France

Join Date: Jan 2019
Posts: 4
Default

Thank you so much, I will try this option as soon as possible and tell you :-)
phleroy is offline   Reply With Quote
Old 02-11-2019, 06:26 AM   #6
phleroy
Junior Member
 
Location: Toulouse, France

Join Date: Jan 2019
Posts: 4
Default

I tried the python script (calc_expected_accuracy_from_fastq.py) on our fastq consensus sequence which was obtained with quiver and obtained as expected the "expected_accurancy" which was : expected_accuracy=0.997

In a previous analysis I used two smrtlink python scripts to estimate the mean_QV
- summarize_coverage.py to obtain a alignment summary gff file
- polished_assembly.py to obtain the csv file which gives the a mean_qv of 48.65

I have the feeling that the two values estimate different metrics ? I am not a specialist of this area and I am curious to have any remarks or suggestion

Nevertheless, these two values : mean_qv and expected_accuracy should give an estimation of the quality of the consensus assembly. I just need to understand precisely what interpretation to have for each value

Thank you in advance
Philippe
phleroy is offline   Reply With Quote
Old 02-12-2019, 08:22 AM   #7
rhall
Senior Member
 
Location: San Francisco

Join Date: Aug 2012
Posts: 319
Default

If you assemble a set of reads, then use them to polish the assembly, there is no way to measure any truly meaningful consensus quality without an orthogonal datatype, or knowledge of ground truth. The expected accuracy from the fastq that results from polishing is highly dependent on the consensus algorithm and may not be a true indication of the quality of the consensus.
rhall is offline   Reply With Quote
Reply

Tags
consensus, pacbio, quality value

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 09:05 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO