Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • phleroy
    Junior Member
    • Jan 2019
    • 4

    PacBio consensus quality

    Hello

    I have sequenced a BAC clone with PacBio RSII
    To make the assembly I used Facon through pbbioconda and for polishing I used quiver

    To have an estimation of the consensus quality I re map the original bam reads file against the consensus

    How to estimate a mean quality value, in other world a consensus Phred score for the base calls of the consensus ... :-)

    Thank you in advance
    Philippe
    Last edited by phleroy; 01-25-2019, 01:07 AM.
  • SNPsaurus
    Registered Vendor
    • May 2013
    • 525

    #2
    We polish with arrow and just list one of the outputs as fastq "-o sample_consensus.fastq" and it generates a fastq file with a consensus for each contig and the quality score. You might check if quiver has the same option, or switch to arrow (here's a blog about doing so https://dazzlerblog.wordpress.com/tag/arrow/ ).
    Providing nextRAD genotyping and PacBio sequencing services. http://snpsaurus.com

    Comment

    • phleroy
      Junior Member
      • Jan 2019
      • 4

      #3
      Thank you very much for this suggestion
      We do have the possibility to obtain a fastq file with quiver with the option -o out.fastq as you have mentionned for arrow

      The question, is then, how you recover the mean QV for the consensus?

      into the fastq file we can see :
      @000000F|quiver
      ATCATTGTTACTACTAGAGGAAGAATCTTTCTTG ...
      +
      "RQQPQQQRQQQQQSRRQSTSSQRQRSSSRRRQQRSRRRQRSRQ ...

      I guess the quality value for each consensus nucleotide is the second line ? but how to calculate it ?

      Thank you again for any help
      Philippe

      Comment

      • Magdoll
        Member
        • Aug 2011
        • 30

        #4
        You can convert the Phred QV scores to probabilities then sum over the probabilities over the entire sequence to get the expected number of errors.

        You can use this Python script to calculate expected acc from a FASTQ files (though this is in a repo meant for PacBio transcriptome data, this script is generic):
        Miscellaneous collection of Python and R scripts for processing Iso-Seq data - Magdoll/cDNA_Cupcake

        Comment

        • phleroy
          Junior Member
          • Jan 2019
          • 4

          #5
          Thank you so much, I will try this option as soon as possible and tell you :-)

          Comment

          • phleroy
            Junior Member
            • Jan 2019
            • 4

            #6
            I tried the python script (calc_expected_accuracy_from_fastq.py) on our fastq consensus sequence which was obtained with quiver and obtained as expected the "expected_accurancy" which was : expected_accuracy=0.997

            In a previous analysis I used two smrtlink python scripts to estimate the mean_QV
            - summarize_coverage.py to obtain a alignment summary gff file
            - polished_assembly.py to obtain the csv file which gives the a mean_qv of 48.65

            I have the feeling that the two values estimate different metrics ? I am not a specialist of this area and I am curious to have any remarks or suggestion

            Nevertheless, these two values : mean_qv and expected_accuracy should give an estimation of the quality of the consensus assembly. I just need to understand precisely what interpretation to have for each value

            Thank you in advance
            Philippe

            Comment

            • rhall
              Senior Member
              • Aug 2012
              • 324

              #7
              If you assemble a set of reads, then use them to polish the assembly, there is no way to measure any truly meaningful consensus quality without an orthogonal datatype, or knowledge of ground truth. The expected accuracy from the fastq that results from polishing is highly dependent on the consensus algorithm and may not be a true indication of the quality of the consensus.

              Comment

              Latest Articles

              Collapse

              • SEQadmin2
                Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                by SEQadmin2


                I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

                Here are nine questions we think about, in roughly the order they matter, before...
                06-18-2026, 07:11 AM
              • SEQadmin2
                From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                by SEQadmin2


                Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                ...
                06-02-2026, 10:05 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by SEQadmin2, 06-26-2026, 11:10 AM
              0 responses
              15 views
              0 reactions
              Last Post SEQadmin2  
              Started by SEQadmin2, 06-17-2026, 06:09 AM
              0 responses
              49 views
              0 reactions
              Last Post SEQadmin2  
              Started by SEQadmin2, 06-09-2026, 11:58 AM
              0 responses
              107 views
              0 reactions
              Last Post SEQadmin2  
              Started by SEQadmin2, 06-05-2026, 10:09 AM
              0 responses
              125 views
              0 reactions
              Last Post SEQadmin2  
              Working...