Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • base drop outs in PacBio assemblies

    Hello,
    I am new to this board and I apologize if this is old news, but I haven't been able to do a productive search for it.

    My agency recently got a PacBio, which is located in a different state than I am so getting all the information takes me awhile.

    Any way, I submitted a Salmonella genomic DNA sample that was the first bacterial sample that the lab ran on the machine. They ran the genome twice and assembled one with HGAP2 and the other with HGAP3. Both assemblies gave a single large contig that was in the size range expected for a Salmonella genome (about 4.7M) and three plasmids that match ones seen in a similar strain in NCBI. Average coverage on our runs was reported to me to be over 100X. Synteny with the NCBI genome was spot on.

    The problem was that when the genomes were aligned, there were many places where a single base was missing, most often in at the beginning of a short homopolymeric run of maybe 6 to 10 bases. This happened throughout the genome at a rate of about once for every 4- to 6-thousand bases. In a three genome alignment (our two runs and the NCBI sequence) the missing bases in our runs were usually at different places.

    So, does anybody know what is going on? What do we need to fix?

    Thanks,
    Rick

  • #2
    Hi, while it is a known issue that you are more likely to get deletion errors in long homopolymers, with a good bacterial assembly you should be able to reach QV50 (1 error in 100,000) at 100x coverage. 1 in ~6,000 (<QV40) is a little high. Do you have the stats for the analysis? It would be interesting to check cell loading and preassembly yield.
    The first thing to try, with the current data, is to circularize and trim the assemblies, (genome and plasmids), upload them as a reference and run resequencing / quiver correction again. From this analysis you will be able to see if quiver makes any further corrections to the assembly. Generally quiver converges after one round, but it is always useful to check by running further rounds, until the number of corrections made is zero, or corrections simply oscillate. An oscillating correction would be an indication that the data does not support one variant over the other, either due to data quality, or sample heterogeneity.

    Comment


    • #3
      This thread has some further discussion of this issue, which we've seen in fungal assemblies. If you have Illumina from the strain you have options for correcting the assembly.

      Discussion of next-gen sequencing related bioinformatics: resources, algorithms, open source efforts, etc

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Essential Discoveries and Tools in Epitranscriptomics
        by seqadmin


        The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist on Modified Bases...
        Yesterday, 07:01 AM
      • seqadmin
        Current Approaches to Protein Sequencing
        by seqadmin


        Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
        04-04-2024, 04:25 PM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, 04-11-2024, 12:08 PM
      0 responses
      44 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-10-2024, 10:19 PM
      0 responses
      44 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-10-2024, 09:21 AM
      0 responses
      39 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-04-2024, 09:00 AM
      0 responses
      55 views
      0 likes
      Last Post seqadmin  
      Working...
      X