Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Should I try hybrid assembly with my PacBio data?

    Hi all,

    I recently had the genome of a bacterial strain I am working with sequenced using both PacBio and Illumina paired end.

    I have managed to assemble the Illumina data into ~200 contigs using Soap2. The PacBio data I got back came assembled into 22 contigs. Which I was a little disappointed with especially because other people in my lab have sequenced the same species but different strains and got their data back as one contig! The original idea was to map the Illumina to the PacBio to look for errors.

    But anyway, now I am not sure what to do with the data I have. The longest four contigs of the PacBio data cover ~97% of my estimated 4.5Mb genome size but all the other contigs do map to the same species when looking at the BLASR output, although some with low coverage. Now I'm not sure what is "real" and I don't want to underestimate the genome size.

    I have read that you can use Pacbio sequences to scaffold Illumina contigs so I am wondering if I should try that? But I can't really find any helpful tutorials/resources on how to do this. I'm not sure about which PacBio data I should use (I have the CCS.fastq, filtered subread fastq and longest subread fastq file). If I need to do anything to the data before using it? Which program to use? etc.

    Any help would be appreciated, even if its just a link to a good resource.

    Thanks in advance!

  • #2
    Rather than try a complex hybrid approach, which is unlikely to be any more successful than the 22 contig Pacbio assembly I would try to diagnose and optimize the Pacbio assembly. How do the preassembly statistics (yield, N50, number of bases) compare to the other assemblies in your lab? Was the subread N50, or the number of bases in the filtered data less than the assemblies that generated single contigs?
    With 22 contigs it is possible to run bridgemapper to order the contigs with the remaining Pacbio reads, overlapping the contigs using minimus2 and validating using resequencing. Think of it as manual finishing. I would then use the illumina reads to check the final base accuracy.
    You mentioned contigs with lower coverage, is it possible that the sample is not perfectly clonal, and you are seeing a minor population that is breaking the assembly?

    Comment

    Latest Articles

    Collapse

    • seqadmin
      Strategies for Sequencing Challenging Samples
      by seqadmin


      Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
      03-22-2024, 06:39 AM
    • seqadmin
      Techniques and Challenges in Conservation Genomics
      by seqadmin



      The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

      Avian Conservation
      Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
      03-08-2024, 10:41 AM

    ad_right_rmr

    Collapse

    News

    Collapse

    Topics Statistics Last Post
    Started by seqadmin, 03-27-2024, 06:37 PM
    0 responses
    12 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 03-27-2024, 06:07 PM
    0 responses
    11 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 03-22-2024, 10:03 AM
    0 responses
    53 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 03-21-2024, 07:32 AM
    0 responses
    69 views
    0 likes
    Last Post seqadmin  
    Working...
    X