Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • huan
    Member
    • Oct 2010
    • 56

    Is it possible to evaluate genome size with sequel data?

    Now we are doing the denovo assembly of marine organism with whole genome sequcing using sequel system. As we all know, the DNA extraction from marine organism is very difficult because of pollution and degradation. So is there any way to evaluate the genome size, heterozygus rate or genome repeat with DNA sequel data?
    happy
  • Markiyan
    Senior Member
    • Sep 2010
    • 126

    #2
    Use multipass pacbio reads for self error correction and Kmer counting.

    First try filtering out the multipass reads, and using those for kmer counting and self error correction.

    Make sure to remove any mitochondrial/symbionts reads before doing the kmer counting. (Identify and complete the respective genome(s) first).

    Get some good quality PCR-free illumina 2x250 reads or (BGIseq data if it works in your hands) and use it to confirm the kmer counting/self error correction/etc.

    Short reads are very helpful for getting the contaminant(s)/symbionts genomes to a good draft stage and for filtering them out from the main dataset.
    Usually such approach has to be done in the iterative fashion (with increasing amount of the input data after each iteration).

    Comment

    • luc
      Senior Member
      • Dec 2010
      • 469

      #3
      Markiyan has alluded to it already; Pacbio data are not suitable for genome size estimates based on kmer analyses. The error rates of the uncorrected raw data are too high.

      Comment

      • rhall
        Senior Member
        • Aug 2012
        • 324

        #4
        While a kmer analysis is going to be difficult with the raw pacbio data, it is possible to estimate the (effective) genome size from overlap statistics, either for the raw reads, the error corrected preassembled reads or by mapping the raw reads to the assembled contigs.
        Run an initial assembly using a small seed read length, then plot the preassembled read overlap histogram.


        Comment

        • huan
          Member
          • Oct 2010
          • 56

          #5
          I really appreciate for your help! I will have a try!
          happy

          Comment

          Latest Articles

          Collapse

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by SEQadmin2, 06-05-2026, 10:09 AM
          0 responses
          16 views
          0 reactions
          Last Post SEQadmin2  
          Started by SEQadmin2, 06-04-2026, 08:59 AM
          0 responses
          34 views
          0 reactions
          Last Post SEQadmin2  
          Started by SEQadmin2, 06-02-2026, 12:03 PM
          0 responses
          36 views
          0 reactions
          Last Post SEQadmin2  
          Started by SEQadmin2, 06-02-2026, 11:40 AM
          0 responses
          23 views
          0 reactions
          Last Post SEQadmin2  
          Working...