Unconfigured Ad

**Markiyan** · 11-22-2017, 05:04 AM

Use multipass pacbio reads for self error correction and Kmer counting.

First try filtering out the multipass reads, and using those for kmer counting and self error correction.

Make sure to remove any mitochondrial/symbionts reads before doing the kmer counting. (Identify and complete the respective genome(s) first).

Get some good quality PCR-free illumina 2x250 reads or (BGIseq data if it works in your hands) and use it to confirm the kmer counting/self error correction/etc.

Short reads are very helpful for getting the contaminant(s)/symbionts genomes to a good draft stage and for filtering them out from the main dataset.
Usually such approach has to be done in the iterative fashion (with increasing amount of the input data after each iteration).

**luc** · 11-22-2017, 04:29 PM

Markiyan has alluded to it already; Pacbio data are not suitable for genome size estimates based on kmer analyses. The error rates of the uncorrected raw data are too high.

**rhall** · 11-27-2017, 09:55 AM

While a kmer analysis is going to be difficult with the raw pacbio data, it is possible to estimate the (effective) genome size from overlap statistics, either for the raw reads, the error corrected preassembled reads or by mapping the raw reads to the assembled contigs.
Run an initial assembly using a small seed read length, then plot the preassembled read overlap histogram.

Tutorial — FALCON 0.5 documentation

http://pb-falcon.readthedocs.io/en/latest/tutorial.html#assembly-graph-and-pread-overlaps

http://pb-falcon.readthedocs.io/en/latest/_downloads/Kingan_DiploidGenome_ECUGM2017_BFX.pdf

**huan** · 11-28-2017, 06:37 PM

I really appreciate for your help! I will have a try!

Topics	Statistics	Last Post
New Genomic Method Uncovers Ancient Hominin DNA by SEQadmin2 Started by SEQadmin2, 07-31-2026, 02:55 AM	0 responses 17 views 0 reactions	Last Post by SEQadmin2 07-31-2026, 02:55 AM
Study Captures the First Moments of DNA Replication by SEQadmin2 Started by SEQadmin2, 07-24-2026, 12:17 PM	0 responses 15 views 0 reactions	Last Post by SEQadmin2 07-24-2026, 12:17 PM
Chemotherapy Leaves Detectable DNA Signatures in Childhood Tumors by SEQadmin2 Started by SEQadmin2, 07-23-2026, 11:41 AM	0 responses 13 views 0 reactions	Last Post by SEQadmin2 07-23-2026, 11:41 AM
Single-Cell Atlases Skew Toward European Ancestry, Analysis Finds by SEQadmin2 Started by SEQadmin2, 07-20-2026, 11:10 AM	0 responses 24 views 0 reactions	Last Post by SEQadmin2 07-20-2026, 11:10 AM

Unconfigured Ad

Is it possible to evaluate genome size with sequel data?

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News