SEQanswers

Go Back   SEQanswers > Applications Forums > De novo discovery



Similar Threads
Thread Thread Starter Forum Replies Last Post
Large discrepancy between de novo assembly versus actual biological genome size NYGen De novo discovery 5 01-21-2017 11:19 AM
Genome size estimation-jellyfish bioman1 Bioinformatics 3 08-18-2014 11:14 AM
Genome size estimation moinul De novo discovery 9 04-04-2014 03:22 AM
estimate genome size through kmer analysis plantae Bioinformatics 0 07-05-2012 03:46 AM
estimate genome size through kmer analysis plantae De novo discovery 0 07-05-2012 03:36 AM

Reply
 
Thread Tools
Old 01-20-2017, 08:02 PM   #1
GAFA
Junior Member
 
Location: Riyadh, KSA

Join Date: Jan 2017
Posts: 3
Question Kmer genome size estimation so much lower than the actual

Hi everyone,

Im assembling an eukaryotic genome (2n=22) for the first time, working in a non-model plant species, and I could use some insight: my data consists of reads from a full lane of Illumina HiSeq 2x151 sequences with insert size ~350. from several literature, the estimation of haploid genome size of this plant was about 400Mb. However, kmer-counting programs such as Jellyfish have predicted an assembly size of less than half that number, at about 210Mb.

Does anyone have any idea why the nuclear genome size is so much larger than that of kmer one?

Any related thoughts/comments would be, by me, appreciated!
GAFA is offline   Reply With Quote
Old 01-21-2017, 09:38 AM   #2
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

It would be nice if you could post the kmer frequency histogram (both as text and as an image). Kmer-counters can make mistakes on genome size estimation when the data is noisy, or if the peaks are too broad, or if they guess the ploidy wrong, etc. I'd be interested in seeing what BBMap's KmerCountExact reports as the estimated genome size, also:

Code:
kmercountexact.sh in=reads.fq khist=khist.txt peaks=peaks.txt

Last edited by Brian Bushnell; 01-21-2017 at 09:42 AM.
Brian Bushnell is offline   Reply With Quote
Old 01-21-2017, 08:00 PM   #3
GAFA
Junior Member
 
Location: Riyadh, KSA

Join Date: Jan 2017
Posts: 3
Default

Dear Brian,
I have estimated kmer count using Jellyfish.
hereunder, the links to histo image and text.

https://www.dropbox.com/s/ar60gcuiqs...histo.pdf?dl=0
https://www.dropbox.com/s/wse5iu02bi...histo.txt?dl=0

I will try using BBMap and let u know

thanks
GAFA is offline   Reply With Quote
Old 01-21-2017, 11:29 PM   #4
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

That looks like a haploid to me. There is no evidence from the kmer frequency histogram that the organism is a diploid. There is exactly one prominent peak, and it is very clear. Is this sample highly inbred, or wild-type?

Last edited by Brian Bushnell; 01-21-2017 at 11:33 PM.
Brian Bushnell is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 06:31 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO