SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
hi everyone, my own survey on what to do with my genome myowngenome Introductions 1 08-13-2013 04:56 AM
KmerFreq - std::bad_alloc 2seq Bioinformatics 0 10-11-2012 02:36 AM
How to run KmerFreq elisadouzi Bioinformatics 4 07-16-2012 10:31 PM

Reply
 
Thread Tools
Old 01-18-2015, 10:53 PM   #1
qihualiang
Junior Member
 
Location: United States

Join Date: Jan 2015
Posts: 8
Question Genome Survey using KmerFreq

Hi, everyone. I am now doing genome survey using KmerFreq for two species of wild goat. There are some problems

In data statistics, Goat A has more number of bps than Goat B. But in the log file result of KmerFreq, the kmer number of Goat A is about 1/2 of Goat B. And I find in the log file, the abnormal of kmer number is caused by the processed reads, Goat A is 1/2 of Goat B.

I am confused.. How can Goat A with more number of bps and reads, but with much less reads being processed in KmerFreq?

Thank you.
qihualiang is offline   Reply With Quote
Old 01-19-2015, 10:25 AM   #2
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

It might help if you gave some actual numbers - read lengths, platform, read count, kmer counts, and ideally kmer frequency histograms.
Brian Bushnell is offline   Reply With Quote
Old 01-20-2015, 12:34 AM   #3
qihualiang
Junior Member
 
Location: United States

Join Date: Jan 2015
Posts: 8
Default

Quote:
Originally Posted by Brian Bushnell View Post
It might help if you gave some actual numbers - read lengths, platform, read count, kmer counts, and ideally kmer frequency histograms.
The sequence platform is Hiseq 2000 or 2500, I am not sure at this moment. I choose the library with largest number of bps to give those numbers.

For goat A: reads length=101bp, reads num=393Mb. In kmerfreq, processed reads=180Mb.(90Mb for left and 90Mb for right)
For goat B: in data statistics, reads length=101bp, reads num=366.7Mb. In kmerfreq log file, processed reads=360Mb.(180Mb for left and 180Mb for right)


I also run FastQC for this library files of goat A.
for both Left and Right end fq file: warning of Per base sequence content & Per sequence GC content;Fail of Per base N content--just the last 2 position of the reads has 40% N
Right end fq file: warning of Per tile sequence quality; Fail of Kmer Content as followed:
Sequence Count PValue Obs/Exp Max Max Obs/Exp Position
TCGGAAT 55620 0.0 5.3323565 94


I assume that there are some problems when KmerFreq loads reads from fq files, this may explain for the less amount of reads being processed by KmerFreq in Goat A. But I don't know what is wrong with the fq reads of Goat A
qihualiang is offline   Reply With Quote
Old 01-20-2015, 10:26 AM   #4
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

1) When you say, for example, "393Mb", do you mean 393 million reads or 393 megabases?
2) In both cases, the amount of data processed appears to be less than the total amount of data, which is strange.
3) Before doing any sort of kmer analysis, you should adapter-trim your reads. I'm guessing that you have not, based on FastQC indicating there is an overexpressed kmer, but I'm not sure.

Perhaps you could install this and run khist, then post both the output to the console and the histogram files, since KmerFreq seems to be dropping some of the reads:

khist.sh in1=goatA_1.fq in2=goatA_2.fq hist=goatA_hist.txt

khist.sh in1=goatB_1.fq in2=goatB_2.fq hist=goatB_hist.txt
Brian Bushnell is offline   Reply With Quote
Reply

Tags
genome survey, kmerfreq

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 04:29 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO