SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
High amount of low frequency / unique k-mers in Illumina reads balaena Bioinformatics 1 04-27-2016 03:47 AM
Unique K-mers & coverage depth sigma Bioinformatics 9 05-25-2012 04:22 AM
K-mers charltt Bioinformatics 2 06-08-2011 12:03 PM
Regarding Unique reads, Unique alignments sridharacharya RNA Sequencing 2 09-20-2010 06:39 AM

Reply
 
Thread Tools
Old 07-04-2017, 02:12 AM   #1
Elakkiya
Junior Member
 
Location: India

Join Date: Jul 2017
Posts: 4
Default What does Unique and Distinct K-mers mean?

Hello!

I am new to bioinformatics.I have generated the k-mers and unique k-mers from the reads.What does distinct k-mers mean and how it differ from unique k-mers.Can anyone help me clarify with an example pls.
Elakkiya is offline   Reply With Quote
Old 07-05-2017, 11:46 AM   #2
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,695
Default

Consider "AAAAAA". When counting 3-mers, there are 4 of them. But there is only one unique 3-mer: "AAA".
Brian Bushnell is offline   Reply With Quote
Old 07-05-2017, 09:01 PM   #3
Elakkiya
Junior Member
 
Location: India

Join Date: Jul 2017
Posts: 4
Default distinct k-mer

Thanks Brain!

But what is distinct k-mer mean?how it differ from unique k-mers.



Thanks,
Elakkiya
Elakkiya is offline   Reply With Quote
Old 07-05-2017, 09:52 PM   #4
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,695
Default

Quote:
Originally Posted by Elakkiya View Post
Thanks Brain!

But what is distinct k-mer mean?how it differ from unique k-mers.

Thanks,
Elakkiya
I don't use that term because I find it confusing. But I assume the authors mean, by "distinct kmers", the total number of counted kmers, whether unique or not. In my example, that would mean distinct kmers are 4 and unique kmers are 1. I discourage using the term "distinct kmers" since "distinct" is essentially synonymous with "unique", just less precise in this case. I suggest you call unique kmers "unique kmers". And I suggest you call the total number of kmers counted (whether unique or not) "total kmers" or "counted kmers" or "total kmers counted". But never call non-unique kmers "distinct kmers", since that's misleading. If two kmers are identical, nothing distinguishes them. Therefore, neither is unique from the other. And, by definition, they cannot be distinct while being identical. I'm not sure what software you are using that defines "unique kmers" and "distinct kmers" differently, but that definition is misleading and not useful.

I think that probably the authors think of what they call "distinct kmers" as "total kmers counted" and "unique kmers" as "unique kmers". But I suggest you contact them and inquire.

Last edited by Brian Bushnell; 07-05-2017 at 10:03 PM.
Brian Bushnell is offline   Reply With Quote
Old 07-05-2017, 10:47 PM   #5
Elakkiya
Junior Member
 
Location: India

Join Date: Jul 2017
Posts: 4
Default

Thanks for the Clarification Brian!

I have contacted the author for the clarity.

They mentioned in the table: total k-mers,unique k-mers,distinct k-mers.By seeing that i got confused.Let us wait for the reply from the authors.


Thanks,
Elakkiya
Elakkiya is offline   Reply With Quote
Old 07-05-2017, 11:32 PM   #6
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,695
Default

Quote:
Originally Posted by Elakkiya View Post
Thanks for the Clarification Brian!

I have contacted the author for the clarity.

They mentioned in the table: total k-mers,unique k-mers,distinct k-mers.By seeing that i got confused.Let us wait for the reply from the authors.


Thanks,
Elakkiya
I can only think of two kmer counts... total, and unique. So, it seems like they may have a new category that I have not heard of, or there might be a misunderstanding. Please post the results of your investigation!
Brian Bushnell is offline   Reply With Quote
Old 07-05-2017, 11:32 PM   #7
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,695
Default

Quote:
Originally Posted by Elakkiya View Post
Thanks for the Clarification Brian!

I have contacted the author for the clarity.

They mentioned in the table: total k-mers,unique k-mers,distinct k-mers.By seeing that i got confused.Let us wait for the reply from the authors.


Thanks,
Elakkiya
I can only think of two kmer counts... total, and unique. So, it seems like they may have a new category that I have not heard of, or there might be a misunderstanding. Please post the results of your investigation!
Brian Bushnell is offline   Reply With Quote
Old 07-07-2017, 03:50 AM   #8
Elakkiya
Junior Member
 
Location: India

Join Date: Jul 2017
Posts: 4
Default

Hi Brian

The author replies as
"Distinct k-mers should be count of k-mers that occur at least once in reads/data".

k-mers: AAA, AAA, CCA, CCC, CCC, GGG, GGG, GGG, TTT
total k-mers: 9x
unique k-mers: 2x (CCA, TTT)
distinct k-mers: 5x (AAA, CCA, CCC, GGG, TTT)


Thanks,
Elakkiya
Elakkiya is offline   Reply With Quote
Old 07-07-2017, 09:49 AM   #9
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,695
Default

Oh, I see. I normally use the term "unique kmers" where he uses "distinct kmers", and "singleton kmers" or "depth-1 kmers" where he uses "unique kmers".
Brian Bushnell is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 07:34 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2017, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO