 07-04-2017, 02:12 AM #1 Elakkiya Junior Member   Location: India Join Date: Jul 2017 Posts: 4 What does Unique and Distinct K-mers mean? Hello! I am new to bioinformatics.I have generated the k-mers and unique k-mers from the reads.What does distinct k-mers mean and how it differ from unique k-mers.Can anyone help me clarify with an example pls.
 07-05-2017, 11:46 AM #2 Brian Bushnell Super Moderator   Location: Walnut Creek, CA Join Date: Jan 2014 Posts: 2,695 Consider "AAAAAA". When counting 3-mers, there are 4 of them. But there is only one unique 3-mer: "AAA".
 07-05-2017, 09:01 PM #3 Elakkiya Junior Member   Location: India Join Date: Jul 2017 Posts: 4 distinct k-mer Thanks Brain! But what is distinct k-mer mean?how it differ from unique k-mers. Thanks, Elakkiya
 Originally Posted by Elakkiya Thanks Brain! But what is distinct k-mer mean?how it differ from unique k-mers. Thanks, Elakkiya
I don't use that term because I find it confusing. But I assume the authors mean, by "distinct kmers", the total number of counted kmers, whether unique or not. In my example, that would mean distinct kmers are 4 and unique kmers are 1. I discourage using the term "distinct kmers" since "distinct" is essentially synonymous with "unique", just less precise in this case. I suggest you call unique kmers "unique kmers". And I suggest you call the total number of kmers counted (whether unique or not) "total kmers" or "counted kmers" or "total kmers counted". But never call non-unique kmers "distinct kmers", since that's misleading. If two kmers are identical, nothing distinguishes them. Therefore, neither is unique from the other. And, by definition, they cannot be distinct while being identical. I'm not sure what software you are using that defines "unique kmers" and "distinct kmers" differently, but that definition is misleading and not useful.

I think that probably the authors think of what they call "distinct kmers" as "total kmers counted" and "unique kmers" as "unique kmers". But I suggest you contact them and inquire.

 07-05-2017, 10:47 PM #5 Elakkiya Junior Member   Location: India Join Date: Jul 2017 Posts: 4 Thanks for the Clarification Brian! I have contacted the author for the clarity. They mentioned in the table: total k-mers,unique k-mers,distinct k-mers.By seeing that i got confused.Let us wait for the reply from the authors. Thanks, Elakkiya
 Originally Posted by Elakkiya Thanks for the Clarification Brian! I have contacted the author for the clarity. They mentioned in the table: total k-mers,unique k-mers,distinct k-mers.By seeing that i got confused.Let us wait for the reply from the authors. Thanks, Elakkiya
I can only think of two kmer counts... total, and unique. So, it seems like they may have a new category that I have not heard of, or there might be a misunderstanding. Please post the results of your investigation!

 07-07-2017, 03:50 AM #8 Elakkiya Junior Member   Location: India Join Date: Jul 2017 Posts: 4 Hi Brian The author replies as "Distinct k-mers should be count of k-mers that occur at least once in reads/data". k-mers: AAA, AAA, CCA, CCC, CCC, GGG, GGG, GGG, TTT total k-mers: 9x unique k-mers: 2x (CCA, TTT) distinct k-mers: 5x (AAA, CCA, CCC, GGG, TTT) Thanks, Elakkiya
 07-07-2017, 09:49 AM #9 Brian Bushnell Super Moderator   Location: Walnut Creek, CA Join Date: Jan 2014 Posts: 2,695 Oh, I see. I normally use the term "unique kmers" where he uses "distinct kmers", and "singleton kmers" or "depth-1 kmers" where he uses "unique kmers".