Hi,
In choosing the hash length k, I found some tip from the Velvet manual.
It says the relation between k-mer coverage C(k) and standard (nucleotide-wise) coverage C: C(k) = C*(L-k+1)/L where k is hash length, and L is read length. It recommends that C(k) should be above 10 to start getting decent results.
For test, I wanted to use 1 lane SOLEXA result whose read length is 101 and total read count is 69,084,522 which is corresponding to standard (nucleotide-wise) coverage C, 2.46.
With the C(2.46), L(101), I wanted to find the hash length k when setting C(k) to 10. However, calculated k is about -308 which is strange negative number.
I've also seen that if I increase 'C' value, I can obtain the reasonable 'k' value. e.g) when 'C' is 14.23, 'k' can be about 31.
Does it mean that I should increase 'C' value to get the reasonable hash length k?
To do so, I think I should pool multiple lanes to increase 'C' value.
However, it may cause memory problem.
Is my approach correct? or is there anybody who has a different idea?
Please let me know. Thanks in advance.
Won-Chul.
In choosing the hash length k, I found some tip from the Velvet manual.
It says the relation between k-mer coverage C(k) and standard (nucleotide-wise) coverage C: C(k) = C*(L-k+1)/L where k is hash length, and L is read length. It recommends that C(k) should be above 10 to start getting decent results.
For test, I wanted to use 1 lane SOLEXA result whose read length is 101 and total read count is 69,084,522 which is corresponding to standard (nucleotide-wise) coverage C, 2.46.
With the C(2.46), L(101), I wanted to find the hash length k when setting C(k) to 10. However, calculated k is about -308 which is strange negative number.
I've also seen that if I increase 'C' value, I can obtain the reasonable 'k' value. e.g) when 'C' is 14.23, 'k' can be about 31.
Does it mean that I should increase 'C' value to get the reasonable hash length k?
To do so, I think I should pool multiple lanes to increase 'C' value.
However, it may cause memory problem.
Is my approach correct? or is there anybody who has a different idea?
Please let me know. Thanks in advance.
Won-Chul.
Comment