08-14-2013, 12:20 PM
Thanks again for your help, GenoMax.

My sequence length is only 50 bases and the quality is very good.

From what I read on the linked site, it seems that I have 6 kmers that are 50-fold enriched throughout the length of my sequence. But what does this mean in terms of sample quality ?

If I have 25-30 million reads and there is a 50 fold enrichment of these kmers (most likely I would guess from the adaptor) then how many sequences does that affect ? So, if there were 100,000 sequences in which had adaptor sequence at various positions other than the end of the sequence what would the fold-enrichment be ? 100,000 affected sequences may be enough to make the fold-enrichment high, but they are only a small percentage of the total. On the other hand, if I had a much smaller number of total sequences, then the fold-enrichment may be a problem. So, I guess I want to know how to relate fold-enrichment and total number of sequences in order to tell if the fold-enrichment is a problem or just from an insignificant part of the total.
