Dear All,
I've got deep-sequencing data for bioinformatic processing, that show some remarkable errors, I can't explain to myself.
What I am observing is small in absolute numbers:
a 1000 out of a million short-reads contain 'N's, so this would be a tiny fraction of sequences.
As I understand the error-rate increases towards the end of sequences and while I can see this in my data too this is not bothering me.
However, I'm stumbling about the fact, that 80% of those misreads specifically carry an 'N' at Position 13.
Does anybody has a hint that could help to explain this?
kind regards
Jochen
I've got deep-sequencing data for bioinformatic processing, that show some remarkable errors, I can't explain to myself.
What I am observing is small in absolute numbers:
a 1000 out of a million short-reads contain 'N's, so this would be a tiny fraction of sequences.
As I understand the error-rate increases towards the end of sequences and while I can see this in my data too this is not bothering me.
However, I'm stumbling about the fact, that 80% of those misreads specifically carry an 'N' at Position 13.
Does anybody has a hint that could help to explain this?
kind regards
Jochen
Comment