When I checked my solexa sequencing reads, i found that some of them are like this.
NNNNNNNNAGGNNNNNGGAGNGNNGNNNCAGNGNTGNNNNNNNNNNNNNANNNNNNGNNNNNNNTGGNGGNNNNNNNN
+
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
First, there are poly "N" in the middle of the sequence as well as at the end.
Second, all of the base calls are in low quality (I guess that % is the lowest quality score in this format, right?)
Third, in some other cases, I can see poly "A" at the end of a sequence.
How should I deal with the reads having those features? Should I just get rid of them, or do some trimming? If trimming is recommended in some cases, what software is suitable for solexa reads?
NNNNNNNNAGGNNNNNGGAGNGNNGNNNCAGNGNTGNNNNNNNNNNNNNANNNNNNGNNNNNNNTGGNGGNNNNNNNN
+
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
First, there are poly "N" in the middle of the sequence as well as at the end.
Second, all of the base calls are in low quality (I guess that % is the lowest quality score in this format, right?)
Third, in some other cases, I can see poly "A" at the end of a sequence.
How should I deal with the reads having those features? Should I just get rid of them, or do some trimming? If trimming is recommended in some cases, what software is suitable for solexa reads?
Comment