For my illumina data fastqc shows presence of N's at positions 13,14,15 in 101 bp longs reads. If i go for cropping first 15 bases by using trimmomatic, it solves the problem but i lose a lot of data. I wanted to know that if i retain the N's what sort of problems would they cause during alignment(bwa+stampy)/variant calling(unified genotyper) and how can i handle these problems?
If any body faced a similar problem how did you handle it? Similar questions asked on different forums but none answered. Could not find a resourse on how variant calling programs handle N's. Do they ignore them? Or consider them as a variation with low confidence scores?
If any body faced a similar problem how did you handle it? Similar questions asked on different forums but none answered. Could not find a resourse on how variant calling programs handle N's. Do they ignore them? Or consider them as a variation with low confidence scores?
Comment