Dear all,
I am analysing sequencing data for pooled samples for a candidate gene to look for rare variants. Using the data from the illumina pipeline I first used the s_N_sequence.txt filtered data and mapped it to my candidate gene. If I understand correctly It is filtered by how well it aligns to the human genome, using certain parameters.
If I repeat my analysis using the unfiltered data which is s_N_export.txt I get a better depth of coverage.
Is it OK to use this data, or am I introducing errors?
Because I already have some PCR introduced errors I am filtering out very low frequency snps from my data, so any very low frequency errors from the sequencing data will be filtered out here too.
Any thoughts would be greatly appreciated.
Best Wishes
Michelle
I am analysing sequencing data for pooled samples for a candidate gene to look for rare variants. Using the data from the illumina pipeline I first used the s_N_sequence.txt filtered data and mapped it to my candidate gene. If I understand correctly It is filtered by how well it aligns to the human genome, using certain parameters.
If I repeat my analysis using the unfiltered data which is s_N_export.txt I get a better depth of coverage.
Is it OK to use this data, or am I introducing errors?
Because I already have some PCR introduced errors I am filtering out very low frequency snps from my data, so any very low frequency errors from the sequencing data will be filtered out here too.
Any thoughts would be greatly appreciated.
Best Wishes
Michelle
Comment