Question arising from another thread in the forums; I'm posting this separately (and cross-referencing) because I want to ask the more bioinformatics and less sample-prep oriented subset.
An apparently property of at least some hybridization capture methods is a tendency to reduce the library size. As a result, with long Illumina reads the paired end reads may overlap in the middle.
How do the variant callers out there handle this? If a variant is found in the overlap & the two reads agree, then that is clearly stronger evidence that a given variant is present in that DNA fragment. BUT, if you are worried about PCR (or sample damage such as FFPE) artifacts then you may want some separate accounting for having actually seen the same variant in two different fragments.
An apparently property of at least some hybridization capture methods is a tendency to reduce the library size. As a result, with long Illumina reads the paired end reads may overlap in the middle.
How do the variant callers out there handle this? If a variant is found in the overlap & the two reads agree, then that is clearly stronger evidence that a given variant is present in that DNA fragment. BUT, if you are worried about PCR (or sample damage such as FFPE) artifacts then you may want some separate accounting for having actually seen the same variant in two different fragments.
Comment