Hello,
I have got exome sequencing data from 24 samples. It was done on a MiSeq sequencer. The data are not whole exome data, instead it is gene panel of about 200 genes. I have processed the whole data using the usual whole exome pipeline BWA+GATK, of course with the corresponding bed file. It turned out that the PCR duplication rate in majority of the samples is very high, 70% to 90%. This also caused the mean coverage to be very low, about 4. I am wondering if this is normal in gene panel data. And is it OK in gene panel sequencings to not remove PCR duplicates. Any other points regarding the processing of gene panel exome data is much appreciated.
I have got exome sequencing data from 24 samples. It was done on a MiSeq sequencer. The data are not whole exome data, instead it is gene panel of about 200 genes. I have processed the whole data using the usual whole exome pipeline BWA+GATK, of course with the corresponding bed file. It turned out that the PCR duplication rate in majority of the samples is very high, 70% to 90%. This also caused the mean coverage to be very low, about 4. I am wondering if this is normal in gene panel data. And is it OK in gene panel sequencings to not remove PCR duplicates. Any other points regarding the processing of gene panel exome data is much appreciated.
Comment