I have trio datasets that I have phased using GATK's PhaseByTransmission and ReadbackedPhasing walkers.
My target is to identify de novo mutations from this data.
I'm creating a candidate de novo mutations dataset by checking for variants that are present in the offspring and not in either of the parents as well as looking for variant sites where there are mendelian violations.
I'd like to know how to proceed in filtering through this dataset to confidently ascertain variants that are de novo from the rest.
I'd appreciate any inputs/ideas on creating a methodology to go about this analysis.
My target is to identify de novo mutations from this data.
I'm creating a candidate de novo mutations dataset by checking for variants that are present in the offspring and not in either of the parents as well as looking for variant sites where there are mendelian violations.
I'd like to know how to proceed in filtering through this dataset to confidently ascertain variants that are de novo from the rest.
I'd appreciate any inputs/ideas on creating a methodology to go about this analysis.
Comment