Does anyone know of a way to generate a simulated bacterial genome with known SNPs and indels relative to a given reference? I'd like to be able to generate these simulated genomes to benchmark various SNP-calling pipelines. It would, after all, be much easier to trust a particular SNP if I know for an absolute fact that it's there. I'm using PacBio data, so the various tools to simulate short read dataset seem designed to solve a different problem than the one I have.
If you think I'm going about this in entirely the wrong way, I'm willing to listen.
If you think I'm going about this in entirely the wrong way, I'm willing to listen.
Comment