View Single Post
Old 04-07-2017, 02:37 PM   #453
Brian Bushnell
Super Moderator
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707

Normally, the variant quality is the best indicator of whether a variant is real. I think, in this case, it's likely that most of them are real - viruses mutate rapidly, so viral "isolates" often contain multiple different viral genomes (which is why they are hard to assemble). The possibilities are, essentially...

1) Sequencing error randomly matched somewhere distant in the genome, causing the false appearance of a deletion. This is extremely unlikely with small, non-redundant genomes, and would usually only be represented by a single read.
2) Chimeric reads formed from different genomic fragments fusing together during sequencing may look like long deletion events. These should universally be represented by a single read (or pair) with a PCR-free or deduplicated library.
3) Real deletion events. Hopefully these will be represented by multiple reads.

So, you may want to deduplicate your library (if it was PCR'd, otherwise don't) and then call variants using a threshold of, say, 3 reads (minreads=3 flag, placed after "clearfilters").
Brian Bushnell is offline   Reply With Quote