Originally Posted by Zaag
GQ does not tell you anything about the variant quality. GQ tells you about how certain the HC is about the zygosity.

About the deletions: do they disappear if you use this option in the variant calling step?

Hi Zaag and apologies for the late reply (I happened to be on holyday these last couple of weeks!),

Part of the deletions disappeared after I used your options, although new ones reappeared. coll thing though, many of the FP InDels were among the one that were cleansed.

I am now also trying to use Picard CleanSam to filter them BEFORE the GATK pipeline and compare the differences (also to see why these new InDels appear). Thanks a lot for your reply, was very helpful!

Originally Posted by Linnea
No sorry, I can't explain that part, maybe someone else has an idea?

But actually, why can't they just be real indels with a very high quality? (99 seems to be the highest quality you can get: "Because the most likely PL is always 0, GQ = second highest PL - 0. If the second most likely PL is greater than 99, we still assign a GQ of 99, so the highest value of GQ is 99." -from the GATK webpage). Maybe it will be clear after the realignment? (And sorry if I misunderstood something, I am really no indel expert..)
Hi Linnea! again, apoologies for the late reply.

I am quite confident these InDels were not real beacuase the same regions were validated with Sanger before the experiment Also, seeing a lot of inDel nvery close to each other (3-4 bps at the most) and considering the nature of the disease, as long as the conservation of these regions makes me think these are FP. Again, I'm not an InDel expert as well, so we're on the same boat! Thanks a lot for your contribution though, it was really helpful!


