Go Back   SEQanswers > Bioinformatics > Bioinformatics

Similar Threads
Thread Thread Starter Forum Replies Last Post
Strange Fst distribution in BayeScan sacketlc Bioinformatics 1 10-06-2015 10:35 AM
NexteraXT MiSeq Bowtie2 strange insert size distribution M4TTN Bioinformatics 63 04-08-2015 07:51 AM
Multi-sample vs Single sample SNP calling for Linkage analysis meher Bioinformatics 0 10-23-2013 06:13 AM
Cheapest 96 well WGS sample prep andibody Sample Prep / Library Generation 3 02-18-2013 01:49 AM
[Galaxy] Strange QC Nucleotides Distribution Chart zippered_ohio Bioinformatics 5 06-29-2011 12:43 AM

Thread Tools
Old 10-26-2015, 04:35 AM   #1
Junior Member
Location: London

Join Date: Oct 2015
Posts: 2
Default Strange VAF distribution across SNP positions in WGS sample

Hi everyone,

I have a germline sample that underwent whole genome sequencing at 30x on Illumina Hiseq. After extracting the VAFs at all 1000 Genomes SNP positions, it seems that the data is strangely noisy.

- If you plot a histogram of VAFs from normal samples, you would expect a peak at 0.5 for heterozygous variants and 1.0 for homozygous variants, with some sort of distribution in between for sequencing noise. However, in this sample there is almost a uniform distribution from 0 to 0.8 and a peak at 1.0 (see attached figure).

- This is also reflected when I plot the VAFs at these SNPs from the germline sample (in blue) and a matched tumor sample (in red). In a normal-looking sample, the blue dots should cluster around 0.5 on the y-axis, and the red dots separate where there is a CNV. In this weird sample, you can see that the blue dots basically do not cluster around any VAF, whereas the matched tumor sample looks fine.

I've compared this sample with other samples, and there is no significant difference in coverage, insert size, GC content, ACGT content, indels, base quality or mismatch distributions. Anyone have any idea what might give rise to such noisy data or anyone seen a similar case before? Thanks!
Attached Images
File Type: jpg Slide1.jpg (81.9 KB, 12 views)
graubner277 is offline   Reply With Quote
Old 10-26-2015, 05:41 AM   #2
Senior Member
Location: UK

Join Date: Jan 2010
Posts: 390

Do you have more SNP calls in your 'weird' sample?

Contamination perhaps?
Bukowski is offline   Reply With Quote
Old 10-26-2015, 06:08 AM   #3
Junior Member
Location: London

Join Date: Oct 2015
Posts: 2

The number of SNP calls was similar to others as well, so was the Ti/TV, Het/Hom ratio, % in dbSNP, % in Genes etc. I did not notice the noise from any of the sequencing QC statistics or SNP calls until I did a plot of the VAFs.

I was wondering if it was perhaps contaminated with another sample, but I felt that there would be bands or multiple VAF peaks if there was say 1 other sample. It would have to be many samples together to get such a VAF distribution I think? And this is the only sample I have out of 30 that looks like that.
graubner277 is offline   Reply With Quote

sequencing, snp, vaf, wgs

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

All times are GMT -8. The time now is 05:35 AM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO