![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
help: get the SV data set of NA12878 | Guangzhu | Bioinformatics | 0 | 08-07-2014 03:57 AM |
database snpeff vs dbsnp in variant annotation | bongbimit | Bioinformatics | 0 | 04-05-2014 11:40 PM |
help finding indels, etc. in dbSNP database! | adaptivegenome | Bioinformatics | 0 | 03-26-2013 07:33 PM |
NA12878 truth sets? | brofallon | Bioinformatics | 1 | 11-09-2012 11:02 AM |
Truth set for NA12878 SNVs | alonie | Bioinformatics | 3 | 09-20-2012 11:53 PM |
![]() |
|
Thread Tools |
![]() |
#1 |
Junior Member
Location: Michigan Join Date: Jul 2015
Posts: 7
|
![]()
So I may be completely wrong here but please correct me since I'm mostly used to doing RNA splicing analysis.
I have sequenced sheared DNA from about 10 NA12878 cells using Illumina and a library prep that uses a non proof reading polymerase so we expect it to introduce lots of errors even early on. What I would like to do is figure out how many false positives it gives me (Allele frequency of >=15%. To do that I need the NA12878 ref genome, and it's dbSNP database for heterozygous loci. Right? If I just align to NA12878 then the het loci that are endogenous "SNPs" would look as false positives. Where can I find the NA12878 specific dbSNP? If there is another more logical way of doing this analysis please feel free to call me out. Thank you! |
![]() |
![]() |
![]() |
#2 |
Junior Member
Location: Michigan Join Date: Jul 2015
Posts: 7
|
![]()
I've actually found this thread that might help.
http://seqanswers.com/forums/showthread.php?t=23093 Two files that I found useful: ftp://ftp.1000genomes.ebi.ac.uk/vol1...populations.md shows the population that the vcf files came from. The NA12878 should be the CEU CEPH Utah residents (CEPH) with Northern and Western European ancestry ftp://ftp.1000genomes.ebi.ac.uk/vol1.../2010_07/trio/ has all the different datasets If someone else has better options or suggestions please let me know |
![]() |
![]() |
![]() |
#3 |
Senior Member
Location: Berlin, DE Join Date: May 2008
Posts: 628
|
![]()
This might be interesting for you: http://www.illumina.com/platinumgenomes/
|
![]() |
![]() |
![]() |
#4 |
Junior Member
Location: Michigan Join Date: Jul 2015
Posts: 7
|
![]()
Thank you sklages! It looks that people update these files frequently so the databases should be way better than the 2010 version the 1000genomes pilot study offers.
|
![]() |
![]() |
![]() |
#6 |
Junior Member
Location: Michigan Join Date: Jul 2015
Posts: 7
|
![]()
HESmith, That was a fantastic paper, thank you for sharing!
|
![]() |
![]() |
![]() |
Tags |
allele frequency, dbsnp, false positives, na12878 |
Thread Tools | |
|
|