Go Back   SEQanswers > Applications Forums > Genomic Resequencing

Similar Threads
Thread Thread Starter Forum Replies Last Post
Optimal Analysis Software? sjhanson RNA Sequencing 1 05-24-2012 03:33 AM
CNV Boundary Estimation ken89 Genomic Resequencing 0 03-12-2012 11:37 PM
BWA and sub-optimal alignments LauraR Bioinformatics 0 02-10-2012 07:46 AM
optimal Kmer PHSchi Bioinformatics 0 02-16-2011 11:30 AM
optimal k value? LHT De novo discovery 4 06-18-2010 02:12 AM

Thread Tools
Old 05-24-2012, 01:58 AM   #1
Junior Member
Location: Leeds, UK

Join Date: May 2012
Posts: 2
Default Optimal bandwidth for kernel CNV estimation

I just started working on bandwidth selection for KC-smart and/or similar kernel regression methods for the purpose of identifying recurrent copy number aberrations in NGS libraries from genomic tumor DNA. If there are other people here working on related research and/or potential users of such a bandwidth selection algorithm, maybe we could share some thoughts.

We use KC-smart for the detection of recurrent copy number abberations across multiple tumor libraries. In short, it works like this:

- Each library (i.e. sample, i.e. tumor) gives us appr. 5 mio reads. Since the fragments to be read are made with sonocation rather than restriction enzymes, they tend to be at unique locations. Subject to correction for GC bias and such, the density of the reads are proportional to the copy number.

- The reads are binned into windows of size say 50 kb so that each window has a copy number estimate

- That copy number estimate is then used by KC smart which is a kernel regression algorithm originally made for aCGH. It produces locally weighted regression coefficients related to research questions such as "does this region have a higher copy number in library class A than in library class B?".

What I want to do is to make a bandwidth selection algorithm. I want to use the read locations directly without binning them. Bandwidth selection for kernel regression is a little different than for kernel density estimation. Also, I might consider
- building the GC correction into the bandwidth selection
- dynamic bandwidth selection, i.e. larger bandwidth in low-copy number regions
- Shrinking estimates towards nearest integer copy number (the sample may be homogenous with respect to the CN of some regions)
- handling ambigiously mapped reads
helene_thygesen is offline   Reply With Quote

bandwidth selection, cnv, kc-smart, kernel regression, recurrent aberrations

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

All times are GMT -8. The time now is 10:08 PM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO