SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
CNV-Seq output Robby Bioinformatics 13 08-11-2014 10:33 PM
CNV in targeted exons m_elena_bioinfo Bioinformatics 0 12-21-2011 12:59 AM
CNV between twins... milesgr General 9 05-31-2011 10:33 AM
Sample/library prep of DNA and RNA in a metagenomic sample chrisaw01 Metagenomics 1 05-05-2011 01:59 PM
CNV Error: JohnK SOLiD 1 08-09-2010 08:38 AM

Reply
 
Thread Tools
Old 07-07-2011, 03:19 AM   #1
m_elena_bioinfo
Member
 
Location: Ospedali Riuniti di Bergamo, ITALY

Join Date: Oct 2009
Posts: 99
Default CNV from only one sample

Dear NGS user,
anyone knows if and how can I analyse CNV in ONLY one sample from next-gen DNA sequencing without having controls or other samples for comparison?

Thanx a lot to everybody,
ME
m_elena_bioinfo is offline   Reply With Quote
Old 07-07-2011, 03:29 AM   #2
Dethecor
Member
 
Location: Germany

Join Date: May 2010
Posts: 24
Default Depth of Coverage

Assuming that you have a reference genome for your organism you can still spot such things by looking at the depth-of-coverage. In this way you will be able to see regions where your sample had more or less copies than the reference. (basically assume most regions are present exactly once, determine the expected coverage per molecule of DNA/ copy of a region - this might be expected coverage per two copies, if your organism is diploid by default - and then find regions/windows/bins which have a significantly different depth-of-coverage thereby indicating a change in copy number of said region)
I think the alignment method can play an important role here (align each read only once, etc.) and also you might want to try some normalization for GC-content and mappability to make things more comparable.

Admittedly it's probably something you'll at least partly will have to implement yourself (using R/Bioconductor, Python/HTSeq or Bio<YourFavouriteScriptingLanguageHere> ...), but maybe I'm just not aware of tools that already incorporate all of this functionality out of the box?!

Cheers,
Paul
__________________

"You are only young once, but you can stay immature indefinitely."
Dethecor is offline   Reply With Quote
Old 07-07-2011, 04:40 AM   #3
m_elena_bioinfo
Member
 
Location: Ospedali Riuniti di Bergamo, ITALY

Join Date: Oct 2009
Posts: 99
Default

Dear Paul,
thank for the quick reply!
I work with human sample, so reference genome and relative annotation are available for my analysis.
I have only question about your answer, probably I have not understand well enough it.
How can I normalized the coverage of one sample if i have not depth of other samples?
m_elena_bioinfo is offline   Reply With Quote
Old 07-07-2011, 05:03 AM   #4
Dethecor
Member
 
Location: Germany

Join Date: May 2010
Posts: 24
Default Normalization

The kind of normalization I was thinking about is based on a property of the sequence and not relative to other samples.
For example for two samples A and B with read-counts rc_a and rc_b you would maybe multiply the coverage in sample A by the ratio rc_b / rc_a to correct for the difference in library size.

Independent of this library-size correction for multiple samples you might want to normalize for mappability within a single sample, for example like so:

corrected_cvg[i] = coverage[i] / mappability[i]

where coverage is the read count per position and mappability gives you for each position i the percentage of mappable positions in a window around that position i. The size of the window should be related to the length of your reads, since e.g. for a library with readlength 100 the number of reads overlapping each position i can only be influenced by the mappability in the interval [i-100..i+100].

Maybe have a look here, for some ideas of what people do to normalize for gc-content and mappability. e.g. the first pubmed hit on the subject
__________________

"You are only young once, but you can stay immature indefinitely."
Dethecor is offline   Reply With Quote
Old 07-07-2011, 05:21 AM   #5
m_elena_bioinfo
Member
 
Location: Ospedali Riuniti di Bergamo, ITALY

Join Date: Oct 2009
Posts: 99
Default

Great! Your explanation is very clear!
Thanx a lot again,
Good work!
Maria Elena
m_elena_bioinfo is offline   Reply With Quote
Old 07-07-2011, 05:43 AM   #6
m_elena_bioinfo
Member
 
Location: Ospedali Riuniti di Bergamo, ITALY

Join Date: Oct 2009
Posts: 99
Default

Paul,
the last question is...
the same reasoning (and the answer to my questions) goes for a targeted sequencing (target enrichment of some regions of genes or whole-exome)?
m_elena_bioinfo is offline   Reply With Quote
Old 07-07-2011, 06:00 AM   #7
Dethecor
Member
 
Location: Germany

Join Date: May 2010
Posts: 24
Default CNVs with exome sequencing

I think that depends on what technology you use and what you want to do with your data, from what I understand one would use whole-exome sequencing primarily for SNP detection and maybe finding some small indels?! In that case the normalisation is not required since you're only looking for ratios or reads that map with insertions / deletions, but not for how many of those you have.

In theory you can still correct for mappability and gc-content with this kind of data, but depending on you wet-lab protocol some additional effects might occur that would make it hard to call ploidy / copy number variation.
For example if you use an exon-array to extract all the exonic DNA before sequencing, the binding affinity of the probes might play a role (if you're lucky it's mainly determined by the probe-GC content and you can also normalize it away later on) . . . you might also get saturation of certain probes (creating a theoretical maximum copy number that you could still detect).

I guess the inconclusive answer here is: It depends!

Cheers,
Paul

p.s.: Is there precedence for CNV-calling with targeted sequencing? Also remember that you need a basis of "normal" regions to check against if you want to determine the CNV of an interesting gene.
__________________

"You are only young once, but you can stay immature indefinitely."

Last edited by Dethecor; 07-07-2011 at 06:01 AM. Reason: Typo-Typo-Typo
Dethecor is offline   Reply With Quote
Old 07-11-2011, 05:57 AM   #8
krobison
Senior Member
 
Location: Boston area

Join Date: Nov 2007
Posts: 747
Default

http://www.ncbi.nlm.nih.gov/pubmed/21701589 shows that both homozygous deletions and high level amplifications can be identified from exome data
krobison is offline   Reply With Quote
Old 01-18-2016, 07:20 AM   #9
simobioinfo
Member
 
Location: italy

Join Date: Aug 2014
Posts: 38
Default

Hi,
I'm working with targeted Ion torrent PGM data.
I would like to know if there are methods to identify CNVs from such a kind of data.
Thank you in advance
simobioinfo is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 05:00 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO