SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Amplicon sequencing: Method to prevent end over-representation & improve uniformity ECO Sample Prep / Library Generation 3 10-23-2015 05:08 AM
Method for targeted sequencing flobpf De novo discovery 2 02-03-2014 08:41 PM
Targeted (amplicon) read mapping algorithms msl1y11 Bioinformatics 0 07-30-2013 02:50 AM
Targeted Amplicon Primer Sets cement_head Literature Watch 1 12-07-2012 04:33 AM

Reply
 
Thread Tools
Old 02-10-2015, 09:30 AM   #1
valeu
Member
 
Location: Paris

Join Date: Sep 2008
Posts: 69
Default ONCOCNV: a method to extract CNAs from amplicon (or targeted) sequencing data

We are happy to present ONCOCNV, a method to detect copy number alterations in amplicon or targeted sequencing data. The method can be applied to exome-seq data as well, but it will not adjust the profiles for contamination by normal cells or evaluate genotypes (LOH).

ONCOCNV was developed by OncoDNA with the collaboration with the Bioinformatics Laboratory of Institut Curie (Paris). It automatically computes, normalizes, segments copy number profiles, then calls copy number alterations. The user can provide any number of control samples in order to construct the baseline. However, we recommend to use at least three control samples. The more the better

Webpage: http://oncocnv.curie.fr/
Publication: Boeva,V. et al. (2014) Multi-factor data normalization enables the detection of copy number aberrations in amplicon sequencing data. Bioinformatics, 30(24):3443-3450. Link

Input for CNA detection: aligned single-end or paired-end data in the BAM format.
Output: Annotation of genes with copy number changes + visualization of the profile (.png).

Paper abstract:
MOTIVATION:
Because of its low cost, amplicon sequencing, also known as ultra-deep targeted sequencing, is now becoming widely used in oncology for detection of actionable mutations, i.e. mutations influencing cell sensitivity to targeted therapies. Amplicon sequencing is based on the polymerase chain reaction amplification of the regions of interest, a process that considerably distorts the information on copy numbers initially present in the tumor DNA. Therefore, additional experiments such as single nucleotide polymorphism (SNP) or comparative genomic hybridization (CGH) arrays often complement amplicon sequencing in clinics to identify copy number status of genes whose amplification or deletion has direct consequences on the efficacy of a particular cancer treatment. So far, there has been no proven method to extract the information on gene copy number aberrations based solely on amplicon sequencing.
RESULTS:
Here we present ONCOCNV, a method that includes a multifactor normalization and annotation technique enabling the detection of large copy number changes from amplicon sequencing data. We validated our approach on high and low amplicon density datasets and demonstrated that ONCOCNV can achieve a precision comparable with that of array CGH techniques in detecting copy number aberrations. Thus, ONCOCNV applied on amplicon sequencing data would make the use of additional array CGH or SNP array experiments unnecessary.

Last edited by valeu; 02-10-2015 at 09:34 AM.
valeu is offline   Reply With Quote
Old 08-24-2015, 10:16 AM   #2
xxqtony
Junior Member
 
Location: CA

Join Date: Jun 2008
Posts: 9
Default

Hi Valeu,
I wonder if you can help me on this. I tried to run your ONCOCNV v6.1, with the test running, I got the error of Error in file(file, "rt") : cannot open the connection. Then the program quits.
More details are shown below.
Thanks!
-Tony

=====================================================
$ ./RUNME.sh
Package 'mclust' version 5.0.2
Type 'citation("mclust")' for citing this R package in publications.
Warning: you have both male and female samples in the control. We will try to assign sex using read coverage on chrX
0.5 0.5 0.5 1 1 0.5 0.5 1 1 0.5 1 1 0.5 1 1
Centering
Whitening
Symmetric FastICA using logcosh approx. to neg-entropy function
Iteration 1 tol=0.354678
Iteration 2 tol=0.401774
Iteration 3 tol=0.485327
Iteration 4 tol=0.644948
Iteration 5 tol=0.960465
Iteration 6 tol=0.518261
Iteration 7 tol=0.071013
Iteration 8 tol=0.006314
Iteration 9 tol=0.004754
Iteration 10 tol=0.004012
Iteration 11 tol=0.003472
Iteration 12 tol=0.004472
Iteration 13 tol=0.005501
Iteration 14 tol=0.005915
Iteration 15 tol=0.005593
Iteration 16 tol=0.004960
Iteration 17 tol=0.004012
Iteration 18 tol=0.002834
Iteration 19 tol=0.001732
Iteration 20 tol=0.001043
Iteration 21 tol=0.000649
Iteration 22 tol=0.000395
Iteration 23 tol=0.000245
Iteration 24 tol=0.000161
Iteration 25 tol=0.000133
Iteration 26 tol=0.000131
Iteration 27 tol=0.000133
Iteration 28 tol=0.000138
Iteration 29 tol=0.000146
Iteration 30 tol=0.000157
Iteration 31 tol=0.000171
Iteration 32 tol=0.000186
Iteration 33 tol=0.000203
Iteration 34 tol=0.000221
Iteration 35 tol=0.000239
Iteration 36 tol=0.000256
Iteration 37 tol=0.000272
Iteration 38 tol=0.000285
Iteration 39 tol=0.000294
Iteration 40 tol=0.000299
Iteration 41 tol=0.000298
Iteration 42 tol=0.000290
Iteration 43 tol=0.000276
Iteration 44 tol=0.000257
Iteration 45 tol=0.000233
Iteration 46 tol=0.000206
Iteration 47 tol=0.000179
Iteration 48 tol=0.000151
Iteration 49 tol=0.000128
Iteration 50 tol=0.000107
Iteration 51 tol=0.000088
Iteration 52 tol=0.000072
Iteration 53 tol=0.000058
Iteration 54 tol=0.000047
Iteration 55 tol=0.000037
Iteration 56 tol=0.000030
Iteration 57 tol=0.000024
Iteration 58 tol=0.000019
Iteration 59 tol=0.000015
Iteration 60 tol=0.000012
Iteration 61 tol=0.000010
Iteration 62 tol=0.000008
Iteration 63 tol=0.000006
Iteration 64 tol=0.000005
Iteration 65 tol=0.000004
Iteration 66 tol=0.000003
Iteration 67 tol=0.000003
Iteration 68 tol=0.000002
Iteration 69 tol=0.000002
Iteration 70 tol=0.000001
Iteration 71 tol=0.000001
Iteration 72 tol=0.000001
Explained variance by the first pronicpal components of PCA:Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7 Comp.8 Comp.9 Comp.10 Comp.11 Comp.12 Comp.13 Comp.14 Comp.150.7874551 0.8498553 0.9053619 0.9284231 0.9443718 0.9596778 0.9682324 0.9741502 0.9794208 0.983909 0.9881928 0.991948 0.9950968 0.9978144 1null device
1
Package 'mclust' version 5.0.2
Type 'citation("mclust")' for citing this R package in publications.
PSCBS v0.44.0 (2015-02-22) successfully loaded. See ?PSCBS for help.

Attaching package: ‘PSCBS’

The following objects are masked from ‘package:base’:

append, load

R.cache v0.10.0 (2014-06-10) successfully loaded. See ?R.cache for help.
Loading required package: lattice
Loading required package: grid
Loading required package: parallel
Error in file(file, "rt") : cannot open the connection
Calls: read.table -> file
In addition: Warning message:
In file(file, "rt") :
cannot open file './Test.stats.txt': No such file or directory
Execution halted

Last edited by xxqtony; 08-24-2015 at 10:20 AM.
xxqtony is offline   Reply With Quote
Old 08-24-2015, 10:49 AM   #3
xxqtony
Junior Member
 
Location: CA

Join Date: Jun 2008
Posts: 9
Default

I also tried with my own data, and finally managed to get the program run, however the results are not expected. I have 4x CNV regions/amplicons, but they get 2x prediction. I wonder if there's anything I missed.
Thanks.
xxqtony is offline   Reply With Quote
Old 08-26-2015, 05:48 AM   #4
valeu
Member
 
Location: Paris

Join Date: Sep 2008
Posts: 69
Default

For the test dataset, do you see that './Test.stats.txt' has been created?

For the second dataset, I don't understand what is wrong.

Please, contact me by email.
valeu is offline   Reply With Quote
Old 03-30-2016, 06:23 AM   #5
bioWizz
Junior Member
 
Location: Manipal

Join Date: Jul 2013
Posts: 6
Default

Hi Valeu,

I have 10 sample 8 test and 2 control for which I am trying to run oncocnv v6.4. I have configured ONCOCNV.sh file as per instructions given. It is throwing following error.

Quote:
Detected 2 control sample(s)
reading 11.bam
sample name: 11
read 100000 reads
read 200000 reads
read 300000 reads
read 400000 reads
read 500000 reads
reading 12.bam
sample name: 12
read 100000 reads
read 200000 reads
read 300000 reads
read 400000 reads
Total target length: 272944
processed 2 controls, 11 12
Illegal division by zero at
/san2/mallya/exome_cnv/anantha/unmapped/oncocnv/ONCOCNV//ONCOCNV_getCounts.v6.4.pl line 466 (#1)
(F) You tried to divide a number by 0. Either something was wrong in
your logic, or you need to put a conditional in to guard against
meaningless input.

Uncaught exception from user code:
Illegal division by zero at /san2/mallya/exome_cnv/anantha/unmapped/oncocnv/ONCOCNV//ONCOCNV_getCounts.v6.4.pl line 466.
at /san2/mallya/exome_cnv/anantha/unmapped/oncocnv/ONCOCNV//ONCOCNV_getCounts.v6.4.pl line 466.

------------------------


--Coordinates are read--


------------------------

Total target length: 0
Detected 8 tumor sample(s)
reading 1.bam
reading 2.bam
reading 3.bam
reading 4.bam
reading 5.bam
reading 6.bam
reading 7.bam
reading 8.bam
Error: The requested bed file (/san2/mallya/exome_cnv/anantha/unmapped/oncocnv/result//target.bed) could not be opened. Exiting!
Any suggestions to proceed further?

Thanks
bioWizz is offline   Reply With Quote
Old 04-25-2016, 02:39 AM   #6
valeu
Member
 
Location: Paris

Join Date: Sep 2008
Posts: 69
Default

I believe something is wrong with your .bed file with regions. Please check the readme.
valeu is offline   Reply With Quote
Old 08-16-2016, 02:51 PM   #7
arnoldliao
Junior Member
 
Location: San Diego CA

Join Date: Oct 2010
Posts: 5
Default span is too small

Anyone experience this issue
Error in simpleLoess(y, x, w, span, degree = degree, parametric = parametric, :
span is too small

I assume this is because my bed file contain sections where there are no reads? Outputs at https://drive.google.com/file/d/0B4x...ew?usp=sharing

*** rest of stdout ***
Calls: loess -> simpleLoess
Execution halted
Package 'mclust' version 5.2
Type 'citation("mclust")' for citing this R package in publications.
PSCBS v0.61.0 (2016-02-03) successfully loaded. See ?PSCBS for help.

Attaching package: 'PSCBS'

The following objects are masked from 'package:base':

append, load

R.cache v0.12.0 (2015-11-12) successfully loaded. See ?R.cache for help.
Loading required package: lattice
Loading required package: grid
Loading required package: parallel
Error in file(file, "rt") : cannot open the connection
Calls: read.table -> file
In addition: Warning message:
In file(file, "rt") :
cannot open file '/apps/outputDEEPCNA//Control.stats.Processed.txt': No such file or directory
Execution halted
arnoldliao is offline   Reply With Quote
Old 08-16-2016, 03:08 PM   #8
arnoldliao
Junior Member
 
Location: San Diego CA

Join Date: Oct 2010
Posts: 5
Default

perhaps switching out https://stat.ethz.ch/R-manual/R-deve...tml/loess.html
for https://stat.ethz.ch/R-manual/R-deve.../html/rlm.html ?
arnoldliao is offline   Reply With Quote
Old 01-20-2017, 02:25 AM   #9
sachin
Junior Member
 
Location: India

Join Date: May 2010
Posts: 9
Default

Quote:
Originally Posted by arnoldliao View Post
Anyone experience this issue
Error in simpleLoess(y, x, w, span, degree = degree, parametric = parametric, :
span is too small

I assume this is because my bed file contain sections where there are no reads? Outputs at https://drive.google.com/file/d/0B4x...ew?usp=sharing

*** rest of stdout ***
Calls: loess -> simpleLoess
Execution halted
Package 'mclust' version 5.2
Type 'citation("mclust")' for citing this R package in publications.
PSCBS v0.61.0 (2016-02-03) successfully loaded. See ?PSCBS for help.

Attaching package: 'PSCBS'

The following objects are masked from 'package:base':

append, load

R.cache v0.12.0 (2015-11-12) successfully loaded. See ?R.cache for help.
Loading required package: lattice
Loading required package: grid
Loading required package: parallel
Error in file(file, "rt") : cannot open the connection
Calls: read.table -> file
In addition: Warning message:
In file(file, "rt") :
cannot open file '/apps/outputDEEPCNA//Control.stats.Processed.txt': No such file or directory
Execution halted
Hi,
I got the same error. Resolved, problem was with the bam file.

Sachin A
sachin is offline   Reply With Quote
Old 06-14-2017, 09:14 AM   #10
arnoldliao
Junior Member
 
Location: San Diego CA

Join Date: Oct 2010
Posts: 5
Default mclust error

I figured it out. My be file contain only chr,start,end while ONCOCONV needed chr,start,end,name,score,geneName

I'm getting an error
Error in if (minFrac < minFractionOfShortOrLongAmplicons & maxFrac < minFractionOfShortOrLongAm
missing value where TRUE/FALSE needed

Any idea where I can start to debug? Use a smaller bed file?

stderr


/outputDEEPCNA//Test.stats.txt was created
-rw-rw-rw- 1 root root 29 Jun 14 06:26 /outputDEEPCNA//Test.stats.txt
creating target.bed
-rw-rw-rw- 1 root root 0 Jun 14 06:26 /outputDEEPCNA//target.bed
creating target.GC.txt
..Oops.. File /outputDEEPCNA//target.fasta is empty!
..It seems that there is not 'chr' prefixes in your reference genome fasta file..
..But no worries! OncoCNV will adjust for it
-rw-rw-rw- 1 root root 10 Jun 14 06:27 /outputDEEPCNA//target.GC.txt
running processControl.R
running processSamples.R

Package 'mclust' version 5.3
Type 'citation("mclust")' for citing this R package in publications.
Error in if (minFrac < minFractionOfShortOrLongAmplicons & maxFrac < minFractionOfShortOrLongAm
missing value where TRUE/FALSE needed
Execution halted
ls: cannot access '/outputDEEPCNA//Control.stats.Processed.txt': No such file or directory
Package 'mclust' version 5.3
Type 'citation("mclust")' for citing this R package in publications.
PSCBS v0.62.0 (2016-11-10) successfully loaded. See ?PSCBS for help.

Last edited by arnoldliao; 06-14-2017 at 02:10 PM. Reason: figured it out.
arnoldliao is offline   Reply With Quote
Old 07-05-2017, 06:55 AM   #11
valeu
Member
 
Location: Paris

Join Date: Sep 2008
Posts: 69
Default

Does your new (tab-delimited) .bed file satisfy the requirements listed in the OncoCNV manual?

Check formats:
o reads should be given in .BAM format
o amplicon coordinates should be given in .bed format (with or without the headline) and have amplicon ID in column 4 and gene symbol in column 6, e.g.: chr1 2488068 2488201 AMPL223847 0 TNFRSF14

It is mandatory to provide gene names in the 6th column.

VERY IMPORTANT

Please make sure that:
- There is no duplicates in the coordinates
- Coordinates are sorted
- Gene names are gene names in the sense that corresponding amplicons fall in the same genomic locus and not on different chromosomes
- Gene names cannot be the same as amplicon names or IDs because ONCOCNV assumes to have several amplicons per gene
valeu is offline   Reply With Quote
Old 07-05-2017, 09:20 AM   #12
arnoldliao
Junior Member
 
Location: San Diego CA

Join Date: Oct 2010
Posts: 5
Default Thank you

Merci for the reply, I got it to work with a correct bed file. I did get many Na . I will email you separately on the issues.
arnoldliao is offline   Reply With Quote
Reply

Tags
amplicon sequencing, copy number analysis, gc-correction, normalization

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 03:21 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2017, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO