SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
CNV from only one sample m_elena_bioinfo Bioinformatics 8 01-18-2016 08:20 AM
CNV-seq, to detect Copy Number Variation using next-generation sequencing xiechao Literature Watch 46 10-20-2015 10:30 AM
Does CNV-Seq run with DNA-seq or RNA-seq? louis7781x Bioinformatics 3 07-16-2014 04:23 PM
CNV between twins... milesgr General 9 05-31-2011 11:33 AM
CNV Error: JohnK SOLiD 1 08-09-2010 09:38 AM

Reply
 
Thread Tools
Old 01-16-2012, 08:08 AM   #1
Robby
Member
 
Location: Germany

Join Date: Mar 2011
Posts: 68
Default CNV-Seq output

Dear all,
I used CNV-Seq to call my CNVs, but I don't understand the output file with the following columns:
chromosome, start, end, test, ref, position, log2, p.value, cnv, cnv.size, cnv.log2, cnv.p.value

What is the real CNV-position (start and end)? Is the start at "position" and the end at "position+cnv.size"?
What is the difference bewteen cnv.p.value and p.value?
What is the meaning of cnv.log2 and log2?
What is the meaning of the values for "ref" and "test"?

Can sombody help me or recommend a documentation, which is easy to understand. I would be very happy about any help.

Best regards
Robby
Robby is offline   Reply With Quote
Old 01-17-2012, 01:22 AM   #2
Robby
Member
 
Location: Germany

Join Date: Mar 2011
Posts: 68
Default

The values for "test" and "ref" are the numbers of reads starting in region mentioned in column 2 and 3. Is that correct? So column 2 is just the start of the window and column 3 the end of the window, right?

For one CNV I have to look for the same ID in column 9 => so CNV-Seq has multiple line for one CNV, correct?

But what is the exact starting and end position of the CNV in the example below? And I don't still understand the difference between log2 and cnv.log2 and between p.value and cnv.p.value. Can sombody help me or comment on that?


Quote:
"chr1" 2250641 2250864 159 100 2250752 0.722575091988798 8.10063503680068e-07 77 448 0.964667045392917 3.93033619539229e-35
"chr1" 2250753 2250976 171 87 2250864 1.02845734551634 3.42115799964058e-11 77 448 0.964667045392917 3.93033619539229e-35
"chr1" 2250865 2251088 161 71 2250976 1.2347180850891 2.18142864565488e-14 77 448 0.964667045392917 3.93033619539229e-35
"chr1" 2250977 2251200 107 60 2251088 0.888124717271795 4.18272787914529e-09 77 448 0.964667045392917 3.93033619539229e-35
Robby is offline   Reply With Quote
Old 01-18-2012, 10:40 PM   #3
Robby
Member
 
Location: Germany

Join Date: Mar 2011
Posts: 68
Default

I still have problems to understand the output. Does really nobody understand the output or knows a good documentation?
Robby is offline   Reply With Quote
Old 04-19-2012, 02:15 AM   #4
ritzriya
Member
 
Location: Canada

Join Date: Jun 2010
Posts: 49
Arrow

Robby,

Were you able to find an answer to your issue? I am also confused about the output format. Please post the understanding here, if any.
ritzriya is offline   Reply With Quote
Old 04-19-2012, 02:59 AM   #5
VanAxel
Junior Member
 
Location: germany

Join Date: Apr 2009
Posts: 3
Default

Hello,
CNV-seq based on the overlapping-sliding window method. As the windows overlap, they not only give the "start" and "end" of the window but the midpoint ("position"), too. And it is on your own to choose start and end points for a cnv by columns "start" and "end" or by the column "position".
cnv.log2 is the mean of all log2 values for this called cnv.
How the p.values for the window itself and the called cnv is computed can be read in the white paper (equation 4 and 6 respectively)(PMID: 19267900).

If you have any further questions don't hesitate to ask.
VanAxel
VanAxel is offline   Reply With Quote
Old 03-15-2013, 09:58 AM   #6
Ayush_Saxena
Junior Member
 
Location: Gainesville, Florida

Join Date: Jan 2013
Posts: 3
Default

Is there any way we can report a bug for CNV-Seq, I tried searching for it but couldn't find.

the software, for some strange reasons is not calling CNVs even when all conditions are met. I used a log2 threshold of 0.8 and a window size of only 2 so that I can look at the result by eye and judge which ones to pick.

"CHROMOSOME_II" 1450045 1451171 404 416 1450608 0.881401332001257 2.29121717076986e-13 0 NA NA NA
"CHROMOSOME_II" 1450609 1451735 570 464 1451172 1.22046668131509 1.31845290945157e-22 0 NA NA NA
"CHROMOSOME_II" 1451173 1452299 629 550 1451736 1.11725796585782 1.18360892978282e-19 0 NA NA NA
"CHROMOSOME_II" 1451737 1452863 671 657 1452300 0.954048963268409 3.24101505961822e-15 0 NA NA NA
"CHROMOSOME_II" 1452301 1453427 602 577 1452864 0.984821735504775 5.02877099034994e-16 0 NA NA NA
"CHROMOSOME_II" 1452865 1453991 513 516 1453428 0.915217327574355 3.2389370038797e-14 0 NA NA NA
"CHROMOSOME_II" 1453429 1454555 590 516 1453992 1.1169734562165 1.20564598087743e-19 0 NA NA NA
"CHROMOSOME_II" 1453993 1455119 551 465 1454556 1.16845116996632 4.16186035989176e-21 0 NA NA NA
"CHROMOSOME_II" 1454557 1455683 374 369 1455120 0.943047021217795 6.25787394726544e-15 0 NA NA NA
"CHROMOSOME_II" 1455121 1456247 415 422 1455684 0.899497904917657 8.08876111589643e-14 0 NA NA NA
"CHROMOSOME_II" 1455685 1456811 615 573 1456248 1.02568083886025 4.03369375071261e-17 0 NA NA NA
"CHROMOSOME_II" 1456249 1457375 666 575 1456812 1.13558978863008 3.59196486144447e-20 0 NA NA NA
"CHROMOSOME_II" 1456813 1457939 663 565 1457376 1.15438757020059 1.04963378485956e-20 0 NA NA NA
"CHROMOSOME_II" 1457377 1458503 609 524 1457940 1.14050498375944 2.60576418573692e-20 0 NA NA NA
"CHROMOSOME_II" 1457941 1459067 410 408 1458504 0.930684324924505 1.30371091397085e-14 0 NA NA NA

This entire >8kb region qualifies all conditions set for a CNV to be called and its still not called which makes me a little skeptical about the software itself. Can anyone list some other reliable alternatives.
Ayush_Saxena is offline   Reply With Quote
Old 03-25-2014, 08:03 AM   #7
Marevilla
Junior Member
 
Location: Barcelona

Join Date: Mar 2014
Posts: 1
Default Error in cnv output (cnv-seq)

Hi, when using the .cnv output from cnv-seq.pl and trying to plot with library(cnv) from cnv-seq in R there appears an error like this. I am upset because I have generated 7 files and some of them are working fine and some not and they are generated in the same way. Here I attach the error in R cran:

> library(cnv)
> data <- read.delim("my_data.cnv")
> cnv.print(data)
cnv chromosome start end size log2 p.value
CNVR_1 chr Inf -Inf -Inf
CNVR_0 chrY chrX chrMT chr1 chr2 chr3 chr4 chr5 chr6 chr7 chr8
chr9 chr10 chr11 chr12 chr13 chr14 chr15 chr16 chr17 chr18 398
315321616 315321219 NA NA
Warning messages:
1: In min(sub$start) : no non-missing arguments to min; returning Inf
2: In min(sub$position) : no non-missing arguments to min; returning Inf
3: In max(sub$end) : no non-missing arguments to max; returning -Inf
4: In max(sub$position) : no non-missing arguments to max; returning -Inf


Thank you very much in advance!
Marevilla is offline   Reply With Quote
Old 03-25-2014, 08:49 AM   #8
VanAxel
Junior Member
 
Location: germany

Join Date: Apr 2009
Posts: 3
Default

Hi Marevilla,
i think your data set seems to be buggy because the normal output should look like this:
Quote:
CNV chromosome start end size log2 p.value type
CNVR_1 chr7 1006251 1018750 12500 0.675152608894017 0.000319481853259238 Gain
CNVR_2 chr7 2181251 2193750 12500 0.670496246462795 0.000349065546251931 Gain
CNVR_3 chr7 2356251 2368750 12500 0.607797341092252 0.00110856198941488 Gain
CNVR_4 chr7 3318751 3331250 12500 0.61291741275106 0.00101142355485097 Gain
CNVR_5 chr7 5231251 5243750 12500 0.600241366643895 0.00126807991100568 Gain
CNVR_6 chr7 6056251 6068750 12500 0.619870173392827 0.000892305035685526 Gain
CNVR_7 chr7 6643751 6656250 12500 0.752320469416477 6.99749167890688e-05 Gain
CNVR_8 chr7 7493751 7518750 25000 0.648214799086789 9.58225619127747e-07 Gain
CNVR_9 chr7 7893751 7918750 25000 0.622685188999881 2.37316926738251e-06 Gain
Maybe you need to check the output from your previous pipeline parts as they are buggy, too.

The output one step before the above one should look like this:
Quote:
chromosome start end test ref position log2 p.value cnv cnv.size cnv.log2 cnv.p.value
chr1 1 25000 91 139 12500 -0.406313758410827 0.0149832381084390 0 NA NA NA
chr1 12501 37500 142 228 25000 -0.478310220546076 0.00561464244688826 0 NA NA NA
chr1 25001 50000 149 202 37500 -0.234210288175650 0.102083476954782 0 NA NA NA
chr1 37501 62500 135 161 50000 -0.0492686069498024 0.393692513368816 0 NA NA NA
chr1 50001 75000 111 131 62500 -0.0341744610733607 0.425762996389164 0 NA NA NA
chr1 62501 87500 83 94 75000 0.0252832537832712 0.444849658758006 0 NA NA NA
chr1 75001 100000 87 88 87500 0.188344551325415 0.150648136271992 0 NA NA NA
chr1 87501 112500 117 122 1e+05 0.144460056134502 0.213854859246385 0 NA NA NA
chr1 100001 125000 121 118 112500 0.241052862026737 0.0931423468736603 0 NA NA NA
regards
VanAxel
VanAxel is offline   Reply With Quote
Old 05-13-2014, 09:07 PM   #9
younko
Member
 
Location: USA

Join Date: May 2014
Posts: 24
Post

Hello, VanAxel

I have one additional question. I got the exact same output file as you show..

chromosome start end test ref position log2 p.value cnv cnv.size cnv.log2 cnv.p.value
chr1 1 25000 91 139 12500 -0.406313758410827 0.0149832381084390 0 NA NA NA
chr1 12501 37500 142 228 25000 -0.478310220546076 0.00561464244688826 0 NA NA NA
chr1 25001 50000 149 202 37500 -0.234210288175650 0.102083476954782 0 NA NA NA
chr1 37501 62500 135 161 50000 -0.0492686069498024 0.393692513368816 0 NA NA NA
chr1 50001 75000 111 131 62500 -0.0341744610733607 0.425762996389164 0 NA NA NA
chr1 62501 87500 83 94 75000 0.0252832537832712 0.444849658758006 0 NA NA NA
chr1 75001 100000 87 88 87500 0.188344551325415 0.150648136271992 0 NA NA NA
chr1 87501 112500 117 122 1e+05 0.144460056134502 0.213854859246385 0 NA NA NA
chr1 100001 125000 121 118 112500 0.241052862026737 0.0931423468736603 0 NA NA NA


but from this file, how can I get the final file you showed??

CNV chromosome start end size log2 p.value type
CNVR_1 chr7 1006251 1018750 12500 0.675152608894017 0.000319481853259238 Gain
CNVR_2 chr7 2181251 2193750 12500 0.670496246462795 0.000349065546251931 Gain
CNVR_3 chr7 2356251 2368750 12500 0.607797341092252 0.00110856198941488 Gain
CNVR_4 chr7 3318751 3331250 12500 0.61291741275106 0.00101142355485097 Gain
CNVR_5 chr7 5231251 5243750 12500 0.600241366643895 0.00126807991100568 Gain
CNVR_6 chr7 6056251 6068750 12500 0.619870173392827 0.000892305035685526 Gain
CNVR_7 chr7 6643751 6656250 12500 0.752320469416477 6.99749167890688e-05 Gain
CNVR_8 chr7 7493751 7518750 25000 0.648214799086789 9.58225619127747e-07 Gain
CNVR_9 chr7 7893751 7918750 25000 0.622685188999881 2.37316926738251e-06 Gain

could you please let me know the way to get the final file??
younko is offline   Reply With Quote
Old 05-14-2014, 01:15 AM   #10
younko
Member
 
Location: USA

Join Date: May 2014
Posts: 24
Default

Nevermind. I found the way..

cnv.print(data,file="xxx") will save the result as you suggest.. though it doesnot show "Gain" column
younko is offline   Reply With Quote
Old 05-14-2014, 01:46 AM   #11
younko
Member
 
Location: USA

Join Date: May 2014
Posts: 24
Default CNV-seq output error

Hello, VanAxel

Sorry for continuous posting.. I am still playing with CNV-seq.
From the cnv file, most of my line's cnv.size, cnv.log2, cnv.p.value havs NA like your posting.. but I have some lines which have not NA value for these column.

In this case, I do not have any problem to cnv.print(data) even though it is a little awkward.

BUT, for some analsyis, I had exactly same error with Marevilla. When I looked at the cnv file, I found that cnv.size, cnv.log2, cnv.p.value columnes have NA value for all lines in the cnv file. those columnes do not have any value at all except "NA". And in this case, I realized that cnv.summary(data) return

CNV percentage in genome: 0%
CNV nucleotide content: 0
CNV count: 0
Mean size: NaN
Median size: NA
Max Size: -Inf
Min Size: Inf
Warning messages:
1: In max(true$cnv.size) : no non-missing arguments to max; returning -Inf
2: In min(true$cnv.size) : no non-missing arguments to min; returning Inf


... It is really weird.. even though I did the exactly smae thing for those two dataset!.

Could you please somebody help with this??
younko is offline   Reply With Quote
Old 07-09-2014, 08:08 PM   #12
xiechao
Junior Member
 
Location: singapore

Join Date: Mar 2009
Posts: 9
Default

The last 3 columns are referring to CNV_Region, so if a sliding window is not part of a CNV region, the values will be NA.
If all values for the 3 columns are NA, it means no CNV region was found passing the criteria.
xiechao is offline   Reply With Quote
Old 08-09-2014, 11:14 AM   #13
Pepe
Member
 
Location: Germany

Join Date: Mar 2009
Posts: 28
Default

Hi all,

I am not sure if Marevilla and Ayush_Saxena are still interested, or if the developers are around.
I was having the same problem as some of you:
I run CNVseq for multiple bam files and some were not returning any significant regions while some other did.
After some hours with the R package I see there is a bug when estimating the window size to call the internal cnv.ANNO function. To fix it:
- Open the file "02.1.cnv.R"
- find the following line (line 59 in my version): step <- window.size/2;
- add this line below that one: step <- ceiling(step)
Pepe is offline   Reply With Quote
Old 08-11-2014, 11:33 PM   #14
xiechao
Junior Member
 
Location: singapore

Join Date: Mar 2009
Posts: 9
Default

Quote:
Originally Posted by Pepe View Post
Hi all,

I am not sure if Marevilla and Ayush_Saxena are still interested, or if the developers are around.
I was having the same problem as some of you:
I run CNVseq for multiple bam files and some were not returning any significant regions while some other did.
After some hours with the R package I see there is a bug when estimating the window size to call the internal cnv.ANNO function. To fix it:
- Open the file "02.1.cnv.R"
- find the following line (line 59 in my version): step <- window.size/2;
- add this line below that one: step <- ceiling(step)
Thank you very much for the bugfix. The files are updated.
xiechao is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 11:23 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO