SEQanswers

Go Back   SEQanswers > Literature Watch



Similar Threads
Thread Thread Starter Forum Replies Last Post
PubMed: Detection of somatic copy number alterations in cancer using targeted exome c Newsbot! Literature Watch 0 04-03-2012 04:30 PM

Reply
 
Thread Tools
Old 05-11-2015, 04:15 PM   #21
etal
Member
 
Location: California

Join Date: Oct 2013
Posts: 23
Default

Hi wisekh,

The LOH functionality in CNVkit is described here:
http://cnvkit.readthedocs.org/en/lat...s.html#scatter

However, the "calls" are simply displayed visually -- the variant allele frequencies are plotted alongside the copy ratios, and a shift in VAF from 0.5 indicates LOH. I'm currently working on expanding this functionality to make it more useful.

To run the complete pipeline with a tumor-normal pair and make a plot of the copy number and LOH shifts together, follow the quick start guide here:
http://cnvkit.readthedocs.org/en/latest/quickstart.html

Separately from running the CNVkit "batch" pipeline, you'll need to call SNPs in the tumor sample in VCF format. Then use that VCF file as input to the CNVkit "scatter" command along with the .cnr and .cns files from the CNVkit pipeline to make the plot.

Hope that helps,
Eric
etal is offline   Reply With Quote
Old 05-11-2015, 09:08 PM   #22
wisekh
Junior Member
 
Location: United States

Join Date: Oct 2011
Posts: 5
Default

I got it. Thank you very much for your quick response.

hoon
wisekh is offline   Reply With Quote
Old 05-12-2015, 08:24 PM   #23
wisekh
Junior Member
 
Location: United States

Join Date: Oct 2011
Posts: 5
Default using a mapping quality threshold

Dear Eric,
I have another question. Is it possible to use a mapping quality threshold to, for example, a tumor bam or multiple tumor bams that I would infer copy number from?

thank you,

Hoon
wisekh is offline   Reply With Quote
Old 05-14-2015, 10:08 AM   #24
etal
Member
 
Location: California

Join Date: Oct 2013
Posts: 23
Default

The mapping quality threshold is hard-coded to only exclude unmapped or ambiguously mapped reads, see here:
https://github.com/etal/cnvkit/blob/...verage.py#L115

Just change the -Q value to another integer if you'd like to try it yourself. However, I recall reading (can't find the reference at the moment) that keeping low-MAPQ reads did not harm copy number estimation and may have improved it.

In any case, in CNVkit the script genome2access.py can be used to directly exclude poorly mappable genomic regions. This is done already for hg19 in the bundled file access-5k-mappable.hg19.bed.
etal is offline   Reply With Quote
Old 03-08-2016, 04:25 AM   #25
oxydeepu
Member
 
Location: bangalore,india

Join Date: Jul 2011
Posts: 41
Default Sequence IDs don't match with bed Error CNVkit.

Hi all,

I trying to run CNVkit with tumor and normal samples on exome sequencing. But I tried all possible things mentioned in the docs. I tried different bed file input as the target region and also the annotations but I always end up getting the error.
ValueError: BED file 'results/S04380110_Covered.target.bed' sequence IDs don't match any in BAM file.

Anybody came across this issue. Please help.

Thank you in advance.

Deepak
oxydeepu is offline   Reply With Quote
Old 03-08-2016, 07:33 AM   #26
etal
Member
 
Location: California

Join Date: Oct 2013
Posts: 23
Default

Hi Deepak,

The sequence IDs are the chromosome names in your reference genome and the first column of your BED file. For the human genome the first chromosome might be "1" or "chr1" depending on where you got your reference genome.

Check that the name schemes match between your BED and BAM files, either "1" or "chr1". You can use "samtools view -H" to see the BAM header. If the names don't match, then you should edit your BED file, either adding or removing the "chr" prefix, so that they do match.

Hope that helps,
Eric
etal is offline   Reply With Quote
Old 03-09-2016, 01:30 PM   #27
DrWorm
Member
 
Location: St. Louis, MO

Join Date: Apr 2013
Posts: 17
Default errors in production of cns file

Hello all,

I'm having issues working through the install of CNVkit. I've pinpointed my issues to problems with the segment sub-command. When I try to run the test procedure, it dies with this error:

Traceback (most recent call last):
File "../cnvkit.py", line 11, in <module>
args.func(args)
File "/home/snmcnulty/bin/cnvkit-master/cnvlib/commands.py", line 666, in _cmd_segment
rlibpath=args.rlibpath)
File "/home/snmcnulty/bin/cnvkit-master/cnvlib/segmentation/__init__.py", line 58, in do_segmentation
segarr = cnarr.as_dataframe(seg2cns(seg_out))
File "/home/snmcnulty/bin/cnvkit-master/cnvlib/segmentation/__init__.py", line 143, in seg2cns
+ seg_text)
ValueError: Segmentation output is not valid SEG format:

[then a big table]

Makefile:57: recipe for target 'build/p2-5_5.cns' failed
make: *** [build/p2-5_5.cns] Error 1

If anyone could help me pinpoint the problem, I would really appreciate it!
DrWorm is offline   Reply With Quote
Old 03-09-2016, 03:25 PM   #28
etal
Member
 
Location: California

Join Date: Oct 2013
Posts: 23
Default

Sorry for the trouble. Could you post the first few lines of the output table and the version of CNVkit you're using (try "cnvkit.py version")?

If you're able to load the PSCBS package in R, could you post that package's version too?

If the default CBS method is failing, as an alternative you can try the "haar" method, which does not rely on R at all:

cnvkit.py segment MySample.cnr -m haar

(Use the -t parameter to specify the significance threshold, which might need to be very small for exome sequencing data to segment well.)
etal is offline   Reply With Quote
Old 03-10-2016, 07:39 AM   #29
oxydeepu
Member
 
Location: bangalore,india

Join Date: Jul 2011
Posts: 41
Default

Hi Eric,

I have solved the problems with the reference. But now the problem seems to be different at the step of calculating coverage. This is the error I get.

#######################################

cnvkit.py coverage /media/ae5197fd-42a8-4d8f-96d6-1c8762b4d8ae/MM02-N_MM02-020200/colossus_analysis/BAM/MM02-020200_final.bam my_targets.bed -o .targetcoverage.cnn
Processing reads in MM02-020200_final.bam
Traceback (most recent call last):
File "/usr/local/bin/cnvkit.py", line 4, in <module>
__import__('pkg_resources').run_script('CNVkit==0.7.9.dev0', 'cnvkit.py')
File "/usr/local/lib/python2.7/dist-packages/pkg_resources/__init__.py", line 726, in run_script
self.require(requires)[0].run_script(script_name, ns)
File "/usr/local/lib/python2.7/dist-packages/pkg_resources/__init__.py", line 1491, in run_script
exec(script_code, namespace, namespace)
File "/usr/local/lib/python2.7/dist-packages/CNVkit-0.7.9.dev0-py2.7.egg/EGG-INFO/scripts/cnvkit.py", line 11, in <module>

File "build/bdist.linux-x86_64/egg/cnvlib/commands.py", line 434, in _cmd_coverage

File "build/bdist.linux-x86_64/egg/cnvlib/commands.py", line 457, in do_coverage

File "build/bdist.linux-x86_64/egg/cnvlib/coverage.py", line 37, in interval_coverages
File "build/bdist.linux-x86_64/egg/cnvlib/coverage.py", line 110, in interval_coverages_pileup
File "build/bdist.linux-x86_64/egg/cnvlib/coverage.py", line 143, in bedcov
RuntimeError: Bad line from bedcov:
1

############################

Can you shed some light into the issue?

Thank you.

Regards,
Deepak
oxydeepu is offline   Reply With Quote
Old 03-10-2016, 12:20 PM   #30
DrWorm
Member
 
Location: St. Louis, MO

Join Date: Apr 2013
Posts: 17
Default segmentation, ctd.

M CNVKit version is 0.7.8. R is installed, and I'm using PSCBS v0.61.0. Here's the output I get from the segmentation command.

to connect to socket /tmp/dbus-Hufo1aFYF5: Connection refused
Dropped 1 outlier bins:
chromosome start end gene log2 weight
0 chr4 46638882 46738865 Background -4.1499 0.167756
Traceback (most recent call last):
File "/opt/apps/cnvkit/0.7.8/bin/cnvkit.py", line 11, in <module>
args.func(args)
File "/opt/apps/cnvkit/0.7.8/lib/python2.7/site-packages/cnvlib/commands.py", line 666, in _cmd_segment
rlibpath=args.rlibpath)
File "/opt/apps/cnvkit/0.7.8/lib/python2.7/site-packages/cnvlib/segmentation/__init__.py", line 58, in do_segmentation
segarr = cnarr.as_dataframe(seg2cns(seg_out))
File "/opt/apps/cnvkit/0.7.8/lib/python2.7/site-packages/cnvlib/segmentation/__init__.py", line 143, in seg2cns
+ seg_text)
ValueError: Segmentation output is not valid SEG format:
WARNING: ignoring environment value of R_HOME
"sampleName" "chromosome" "start" "end" "nbrOfLoci" "mean"
"19987_main" "chr1" 93708 2582834 32 0.0317406246246301
"19987_main" "chr1" 2684720 6239917 34 -0.392436669374812
"19987_main" "chr1" 6239917 26177167 253 0.0239173699112572
"19987_main" "chr1" 26177167 29477537 33 0.379337119869004

Also, you were right -- using a non-R workflow completed perfectly.
DrWorm is offline   Reply With Quote
Old 03-10-2016, 12:35 PM   #31
etal
Member
 
Location: California

Join Date: Oct 2013
Posts: 23
Default

Deepak: I haven't been able to replicate the error. Are you sure you used a valid BED file, i.e. my_targets.bed is a 4-column BED file that you've preprocessed with "cnvkit.py target"? The error message suggests you might have given a 1-column list of chromosome IDs instead.
etal is offline   Reply With Quote
Old 03-10-2016, 12:41 PM   #32
etal
Member
 
Location: California

Join Date: Oct 2013
Posts: 23
Default

DrWorm: The SEG parser handles 5- or 6-column tables like the one you've posted. I wonder if the ValueError is due to the "WARNING" message from R getting fed to the SEG parser in CNVkit, just before the start of the usual table. If so, maybe you could disable that message from R and therefore prevent the CNVkit error by prefixing your command line to reset or unset R_HOME:

R_HOME="" cnvkit.py 19987_main.cnr
etal is offline   Reply With Quote
Old 03-11-2016, 08:18 AM   #33
oxydeepu
Member
 
Location: bangalore,india

Join Date: Jul 2011
Posts: 41
Default

Dear Eric,

Thank you for sparing your time. I tried the target and it gives a 4 field bed. But I gues the problem is coming when I use the BAM files.

###################

cnvkit.py batch /home/deepak/MM02-N_MM02-020200/colossus_analysis/BAM/MM02-020200_MarkDup.bam --normal /home/deepak/MM02-N_MM02-020200/colossus_analysis/BAM/MM02-N_MarkDup.bam --targets Agilent_b37_exome_covered.bed --split --annotate refFlatMod.txt --fasta /home/deepak/Ref/human_g1k_v37_decoy.fasta --access /home/deepak/src/cnvkit/data/access-5k-mappable.hg19.bed --output-reference my_reference.cnn --output-dir results/ --diagram --scatter

Processing reads in MM02-N_MarkDup.bam
Traceback (most recent call last):
File "/usr/local/bin/cnvkit.py", line 4, in <module>
__import__('pkg_resources').run_script('CNVkit==0.7.9.dev0', 'cnvkit.py')
File "/usr/local/lib/python2.7/dist-packages/pkg_resources/__init__.py", line 726, in run_script
self.require(requires)[0].run_script(script_name, ns)
File "/usr/local/lib/python2.7/dist-packages/pkg_resources/__init__.py", line 1491, in run_script
exec(script_code, namespace, namespace)
File "/usr/local/lib/python2.7/dist-packages/CNVkit-0.7.9.dev0-py2.7.egg/EGG-INFO/scripts/cnvkit.py", line 11, in <module>

File "build/bdist.linux-x86_64/egg/cnvlib/commands.py", line 97, in _cmd_batch

File "build/bdist.linux-x86_64/egg/cnvlib/commands.py", line 172, in batch_make_reference

File "build/bdist.linux-x86_64/egg/cnvlib/parallel.py", line 14, in apply_async
File "build/bdist.linux-x86_64/egg/cnvlib/commands.py", line 193, in batch_write_coverage

File "build/bdist.linux-x86_64/egg/cnvlib/commands.py", line 457, in do_coverage

File "build/bdist.linux-x86_64/egg/cnvlib/coverage.py", line 37, in interval_coverages
File "build/bdist.linux-x86_64/egg/cnvlib/coverage.py", line 110, in interval_coverages_pileup
File "build/bdist.linux-x86_64/egg/cnvlib/coverage.py", line 143, in bedcov
RuntimeError: Bad line from bedcov:
1

#######################

But it does create two bed files.

$ head results/Agilent_b37_exome_covered.*

==> results/Agilent_b37_exome_covered.antitarget.bed <==
chr1 10500 93708 Background
chr1 93708 176917 Background
chr1 227917 267219 Background
chr1 318219 394543 Background
chr1 394543 470868 Background
chr1 521868 563949 Background
chr1 570871 671469 Background
chr1 671469 772067 Background
chr1 772067 872665 Background
chr1 872665 973263 Background

==> results/Agilent_b37_exome_covered.target.bed <==
1 65509 65625 -
1 65831 65973 -
1 69481 69600 OR4F5
1 721381 721519 -
1 721530 721806 -
1 721851 721942 -
1 752916 753035 FAM87B
1 762095 762275 LINC00115
1 762280 762414 LINC00115
1 762420 762565 LINC00115

I have no idea where it is going wrong.

Sorry for a messy and long mail.

Thank you.

Best,
Deepak
oxydeepu is offline   Reply With Quote
Old 03-11-2016, 11:09 AM   #34
etal
Member
 
Location: California

Join Date: Oct 2013
Posts: 23
Default

Deepak, I notice your antitarget BED chromosomes names start with "chr" but your target BED chromosome names do not. Which names are in your BAM file? (I'm guessing "chr", like hg19, based on the generated antitarget BED file.) Either way, there will be a mismatch between chromosome names in the BAM and BED files at some point. I recommend checking (try "samtools view -H"), and if the "chr" prefix is used in the BAM, then add "chr" to the input target BED file.
etal is offline   Reply With Quote
Old 03-11-2016, 11:28 AM   #35
oxydeepu
Member
 
Location: bangalore,india

Join Date: Jul 2011
Posts: 41
Default

Hi Eric,

Thanks for the reply. I cross checked chr notations in bam and reference bed file. It is all without chr. But when it creates antitarget files it adds chr to the file and vice versa if I use a reference bed file with chr. I don't know why.

Best,
Deepak
oxydeepu is offline   Reply With Quote
Old 03-11-2016, 05:13 PM   #36
etal
Member
 
Location: California

Join Date: Oct 2013
Posts: 23
Default

The antitarget BED file has "chr" because "access-5k-mappable.hg19.bed" has chr. You could build an access file for GRCh37 using "cnvkit.py access", but it would probably be quicker for you to just copy the hg19 one and delete the "chr" prefixes from each line.
etal is offline   Reply With Quote
Old 03-12-2016, 10:31 AM   #37
oxydeepu
Member
 
Location: bangalore,india

Join Date: Jul 2011
Posts: 41
Default

Dear Eric,


I tried all possible removal of chr as my BAM file is with just chromosome numbers.But still ended up getting same error as I mentioned above. Now the target and the anti target files have chromosome numbers without chr. I get this error now.

Wrote results/Agilent_b37_exome_covered.antitarget.bed with 54084 regions
Building a copy number reference from normal samples...
Processing reads in MM02-N_final.bam
Traceback (most recent call last):
File "/usr/local/bin/cnvkit.py", line 4, in <module>
__import__('pkg_resources').run_script('CNVkit==0.7.9.dev0', 'cnvkit.py')
File "/usr/local/lib/python2.7/dist-packages/pkg_resources/__init__.py", line 726, in run_script
self.require(requires)[0].run_script(script_name, ns)
File "/usr/local/lib/python2.7/dist-packages/pkg_resources/__init__.py", line 1491, in run_script
exec(script_code, namespace, namespace)
File "/usr/local/lib/python2.7/dist-packages/CNVkit-0.7.9.dev0-py2.7.egg/EGG-INFO/scripts/cnvkit.py", line 11, in <module>

File "build/bdist.linux-x86_64/egg/cnvlib/commands.py", line 97, in _cmd_batch

File "build/bdist.linux-x86_64/egg/cnvlib/commands.py", line 172, in batch_make_reference

File "build/bdist.linux-x86_64/egg/cnvlib/parallel.py", line 14, in apply_async
File "build/bdist.linux-x86_64/egg/cnvlib/commands.py", line 193, in batch_write_coverage

File "build/bdist.linux-x86_64/egg/cnvlib/commands.py", line 457, in do_coverage

File "build/bdist.linux-x86_64/egg/cnvlib/coverage.py", line 37, in interval_coverages
File "build/bdist.linux-x86_64/egg/cnvlib/coverage.py", line 110, in interval_coverages_pileup
File "build/bdist.linux-x86_64/egg/cnvlib/coverage.py", line 143, in bedcov
RuntimeError: Bad line from bedcov:
1


Sorry for the trouble. I really need to make this work.. .

Best,
Deepak
oxydeepu is offline   Reply With Quote
Old 03-12-2016, 06:10 PM   #38
etal
Member
 
Location: California

Join Date: Oct 2013
Posts: 23
Default

You can skip bedcov with the -c/--count-reads option; it uses a different method to calculate read depths but should give similar results.

Also, if you've successfully built the target and antitarget BED files, you can use those as inputs to the "batch" command and drop some of the command line options:

cnvkit.py batch ~/MM02-N_MM02-020200/colossus_analysis/BAM/MM02-020200_MarkDup.bam --normal ~/MM02-N_MM02-020200/colossus_analysis/BAM/MM02-N_MarkDup.bam --targets results/Agilent_b37_exome_covered.target.bed --antitargets results/Agilent_b37_exome_covered.antitarget.bed --fasta ~/Ref/human_g1k_v37_decoy.fasta --output-reference my_reference.cnn --output-dir results/ --diagram --scatter --count-reads
etal is offline   Reply With Quote
Old 03-13-2016, 08:46 PM   #39
etal
Member
 
Location: California

Join Date: Oct 2013
Posts: 23
Default

Deepak, another user reported this issue and found that it was due to a bug in Pysam 0.9:
https://github.com/etal/cnvkit/issues/86

Your original command should work for you if you are able to remove Pysam 0.9 and install Pysam 0.8.4 instead.
etal is offline   Reply With Quote
Old 03-14-2016, 02:01 AM   #40
oxydeepu
Member
 
Location: bangalore,india

Join Date: Jul 2011
Posts: 41
Default

Thank you Eric. I will try and let you know.

Best,
Deepak
oxydeepu is offline   Reply With Quote
Reply

Tags
cnv calling, copy number analysis, software, targeted resequencing

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 04:08 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO