Seqanswers Leaderboard Ad

**id0** · 09-11-2013, 10:49 AM

Originally posted by michaelb28 View Post

Quick Question - Has anybody tried running CONTRA on a VM? I am running it in a Ubuntu-Linux environment, and I'm trying to find out if that may be the source of my problem. I always receive this error, even though all of my scripts are in the correct directory.

Traceback (most recent call last):
File "contra.py", line 569, in <module>
main()
File "contra.py", line 514, in main
get_genome(params.TEST, genomeFile)
File "/home/michaelb/Documents/Contra-II/CONTRA.v2.0.2/scripts/get_chr_length.py",
line 31, in get_genome
raw_header = subprocess.Popen(args, stdout =
subprocess.PIPE).communicate()[0]
File "/usr/local/lib/python2.6/subprocess.py", line 595, in __init__
errread, errwrite)
File "/usr/local/lib/python2.6/subprocess.py", line 1106, in _execute_child
raise child_exception
OSError: [Errno 2] No such file or directory

I realize this thread is over a year old, so this answer is probably not relevant to anyone who posted here. However, given that this is the only discussion of get_chr_length.py I could find anywhere online, I thought I should post my solution. It turns out that error is produced when samtools is missing.

If any CONTRA developers are reading this, I'd like to suggest adding a quick check at the beginning for all the prerequisites. It seems like most of the errors are due to either missing prerequisites or incorrect versions.

**cecile75** · 01-10-2014, 02:06 AM

CONTRA on whole exome - problem with large bam file

Hello,

I just tried CONTRA for doing CNV detection in tumor/sample coupled sample.

I use Ubuntu distribution, python 2.7.3, R 2.15.

CONTRA worked well with test sample included in the installation package.

However, when I try with my own bam file, it fails.
The error says that the process is stopped after "Getting the Log Ratio ...".

Here is the whole command line and stout :

time /home/cecile/Documents/appli/CONTRA/CONTRA.v2.0.4/contra.py -f hg19.fa -c suite_analyse/GL_patient_GCCAAT_L006.reorder_contig.sort_by_coordinate.add_or_replace_group.mark_duplicates.indel_realignment.recal.bam -s suite_analyse/DIAG_patient_CTTGTA_L006.reorder_contig.sort_by_coordinate.add_or_replace_group.mark_duplicates.indel_realignment.recal.bam -t /home/cecile/Documents/data/capture/Agilent_SureSelectAllExonHumanv4/S03723314_Regions.bed -p -o output_CONTRA_param --sampleName WEA_DIAG --nomultimapped --minControlRdForCall 10 --minTestRdForCall 10 -l
target : /home/cecile/Documents/data/capture/Agilent_SureSelectAllExonHumanv4/S03723314_Regions.bed
test : suite_analyse/DIAG_patient_CTTGTA_L006.reorder_contig.sort_by_coordinate.add_or_replace_group.mark_duplicates.indel_realignment.recal.bam
control : suite_analyse/GL_patient_GCCAAT_L006.reorder_contig.sort_by_coordinate.add_or_replace_group.mark_duplicates.indel_realignment.recal.bam
fasta : hg19.fa
outfolder : output_CONTRA_param
numBin : [20]
minreaddepth : 10
minNBases : 10
sam : False
pval : 0.05
sampleName : WEA_DIAG
nomultimapped : True
plot : True
bedInput : False
minExon : 2000
largeDeletion : True
Creating Output Folder : Done.
Removing multi-mapped reads
Multi mapped reads removed.
Multi mapped reads removed.
Converting TEST Sample...
DEBUG 123 genomeCoverageBed -ibam output_CONTRA_param/buf/test_reliable.BAM -bga -g output_CONTRA_param/buf/sample.Genome
Converting CONTROL Sample...
DEBUG 123 genomeCoverageBed -ibam output_CONTRA_param/buf/control_reliable.BAM -bga -g output_CONTRA_param/buf/sample.Genome
Getting targeted regions DOC...
chr1
chr10
chr11
chr12
chr13
chr14
Getting targeted regions DOC...
chr1
chr15
chr16
chr10
chr17
chr11
chr18
chr19
chr12
chr2
chr13
chr14
chr15
chr20
chr16
chr21
chr22
chr17
chr3
chr18
chr19
chr4
chr5
chr2
chr6
chr7
chr20
chr21
chr22
chr8
chr3
chr9
chrX
chr4
chrY
Targeted regions pre-processing: Done
chr5
chr6
chr7
chr8
chr9
chrX
chrY
Targeted regions pre-processing: Done
Test file read depth = 8701473580
Control file read depth = 8850452831
Pre-processing Completed.
Getting the Log Ratio ...
Processus arrêté

real 296m18.786s
user 153m41.800s
sys 11m19.754s

Do you have an idea of what causes this error ? Are my bam files (~18Ga) too large ?

Thank you for your help !!!

**wolfpack14** · 01-10-2014, 05:27 AM

Looks like CONTRA locked up when it tried to calculate the log ratios. You may want to try increasing your available RAM, reducing your file size, or adjust your input where CONTRA doesn't have to read as many rows. That may involve increasing the minimum size acceptable for a read.

Can you post your computer specs?

**cecile75** · 01-10-2014, 05:48 AM

Here are my computer specs :

4 processors, such as :
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 45
model name : Intel(R) Xeon(R) CPU E5-2609 0 @ 2.40GHz
stepping : 7
microcode : 0x70b
cpu MHz : 1200.000
cache size : 10240 KB

MemTotal: 65902720 kB

My bam contain about 181 000 000 and 157 000 000 reads. When I remove duplicates, I can eliminate 17% of them. I can try again with those file.
Generally, how do you deal with large files ? Do you perform a per-chromosome analysis ?

Thank you wolfpack14 !!

**wolfpack14** · 01-10-2014, 07:30 AM

Interesting. You would think 64GB of RAM would be enough for calculating log ratios on 180MM reads, but I guess not. CONTRA isn't exactly the most efficient piece of software.

We are in the process of putting samples through the algorithm to compare to other CNV packages. I will keep you posted if we run into issues.

There are quite a few of CNV packages, including ExomeCNV, PennCNV, ADTEx, CNV-Seq, CNVer and Connifer. I would explore all of your options before settling on one algorithm to implement. That said, CONTRA is nice because it is well integrated into NGS hardware and output formatting, but that won't stop us from using a better algorithm.

**wolfpack14** · 01-15-2014, 05:40 AM

Found another bug. If CONTRA runs and forces you data into 1 bin, you must launch CONTRA with the option --numBin 1. If you don't, the application will fail.

It is a bug in how cn_analysis.v3.R handles bin sizes of 1. It doesn't carry over the actual bin size of your data, but rather the specified bin size from the application launch (20 by default).

**wolfpack14** · 01-27-2014, 11:04 AM

cecile,

I think your BAM files are too large. I am successfully running CONTRA on BAM files that are about 200-300MB. How big are the bins in your target BED file? That may also make a difference in performance.

**cecile75** · 01-28-2014, 05:06 AM

Hi wolfpack14,

Indeed, my bam files are ~18 Ga.
The intervals in my target BED file vary from 200bp to 1000 bp. I did'nt specify the option --nomBin, so I ran CONTRA with the default (20).

I ran it after spliting my bam files by chromosome (using Bamtools split), and it also failed. The error was about the bam file, which seemed to be malformed.

I did another test by spliting the file with samtools view -bh file.bam chr${num} and it worked.
But it seems there is no significant result (nothing in CNATable.10rd.10bases.20bins.DetailsFILTERED) althougt there were 6439 targets.

**wolfpack14** · 01-28-2014, 07:23 AM

Have you tried playing around with the minExon setting? We have ours set at 100 so we can make appropriate bin sizes. If we left it at 2000 (the default), CONTRA had a tough time splitting the data up into enough bins.

We also forced our input BED intervals to width 20. We are getting pretty granular results from this combination.

**wolfpack14** · 01-30-2014, 10:42 AM

Just solved another issue we were having with using multiple input bed files on a CONTRA built baseline....

You have to build a new baseline for each input bed you want to use. CONTRA will only work if the input bed file matches the one used to create the original baseline. I think it has to do with non-matching base pair intervals. I kept getting "list out of index" errors on the control sample.

**wolfpack14** · 01-30-2014, 10:47 AM

Cecile,

I think you forgot to specify that your input file is in BED format. I can tell because your output has this:

bedInput : False

Use the --bed parameter when you launch CONTRA. I think it will solve your problem.

If it doesn't solve your issue, try running without the -l option (CBS Large Variation detection). That is probably what is taking so long.

**nielsk** · 01-31-2014, 06:30 AM

Hi,

Im trying to get CONTA working.
However, using the testfiles i get the following error:

DEBUG 266b

fastaFromBed -fi reference/human_g1k_v37.fasta -bed testfix2/buf/CNATable.10rd.10bases.20bins.BED -fo testfix2/buf/CNATable.10rd.10bases.20bins.fastaOut.txt -name
Error: The requested bed file (testfix2/buf/CNATable.10rd.10bases.20bins.BED) could not be opened. Exiting!
Creating VCF file ...
testfix2/table/CNATable.10rd.10bases.20bins.vcf created.
Done...

Command used:

python CONTRA.v2.0.4/contrafix.py -t test_files/0247401_D_BED_20090724_hg19_MERGED.bed -s test_files/P0667T_GATKrealigned_duplicates_marked.bam -c test_files/P0667N_GATKrealigned_duplicates_marked.bam -f reference/human_g1k_v37.fasta -o testfix2

I was wondering if someone could help me to solve this error.

I am planning to determine CNV on IonTorrent PGM data.
Do you think CONTRA would be suitable for this?
Or do you suggest another program?

Nevertheless, I hope someone can help me to fix this!

Thanks in advance

**nielsk** · 02-04-2014, 05:53 AM

Are there still people using CONTRA?

**dnusol** · 11-13-2014, 07:49 AM

Hi nielsk,

are you still running into this error?

If so, I am just trying out CONTRA for the first time so this may be of no help to you.

It seems CONTRA cannot find your .bed file in testfix2/buf/ directory. Can you check your paths are correct? You seem to be executing CONTRA from your home dir or something similar and you have also your test files and your reference in the same directory

From my initial tests, CONTRA copies your target bed file to the buf directory and renames it target.BED

But in your case CONTRA seems to be looking for the original name of the file. I wonder if copying the file manually into the buf directory will make the trick

Anyway, have you managed to get any CNV results from your PGM data? it would be interesting to know if it works on single-end reads with a targeted resequencing experiment

HTH

Dave

**DNAmethylome** · 08-14-2015, 01:11 PM

Hi Dr. Li,

I have some problems running CONTRA, and I have posted a thread here:

CONTRA error messages (need urgent help!) - SEQanswers

http://seqanswers.com/forums/showthread.php?t=62000

Discussion of next-gen sequencing related bioinformatics: resources, algorithms, open source efforts, etc

Could you please help me on that?

Thanks a lot!

-J

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 25 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 27 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 24 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 52 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News