Seqanswers Leaderboard Ad

**valeu** · 07-14-2014, 06:31 AM

There is a problem with this SNP file. Please try another one from the FREEC website.

**bhdavis1978** · 08-11-2014, 01:11 PM

Disadvantages to setting degree to 7, or 9 and small window and step sizes

Hello,

Beyond the issue of additional running time, are there any reasons not to run control-freec with the degree setting set to 7 or 9 instead of 3 or 5?

Similarly, are there any problems with setting the window and step size relatively small (I was thinking of 500 bp and 250 bp respectively)?

Thanks

**valeu** · 08-11-2014, 01:32 PM

Originally posted by bhdavis1978 View Post

Beyond the issue of additional running time, are there any reasons not to run control-freec with the degree setting set to 7 or 9 instead of 3 or 5?

I did not try, but I think with degree >3 the result will be very similar to that with degree==3.

Similarly, are there any problems with setting the window and step size relatively small (I was thinking of 500 bp and 250 bp respectively)?
Thanks

The ideal window size depends on read density. It is good to have about 400 reads per window. Alternatively, the window size can be evaluated automatically by FREEC using the read count and based on Poisson distribution.

**bhdavis1978** · 08-11-2014, 01:42 PM

Hi Valeu,

I want to be able to use the copy number data generated using control freec as input for a regression of copy number against other genomic and epigenetic features, so having higher precision is very useful to me.

Assuming 30X coverage & read length = 100 bp, and a desire to have 400 reads / window suggests to me that the minimum recommended window size is about 1333 =(400 / 30 * 100). I was hoping to have the window size set to about 500 bp, which would imply about 150 reads per window.

What would be the consequences of this? More variability in the copy number estimation? More breakpoints? Less confidence in identifying break points?

**valeu** · 08-12-2014, 12:17 AM

Originally posted by bhdavis1978 View Post

Hi Valeu,
What would be the consequences of this? More variability in the copy number estimation? More breakpoints? Less confidence in identifying break points?

More variability in the normalized read count signal => less confidence in breakpoints.

Anyway, you can try and then visually check the resulting profile.

**tatinhawk** · 09-23-2014, 08:20 AM

Question about the _CNV output

Dear Value,

I would like to ask you something about the "_CNVs" output of Control-FREEC. I have a set of mouse cancer whole genomes that have been sequence at high depth ~45X using Illumina. I have used Control-FREEC to call CNVs on the samples as well as the BAF(using the set of SNPs idetified by the mouse resequencing project on the same mouse strain). I noticed that in the "_CNVs" output file there are overlapping CNVs. For instance (highlighted in bold below as reported in the _CNVs output file)

1 2960000 3029999 2 normal AA 20.8697
1 2990000 3389999 8 gain AAAAABBB 5.57241
1 3350000 3499999 3 gain AAB 44.5596
1 3460000 3549999 11 gain AAAAAAAAABB 100
1 3510000 3739999 3 gain AAB 7.9066
1 11890000 12709999 3 gain AAB 2.14849
1 12670000 16909999 3 gain AAB 0.411016

In most of the cases that I have encountered so far, the overlapping CNV windows have either different predicted genotypes and copy number (like in the firs example) or only different precentages of uncertainty of the predicted genotype.

In the former case I assume the presence of the overlapped CNVs is due to the prediction of different genotypes (is this correct?) and a filter by percentage of uncertainity would remove them. However, in the latter the predicted genotypes and copy numbers are the same and the percentages of uncertainity are low as well.

Do you have any clues on why this might be occuring? Also would you recommend to filter out the CNVs based on the precentages of uncertainty up to the point where one ends up with non overlapping CNVs?

Thanks and I hope that you have a good day!

**valeu** · 09-23-2014, 08:43 AM

Originally posted by tatinhawk View Post

I noticed that in the "_CNVs" output file there are overlapping CNVs.

FREEC uses overlapping windows to scan the genome (if step < window). This is why you may have overlapping predictions. The breakpoint should be located somewhere in the overlapping part.

**AnweshaM7** · 10-04-2014, 12:15 AM

Hi , I would like to download all tracks > SNP130 (if your using hg18, for hg19 its 131) >.provide the hg18 snp 130 txt file. I checked ucsc but am not able to understand which filters to select. Secondly how do I change the order of the columns. I checked the tutorial but am not getting any option to do

Thanks
Anwesha

**valeu** · 10-05-2014, 04:56 AM

Originally posted by AnweshaM7 View Post

Hi , I would like to download all tracks > SNP130 (if your using hg18, for hg19 its 131) >.provide the hg18 snp 130 txt file. I checked ucsc but am not able to understand which filters to select. Secondly how do I change the order of the columns. I checked the tutorial but am not getting any option to do

I will try to add it. But I assure you that the results will be the same as if you use hg18_snp130.SingleDiNucl.1based.txt

**shruti** · 11-12-2014, 06:56 AM

Hi,

I am running ControlFreec for matched tumor/normal pairs whole exome sequencing.
However for one sample I am always getting the error.

Initial guess for polynomial:
Error: variation in read count per window is too small.
Unable to proceed..
Wed Nov 12 14:41:11 GMT 2014

I have tried to increase the window size but still get the same problem. Last setting for window size was 1500.

The average coverage for the normal and tumor is 107x and 24x respectively.

I am a bit clueless here.. should I increase or decrease the window size?

Thanks

Regards
Shruti

**AnweshaM7** · 11-12-2014, 09:56 PM

error while control free C

Hi,
I have bam files for my sample. I ran control-freeC (WGS) for all chromosomes and got _CNV for them .
However for chromosome X and Y I am getting error :
'Unable to proceed..
Try to rerun the program with higher number of reads'

The data (tumor) is of 27x coverage for hg18 track.
I have tried a winow length of 1000,1500 and 3000 but still get the same error.

I am not able to understand the reason for getting this error.

Thanks
Anwesha

**vd4mindia** · 02-05-2015, 02:34 AM

what should be the parameter for normal/tumor clone with varying coverage

I would like to discuss certain things with you regarding the samples am using to infer CNV with exome data with Control-FREEC. I am using WES tumor data. I have tumor sample with a coverage of 70X(polyclonal) and its match normal as blood with same coverage. I used 500 windows and step 250 to infer the CNVs. I found 120 CNVs with signifiance with a median of 42kb for a region that is called CNV. However am applying the same parameters when I am using to infer CNVs from my tumor reprogrammed clones which are sequenced at 35X since they are single clone but the normal control in that case is again 70X coverage blood sample. So can you suggest me if the window length for this? Should it be the same as that of tumor/normal pair? I did with same window and found the median distribution of the bases is higher for single clone iPSCs than the tumor. Do you have any suggestion is I should double the window and step size for the single clone or reduce it by half? Also the coverage of normal blood is 70X while that of the iPSC clone is 35X so wont the results be spurious taking the same window and step as with tumor/normal samples having both 70X coverage? What should be ideal window and step if the control is having double the coverage than its tumor sample? or is it preferable to use the coefficientofVariation? If so then what should be the suggestion of coefficientofvariation that I should use. Also the breakpointType and breakpoint threshold that should be used. Am attaching the config file which I already used for my normal/tumor (both 70X coverage) . I have used the same config file for normal/tumor-IPSC (70X/35X) coverage. The results look promising but am thinking if am tampering with the sensitivity or not, but as far as I know the read depths are normalized for both and then the CNV are calculated. Still I would like some suggestions about the parameters I should change for varying normal/tumor depth. Should I also use intercept=0 and readcountThreshold >=50 since it is WES data. I would like some suggestions if it seems that am tampering with the sensitivity since am keeping the parameters same for norma/tumor and normal/ipsc which has different coverage.

Code:

[general]

chrLenFile = /scratch/GT/vdas/pietro/exome_seq/test_Control_FREEC/hs19_chr.len
window = 500

step = 250
ploidy = 2

outputDir = /scratch/GT/vdas/pietro/exome_seq/results/control_freec_out/output_S313_tumor/
BedGraphOutput=TRUE
breakPointType=4

gemMappabilityFile = /scratch/GT/vdas/pietro/exome_seq/test_Control_FREEC/out100m1_hg19.gem

chrFiles =  /scratch/GT/vdas/test_exome/exome/

maxThreads=6

breakPointThreshold=1.5
noisyData=TRUE
printNA=FALSE
#breakPointThreshold = -.002;
#window = 50000
#chrFiles = hg18/hg18_per_chromosome
#outputDir = test
#degree=3
#intercept = 0

[sample]

mateFile = /scratch/GT/vdas/pietro/exome_seq/results/T_S7998/T_S7998.realigned.recal.bam
inputFormat = bam
mateOrientation = FR

[control]

mateFile = /scratch/GT/vdas/pietro/exome_seq/results/N_S8980/N_S8980.realigned.recal.bam
inputFormat = bam
mateOrientation = FR

[BAF]

SNPfile = /scratch/GT/vdas/pietro/exome_seq/test_Control_FREEC/hg19_snp137.SingleDiNucl.1based.txt
minimalCoveragePerPosition = 5

[target]

captureRegions = /scratch/GT/vdas/referenceBed/hg19/ss_v4/Exon_SSV4_clean.bed

**morrowliu** · 06-02-2015, 02:23 PM

Originally posted by smapdy View Post

I ended up figuring out what was going on. I had some multiallelic variants in the .snp file that were causing it to fail to load, and my sex variable in the configuration file didn't match up with the actual sample sex which caused problems as well. I ended up dropping the sex argument and using the following general configuration file for my samples:
[general]
window = 8000
step = 2500
samtools = samtools
minCNAlength = 4
BedGraphOutput = TRUE
chrLenFile = NCBIM37_um.fa.len
chrFiles = chrfiles
outputDir = 31208T_31668N_FREEC_V1
printNA = FALSE
maxThreads = 6
ploidy = 2
breakPointType = 4
contaminationAdjustment = TRUE
noisyData = TRUE

[sample]
mateFile = 31208_EXOME.pileup.gz
inputFormat = pileup
mateOrientation = 0

[control]
mateFile = 31668_EXOME.pileup.gz
inputFormat = pileup
mateOrientation = 0

[target]
captureRegions = S0276129_Merged_Sorted_Probes.bed

[BAF]
SNPfile = snp128.singlebases.monoalleleic.freec_baf.txt
minimalCoveragePerPosition = 5

If anyone is interested I also have the commands I used to generate the pileups from the .bams, as well as the script I used to generate a working Mm9 and Mm10 .snp file.

Hi, Smapdy,

I am also working on a mouse project and want to use FreeC to call CNVs. However, when I use the Snp137 file I have the same error message as you mentioned above.
I noticed it's been 2 years. But still wondering if you can send me the mm10.snp file?

Thank you very much!
Best,
Yihua

**CLFougner** · 07-28-2016, 03:15 PM

Segmentation fault (core dumped)

Hi Valeu,

I'm trying to run Control-FREEC on mouse exome sequencing data, but I've run into an issue! It works fine when I run Control-FREEC without the BAF analysis, but when I enable it I get the error "Segmentation fault (core dumped)". I'm wondering if this is an issue you've run into before and if you know how to sort it out?

The full output from when I run Control-FREEC:

Code:

Control-FREEC v9.1 : a method for automatic detection of copy number alterations, subclones and for accurate estimation of contamination and main ploidy using deep-sequencing data
MT-mode using 4 threads
..Breakpoint threshold for segmentation of copy number profiles is 0.8
..telocenromeric set to 50000
..FREEC is not going to output normalized copy number profiles into a BedGraph file (for example, for visualization in the UCSC GB). Use "[general] BedGraphOutput=TRUE" if you want a BedGraph file
..FREEC is not going to adjust profiles for a possible contamination by normal cells
..Window = 0 was set
..Output directory:     /data2/christian/Sequencing/Output/
..Sample file:  /data2/christian/Sequencing/Output/DeduppedBams/123_14_6_correctRGs_mm10_BQSR.sorted.dedupped.bam
..Sample input format:  BAM
..will use this instance of samtools: 'samtools' to read BAM files
..Control file: /data2/christian/Sequencing/Output/DeduppedBams/123_14_8_correctRGs_mm10_BQSR.sorted.dedupped.bam
..Input format for the control file:    BAM
FREEC will create a pileup to compute BAF profile! 
...File with SNPs : /data2/christian/Sequencing/ReferenceFiles/hg19_snp142.SingleDiNucl.1based.bed
..Polynomial degree for "Sample ReadCount ~ Control ReadCount" normalization is 1
..Minimal CNA length (in windows) is 5
..File with chromosome lengths: /data2/christian/Sequencing/ReferenceFiles/mm10_chrom_lengths.fa
..Mappability and GC-content won't be used
..Control-FREEC won't use minimal mappability. All windows overlaping capture regions will be considered
..Mappability file/data2/christian/Sequencing/ReferenceFiles/GEM_mapp_GRCm38_68_mm10.gem be used: all low mappability positions will be discarded
..uniqueMatch = FALSE
..average ploidy set to 2
..break-point type set to 4
..noisyData set to 1
..minimal number of reads per window in the control sample is set to 10
Creating Pileup file to compute BAF profile...
..will increase flanking regions by 100 bp
Segmentation fault (core dumped)

My config file is as follows:

Code:

[general]
chrLenFile = /data2/christian/Sequencing/ReferenceFiles/mm10_chrom_lengths.fa
bedtools=/data2/christian/Sequencing/Frameworks/bedtools2/bedtools
ploidy = 2
gemMappabilityFile = /data2/christian/Sequencing/ReferenceFiles/GEM_mapp_GRCm38_68_mm10.gem
noisyData=TRUE
outputDir=/data2/christian/Sequencing/Output/
printNA=FALSE
samtools=samtools
window=0
telocentromeric=50000
breakPointType=4
breakpointThreshold=0.6
minCNAlength=5
maxThreads=4


[sample]
mateFile = /data2/christian/Sequencing/Output/DeduppedBams/123_14_6_correctRGs_mm10_BQSR.sorted.dedupped.bam
inputFormat = BAM
mateOrientation = FR


[control]
mateFile = /data2/christian/Sequencing/Output/DeduppedBams/123_14_8_correctRGs_mm10_BQSR.sorted.dedupped.bam
inputFormat = BAM
mateOrientation = FR

[BAF]
SNPfile=/data2/christian/Sequencing/ReferenceFiles/mm10_dbSNP137.ucsc.freec.txt
fastaFile=/data2/christian/Sequencing/ReferenceFiles/mm10.fa
makePileup=/data2/christian/Sequencing/ReferenceFiles/mm10_dbSNP137.ucsc.freec.bed
minimalCoveragePerPosition=5

[target]
captureRegions=/data2/christian/Sequencing/ReferenceFiles/S0276129/S0276129_AllTracks.bed

Specifically, the error disappears when I remove the 'makePileup=' line (although then the BAF analysis isn't performed). The file is generated according to the instructions on the FREEC website (awk-ing the SNP-file for mm10 that's posted on the website).

I'm running the analysis on exome data from mouse tumors, sequenced on an Illumina HiSeq in paired end mode using the Agilent Mouse All Exon kit. The files have been aligned to mm10 using BWA-men and dedupped with Picard. I'm running the analysis on Ubuntu (64 bit). I downloaded the Control-FREEC framework and the relevant SNP and mappability files from your website 2-3 days ago.

Any help is much appreciated!

**valeu** · 08-01-2016, 01:18 AM

Hi, I do not see any evident mistake in the config file. If you want me to debug it, please share your config and corresponding files with me. Valentina.Boeva%at%inserm.fr

Topics	Statistics	Last Post
Bacterial Timeline Study Suggests Oxygen Use Preceded Photosynthesis by seqadmin Started by seqadmin, Today, 12:59 PM	0 responses 7 views 0 reactions	Last Post by seqadmin Today, 12:59 PM
New Software Simplifies 3D Gene Expression Mapping by seqadmin Started by seqadmin, Yesterday, 10:17 AM	0 responses 8 views 0 reactions	Last Post by seqadmin Yesterday, 10:17 AM
AI Tool Creates High-Resolution 3D Maps of the Mouse Brain by seqadmin Started by seqadmin, 03-20-2025, 05:03 AM	0 responses 49 views 0 reactions	Last Post by seqadmin 03-20-2025, 05:03 AM
Studying Microbial Gene Transfer with RNA Barcoding by seqadmin Started by seqadmin, 03-19-2025, 07:27 AM	0 responses 60 views 0 reactions	Last Post by seqadmin 03-19-2025, 07:27 AM

Seqanswers Leaderboard Ad

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News