Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #61
    There is a problem with this SNP file. Please try another one from the FREEC website.

    Comment


    • #62
      Disadvantages to setting degree to 7, or 9 and small window and step sizes

      Hello,

      Beyond the issue of additional running time, are there any reasons not to run control-freec with the degree setting set to 7 or 9 instead of 3 or 5?

      Similarly, are there any problems with setting the window and step size relatively small (I was thinking of 500 bp and 250 bp respectively)?

      Thanks

      Comment


      • #63
        Originally posted by bhdavis1978 View Post
        Beyond the issue of additional running time, are there any reasons not to run control-freec with the degree setting set to 7 or 9 instead of 3 or 5?
        I did not try, but I think with degree >3 the result will be very similar to that with degree==3.
        Similarly, are there any problems with setting the window and step size relatively small (I was thinking of 500 bp and 250 bp respectively)?
        Thanks
        The ideal window size depends on read density. It is good to have about 400 reads per window. Alternatively, the window size can be evaluated automatically by FREEC using the read count and based on Poisson distribution.

        Comment


        • #64
          Hi Valeu,

          I want to be able to use the copy number data generated using control freec as input for a regression of copy number against other genomic and epigenetic features, so having higher precision is very useful to me.

          Assuming 30X coverage & read length = 100 bp, and a desire to have 400 reads / window suggests to me that the minimum recommended window size is about 1333 =(400 / 30 * 100). I was hoping to have the window size set to about 500 bp, which would imply about 150 reads per window.

          What would be the consequences of this? More variability in the copy number estimation? More breakpoints? Less confidence in identifying break points?

          Comment


          • #65
            Originally posted by bhdavis1978 View Post
            Hi Valeu,
            What would be the consequences of this? More variability in the copy number estimation? More breakpoints? Less confidence in identifying break points?
            More variability in the normalized read count signal => less confidence in breakpoints.

            Anyway, you can try and then visually check the resulting profile.

            Comment


            • #66
              Question about the _CNV output

              Dear Value,

              I would like to ask you something about the "_CNVs" output of Control-FREEC. I have a set of mouse cancer whole genomes that have been sequence at high depth ~45X using Illumina. I have used Control-FREEC to call CNVs on the samples as well as the BAF(using the set of SNPs idetified by the mouse resequencing project on the same mouse strain). I noticed that in the "_CNVs" output file there are overlapping CNVs. For instance (highlighted in bold below as reported in the _CNVs output file)

              1 2960000 3029999 2 normal AA 20.8697
              1 2990000 3389999 8 gain AAAAABBB 5.57241
              1 3350000 3499999 3 gain AAB 44.5596

              1 3460000 3549999 11 gain AAAAAAAAABB 100
              1 3510000 3739999 3 gain AAB 7.9066
              1 11890000 12709999 3 gain AAB 2.14849
              1 12670000 16909999 3 gain AAB 0.411016


              In most of the cases that I have encountered so far, the overlapping CNV windows have either different predicted genotypes and copy number (like in the firs example) or only different precentages of uncertainty of the predicted genotype.

              In the former case I assume the presence of the overlapped CNVs is due to the prediction of different genotypes (is this correct?) and a filter by percentage of uncertainity would remove them. However, in the latter the predicted genotypes and copy numbers are the same and the percentages of uncertainity are low as well.

              Do you have any clues on why this might be occuring? Also would you recommend to filter out the CNVs based on the precentages of uncertainty up to the point where one ends up with non overlapping CNVs?

              Thanks and I hope that you have a good day!

              Comment


              • #67
                Originally posted by tatinhawk View Post
                I noticed that in the "_CNVs" output file there are overlapping CNVs.
                FREEC uses overlapping windows to scan the genome (if step < window). This is why you may have overlapping predictions. The breakpoint should be located somewhere in the overlapping part.

                Comment


                • #68
                  Hi , I would like to download all tracks > SNP130 (if your using hg18, for hg19 its 131) >.provide the hg18 snp 130 txt file. I checked ucsc but am not able to understand which filters to select. Secondly how do I change the order of the columns. I checked the tutorial but am not getting any option to do

                  Thanks
                  Anwesha

                  Comment


                  • #69
                    Originally posted by AnweshaM7 View Post
                    Hi , I would like to download all tracks > SNP130 (if your using hg18, for hg19 its 131) >.provide the hg18 snp 130 txt file. I checked ucsc but am not able to understand which filters to select. Secondly how do I change the order of the columns. I checked the tutorial but am not getting any option to do
                    I will try to add it. But I assure you that the results will be the same as if you use hg18_snp130.SingleDiNucl.1based.txt

                    Comment


                    • #70
                      Hi,

                      I am running ControlFreec for matched tumor/normal pairs whole exome sequencing.
                      However for one sample I am always getting the error.

                      Initial guess for polynomial:
                      Error: variation in read count per window is too small.
                      Unable to proceed..
                      Wed Nov 12 14:41:11 GMT 2014

                      I have tried to increase the window size but still get the same problem. Last setting for window size was 1500.

                      The average coverage for the normal and tumor is 107x and 24x respectively.

                      I am a bit clueless here.. should I increase or decrease the window size?

                      Thanks

                      Regards
                      Shruti

                      Comment


                      • #71
                        error while control free C

                        Hi,
                        I have bam files for my sample. I ran control-freeC (WGS) for all chromosomes and got _CNV for them .
                        However for chromosome X and Y I am getting error :
                        'Unable to proceed..
                        Try to rerun the program with higher number of reads'

                        The data (tumor) is of 27x coverage for hg18 track.
                        I have tried a winow length of 1000,1500 and 3000 but still get the same error.

                        I am not able to understand the reason for getting this error.

                        Thanks
                        Anwesha

                        Comment


                        • #72
                          what should be the parameter for normal/tumor clone with varying coverage

                          I would like to discuss certain things with you regarding the samples am using to infer CNV with exome data with Control-FREEC. I am using WES tumor data. I have tumor sample with a coverage of 70X(polyclonal) and its match normal as blood with same coverage. I used 500 windows and step 250 to infer the CNVs. I found 120 CNVs with signifiance with a median of 42kb for a region that is called CNV. However am applying the same parameters when I am using to infer CNVs from my tumor reprogrammed clones which are sequenced at 35X since they are single clone but the normal control in that case is again 70X coverage blood sample. So can you suggest me if the window length for this? Should it be the same as that of tumor/normal pair? I did with same window and found the median distribution of the bases is higher for single clone iPSCs than the tumor. Do you have any suggestion is I should double the window and step size for the single clone or reduce it by half? Also the coverage of normal blood is 70X while that of the iPSC clone is 35X so wont the results be spurious taking the same window and step as with tumor/normal samples having both 70X coverage? What should be ideal window and step if the control is having double the coverage than its tumor sample? or is it preferable to use the coefficientofVariation? If so then what should be the suggestion of coefficientofvariation that I should use. Also the breakpointType and breakpoint threshold that should be used. Am attaching the config file which I already used for my normal/tumor (both 70X coverage) . I have used the same config file for normal/tumor-IPSC (70X/35X) coverage. The results look promising but am thinking if am tampering with the sensitivity or not, but as far as I know the read depths are normalized for both and then the CNV are calculated. Still I would like some suggestions about the parameters I should change for varying normal/tumor depth. Should I also use intercept=0 and readcountThreshold >=50 since it is WES data. I would like some suggestions if it seems that am tampering with the sensitivity since am keeping the parameters same for norma/tumor and normal/ipsc which has different coverage.

                          Code:
                          [general]
                          
                          chrLenFile = /scratch/GT/vdas/pietro/exome_seq/test_Control_FREEC/hs19_chr.len
                          window = 500
                          
                          step = 250
                          ploidy = 2
                          
                          outputDir = /scratch/GT/vdas/pietro/exome_seq/results/control_freec_out/output_S313_tumor/
                          BedGraphOutput=TRUE
                          breakPointType=4
                          
                          gemMappabilityFile = /scratch/GT/vdas/pietro/exome_seq/test_Control_FREEC/out100m1_hg19.gem
                          
                          chrFiles =  /scratch/GT/vdas/test_exome/exome/
                          
                          maxThreads=6
                          
                          breakPointThreshold=1.5
                          noisyData=TRUE
                          printNA=FALSE
                          #breakPointThreshold = -.002;
                          #window = 50000
                          #chrFiles = hg18/hg18_per_chromosome
                          #outputDir = test
                          #degree=3
                          #intercept = 0
                          
                          [sample]
                          
                          mateFile = /scratch/GT/vdas/pietro/exome_seq/results/T_S7998/T_S7998.realigned.recal.bam
                          inputFormat = bam
                          mateOrientation = FR
                          
                          [control]
                          
                          mateFile = /scratch/GT/vdas/pietro/exome_seq/results/N_S8980/N_S8980.realigned.recal.bam
                          inputFormat = bam
                          mateOrientation = FR
                          
                          [BAF]
                          
                          SNPfile = /scratch/GT/vdas/pietro/exome_seq/test_Control_FREEC/hg19_snp137.SingleDiNucl.1based.txt
                          minimalCoveragePerPosition = 5
                          
                          [target]
                          
                          captureRegions = /scratch/GT/vdas/referenceBed/hg19/ss_v4/Exon_SSV4_clean.bed

                          Comment


                          • #73
                            Originally posted by smapdy View Post
                            I ended up figuring out what was going on. I had some multiallelic variants in the .snp file that were causing it to fail to load, and my sex variable in the configuration file didn't match up with the actual sample sex which caused problems as well. I ended up dropping the sex argument and using the following general configuration file for my samples:
                            [general]
                            window = 8000
                            step = 2500
                            samtools = samtools
                            minCNAlength = 4
                            BedGraphOutput = TRUE
                            chrLenFile = NCBIM37_um.fa.len
                            chrFiles = chrfiles
                            outputDir = 31208T_31668N_FREEC_V1
                            printNA = FALSE
                            maxThreads = 6
                            ploidy = 2
                            breakPointType = 4
                            contaminationAdjustment = TRUE
                            noisyData = TRUE

                            [sample]
                            mateFile = 31208_EXOME.pileup.gz
                            inputFormat = pileup
                            mateOrientation = 0

                            [control]
                            mateFile = 31668_EXOME.pileup.gz
                            inputFormat = pileup
                            mateOrientation = 0

                            [target]
                            captureRegions = S0276129_Merged_Sorted_Probes.bed

                            [BAF]
                            SNPfile = snp128.singlebases.monoalleleic.freec_baf.txt
                            minimalCoveragePerPosition = 5

                            If anyone is interested I also have the commands I used to generate the pileups from the .bams, as well as the script I used to generate a working Mm9 and Mm10 .snp file.
                            Hi, Smapdy,

                            I am also working on a mouse project and want to use FreeC to call CNVs. However, when I use the Snp137 file I have the same error message as you mentioned above.
                            I noticed it's been 2 years. But still wondering if you can send me the mm10.snp file?

                            Thank you very much!
                            Best,
                            Yihua

                            Comment


                            • #74
                              Segmentation fault (core dumped)

                              Hi Valeu,

                              I'm trying to run Control-FREEC on mouse exome sequencing data, but I've run into an issue! It works fine when I run Control-FREEC without the BAF analysis, but when I enable it I get the error "Segmentation fault (core dumped)". I'm wondering if this is an issue you've run into before and if you know how to sort it out?

                              The full output from when I run Control-FREEC:
                              Code:
                              Control-FREEC v9.1 : a method for automatic detection of copy number alterations, subclones and for accurate estimation of contamination and main ploidy using deep-sequencing data
                              MT-mode using 4 threads
                              ..Breakpoint threshold for segmentation of copy number profiles is 0.8
                              ..telocenromeric set to 50000
                              ..FREEC is not going to output normalized copy number profiles into a BedGraph file (for example, for visualization in the UCSC GB). Use "[general] BedGraphOutput=TRUE" if you want a BedGraph file
                              ..FREEC is not going to adjust profiles for a possible contamination by normal cells
                              ..Window = 0 was set
                              ..Output directory:     /data2/christian/Sequencing/Output/
                              ..Sample file:  /data2/christian/Sequencing/Output/DeduppedBams/123_14_6_correctRGs_mm10_BQSR.sorted.dedupped.bam
                              ..Sample input format:  BAM
                              ..will use this instance of samtools: 'samtools' to read BAM files
                              ..Control file: /data2/christian/Sequencing/Output/DeduppedBams/123_14_8_correctRGs_mm10_BQSR.sorted.dedupped.bam
                              ..Input format for the control file:    BAM
                              FREEC will create a pileup to compute BAF profile! 
                              ...File with SNPs : /data2/christian/Sequencing/ReferenceFiles/hg19_snp142.SingleDiNucl.1based.bed
                              ..Polynomial degree for "Sample ReadCount ~ Control ReadCount" normalization is 1
                              ..Minimal CNA length (in windows) is 5
                              ..File with chromosome lengths: /data2/christian/Sequencing/ReferenceFiles/mm10_chrom_lengths.fa
                              ..Mappability and GC-content won't be used
                              ..Control-FREEC won't use minimal mappability. All windows overlaping capture regions will be considered
                              ..Mappability file/data2/christian/Sequencing/ReferenceFiles/GEM_mapp_GRCm38_68_mm10.gem be used: all low mappability positions will be discarded
                              ..uniqueMatch = FALSE
                              ..average ploidy set to 2
                              ..break-point type set to 4
                              ..noisyData set to 1
                              ..minimal number of reads per window in the control sample is set to 10
                              Creating Pileup file to compute BAF profile...
                              ..will increase flanking regions by 100 bp
                              Segmentation fault (core dumped)

                              My config file is as follows:
                              Code:
                              [general]
                              chrLenFile = /data2/christian/Sequencing/ReferenceFiles/mm10_chrom_lengths.fa
                              bedtools=/data2/christian/Sequencing/Frameworks/bedtools2/bedtools
                              ploidy = 2
                              gemMappabilityFile = /data2/christian/Sequencing/ReferenceFiles/GEM_mapp_GRCm38_68_mm10.gem
                              noisyData=TRUE
                              outputDir=/data2/christian/Sequencing/Output/
                              printNA=FALSE
                              samtools=samtools
                              window=0
                              telocentromeric=50000
                              breakPointType=4
                              breakpointThreshold=0.6
                              minCNAlength=5
                              maxThreads=4
                              
                              
                              [sample]
                              mateFile = /data2/christian/Sequencing/Output/DeduppedBams/123_14_6_correctRGs_mm10_BQSR.sorted.dedupped.bam
                              inputFormat = BAM
                              mateOrientation = FR
                              
                              
                              [control]
                              mateFile = /data2/christian/Sequencing/Output/DeduppedBams/123_14_8_correctRGs_mm10_BQSR.sorted.dedupped.bam
                              inputFormat = BAM
                              mateOrientation = FR
                              
                              [BAF]
                              SNPfile=/data2/christian/Sequencing/ReferenceFiles/mm10_dbSNP137.ucsc.freec.txt
                              fastaFile=/data2/christian/Sequencing/ReferenceFiles/mm10.fa
                              makePileup=/data2/christian/Sequencing/ReferenceFiles/mm10_dbSNP137.ucsc.freec.bed
                              minimalCoveragePerPosition=5
                              
                              [target]
                              captureRegions=/data2/christian/Sequencing/ReferenceFiles/S0276129/S0276129_AllTracks.bed
                              Specifically, the error disappears when I remove the 'makePileup=' line (although then the BAF analysis isn't performed). The file is generated according to the instructions on the FREEC website (awk-ing the SNP-file for mm10 that's posted on the website).

                              I'm running the analysis on exome data from mouse tumors, sequenced on an Illumina HiSeq in paired end mode using the Agilent Mouse All Exon kit. The files have been aligned to mm10 using BWA-men and dedupped with Picard. I'm running the analysis on Ubuntu (64 bit). I downloaded the Control-FREEC framework and the relevant SNP and mappability files from your website 2-3 days ago.

                              Any help is much appreciated!

                              Comment


                              • #75
                                Hi, I do not see any evident mistake in the config file. If you want me to debug it, please share your config and corresponding files with me. Valentina.Boeva%at%inserm.fr

                                Comment

                                Latest Articles

                                Collapse

                                • seqadmin
                                  Strategies for Sequencing Challenging Samples
                                  by seqadmin


                                  Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                                  03-22-2024, 06:39 AM
                                • seqadmin
                                  Techniques and Challenges in Conservation Genomics
                                  by seqadmin



                                  The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                                  Avian Conservation
                                  Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                                  03-08-2024, 10:41 AM

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by seqadmin, Yesterday, 06:37 PM
                                0 responses
                                7 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, Yesterday, 06:07 PM
                                0 responses
                                7 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 03-22-2024, 10:03 AM
                                0 responses
                                49 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 03-21-2024, 07:32 AM
                                0 responses
                                66 views
                                0 likes
                                Last Post seqadmin  
                                Working...
                                X