Seqanswers Leaderboard Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • Jane M
    Senior Member
    • Aug 2011
    • 239

    CNV + LOH detection on NGS data with Control FREEC

    Hello everybody !

    I am using Control freec (FREEC v6.4 (Control-FREEC v3.4) ) on exome paired (tumor-control) data, on about 25 patients.
    I have several issues I would like to share here to get some help hopefully.

    First, for CNV of the control sample, I got a line at 2 for all my 25 patients (attached file "CNV.mpileup_normal_ratio.txt.png"). I wonder why there is no noise around 2 and why I cannot get points as I do for the tumor sample (attached file "CNV.mpileup_ratio.txt.png"). I changed the option noisyData but it doesn't seem to be the explanation.

    My second problem is for the BAF calculations. As you can see in the control ("baf.mpileup_normal_BAF.txt.png") sample (it's the same for tumor), there is something wrong on chromosome 7, for all patients. There is one point after the end of chromosome. I don't understand why. I can see indeed this point in the BAF.txt file :

    7 94855040 0.6 0.5 0.5 0.5 0.5 19.905
    7 94937446 0.47619 0.5 0.5 0.5 0.5 19.905
    7 94941038 0.5 -1 2 -1 2 -1
    7 892613944 1.07685e-08 0.5 0.5 0.324618 0.675382 0.5
    8 182807 0.333333 -1 2 -1 2 -1
    Finally, for some patients, I have anarchical results (see example2). All my patients have the same pathology (a leukemia), so I doubt that such karyotypes are true... I am wondering if these kind of results are not the consequence of altered DNA...

    Could you please tell me if you faced one of these problems and how did you solve it?
    Thank you in advance,
    Jane
    Attached Files
    Last edited by Jane M; 06-03-2013, 12:24 AM.
  • Jane M
    Senior Member
    • Aug 2011
    • 239

    #2
    No Control FREEC users among the members ?

    Comment

    • valeu
      Member
      • Sep 2008
      • 69

      #3
      Hi Jane,

      First, for CNV of the control sample, I got a line at 2 for all my 25 patients (attached file "CNV.mpileup_normal_ratio.txt.png"). I wonder why there is no noise around 2 and why I cannot get points as I do for the tumor sample (attached file "CNV.mpileup_ratio.txt.png"). I changed the option noisyData but it doesn't seem to be the explanation.

      In exome data (unlike whole genome sequencing data) FREEC cannot call CNVs in the control sample. This is because capture bias is so strong that normalizing only with GC-content and mappability does not help. Thus, FREEC assumes that the whole control sample is present in two copies.
      To understand why ratios are not plotted, please check the format of the _ratio.txt of the control and tumor samples.

      My second problem is for the BAF calculations. As you can see in the control ("baf.mpileup_normal_BAF.txt.png") sample (it's the same for tumor), there is something wrong on chromosome 7, for all patients. There is one point after the end of chromosome. I don't understand why. I can see indeed this point in the BAF.txt file :

      Quote:
      7 94855040 0.6 0.5 0.5 0.5 0.5 19.905
      7 94937446 0.47619 0.5 0.5 0.5 0.5 19.905
      7 94941038 0.5 -1 2 -1 2 -1
      7 892613944 1.07685e-08 0.5 0.5 0.324618 0.675382 0.5
      8 182807 0.333333 -1 2 -1 2 -1


      Check the file with chromosome lengths you use to run FREEC.

      Finally, for some patients, I have anarchical results (see example2). All my patients have the same pathology (a leukemia), so I doubt that such karyotypes are true... I am wondering if these kind of results are not the consequence of altered DNA...


      Sometimes, exome profeile look very noisy. Try to play with "degree" of the polynomial (e.g., try "1" instead of "3") and use the NoisyData option.

      Comment

      • Jane M
        Senior Member
        • Aug 2011
        • 239

        #4
        Thank you a lot for your reply!

        Originally posted by valeu View Post

        Quote:
        7 94855040 0.6 0.5 0.5 0.5 0.5 19.905
        7 94937446 0.47619 0.5 0.5 0.5 0.5 19.905
        7 94941038 0.5 -1 2 -1 2 -1
        7 892613944 1.07685e-08 0.5 0.5 0.324618 0.675382 0.5
        8 182807 0.333333 -1 2 -1 2 -1[/FONT]

        Check the file with chromosome lengths you use to run FREEC.



        My chromosome length file should be fine:
        1 chr1 249250621
        2 chr2 243199373
        3 chr3 198022430
        4 chr4 191154276
        5 chr5 180915260
        6 chr6 171115067
        7 chr7 159138663
        8 chr8 146364022
        9 chr9 141213431
        10 chr10 135534747
        11 chr11 135006516
        12 chr12 133851895
        13 chr13 115169878
        14 chr14 107349540
        15 chr15 102531392
        16 chr16 90354753
        17 chr17 81195210
        18 chr18 78077248
        19 chr19 59128983
        20 chr20 63025520
        21 chr21 48129895
        22 chr22 51304566
        23 chrX 155270560
        24 chrY 59373566
        I think that in the computation, an outlier is generated as shown in the output file: in hg19.len, I provided a length of 159138663, but in the output, there is an point at 892613944...


        Finally, for some patients, I have anarchical results (see example2). All my patients have the same pathology (a leukemia), so I doubt that such karyotypes are true... I am wondering if these kind of results are not the consequence of altered DNA...

        Sometimes, exome profeile look very noisy. Try to play with "degree" of the polynomial (e.g., try "1" instead of "3") and use the NoisyData option.
        I tried a degree of 1 and 2 (and I always use the option noisyData). The results are very similar for these 2 values but very different from the ones with degree=3. It's weird to have such different results. I attached the new result. I will in addition try to increase (more than 10) the coverage.
        Attached Files

        Comment

        • ymc
          Senior Member
          • Mar 2010
          • 496

          #5
          If I have a large number of control (ie germline) exome data, can I pool them such that I can call CNV for the individual germline exome data? I heard that ExomeCNV can do that, I wonder if it is also doable in Control-FREEC

          Comment

          • valeu
            Member
            • Sep 2008
            • 69

            #6
            Originally posted by ymc View Post
            If I have a large number of control (ie germline) exome data, can I pool them such that I can call CNV for the individual germline exome data? I heard that ExomeCNV can do that, I wonder if it is also doable in Control-FREEC
            No, so far there is such an option, although it could improve greatly the results. I think CONTRA does it in a right way. You can check. Otherwise, you can simple merge BAM files and use FREEC

            Comment

            • ymc
              Senior Member
              • Mar 2010
              • 496

              #7
              Originally posted by valeu View Post
              No, so far there is such an option, although it could improve greatly the results. I think CONTRA does it in a right way. You can check. Otherwise, you can simple merge BAM files and use FREEC
              merge them and treat it as normal and then the one we are interested in as tumor, correct?

              Comment

              • valeu
                Member
                • Sep 2008
                • 69

                #8
                Originally posted by ymc View Post
                merge them and treat it as normal and then the one we are interested in as tumor, correct?
                Yes, you are right.

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Pathogen Surveillance with Advanced Genomic Tools
                  by seqadmin




                  The COVID-19 pandemic highlighted the need for proactive pathogen surveillance systems. As ongoing threats like avian influenza and newly emerging infections continue to pose risks, researchers are working to improve how quickly and accurately pathogens can be identified and tracked. In a recent SEQanswers webinar, two experts discussed how next-generation sequencing (NGS) and machine learning are shaping efforts to monitor viral variation and trace the origins of infectious...
                  03-24-2025, 11:48 AM
                • seqadmin
                  New Genomics Tools and Methods Shared at AGBT 2025
                  by seqadmin


                  This year’s Advances in Genome Biology and Technology (AGBT) General Meeting commemorated the 25th anniversary of the event at its original venue on Marco Island, Florida. While this year’s event didn’t include high-profile musical performances, the industry announcements and cutting-edge research still drew the attention of leading scientists.

                  The Headliner
                  The biggest announcement was Roche stepping back into the sequencing platform market. In the years since...
                  03-03-2025, 01:39 PM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, Today, 10:17 AM
                0 responses
                7 views
                0 reactions
                Last Post seqadmin  
                Started by seqadmin, 03-20-2025, 05:03 AM
                0 responses
                49 views
                0 reactions
                Last Post seqadmin  
                Started by seqadmin, 03-19-2025, 07:27 AM
                0 responses
                59 views
                0 reactions
                Last Post seqadmin  
                Started by seqadmin, 03-18-2025, 12:50 PM
                0 responses
                50 views
                0 reactions
                Last Post seqadmin  
                Working...