Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • SVDetect v/s BreakDancer

    Hi,
    Has anyone had any experience with SVDetect and BreakDancer? I need some thoughts on their comparison in terms of performance and detection accuracy. I am mainly interested in detecting large INDELs, inversions and translocations using paired-end whole genome data.

    Thanks.

  • #2
    I have just tried both of these on a whole genome sequenced sample with a balanced translocation identified through cytogenetics. We did single lane HiSeq2000, paired-end sequencing of ~600bp long fragments, read lenght 100bp, average coverage 8x. Aligned with BWA, and removed duplicates with Picard.

    We knew roughly where to look for the break points, and with IGV's coloured display of reads where pairs map to different chromosomes, we were able to quickly identify the breakpoint, with 7 reads providing evidence (mapping either side of the breakpoint), and further 3+1 reads mapping across the breakpoints, so we got to the actual base from WGS. But could we identify this translocation if we didn't know where to look?

    I first tried BreakDancer and got strange results (posted a question on seqanswers but no replies), lots of interchromosomal calls with the actual translocation looking no more real than ~100 false positives, and the output saying only 3 reads providing support.

    Then tried SVDetect, and I'm much happier with the results and their format. Still got lots of FP calls but the actual translocation got called correctly, output is much simpler than BreakDancer and listed that 7 read pairs map across it.

    Suggestions to filter out FPs for inter-chromosomal translocations (after selecting only those with the highest score), based on my observations after manually checking some of them in IGV:
    - those where multiple types of rearrangements map in close proximity
    - those mapping to centromeres
    - those with too large number of reads providing evidence (in my case say >14, or twice the average coverage)
    - those where start-end distance is considerably different between chr1 and chr2, which suggests mismapping of reads. This tells me I should probably remove any reads mapping to multiple locations from the bam file first, before running SVDetect.

    Perhaps not so many should have the perfect score?

    These are just some of my initial observations for a particular type of structural aberration, and I'd be also happy to hear form anyone else using these programs.

    Comment


    • #3
      Hello,

      I have tried BreakDancer and SVDetect, and now I am trying to filter out false positives from my results. I am doing these steps that you suggested HLA but I have some questions, you may help me. For the moment, I am just interested in deletions, my doubts:
      1. " (after selecting only those with the highest score)", what is a high score? Because I have from 0.8 to 1, since I have a lot of 1 score I think I will keep just those ones, is it ok?
      2. "those where multiple types of rearrangements map in close proximity", you mean, for instance, if there is a deletion and an insertion in close proximity?
      3. "those with too large number of reads providing evidence (in my case say >14, or twice the average coverage)", my coverage is 55, so, do you mean that I should filter out the SV with nb_pairs(int sv.text file) > 110, because my range of values are [2:95]
      4. " those where start-end distance is considerably different between chr1 and chr2, which suggests mismapping of reads. This tells me I should probably remove any reads mapping to multiple locations from the bam file first, before running SVDetect.", sorry for being a bit picky, but what is considerably difference?

      I was thinking to merge results from breakdancer and svdetect, it also exists this tool http://svmerge.sourceforge.net/ that is useful also for copy number.
      By the way, have you experience with any other tool such as Gasv or Hidra?

      Thanks in advance!

      Comment


      • #4
        Hi ralonso,

        my suggested filters were for balanced inter-chromosomal translocations only, and I couldn't be very specific as I had a single sample only and just looked at the distributions of various attributes given in the output.

        You seem to be looking for deletions, and have much higher coverage than me. In which case you need to look for cases with insert sizes larger than expected, probably starting with ones supported by the highest number of reads, and possibly largest deletions. If you are working on a human sample and looking for something disease-causing then obviously look for the ones not in DGV. You should also have quite a few reads spanning across the deletion itself (rather than read pairs either side of it), so you may be able to use one of coverage-based methods, if these two programs already don't take it into account?

        As it happens next week we'll do whole genome sequencing of a patient to confirm and accurately map a ~100kb deletion identified by CNV analysis of exome sequence data. So I can run SVDetect on this sample and post my observations again regarding filtering deletion calls. Though my coverage will again be ~8x, not sure if this means I'll get more or fewer calls than if I had 50x coverage.

        Comment


        • #5
          Hi HLA,

          thanks for your reply. I am not doing human, but a plant. I think we will be very glad if you post your results . On the other hand, have you tried gasv (https://code.google.com/p/gasv/)? They are doing quite well, focusing also in false positives, that in my case is very important since I have a lot of deletions .

          thanks!!

          Comment


          • #6
            I haven't tried gasv yet, I'd be happy to give it a go. I have to say we mainly work on exome and targeted sequencing experiments, and have so far only used low-ish coverage whole genome sequencing (whatever we get from single HiSeq2000 lane) on a couple of samples to map the structural variant breakpoints when we already know roughly where to look. So my current testing of various software is only out of curiosity for potential future experiments when we might go straight to WGS for structural variant detection.

            Comment


            • #7
              Hi all,

              I am currently looking for a software to determine CNV. Knowing that I am using data re-targeted sequencing (approximately twenty genes). I tested CONTRA but I would know your opinion? What do you think is the best software to detect all types of CNV (deletion, duplication, inversion and translocation)?

              Thank you in advance for your help
              Last edited by tonio100680; 10-30-2012, 12:36 AM.

              Comment


              • #8
                Hi all,

                I am currently looking for a software to determine CNV. Knowing that I am using data re-targeted sequencing (approximately twenty genes). I tested CONTRA but I would know your opinion? What do you think is the best software to detect all types of CNV (deletion, duplication, inversion and translocation)?

                Thank you in advance for your help

                Comment


                • #9
                  Problem about using SVDetect

                  When I used a perl scripts BAM_preprocessingPairs.pl to get anomalously mapped mate-pair/paired-end reads, I got to two files: **.ab.bam and **.norm.bam, which were empty. Why did not the files contain anything?
                  Thanks!

                  Comment


                  • #10
                    Originally posted by binlangman View Post
                    When I used a perl scripts BAM_preprocessingPairs.pl to get anomalously mapped mate-pair/paired-end reads, I got to two files: **.ab.bam and **.norm.bam, which were empty. Why did not the files contain anything?
                    Thanks!
                    You should check the stdout output for parameters such as number of correctly mapped reads, anamolously mapped reads, etc.

                    Comment

                    Latest Articles

                    Collapse

                    • seqadmin
                      Current Approaches to Protein Sequencing
                      by seqadmin


                      Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                      04-04-2024, 04:25 PM
                    • seqadmin
                      Strategies for Sequencing Challenging Samples
                      by seqadmin


                      Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                      03-22-2024, 06:39 AM

                    ad_right_rmr

                    Collapse

                    News

                    Collapse

                    Topics Statistics Last Post
                    Started by seqadmin, 04-11-2024, 12:08 PM
                    0 responses
                    31 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 04-10-2024, 10:19 PM
                    0 responses
                    32 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 04-10-2024, 09:21 AM
                    0 responses
                    28 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 04-04-2024, 09:00 AM
                    0 responses
                    53 views
                    0 likes
                    Last Post seqadmin  
                    Working...
                    X