Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • orcy
    Junior Member
    • Jan 2010
    • 8

    #16
    OK. I think that problem was simply the viewer telling me that read was a problem. I've got past that, but now get a

    [main_samview] fail to get the reference name. Continue anyway.

    error, and nothing in the output.

    Does anyone know what that means? It happens during the sharpenedges part of the script.

    cheers

    Comment

    • megnetz
      Junior Member
      • Jul 2010
      • 4

      #17
      Hello!

      I'm trying to make structural variation calls from 1000 genomes data. I thought I might try breakway but ran into problems :/. When calculating PED values with the dnaa script dbampairedenddist I need to specify a certain range based on predicted PED from library generation. As far as I know the 1000 genomes bam-files take input from several different raw read files so how can I know which range to choose? Or does this make 1000 genomes data incompatible with breakway SV detection?

      Thank you very much!

      Comment

      • Lee Sam
        Member
        • Oct 2008
        • 57

        #18
        Originally posted by Michael.James.Clark View Post
        While I haven't tested it on such datasets, it ought to work on them. The key will be in the reference genome used.

        Breakway functions by looking for clusters of aberrantly spaced paired reads, so the key is to have an appropriate reference genome for it to compare to.

        For exon capture, it should work with the normal reference genome just as well as it will with whole genomes.

        For RNAseq, and I'm not an expert so I welcome other suggestions, the transcriptome will probably be best used as the reference genome.
        I'm very interested in using this with RNA-Seq. I figure aligning against transcriptome is an issue because it limits the size of the indel that you can have (e.g. no 2-transcript mappings, where one end maps to one transcript and the other maps to a completely different transcript).

        Comment

        • Michael.James.Clark
          Senior Member
          • Apr 2009
          • 207

          #19
          Originally posted by orcy View Post
          OK. I think that problem was simply the viewer telling me that read was a problem. I've got past that, but now get a

          [main_samview] fail to get the reference name. Continue anyway.

          error, and nothing in the output.

          Does anyone know what that means? It happens during the sharpenedges part of the script.

          cheers
          Sorry for the late reply on this.

          Sharpenedges uses samtools as part of its activity, and this is a samtools error.

          Make sure that you've properly indexed the BAM file, and that the file is in BAM format.

          If you still have a problem, please run samtools view and post an example read here for me to look at.
          Mendelian Disorder: A blogshare of random useful information for general public consumption. [Blog]
          Breakway: A Program to Identify Structural Variations in Genomic Data [Website] [Forum Post]
          Projects: U87MG whole genome sequence [Website] [Paper]

          Comment

          • Michael.James.Clark
            Senior Member
            • Apr 2009
            • 207

            #20
            Originally posted by megnetz View Post
            Hello!

            I'm trying to make structural variation calls from 1000 genomes data. I thought I might try breakway but ran into problems :/. When calculating PED values with the dnaa script dbampairedenddist I need to specify a certain range based on predicted PED from library generation. As far as I know the 1000 genomes bam-files take input from several different raw read files so how can I know which range to choose? Or does this make 1000 genomes data incompatible with breakway SV detection?

            Thank you very much!
            Breakway works on a library-by-library basis. One can combine libraries with very similar PEDs in a single analysis and it will still function.

            If you have libraries with very different PEDs, it will have difficulty working correctly. You can isolate reads with very different PEDs from each other and run it independently on each one, then combine the results, though. This is what I have done.

            I'm not very familiar with 1000 genomes data, but if they use the read group flag in their BAM files with the library field clarifying which library specific RGs are sourced from, you can use that to isolate the reads.

            Sorry I can't be more help--Breakway was designed to function optimally on a sample-by-sample basis, not on a batch of samples.
            Mendelian Disorder: A blogshare of random useful information for general public consumption. [Blog]
            Breakway: A Program to Identify Structural Variations in Genomic Data [Website] [Forum Post]
            Projects: U87MG whole genome sequence [Website] [Paper]

            Comment

            • Michael.James.Clark
              Senior Member
              • Apr 2009
              • 207

              #21
              Originally posted by Lee Sam View Post
              I'm very interested in using this with RNA-Seq. I figure aligning against transcriptome is an issue because it limits the size of the indel that you can have (e.g. no 2-transcript mappings, where one end maps to one transcript and the other maps to a completely different transcript).
              True, it would be blind to fusion transcripts if you were to use transcriptome.

              An alternative might be using all possible fusions as a reference.

              I believe Tophat/Cufflink are very popular for this type of analysis, so you may want to take a look at them!
              Mendelian Disorder: A blogshare of random useful information for general public consumption. [Blog]
              Breakway: A Program to Identify Structural Variations in Genomic Data [Website] [Forum Post]
              Projects: U87MG whole genome sequence [Website] [Paper]

              Comment

              • megnetz
                Junior Member
                • Jul 2010
                • 4

                #22
                I'll try that, thanks!

                Comment

                • Jon_Keats
                  Senior Member
                  • Mar 2010
                  • 279

                  #23
                  Hi Michael,

                  Shouldn't Breakway.ReadCluster.pl find both clusters of reads implicating insertions or deletions exceeding the floor-pe-length and ceiling-pe-length and translocations? In a quick test of some Illumina mate-pair data you only see the intra-chromosomal events but not the inter-chromosomal events event though a quick parsing of the dtranslocations table clearly identifies positive control events that should meet the -mincs and -maxcs options used.

                  Comment

                  • Michael.James.Clark
                    Senior Member
                    • Apr 2009
                    • 207

                    #24
                    Originally posted by Jon_Keats View Post
                    Hi Michael,

                    Shouldn't Breakway.ReadCluster.pl find both clusters of reads implicating insertions or deletions exceeding the floor-pe-length and ceiling-pe-length and translocations? In a quick test of some Illumina mate-pair data you only see the intra-chromosomal events but not the inter-chromosomal events event though a quick parsing of the dtranslocations table clearly identifies positive control events that should meet the -mincs and -maxcs options used.
                    Hi Jon,

                    Sorry for the late reply, I've been otherwise occupied, but I hope I can help solve this with you.

                    I'm a little bit unclear on what you're seeing. Are you observing that an event that should pass your parameters is not being reported by Breakway? If so, would it be possible to provide the library design (insert size, read length, sequence depth, etc.), parameters you used in dtranslocations and Breakway and the segment of the dtranslocations file in question?

                    Usually if this type of thing happens, I find it's due to the dtranslocations spot being sporadic to the point that the event doesn't meet the minimum requirements for Breakway. These minimums are determined by mincs/maxcs, so you can decrease mincs and increase maxcs and often they will then come through.
                    Mendelian Disorder: A blogshare of random useful information for general public consumption. [Blog]
                    Breakway: A Program to Identify Structural Variations in Genomic Data [Website] [Forum Post]
                    Projects: U87MG whole genome sequence [Website] [Paper]

                    Comment

                    • shu
                      Junior Member
                      • Jan 2010
                      • 6

                      #25
                      Dear Michael,

                      We are trying to instal BreakAway. Did successfully install BFast, SAMTools in the root as suggested but are having issues during installation of DNAA. During ./configure, it shows fatal: Not a git repository and when we make it it gives the error;

                      make all-recursive
                      make[1]: Entering directory `/storage/Software/dnaa-0.1.2'
                      Making all in dkbaseencoding
                      make[2]: Entering directory `/storage/Software/dnaa-0.1.2/dkbaseencoding'
                      make[2]: *** No rule to make target `all'. Stop.
                      make[2]: Leaving directory `/storage/Software/dnaa-0.1.2/dkbaseencoding'
                      make[1]: *** [all-recursive] Error 1
                      make[1]: Leaving directory `/storage/Software/dnaa-0.1.2'
                      make: *** [all] Error 2

                      We are using 64bit Debian.

                      Could you pl help?

                      Comment

                      • Michael.James.Clark
                        Senior Member
                        • Apr 2009
                        • 207

                        #26
                        Hm, not sure what's going on. I'm not the author of DNAA, I'm afraid, but I have gotten it to install successfully myself.

                        I assume you got the tar.gz from here:

                        Then obviously followed the INSTALL.
                        If you got it through git, maybe that is a problem and you should try making it from the tarball.

                        A search on google for the error "fatal: Not a git repository" has a number of hits that you might want to look at.

                        Just to let you know, I just successfully installed DNAA from scratch on my Mac Pro here.
                        Last edited by Michael.James.Clark; 10-29-2010, 12:59 PM.
                        Mendelian Disorder: A blogshare of random useful information for general public consumption. [Blog]
                        Breakway: A Program to Identify Structural Variations in Genomic Data [Website] [Forum Post]
                        Projects: U87MG whole genome sequence [Website] [Paper]

                        Comment

                        • Michael.James.Clark
                          Senior Member
                          • Apr 2009
                          • 207

                          #27
                          The most common mistake I find people making is forgetting to index their BAM file. Always index your BAM file! Breakway will look in the same folder as the BAM file for a file with the same exact name with the ".bai" appended to the end, which is the standard output from the samtools index program.
                          Mendelian Disorder: A blogshare of random useful information for general public consumption. [Blog]
                          Breakway: A Program to Identify Structural Variations in Genomic Data [Website] [Forum Post]
                          Projects: U87MG whole genome sequence [Website] [Paper]

                          Comment

                          • Michael.James.Clark
                            Senior Member
                            • Apr 2009
                            • 207

                            #28
                            Hi all,

                            Breakway has been updated to version 0.7.

                            In this update:

                            -The breakway.parameters.pl script has been improved. It no longer requires the dbampairedenddist program from DNAA to run. Now BAM files can be directly passed to breakway.parameters.pl along with insert size range and the program will report mean, standard deviation and 95% bounds of the entire BAM file. See The Breakway Compendium at breakway.sf.net for usage.

                            -A bug in breakway.sharpenedges.pl has been fixed. Though it was supposed to default the --score parameter to zero, it was actually undefined, so if one ran the program with this optional parameter, it would crash. Now the script can be run with --score default parameter successfully.
                            Mendelian Disorder: A blogshare of random useful information for general public consumption. [Blog]
                            Breakway: A Program to Identify Structural Variations in Genomic Data [Website] [Forum Post]
                            Projects: U87MG whole genome sequence [Website] [Paper]

                            Comment

                            • unagaswamy
                              Member
                              • May 2010
                              • 13

                              #29
                              Hi,
                              We have the same problem, Breakwasy chokes at :
                              samtools view -X sample.bam chr1:56-230|egrep "pPUr[0-9]d"| head -5
                              286_89_1940 pPUr1d chr1 97 16 50M ...
                              since the string
                              pPUr1d
                              is not captured in its entirity by line in load_alignments function
                              if($line =~ m/^(\S+)\s+([pPrRuU12]*)\s+(\S+)\s+(\d+)\s+\d+\s+\S+\s+\S+\s+(\d+)\s+-?([0-9]+)\s+(\w+)/)
                              in the breakway.sharpenedges.pl

                              Is there a particular reason for accepting only srings of type "pPrR1" ?

                              Comment

                              • Michael.James.Clark
                                Senior Member
                                • Apr 2009
                                • 207

                                #30
                                Thanks for pointing that out! I honestly was at a loss for what this bug was as I hadn't seen the "d" before.

                                Can I ask what version of Samtools you've been using? I have only tested it against an old version that Breakway was designed to work with (v0.1.6 (r453) as stated in the Breakway script headers).

                                This quick fix should work. You can change that line to the following:

                                Code:
                                if($line =~ m/^(\S+)\s+([B].*[/B][pPrRuU12]*[B].*[/B])\s+(\S+)\s+(\d+)\s+\d+\s+\S+\s+\S+\s+(\d+)\s+-?([0-9]+)\s+(\w+)/)
                                That way, it should be robust against anything else in the flag field that might get added subsequently.

                                I have uploaded the program with that bug fix to the Breakway site, so alternatively you can just download and extract it (the only difference is that line!).

                                Download Breakway for free. A project dedicated to the identification of genomic breakpoints utilizing freely available tools and custom analytical techniques. THIS PROJECT IS NO LONGER SUPPORTED.
                                Mendelian Disorder: A blogshare of random useful information for general public consumption. [Blog]
                                Breakway: A Program to Identify Structural Variations in Genomic Data [Website] [Forum Post]
                                Projects: U87MG whole genome sequence [Website] [Paper]

                                Comment

                                Latest Articles

                                Collapse

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by SEQadmin2, Today, 06:09 AM
                                0 responses
                                15 views
                                0 reactions
                                Last Post SEQadmin2  
                                Started by SEQadmin2, 06-09-2026, 11:58 AM
                                0 responses
                                34 views
                                0 reactions
                                Last Post SEQadmin2  
                                Started by SEQadmin2, 06-05-2026, 10:09 AM
                                0 responses
                                39 views
                                0 reactions
                                Last Post SEQadmin2  
                                Started by SEQadmin2, 06-04-2026, 08:59 AM
                                0 responses
                                44 views
                                0 reactions
                                Last Post SEQadmin2  
                                Working...