Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • quinlana
    Senior Member
    • Sep 2008
    • 119

    #16
    Originally posted by tinacai View Post
    Hi,Quinlana;
    I have no idea about the last step yet,in your paper workflow,incluing use BWA NOVOALIGN ,HYDRA and BEDTOOLS ,in http://code.google.com/p/hydra-sv/wiki/TypicalWorkflow linkage,I think it is not a good guidelines
    Thank you for letting me know. When I find some time I will try to make the approach more intuitive.

    Comment

    • zee
      NGS specialist
      • Apr 2008
      • 249

      #17
      Perhaps a python/shell script or a Makefile would simplify things. Any plans for this in the future? Perhaps I could help out on this task since we are working on improving use cases for our aligner.

      Comment

      • quinlana
        Senior Member
        • Sep 2008
        • 119

        #18
        Hi Zee,
        Thanks for the suggestion. However, I am reticent to automate anything about the process as it very much depends on the quality and variability of the user data, the organism, etc. I think it's best for me to merely improve the documentation and give thorough explanations of why things are done the way they are.

        Aaron

        Comment

        • quinlana
          Senior Member
          • Sep 2008
          • 119

          #19
          Hi all,
          I just posted a script on the Hydra site that allows one to convert the Hydra breakpoint calls (in BEDPE format) to BED12 for visualization on IGV, UCSC, etc. I've had several requests for such a tool and finally got around to doing it.



          Best,
          Aaron

          Comment

          • wdt
            Member
            • Oct 2009
            • 19

            #20
            Aaron,

            This is a samtools question but since the o/p is going into Hydra, I thought of posting here.

            Following the workflow on http://code.google.com/p/hydra-sv/wiki/TypicalWorkflow, for extracting the discordant reads using

            samtools view -uF 2 sample.tier1.bam | \
            bamToFastq -bam stdin \
            -fq1 sample.tier1.disc.1.fq \
            -fq2 sample.tier1.disc.2.fq

            Should we see exact same number of reads (and identical pairwise read Ids) in the 1.fq and 2.fq file?

            For 1.fq and 2.fq files I have, I don't see corresponding match of read IDs. Do
            you require BAM to be sorted based on read id?


            Thanks in advance.
            Last edited by wdt; 02-08-2011, 12:17 PM.

            Comment

            • quinlana
              Senior Member
              • Sep 2008
              • 119

              #21
              Originally posted by wdt View Post
              Aaron,

              This is a samtools question but since the o/p is going into Hydra, I thought of posting here.

              Following the workflow on http://code.google.com/p/hydra-sv/wiki/TypicalWorkflow, for extracting the discordant reads using

              samtools view -uF 2 sample.tier1.bam | \
              bamToFastq -bam stdin \
              -fq1 sample.tier1.disc.1.fq \
              -fq2 sample.tier1.disc.2.fq

              Should we see exact same number of reads (and identical pairwise read Ids) in the 1.fq and 2.fq file?

              For 1.fq and 2.fq files I have, I don't see corresponding match of read IDs. Do
              you require BAM to be sorted based on read id?


              Thanks in advance.
              Assuming you have used BWA for Tier 1, your Tier 1 BAM file should be in "query order". That is, the order of the alignments in the BAM file should be in the order of the input FASTQ files. Using the settings described in the workflow, you should have a single alignment for each end of every pair. Thus, when creating 1.fq and 2.fq, you should have the exact same number of reads in each. If not, I suspect you have either a) used a different aligner for Tier1 or b) not used a "query-ordered" BAM file.

              I have updated the workflow to indicate that bamToFastq expects query-ordered BAM files.

              Best,
              Aaron

              Comment

              • gpcr
                Member
                • May 2010
                • 18

                #22
                missing reads

                Aaron,
                I have merged bam file resulting from multiple lanes of paired end alignments. when I extracted fastq from the alignemnt . I have unequal reads in pair1 and 2. when i examined the reads in bam file. I could see for some, there is only one mate (either fwd oir reverse) only aligned and other missing. Do I need to append missing read from the raw read lane? or exclude them from the analysis?.

                Comment

                • epigen
                  Senior Member
                  • May 2010
                  • 101

                  #23
                  get discordant pairs

                  Originally posted by gpcr View Post
                  Aaron,
                  I have merged bam file resulting from multiple lanes of paired end alignments. when I extracted fastq from the alignemnt . I have unequal reads in pair1 and 2. when i examined the reads in bam file. I could see for some, there is only one mate (either fwd oir reverse) only aligned and other missing. Do I need to append missing read from the raw read lane? or exclude them from the analysis?.
                  If you extract the discordant pairs from your BAM file like that:

                  samtools view -hb -F 1038 orig.bam > discordant.bam

                  you get reads that have neither flag 2 (proper pair) nor flag 4 (read itself unmapped) nor flag 8 (mate unmapped) nor flag 1024 (is duplicate).
                  If the number of reads1 and reads2 is still not equal, maybe your aligner messed up? As Aaron wrote above, you need to have exactly one alignment per read.

                  By the way, the above also works for coordinate-sorted BAMs. Afterwards you just have to namesort discordant.bam with samtools sort -n option.

                  Comment

                  • gpcr
                    Member
                    • May 2010
                    • 18

                    #24
                    thanks @epigen

                    Comment

                    • plichel
                      Junior Member
                      • Mar 2010
                      • 9

                      #25
                      breakpoints

                      Does hydra also utilize partially mapped reads ?
                      I see a lot of softclipped alignments in my bwa aligned sam file. I am wondering whether this information is used when searching for breakpoints.

                      Edit1:
                      It seems that softclipped reads are implicitly used, since their edit distance is often abnormal after clipping. (depending from which end it is clipped, the outer or the inner distance changes...)

                      To me it seems: simply by looking at softclipped reads it might be possible to detect the exact breakpoint position (at single nucleotide resolution). Why nobody use it ? Do I miss something ?
                      Last edited by plichel; 05-19-2011, 08:28 AM.

                      Comment

                      • epigen
                        Senior Member
                        • May 2010
                        • 101

                        #26
                        identify breakpoints with softclipped reads

                        Originally posted by plichel View Post
                        To me it seems: simply by looking at softclipped reads it might be possible to detect the exact breakpoint position (at single nucleotide resolution). Why nobody use it ? Do I miss something ?
                        That's a very good question. Maybe softclipped reads don't give enough evidence?

                        Comment

                        • plichel
                          Junior Member
                          • Mar 2010
                          • 9

                          #27
                          I guess it will greatly depend on coverage and uniqueness of the alignment.
                          Suppose you have 30x and see indeed at a particular position 30 ore more softclipped reads supporting the breakpoint and the mapper/aligner doesnt report multiple hits. What could lead here to a false positive conclusion ?

                          Comment

                          • tez
                            Junior Member
                            • Jul 2011
                            • 4

                            #28
                            The CREST algorithm described here:


                            Uses soft-clipped reads, but requires reads of 75bp or longer (so is out for my SOLiD project). It doesn't appear to use discordance information though, so there may be some benefit in using multiple tools as a part of a workflow.

                            Comment

                            • idan.gabdank
                              Junior Member
                              • Jul 2011
                              • 1

                              #29
                              I am trying to run with the typical workflow of hydra, but after tier 3 I am getting stuck with the script "pairDiscordants.py".

                              I am getting:

                              Traceback (most recent call last):
                              File "/usr/local/bin/pairDiscordants.py", line 294, in <module>
                              sys.exit(main())
                              File "/usr/local/bin/pairDiscordants.py", line 290, in main
                              pairReads(opts.inFile, opts.numMappings, opts.order, opts.dist, opts.minSpan, opts.minConcRange, opts.maxConcRange, opts.mode, opts.anchorThresh, opts.multiThresh, opts.editSlop)
                              File "/usr/local/bin/pairDiscordants.py", line 28, in pairReads
                              printHydraMappings(pairs, editDistance, editSlop)
                              UnboundLocalError: local variable 'editDistance' referenced before assignment

                              The input file looks like:

                              CHROMOSOME_II 9172809 9172833 000_1000_1326_R3/2 0 -
                              CHROMOSOME_IV 1641291 1641314 000_1000_171_F3/1 0 +

                              The command was:

                              > cat result | pairDiscordants.py -i stdin -m hydra -z 800

                              What am I doing wrong?

                              Comment

                              • aquinom85
                                Research Bioinformaticist
                                • Dec 2011
                                • 19

                                #30
                                Does Hydra report zygosity of SVs it calls?

                                Comment

                                Latest Articles

                                Collapse

                                • SEQadmin2
                                  From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                                  by SEQadmin2


                                  Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                                  The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                                  ...
                                  06-02-2026, 10:05 AM
                                • SEQadmin2
                                  Single-Cell Sequencing at an Inflection Point: Early Impacts of New Platforms and Emerging Trends
                                  by SEQadmin2


                                  With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.


                                  Introduction

                                  Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...
                                  05-22-2026, 06:42 AM
                                • SEQadmin2
                                  Environmental Genomics in the Age of NGS: From Microbes to Conservation Strategies
                                  by SEQadmin2

                                  Studying ecosystems means dealing with complex, multi-species communities that are hard to observe at scale. This complexity, however, hides many important questions to be answered, from how biogeochemical cycles work and how climate change can affect species distribution to how conservation strategies can work best.


                                  Genomics, particularly since the expansion of NGS, has transformed ecosystem ecology. By sequencing environmental DNA, we can now assess biodiversity without direct...
                                  05-06-2026, 09:04 AM

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by SEQadmin2, Today, 08:59 AM
                                0 responses
                                7 views
                                0 reactions
                                Last Post SEQadmin2  
                                Started by SEQadmin2, 06-02-2026, 12:03 PM
                                0 responses
                                21 views
                                0 reactions
                                Last Post SEQadmin2  
                                Started by SEQadmin2, 06-02-2026, 11:40 AM
                                0 responses
                                14 views
                                0 reactions
                                Last Post SEQadmin2  
                                Started by SEQadmin2, 05-28-2026, 11:40 AM
                                0 responses
                                29 views
                                0 reactions
                                Last Post SEQadmin2  
                                Working...