Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    Originally posted by ecSeq Bioinformatics View Post
    Dear ntn12,

    thanks for your comments and questions.

    segemehl itself is not a fusion-finder. It is a mapping tool that can detect split-reads and its resulting set of these split-reads can be used to call fusion genes. But it has to be done in a separate downstream analysis and is not included in the segemehl algorithm. I hope that makes things clearer.

    Ok. I understand now that SEGEMEHL is not a fusion genes finder and it has never been used for this. It has the same potential to be used for fusion finder as BLAT/BOWTIE/BWA for example.

    I got confused because the authors of SEGEMEHL claim in the title of their paper:

    Hoffmann et al. A multi-split mapping algorithm for circular RNA, splicing, trans-splicing, and FUSION DETECTION, Genome Biol. 2014.

    Numerous high-throughput sequencing studies have focused on detecting conventionally spliced mRNAs in RNA-seq data. However, non-standard RNAs arising through gene fusion, circularization or trans-splicing are often neglected. We introduce a novel, unbiased algorithm to detect splice junctions from …


    that SEGEMEHL does FUSION DETECTION when actually it does not.

    Comment


    • #17
      Originally posted by Paul Newport View Post
      Sorry, but I don't understand the list shown on the linked page.

      My questions would be:
      1. Where do these 40 fusion genes come from?
      2. Why does only FusionCatcher find all of these?
      3. Why is this list on the FusionCatcher website?
      I do not know. We have not used yet FusionCatcher. We have been testing TopHat-fusion, FusionMap, ChimeraScan, and FusionFinder. We found puzzling that all these four give thousands of candidate fusion genes per sample (some even hundred of thousands) when we know from the medical literature that there should not be more than 1-3 fusion genes per sample!!! Therefore one has here 99% false positives.

      UPDATE: We started testing SOAPfuse and we start to like it!
      Last edited by ntn12; 05-19-2014, 05:48 AM.

      Comment


      • #18
        Originally posted by ntn12 View Post
        I got confused because the authors of SEGEMEHL claim in the title of their paper:

        Hoffmann et al. A multi-split mapping algorithm for circular RNA, splicing, trans-splicing, and FUSION DETECTION, Genome Biol. 2014.

        Numerous high-throughput sequencing studies have focused on detecting conventionally spliced mRNAs in RNA-seq data. However, non-standard RNAs arising through gene fusion, circularization or trans-splicing are often neglected. We introduce a novel, unbiased algorithm to detect splice junctions from …


        that SEGEMEHL does FUSION DETECTION when actually it does not.
        Dear ntn12,

        please step gently here. The title of the paper is very clear and all claims are met. Before reading something into the title, you should actually read the paper. Everything is written in very clear manner and all claims are confirmed by public available data.

        Nevertheless, I do not understand your frustrations here. Perhaps you should directly contact the developers of the algorithm and seek a dialogue.
        ecSeq Bioinformatics is Europe’s leading provider of hands-on bioinformatics workshops and professional data analysis in the field of Next-Generation Sequencing (NGS).

        Comment


        • #19
          Originally posted by ecSeq Bioinformatics View Post
          Dear ntn12,

          please step gently here. The title of the paper is very clear and all claims are met. Before reading something into the title, you should actually read the paper. Everything is written in very clear manner and all claims are confirmed by public available data.
          I am even confused about SEGEMEHL after reading the paper.

          The authors of this paper:

          Hoffmann et al. A multi-split mapping algorithm for circular RNA, splicing, trans-splicing, and FUSION DETECTION, Genome Biol. 2014. http://www.ncbi.nlm.nih.gov/pubmed/24512684

          clearly state in the title and other three places thru out their article that:

          "Here, we present a unified unbiased algorithm to detect splicing, trans-splicing and gene fusion events from single-end read data..."

          "The algorithmic strategy to identify splicing, trans-splicing or gene fusion sites is based on a greedy, score-based seed chaining followed by a Smith-Waterman-like transition alignment."

          "Implemented in the segemehl mapping tool, it readily identifies conventional splice junctions, collinear and non-collinear fusion transcripts, and trans-spliced RNAs, without the need for separate post-processing or an extensive computational overhead."


          Also I did not find in the same article not even one fusion gene or fusion transcript found by SEGEMEHL. According to the last statement SEGEMEHL should identify readily fusion transcripts without the need for separate post-processing.

          We will use SOAPfuse for finding fusion genes because it performed really well in our tests.
          Last edited by ntn12; 05-19-2014, 06:04 AM.

          Comment


          • #20
            Dear ntn12,

            I herewith take notice of your assumption that the segemehl developers wrote some statements which are confusing for you, so you will use SOAPfuse.
            Last edited by ecSeq Bioinformatics; 05-19-2014, 06:59 AM.
            ecSeq Bioinformatics is Europe’s leading provider of hands-on bioinformatics workshops and professional data analysis in the field of Next-Generation Sequencing (NGS).

            Comment


            • #21
              Originally posted by ecSeq Bioinformatics View Post
              Dear ntn12,

              I herewith take notice of your assumption that the segemehl developers wrote some statements which are confusing for you, so you will use SOAPfuse.
              That is not an assumption. It is a fact.
              Indeed the authors of "Hoffmann et al. A multi-split mapping algorithm for circular RNA, splicing, trans-splicing, and FUSION DETECTION, Genome Biol. 2014. http://www.ncbi.nlm.nih.gov/pubmed/24512684"

              clearly state in their article that:

              "Implemented in the segemehl mapping tool, it readily identifies conventional splice junctions, collinear and non-collinear fusion transcripts, and trans-spliced RNAs, without the need for separate post-processing or an extensive computational overhead."

              I did not write that. The authors wrote that! Anybody can check this! Please, check here:
              Numerous high-throughput sequencing studies have focused on detecting conventionally spliced mRNAs in RNA-seq data. However, non-standard RNAs arising through gene fusion, circularization or trans-splicing are often neglected. We introduce a novel, unbiased algorithm to detect splice junctions from …


              Originally posted by ecSeq Bioinformatics View Post
              I herewith take notice of your assumption that the segemehl developers wrote some statements which are confusing for you, so you will use SOAPfuse.
              I am not the only one who got confused about SEGEMEHL. There are at least two others who are confused about SEGEMEHL and finding fusion genes here:
              Last edited by ntn12; 05-19-2014, 06:54 AM.

              Comment


              • #22
                Originally posted by ntn12 View Post
                I am not the only one who got confused about SEGEMEHL. There are at least two others who are confused about SEGEMEHL and finding fusion genes here:
                https://www.biostars.org/p/45986/
                Oh, please! Give me a break! Same statements, same time stamp! Too obvious, man!

                Comment


                • #23
                  Originally posted by Paul Newport View Post
                  Oh, please! Give me a break! Same statements, same time stamp! Too obvious, man!
                  ???

                  Comment


                  • #24
                    As already mentioned before in this thread:

                    If any of you is interested in learning how to use segemehl to detect fusion transcripts and/or circularized RNAs, I can recommend you the following hands-on course:

                    Discovering standard and non-standard RNA transcripts - How to detect canonical splicing, circular RNAs, trans-splicing, and fusion transcripts

                    Developers of the algorithm will explain you step-by-step how you can use segemehl to detect standard and non-standard transcripts. They will assure that all of you understand the difference between 'fusion-junctions' and 'fusion-genes' and what exactly you can do with segemehl and all its downstream analysis tools like (lack or haarz). You will understand the implications of splicing or fusion events and the concept of split-reads, how to detect splice sites using split-read information and in the end be able to find circularized RNAs or fusion-stranscripts.

                    The cool thing with this course: You will not just use (and trust) a tool with pre-defined parameters (like SOAPfuse, etc.), but understand everything from scratch!
                    Last edited by ecSeq Bioinformatics; 05-20-2014, 12:14 AM.
                    ecSeq Bioinformatics is Europe’s leading provider of hands-on bioinformatics workshops and professional data analysis in the field of Next-Generation Sequencing (NGS).

                    Comment


                    • #25
                      I'm interested in giving segemehl a shot, but so far it's taking prohibitively long to run. In my cluster-computer environment I reserved 60 nodes for 24 hours to run:

                      segemehl.x -q 8Gb_single_end.fastq -t 60 -d chromosome1.fa -i chr1.idx -S -s -o chr1.sam

                      took over 24hours without completing. There were no errors reported, it did create a sam file, however incomplete. Do you have any tips to make the software run more quickly?

                      Comment


                      • #26
                        Originally posted by NKAkers View Post
                        I'm interested in giving segemehl a shot, but so far it's taking prohibitively long to run. In my cluster-computer environment I reserved 60 nodes for 24 hours to run:

                        segemehl.x -q 8Gb_single_end.fastq -t 60 -d chromosome1.fa -i chr1.idx -S -s -o chr1.sam

                        took over 24hours without completing. There were no errors reported, it did create a sam file, however incomplete. Do you have any tips to make the software run more quickly?
                        This extensively long runtime of segemehl is probably owed to the common mapping strategy of RNA aligners which first attempt to map reads contiguously (i.e. without split) and then use the unmapped ones for a more expensive split-read mapping strategy. By mapping your data only to one chromosome instead of the entire genome, most of your data cannot be mapped but are attempted to be split-mapped, resulting in this huge runtime.

                        Thus, we would recommend to use the entire genome as database, resulting in faster runtime and moreover more reliable hits since by default segemehl reports only the best ones.
                        ecSeq Bioinformatics is Europe’s leading provider of hands-on bioinformatics workshops and professional data analysis in the field of Next-Generation Sequencing (NGS).

                        Comment


                        • #27
                          Originally posted by ntn12 View Post
                          I do not know. We have not used yet FusionCatcher. We have been testing TopHat-fusion, FusionMap, ChimeraScan, and FusionFinder. We found puzzling that all these four give thousands of candidate fusion genes per sample (some even hundred of thousands) when we know from the medical literature that there should not be more than 1-3 fusion genes per sample!!! Therefore one has here 99% false positives.

                          UPDATE: We started testing SOAPfuse and we start to like it!
                          Hi!
                          Is it possible to use SOAPfuse with hg38? If so, how would I do this? I am a bit lost.

                          Thanks in advance!

                          Comment

                          Latest Articles

                          Collapse

                          • seqadmin
                            Strategies for Sequencing Challenging Samples
                            by seqadmin


                            Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                            03-22-2024, 06:39 AM
                          • seqadmin
                            Techniques and Challenges in Conservation Genomics
                            by seqadmin



                            The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                            Avian Conservation
                            Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                            03-08-2024, 10:41 AM

                          ad_right_rmr

                          Collapse

                          News

                          Collapse

                          Topics Statistics Last Post
                          Started by seqadmin, Yesterday, 06:37 PM
                          0 responses
                          10 views
                          0 likes
                          Last Post seqadmin  
                          Started by seqadmin, Yesterday, 06:07 PM
                          0 responses
                          9 views
                          0 likes
                          Last Post seqadmin  
                          Started by seqadmin, 03-22-2024, 10:03 AM
                          0 responses
                          50 views
                          0 likes
                          Last Post seqadmin  
                          Started by seqadmin, 03-21-2024, 07:32 AM
                          0 responses
                          67 views
                          0 likes
                          Last Post seqadmin  
                          Working...
                          X