SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
RNA-Seq: TopHat-Fusion: an algorithm for discovery of novel fusion transcripts. Newsbot! Literature Watch 5 07-13-2013 12:02 AM
RNA-Seq: FusionMap: detecting fusion genes from next-generation sequencing data at ba Newsbot! Literature Watch 11 04-25-2012 07:16 PM
tophat fusion --fusion-min-dist MerFer Bioinformatics 1 07-24-2011 07:09 PM
RNA-Seq: Identification of fusion genes in breast cancer by paired-end RNA-sequencing Newsbot! Literature Watch 0 01-21-2011 04:50 AM
How to find DE genes using RPKM values? casshyr Bioinformatics 2 10-08-2010 07:03 AM

Reply
 
Thread Tools
Old 04-07-2012, 11:27 AM   #21
andrewm
Junior Member
 
Location: vancouver, canada

Join Date: Nov 2010
Posts: 8
Default

This bug has now been fixed with defuse version 0.5.0

Quote:
Originally Posted by srividya View Post
Hello,

I am trying to use defuse. I tried to create the 2bit reference genome file. I got some errors. Can anyone check if there is some mistake with the config file that i changed?

/Apps/serial/defuse/defuse-0.4.3/scripts/create_reference_dataset.pl -c config.txt

This is the error that I got when I ran the create_reference_dataset.pl,

Use of uninitialized value in concatenation (.) or string at /Apps/serial/defuse/defuse-0.4.3/scripts/cmdrunner.pm line 39.

Thanks,
Srividya
andrewm is offline   Reply With Quote
Old 04-08-2012, 12:52 AM   #22
NicoBxl
not just another member
 
Location: Belgium

Join Date: Aug 2010
Posts: 264
Default

I'm curious about a comparison of these fusion gene discovery algorthms !
For different types of genome sizes and genuses ? and for different type of data ?

Anyone is aware about a paper talking about that ?
NicoBxl is offline   Reply With Quote
Old 04-18-2012, 11:39 AM   #23
Ichinichi
Member
 
Location: Philadelphia

Join Date: Mar 2010
Posts: 10
Default

Quote:
Originally Posted by salzberg View Post
Several people asked about TopHat. Since last summer (2011), we have released TopHat-Fusion, which can detect fusion transcripts from either single reads or paired end reads. It is very fast and highly effective at filtering out the numerous false positives that plague these types of tools.
Unless you have a master list of all actual fusion events in the cell lines MCF7 and SKBR3, the claim that TopHat-Fusion has "fewer false positives" entails that the known fusions are the ONLY FUSIONS, which is a claim that is both is unsubstantiated and misleading: how do you differentiate between tophat-fusion being over-fitted for known data from defuse being non-specific?

Furthermore, why didn't you guys do a PCR for the 42 novel fusions and 42 randomly selected deFuse fusions to at least demonstrate the sensitivity and specificity in discovering new fusions? At the very least, rerun TopHat fusion with the same data as Table 3 in the deFuse paper and make use of their PCR results.

How many of the 42 Tophat-Fusion novel fusions agreed with the 1670 deFuse novel fusions?

edit: I am in no way affiliated with either group; I just cannot stand lopsided claims in the literature.

Last edited by Ichinichi; 04-18-2012 at 11:46 AM.
Ichinichi is offline   Reply With Quote
Old 04-18-2012, 11:59 AM   #24
salzberg
Member
 
Location: Baltimore

Join Date: Nov 2011
Posts: 11
Default

Ichinichi, if you think there are really 1670 unknown fusion events in one of these cell lines, by all means spend some time trying to validate them.

We don't make "lopsided claims" in our papers but I'm not sure what you're referring to by your statement. We stand by the results in the TopHat-Fusion paper.
salzberg is offline   Reply With Quote
Old 05-01-2012, 07:40 PM   #25
andrewm
Junior Member
 
Location: vancouver, canada

Join Date: Nov 2010
Posts: 8
Default tophat-fusion deFuse comparison

As the deFuse author Ill add my comments into this discussion.

For the tophat-fusion paper the authors set the anchor length for deFuse to 13bp. I never intended this parameter to be user modified, and as such the parameter isnt propagated to everywhere in the pipeline, and will have a strange effect on calculating probabilities for the fusions. The result will be more false predictions with probability that exceeds the threshold (default 0.5).

I reran the MCF7 and SKBR3 libraries setting anchor length to 4bp, and also setting discord_read_trim for MCF7 to 42 as suggested by a warning message produced by deFuse.

MCF7 and SKBR3 produce 66 and 126 fusions with > 0.5 probability respectively.

Generally, i think its a strength of deFuse that we report more, including events such as novel 5’ exons, or unexpected intergenic splicing. If people are not interested in these they can filter them. People can also select their own sensitivity / specificity by thresholding on the probability score. I think this is preferable to having your results filtered for you.

Finally, I think tophat-fusion is an important contribution, especially since as it says in the paper it doesnt depend on annotations. It would have been nice to then have seen examples of fusions predicted by tophat-fusion and missed by deFuse because of its reliance on annotated genes. Instead the authors focused on the specificity of deFuse, using results from an incorrect run of the software. A quick email would have resolved this issue. As a rule I believe comparisons should be done with full cooperation of those being compared to, as it would certainly add some reliability to the results.
andrewm is offline   Reply With Quote
Old 05-02-2012, 03:46 AM   #26
salzberg
Member
 
Location: Baltimore

Join Date: Nov 2011
Posts: 11
Default

Regarding the comparison to DeFuse in our paper: we ran that at the insistence of the reviewers. As many people know, reviewers often ask for these kinds of "bake-off" evaluations even when not appropriate or necessary. We did not include these in our original manuscript. However I must disagree with andrewm that comparisons "should be done with full cooperation of those being compared to" - this almost never happens. I cannot count the number of times programs of my group have been used in publications as a comparison to some new piece of software, and I don't think I've ever been informed in advance, much less asked to "cooperate."
salzberg is offline   Reply With Quote
Old 10-03-2012, 05:09 AM   #27
tankman
Member
 
Location: usa

Join Date: Sep 2012
Posts: 22
Default problem with tophat-fusion-post

Hi Salzberg,

I've been having a lot of trouble trying to get any kind of nontrivial output from tophat-fusion-post, as per my many posts on Seqanswers, even on the MCF7 example given on the webpage. I am trying to get this example to work so that I can try and find some reliable fusion events from purely single-end data and it's been a really frustrating experience.

At this stage, I would very much appreciate a detailed command history to make that example work or maybe some tips that you as a seasoned tophat-fusion users could provide on why the output of tophat-fusion-post 2.0.4 (or 2.0.3 or earlier) may be empty despite no problems on alignment with tophat-fusion and there being obvious candidates passing filtration from fusions.out. Assuming I eventually overcome these problems (hopefully!), can you also advise me on how to proceed with purely SE data? E,g,m num-fusion-pairs = 0 must be set, etc.?

thanks alot!

tm


Quote:
Originally Posted by salzberg View Post
Several people asked about TopHat. Since last summer (2011), we have released TopHat-Fusion, which can detect fusion transcripts from either single reads or paired end reads. It is very fast and highly effective at filtering out the numerous false positives that plague these types of tools.
tankman is offline   Reply With Quote
Old 02-03-2013, 01:56 AM   #28
ndaniel
Member
 
Location: Helsinki

Join Date: Feb 2009
Posts: 33
Default FusionCatcher

Have you tried FusionCatcher?

http://code.google.com/p/fusioncatcher/

It has found many novel fusion genes in RNA-seq data which were validated later using RT-PCR.

FusionCatcher has found plenty of fusion genes in BT-474, SKBR3, KPL4, and MCF7. Some of those fusion genes found by FusionCatcher to be even alternatively spliced and later they were also validated with RT-PCR!

FusionCatcher has been used for finding novel and known fusion genes in the following articles:
- S. Kangaspeska, S. Hultsch, H. Edgren, D. Nicorici, A. Murumägi, O.P. Kallioniemi, Reanalysis of RNA-sequencing data reveals several additional fusion genes with multiple isoforms, PLOS One, Oct. 2012. http://dx.plos.org/10.1371/journal.pone.0048745
- H. Edgren, A. Murumagi, S. Kangaspeska, D. Nicorici, V. Hongisto, K. Kleivi, I.H. Rye, S. Nyberg, M. Wolf, A.L. Borresen-Dale, O.P. Kallioniemi, Identification of fusion genes in breast cancer by paired-end RNA-sequencing, Genome Biology, Vol. 12, Jan. 2011. http://genomebiology.com/2011/12/1/R6

Quote:
Originally Posted by salzberg View Post
Regarding the comparison to DeFuse in our paper: we ran that at the insistence of the reviewers. As many people know, reviewers often ask for these kinds of "bake-off" evaluations even when not appropriate or necessary. We did not include these in our original manuscript. However I must disagree with andrewm that comparisons "should be done with full cooperation of those being compared to" - this almost never happens. I cannot count the number of times programs of my group have been used in publications as a comparison to some new piece of software, and I don't think I've ever been informed in advance, much less asked to "cooperate."

Last edited by ndaniel; 02-03-2013 at 02:02 AM. Reason: adding info
ndaniel is offline   Reply With Quote
Old 01-04-2016, 12:00 AM   #29
ninni
Junior Member
 
Location: london

Join Date: Jun 2012
Posts: 8
Default

Quote:
Originally Posted by mjn138 View Post
You may also try out FusionMap, from Omicsoft. http://omicsoft.com/fusionmap/

It works with single end reads as well. The paper is at http://bioinformatics.oxfordjournals...tr310.abstract

It runs on both windows and linux.
How can I use FusionMap with hg38? It does not seem to be possible.
Any help is highly appreciated
ninni is offline   Reply With Quote
Old 01-04-2016, 09:15 AM   #30
mjn138
Junior Member
 
Location: Philadelhpia, PA

Join Date: Apr 2011
Posts: 3
Default

Just change the reference library to Human.B38 or Human.hg38. You can see a list of precompiled libraries and gene models here: http://www.arrayserver.com/wiki/inde..._from_OmicSoft
mjn138 is offline   Reply With Quote
Old 01-04-2016, 11:51 PM   #31
ninni
Junior Member
 
Location: london

Join Date: Jun 2012
Posts: 8
Default

Thank you!

Last edited by ninni; 01-05-2016 at 12:07 AM.
ninni is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 02:32 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO