![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Fusion or Chimeric gene discovery tools | Debu_013 | Bioinformatics | 3 | 07-10-2013 01:05 PM |
Gene discovery | point blank | Bioinformatics | 3 | 04-25-2013 11:32 PM |
Cuffcompare codes and use in novel non-coding gene discovery | warrenemmett | Bioinformatics | 0 | 04-11-2013 06:53 AM |
PubMed: In silico analysis of the exome for gene discovery. | Newsbot! | Literature Watch | 0 | 11-09-2011 11:20 AM |
![]() |
|
Thread Tools |
![]() |
#1 |
Member
Location: Boston Join Date: Nov 2013
Posts: 13
|
![]()
I have two questions regarding novel gene discovery using Cufflinks:
1.) Does anyone have a good answer for what the appropriate FPKM cutoff value should be during gene discovery using Cufflinks. I have approximately 13,000 "novel genes" identified using Cufflinks, but many of them possess very low FPKMs. What should be the appropriate FPKM that a gene should possess in order to be accepted as a true novel gene? Is there a consensus? I was thinking of using an FPKM of 5 but was unsure if this was too high. 2.) I have run 4 insect stages of RNA-seq data (all with coverage of approximately 140-200 million reads) through Cufflinks and Cuffcompare. It indicates that there are no novel genes that are present in more than 1 of the time point according to the file [name].tracking Am I misinterpreting the results? Thanks everyone! |
![]() |
![]() |
![]() |
#2 |
Junior Member
Location: China Join Date: Jul 2013
Posts: 3
|
![]()
Hi, AJenkins. How do you find the 13000 novel genes? Can you specify the the procedure?
Thx |
![]() |
![]() |
![]() |
#3 |
Junior Member
Location: China Join Date: Jul 2013
Posts: 3
|
![]()
Hi, AJenkins. How do you find the 13000 novel genes? Can you specify the the procedure?
Thx |
![]() |
![]() |
![]() |
#4 |
Member
Location: Boston Join Date: Nov 2013
Posts: 13
|
![]()
Using upwards of 600 million reads, I have aligned the reads to the reference genome using the Tuxedo suite of programs and using a RABT assembly. This gives me around 10,000 completely unknown genes that aren't associated with any isoforms or known genes. I want to say, with confidence, that the FPKM associated with an unknown gene represents the gene having full coverage and being fully represented.
The problem I'm having is that I cannot determine a good way to create this cutoff point, any ideas? |
![]() |
![]() |
![]() |
#5 |
Member
Location: Berlin Join Date: Oct 2010
Posts: 71
|
![]()
Novel depends on which reference annotation you are using. If there is more than one source, start with filtering out everything that shows any overlap with any known gene. Next, you can filter for genes with >1 exons, as Cufflinks reports a lot of FP exons and spliced transcripts can be easier validated in the lab. Besides that, there is no general rule concerning a cut-off for FPKM values. In fact, there is still some discussion ongoing whether FPKM is actually a good representation of expression in general, so I would argue to just look at the FPKM distribution and then chose a cut-off deemed OK.
Best, René |
![]() |
![]() |
![]() |
Thread Tools | |
|
|