SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
cufflinks : analysis comparison with and without a gtf reference file sohnic Bioinformatics 3 07-07-2019 05:40 AM
How do I go from a fasta and a chromosome to gtf/gff file? Brown_lineage Bioinformatics 8 12-07-2012 06:21 AM
Tab Delimited File Editors? (GFF to GTF) DrD2009 Bioinformatics 16 11-30-2012 04:52 AM
GFF to GTF, and GTF to GRanges objects lewewoo Bioinformatics 2 04-03-2012 02:52 PM
GFF to GTF gen2prot Bioinformatics 9 12-14-2010 10:07 AM

Reply
 
Thread Tools
Old 01-07-2013, 07:41 AM   #1
syintel87
Member
 
Location: Universe

Join Date: Dec 2012
Posts: 81
Default DEG analysis without gff/gtf file

My goal is to see differentially expressed genes across different time points.
However, I want to map allreads based solely on sequence and not on where they map to, because it is not certain whether my annotation of ghe genome is correct or complete. So I do not want to use an annotation.

In this case, after running tophat without "-g option",
what approaches could be used in the next step othar than HTSeq or cufflinks/cuffdiff?

I have been told that cufflink/cuffdiff is not so powerful to see DEG, and have been advised to use HTSeq/EdgeR/DESeq. However, HTSeq requires GFF as an input file. So I need to take another approach. Would you please give me tips about what other programs could be used in my case?

Thanks in advance.

Last edited by syintel87; 01-07-2013 at 08:06 AM.
syintel87 is offline   Reply With Quote
Old 01-07-2013, 07:50 AM   #2
bernardo_bello
Member
 
Location: Spain

Join Date: May 2012
Posts: 51
Default

Hi syintel87,

I have been recently looking for a pipeline for RNA-Seq analysis and had the same doubt as you. As far as I know, in all cases (whether de novo assembly or reference-based mapping) you are going to need a GFF3/GTF file.

Bernardo
bernardo_bello is offline   Reply With Quote
Old 01-07-2013, 10:19 AM   #3
syintel87
Member
 
Location: Universe

Join Date: Dec 2012
Posts: 81
Default how to get DEG without gtf/gff?

Is there a way to achieve my goal which is to see differentially expressed genes across different time points, without gff/gtf file?

If I use the annotated file, reads will only map to annotated reads. This will exclude any reads that map to genes that have yet to be annotated.
syintel87 is offline   Reply With Quote
Old 01-07-2013, 10:57 AM   #4
bernardo_bello
Member
 
Location: Spain

Join Date: May 2012
Posts: 51
Default

Quote:
Originally Posted by syintel87 View Post
Is there a way to achieve my goal which is to see differentially expressed genes across different time points, without gff/gtf file?

If I use the annotated file, reads will only map to annotated reads. This will exclude any reads that map to genes that have yet to be annotated.
Well, at some point programs like rQuant, rDiff, DESeq or Cuffdiff are going to need a file with transcripts in order to quantify them in the *.bam files.

Maybe there are other tools GFT/GFF3-independent that I still don't know.


Bernardo
bernardo_bello is offline   Reply With Quote
Old 01-10-2013, 03:17 PM   #5
adumitri
Member
 
Location: Cambridge, MA

Join Date: Jan 2010
Posts: 27
Default

Even if you do not use Cuffdiff for the DE analysis, you can run Cufflinks on your samples to get sample-specific .gtf files. These annotations (which can contain novel transcripts/genes) can be merged afterwards with a reference .gtf file that you prefer (e.g. Ensembl's) using Cuffmerge, and you can use the resulting merged .gtf file for the DESeq/edgeR analyses.
adumitri is offline   Reply With Quote
Old 01-10-2013, 03:50 PM   #6
syintel87
Member
 
Location: Universe

Join Date: Dec 2012
Posts: 81
Default

Quote:
Originally Posted by adumitri View Post
Even if you do not use Cuffdiff for the DE analysis, you can run Cufflinks on your samples to get sample-specific .gtf files. These annotations (which can contain novel transcripts/genes) can be merged afterwards with a reference .gtf file that you prefer (e.g. Ensembl's) using Cuffmerge, and you can use the resulting merged .gtf file for the DESeq/edgeR analyses.
Oh!!! How helpful it is!!!
Thank you so much!!!!!!!!!
That GFF file is what I exactly want to have!!!
syintel87 is offline   Reply With Quote
Old 01-11-2013, 01:06 AM   #7
syfo
Just a member
 
Location: Southern EU

Join Date: Nov 2012
Posts: 103
Default

Quote:
Originally Posted by syintel87 View Post
Is there a way to achieve my goal which is to see differentially expressed genes across different time points, without gff/gtf file?
Well... it may sound silly but to identify *differentially expressed* genes you need to identify *genes*.
Either you provide them as known data in the form of an annotation file (GTF/GFF/BED/etc) or you'll have to infer them from the reads, which is a very challenging task if you expect complete gene models. You typically get differentially expressed "genomic regions" -aka "transcribed fragments" (transfrags), "transcriptionally active regions" (TAR), etc and not complete "genes".

As adumitri indicated you can use cufflinks (or BEDtools) to extract those transcribed regions from the mapped reads and merge them with some reference annotation so that you can probe known and unknown regions.
I would just recommend to merge the reads from all the samples altogether -along with the reference annotation- so that the statistical method you choose next will consider the exact same set of regions across samples/conditions. You should then find differentially expressed regions. Now defining if two transcribed regions belong to the same gene/transcript is another question.
syfo is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 12:28 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO