Hi,
I am analyzing RNA-Seq data to discover new genes, exons and polymorphic sites. I recently aligned the RNA of 7 different tissue samples against a reference genome using TopHat, and cleaned up the alignment through a pipeline of GATK/samtools programs. I ran cufflinks separately for each sample and merged the resulting gff files in Cuffmerge.
For both Cufflinks and Cuffmerge, I included a gff3 file of known annotations. The Cufflinks analyses all produced annotation files in gtf format with transcript and exon annotations. The Cuffmerge output, however, produced only exon annotation data. The transcript data from the Cufflinks-generated files and the gene and transcript data from the gff3 file of known annotations was not included.
Is this normal behavior for Cufflinks? If so, do I need to use CuffQuant and/or CuffDiff to recover annotations above the level of the exon (i.e. genes, mRNA and transcripts)?
I am analyzing RNA-Seq data to discover new genes, exons and polymorphic sites. I recently aligned the RNA of 7 different tissue samples against a reference genome using TopHat, and cleaned up the alignment through a pipeline of GATK/samtools programs. I ran cufflinks separately for each sample and merged the resulting gff files in Cuffmerge.
For both Cufflinks and Cuffmerge, I included a gff3 file of known annotations. The Cufflinks analyses all produced annotation files in gtf format with transcript and exon annotations. The Cuffmerge output, however, produced only exon annotation data. The transcript data from the Cufflinks-generated files and the gene and transcript data from the gff3 file of known annotations was not included.
Is this normal behavior for Cufflinks? If so, do I need to use CuffQuant and/or CuffDiff to recover annotations above the level of the exon (i.e. genes, mRNA and transcripts)?