I've performed RNA-Seq analysis for A. thaliana using the TopHat/Cufflinks pipeline for 2 conditions. How can I extract the sequences of all isoforms of the genes which are differentially alternatively spliced b/w the two conditions? (Significant genes as output by cuffdiff in the splicing.diff file)
While browsing through the splicing.diff output file, I found that the coordinates for genes mentioned in the file are nowhere to be found in the transcript.gtf file produced by Cufflinks.
Sample splicing.diff file:
Sample transcript.gtf file:
As you can see, the gene_id in the transcript.gtf file is also not same as that in the splicing.diff file, which further complicates extraction.
I've tried gffreads, but since my GeneID's in gtf and splicing.diff files are different, I'm unable to extract the isoforms for only those genes which are differentially spliced b/w the two conditions.
While browsing through the splicing.diff output file, I found that the coordinates for genes mentioned in the file are nowhere to be found in the transcript.gtf file produced by Cufflinks.
Sample splicing.diff file:
TSS1 XLOC_000001 AT1G01010 Chr1:3630-5899 WT20 WT37 NOTEST 0 0 0 0 1 1 yes
TSS10 XLOC_000007 AT1G01180 Chr1:75582-76758 WT20 WT37 NOTEST 0 0 0 0 1 1 yes
TSS10 XLOC_000007 AT1G01180 Chr1:75582-76758 WT20 WT37 NOTEST 0 0 0 0 1 1 yes
Chr1 Cufflinks transcript 11903 12897 1000 - . gene_id "WT_37.1"; transcript_id "WT_37.1.1"; FPKM "0.6185344748"; frac "1.000000"; conf_lo "0.341917"; conf_hi "0.895152"; cov "2.561616";
Chr1 Cufflinks exon 11903 12897 1000 - . gene_id "WT_37.1"; transcript_id "WT_37.1.1"; exon_number "1"; FPKM "0.6185344748"; frac "1.000000"; conf_lo "0.341917"; conf_hi "0.895152"; cov "2.561616";
Chr1 Cufflinks exon 11903 12897 1000 - . gene_id "WT_37.1"; transcript_id "WT_37.1.1"; exon_number "1"; FPKM "0.6185344748"; frac "1.000000"; conf_lo "0.341917"; conf_hi "0.895152"; cov "2.561616";
I've tried gffreads, but since my GeneID's in gtf and splicing.diff files are different, I'm unable to extract the isoforms for only those genes which are differentially spliced b/w the two conditions.