Hi
I used cufflinks to discover novel isoforms on several RNA-seq libraries using the --GTF-guide option. Then, I used cuffmerge to merge the reference annotation file with all the transcript.gtf files generated on the previous step. However, cuffmerge outputted a GTF file where transcripts with different start sites were assigned the same tss_id; see below for an example.
7 Cufflinks exon 90442729 90446104 . + .gene_id "XLOC_038504"; transcript_id "TCONS_00105985"; exon_number "1"; gene_name "Crebzf"; oId "ENSMUST00000061767"; contained_in "TCONS_00105986"; nearest_ref "ENSMUST00000061767"; class_code "="; tss_id "TSS67411"; p_id "P36527";
7 Cufflinks exon 90442781 90447717 . + .gene_id "XLOC_038504"; transcript_id "TCONS_00105986"; exon_number "1"; gene_name "Crebzf"; oId "ENSMUST00000107206"; nearest_ref "ENSMUST00000107206"; class_code "="; tss_id "TSS67411"; p_id "P36526";
Since I'm interested in differential TSS usage, I would like to have tss_ids that are bound to unique genomic positions, which I think should be the expected result. Please, let me know of any comments of yours.
Thanks,
Marcelo
I used cufflinks to discover novel isoforms on several RNA-seq libraries using the --GTF-guide option. Then, I used cuffmerge to merge the reference annotation file with all the transcript.gtf files generated on the previous step. However, cuffmerge outputted a GTF file where transcripts with different start sites were assigned the same tss_id; see below for an example.
7 Cufflinks exon 90442729 90446104 . + .gene_id "XLOC_038504"; transcript_id "TCONS_00105985"; exon_number "1"; gene_name "Crebzf"; oId "ENSMUST00000061767"; contained_in "TCONS_00105986"; nearest_ref "ENSMUST00000061767"; class_code "="; tss_id "TSS67411"; p_id "P36527";
7 Cufflinks exon 90442781 90447717 . + .gene_id "XLOC_038504"; transcript_id "TCONS_00105986"; exon_number "1"; gene_name "Crebzf"; oId "ENSMUST00000107206"; nearest_ref "ENSMUST00000107206"; class_code "="; tss_id "TSS67411"; p_id "P36526";
Since I'm interested in differential TSS usage, I would like to have tss_ids that are bound to unique genomic positions, which I think should be the expected result. Please, let me know of any comments of yours.
Thanks,
Marcelo