Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Trying to make sense of cufflinks output (--GTF vs. --GTF-guide)

    Hi guys,

    This isn't the first time this question is posted on this board, but I haven't found an answer yet.

    I tested out Cufflinks on a TCGA data set, using both -GTF and -GTF-guide optinos:
    Code:
    # --GTF option
    cufflinks -o Test01G -p 4 -G genes.gtf -u --library-type fr-unstranded TCGA.bam
    
    # --GFT-guide option
    cufflinks -o Test11g -p 4 -g genes.gtf -u --library-type fr-unstranded TCGA.bam

    Then, I looked for the results for some "house keeping" genes used in qPCR such as GAPDH and ACTB. Let's look at the results.

    First, --GTF option:

    When I grep for 'ACTB|GAPDH' in the isoforms.fpkm_tracking file:
    Code:
    tracking_id	class_code	nearest_ref_id	gene_id	gene_short_name	tss_id	locus	length	coverage	FPKM	FPKM_conf_lo	FPKM_conf_hi	FPKM_status
    NM_002046	-	-	GAPDH	GAPDH	TSS28915	chr12:6643584-6647537	1383	181.604	95.7623	90.8633	100.661	OK
    NM_001256799	-	-	GAPDH	GAPDH	TSS12547	chr12:6644408-6647537	1407	0.432855	0.22825	0.0308549	0.425646	OK
    NM_001101	-	-	ACTB	ACTB	TSS29883	chr7:5566778-5570232	1812	694.784	366.533	361.681	371.386	OK
    When I grep for 'ACTB|GAPDH' in the transcripts.gtf file:
    Code:
    chr12	Cufflinks	transcript	6643585	6647537	1000	+	.	gene_id "GAPDH"; transcript_id "NM_002046"; FPKM "95.7623252884"; frac "0.997574"; conf_lo "90.863255"; conf_hi "100.661396"; cov "181.604009";
    chr12	Cufflinks	exon	6643585	6643735	1000	+	.	gene_id "GAPDH"; transcript_id "NM_002046"; exon_number "1"; FPKM "95.7623252884"; frac "0.997574"; conf_lo "90.863255"; conf_hi "100.661396"; cov "181.604009";
    chr12	Cufflinks	exon	6643976	6644027	1000	+	.	gene_id "GAPDH"; transcript_id "NM_002046"; exon_number "2"; FPKM "95.7623252884"; frac "0.997574"; conf_lo "90.863255"; conf_hi "100.661396"; cov "181.604009";
    chr12	Cufflinks	exon	6645660	6645759	1000	+	.	gene_id "GAPDH"; transcript_id "NM_002046"; exon_number "3"; FPKM "95.7623252884"; frac "0.997574"; conf_lo "90.863255"; conf_hi "100.661396"; cov "181.604009";
    chr12	Cufflinks	exon	6645850	6645956	1000	+	.	gene_id "GAPDH"; transcript_id "NM_002046"; exon_number "4"; FPKM "95.7623252884"; frac "0.997574"; conf_lo "90.863255"; conf_hi "100.661396"; cov "181.604009";
    chr12	Cufflinks	exon	6646086	6646176	1000	+	.	gene_id "GAPDH"; transcript_id "NM_002046"; exon_number "5"; FPKM "95.7623252884"; frac "0.997574"; conf_lo "90.863255"; conf_hi "100.661396"; cov "181.604009";
    chr12	Cufflinks	exon	6646267	6646382	1000	+	.	gene_id "GAPDH"; transcript_id "NM_002046"; exon_number "6"; FPKM "95.7623252884"; frac "0.997574"; conf_lo "90.863255"; conf_hi "100.661396"; cov "181.604009";
    chr12	Cufflinks	exon	6646475	6646556	1000	+	.	gene_id "GAPDH"; transcript_id "NM_002046"; exon_number "7"; FPKM "95.7623252884"; frac "0.997574"; conf_lo "90.863255"; conf_hi "100.661396"; cov "181.604009";
    chr12	Cufflinks	exon	6646750	6647162	1000	+	.	gene_id "GAPDH"; transcript_id "NM_002046"; exon_number "8"; FPKM "95.7623252884"; frac "0.997574"; conf_lo "90.863255"; conf_hi "100.661396"; cov "181.604009";
    chr12	Cufflinks	exon	6647267	6647537	1000	+	.	gene_id "GAPDH"; transcript_id "NM_002046"; exon_number "9"; FPKM "95.7623252884"; frac "0.997574"; conf_lo "90.863255"; conf_hi "100.661396"; cov "181.604009";
    chr12	Cufflinks	transcript	6644409	6647537	2	+	.	gene_id "GAPDH"; transcript_id "NM_001256799"; FPKM "0.2282504343"; frac "0.002426"; conf_lo "0.030855"; conf_hi "0.425646"; cov "0.432855";
    chr12	Cufflinks	exon	6644409	6644635	2	+	.	gene_id "GAPDH"; transcript_id "NM_001256799"; exon_number "1"; FPKM "0.2282504343"; frac "0.002426"; conf_lo "0.030855"; conf_hi "0.425646"; cov "0.432855";
    chr12	Cufflinks	exon	6645660	6645759	2	+	.	gene_id "GAPDH"; transcript_id "NM_001256799"; exon_number "2"; FPKM "0.2282504343"; frac "0.002426"; conf_lo "0.030855"; conf_hi "0.425646"; cov "0.432855";
    chr12	Cufflinks	exon	6645850	6645956	2	+	.	gene_id "GAPDH"; transcript_id "NM_001256799"; exon_number "3"; FPKM "0.2282504343"; frac "0.002426"; conf_lo "0.030855"; conf_hi "0.425646"; cov "0.432855";
    chr12	Cufflinks	exon	6646086	6646176	2	+	.	gene_id "GAPDH"; transcript_id "NM_001256799"; exon_number "4"; FPKM "0.2282504343"; frac "0.002426"; conf_lo "0.030855"; conf_hi "0.425646"; cov "0.432855";
    chr12	Cufflinks	exon	6646267	6646382	2	+	.	gene_id "GAPDH"; transcript_id "NM_001256799"; exon_number "5"; FPKM "0.2282504343"; frac "0.002426"; conf_lo "0.030855"; conf_hi "0.425646"; cov "0.432855";
    chr12	Cufflinks	exon	6646475	6646556	2	+	.	gene_id "GAPDH"; transcript_id "NM_001256799"; exon_number "6"; FPKM "0.2282504343"; frac "0.002426"; conf_lo "0.030855"; conf_hi "0.425646"; cov "0.432855";
    chr12	Cufflinks	exon	6646750	6647162	2	+	.	gene_id "GAPDH"; transcript_id "NM_001256799"; exon_number "7"; FPKM "0.2282504343"; frac "0.002426"; conf_lo "0.030855"; conf_hi "0.425646"; cov "0.432855";
    chr12	Cufflinks	exon	6647267	6647537	2	+	.	gene_id "GAPDH"; transcript_id "NM_001256799"; exon_number "8"; FPKM "0.2282504343"; frac "0.002426"; conf_lo "0.030855"; conf_hi "0.425646"; cov "0.432855";
    chr7	Cufflinks	transcript	5566779	5570232	1000	-	.	gene_id "ACTB"; transcript_id "NM_001101"; FPKM "366.5331600779"; frac "1.000000"; conf_lo "361.680624"; conf_hi "371.385696"; cov "694.783972";
    chr7	Cufflinks	exon	5566779	5567522	1000	-	.	gene_id "ACTB"; transcript_id "NM_001101"; exon_number "1"; FPKM "366.5331600779"; frac "1.000000"; conf_lo "361.680624"; conf_hi "371.385696"; cov "694.783972";
    chr7	Cufflinks	exon	5567635	5567816	1000	-	.	gene_id "ACTB"; transcript_id "NM_001101"; exon_number "2"; FPKM "366.5331600779"; frac "1.000000"; conf_lo "361.680624"; conf_hi "371.385696"; cov "694.783972";
    chr7	Cufflinks	exon	5567912	5568350	1000	-	.	gene_id "ACTB"; transcript_id "NM_001101"; exon_number "3"; FPKM "366.5331600779"; frac "1.000000"; conf_lo "361.680624"; conf_hi "371.385696"; cov "694.783972";
    chr7	Cufflinks	exon	5568792	5569031	1000	-	.	gene_id "ACTB"; transcript_id "NM_001101"; exon_number "4"; FPKM "366.5331600779"; frac "1.000000"; conf_lo "361.680624"; conf_hi "371.385696"; cov "694.783972";
    chr7	Cufflinks	exon	5569166	5569294	1000	-	.	gene_id "ACTB"; transcript_id "NM_001101"; exon_number "5"; FPKM "366.5331600779"; frac "1.000000"; conf_lo "361.680624"; conf_hi "371.385696"; cov "694.783972";
    chr7	Cufflinks	exon	5570155	5570232	1000	-	.	gene_id "ACTB"; transcript_id "NM_001101"; exon_number "6"; FPKM "366.5331600779"; frac "1.000000"; conf_lo "361.680624"; conf_hi "371.385696"; cov "694.783972";
    I think those results are quite reasonable.


    Then, I looked at the results for the --GTF-guide option:
    When I grep for 'ACTB|GAPDH' in the isoforms.fpkm_tracking file:
    Code:
    NM_002046	-	-	GAPDH	-	-	chr12:6643584-6647537	1383	182.046	95.9885	91.0819	100.895	OK
    NM_001256799	-	-	GAPDH	-	-	chr12:6644408-6647537	1407	0	0	0	0.0420819	OK
    There is no ACTB, but when I looked for the coordinates:
    Code:
    NM_001101	-	-	CUFF.10954	-	-	chr7:5566778-5570232	1812	0	0	0	0.0314856	OK
    Why isn't there any sign of ACTB using the --GTF-guide option? ACTB is typically used as a house keeping gene in qPCR, so I figure it should be there.

    Also, ACTB does not appear in the transcripts.gtf file, either.


    Can someone explain the differences between --GTF and --GTF-guide?

    Thanks in advance.

Latest Articles

Collapse

  • seqadmin
    Essential Discoveries and Tools in Epitranscriptomics
    by seqadmin




    The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
    04-22-2024, 07:01 AM
  • seqadmin
    Current Approaches to Protein Sequencing
    by seqadmin


    Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
    04-04-2024, 04:25 PM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, Yesterday, 08:47 AM
0 responses
15 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-11-2024, 12:08 PM
0 responses
60 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-10-2024, 10:19 PM
0 responses
60 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-10-2024, 09:21 AM
0 responses
54 views
0 likes
Last Post seqadmin  
Working...
X