Seqanswers Leaderboard Ad

**hsmart** · 02-01-2011, 08:00 AM

Dear All,

I am trying to use cufflinks to analyze RNA-seq data from two cell-lines. I used following commands:
cuffcompare -i ~/Cufflink_files.txt -r ~/Homo_sapiens.GRCh37.60.gtf -R -p cell1_cell2 -o ~/cell1_cell2_results.txt
cuffdiff -L cell1,cell2 -p 4 -N --FDR 0.05 -r ~/hg19.fa ~/cell1_cell2_results.combined.gtf ~/cell1_accepted_hits.bam ~/cell2_accepted_hits.bam -o ~/Cuffdiff/
I have few questions regarding output of cufflinks:
(1) None of the .diff output files (gene, cds, isoform, promoters, splicing etc) have gene name or gene id associated with it. How can I generate output with gene names? Do I need to change any parameter in cuffcompare?
(2) Also, the locus region of the gene_exp.diff seems to be quite large (for example about 32 kb and includes cluster of 3 genes). So, how does cufflinks define a gene and boundaries related to it?
(3) Also, the locus region of the isoform_exp.diff seems to be quite large (for example about 10 kb and includes entire gene). So, how does cufflinks define an isoform and boundaries related to it?
(4) What type of statistical method does cufflinks use to calculate uncorrected p-value?
(5) What is the meaning of column 7 and 8 (Reserved with value of 0) in splicing.diff file?
(6) How do you compare FPKM and RPKM in terms of absolute values to consider if the gene is expressed above the background?

I really appreciate your personal help regard these issues.

Thanks,

Rakesh

**honey** · 02-01-2011, 09:31 AM

GTF file

Rakesh,

One of the first thing you may like to do is to use correct reference annotation GTF file as mentioned in this post

GTF reference files that work with TopHat/Cufflinks - SEQanswers

http://seqanswers.com/forums/showthread.php?t=8247

Discussion of next-gen sequencing related bioinformatics: resources, algorithms, open source efforts, etc

This will add gene names etc.

and will solve some of your issues. So far as 0 in col 7 and 8 is concerned it is because of formatting of the output and are just reserved columns.

Best

**hsmart** · 02-01-2011, 12:12 PM

Hi,

Thank you so much for your help.
It worked perfectly with
awk '{print "chr"$0}' Homo_sapiens.GRCh37.60.gtf | sed 's/chrMT/chrM/g' > hg19.ensembl-for-tophat.gtf

I really appriciate your help,

Best wishes,

Rakesh

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 17 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 22 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 16 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 46 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

Cufflinks question

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News