SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Different fpkm values for cuffdiff and cuffcompare madsaan Bioinformatics 3 12-12-2012 04:14 PM
run cufflink, cuffcompare and cuffdiff workflow Robin RNA Sequencing 5 07-16-2012 03:21 PM
questions for cuffcompare and cuffdiff output liuxq Bioinformatics 0 01-04-2011 04:17 AM
compare expression? cuffcompare or cuffdiff vebaev RNA Sequencing 6 12-21-2010 09:59 PM
cuffcompare vs cuffdiff? jli525 Bioinformatics 0 06-08-2010 10:40 AM

Reply
 
Thread Tools
Old 12-02-2010, 10:25 AM   #1
sterding
Member
 
Location: Boston

Join Date: Sep 2010
Posts: 36
Question issues found in using cufflinks/cuffcompare/cuffdiff

I already wrote to Cole for the following questions. Post here in case others have the relevant questions (or answer).

-----------------------
Hi, Cole

THanks first for the great tools you designed.

I have found several issues during using the tools. I'm not sure it's reported by others already or not. Just let you know in case:

1. Cufflinks -M mask.gtf seems not really mask reads mapped to mask.gtf file, when -G is set.
I have checked the results from the following command:
cufflinks -o \$HOME/scratch/HD_$inputfile/cufflinks_allknown -j 0.05 -p 8 -Q 0 -G $BOWTIE_INDEXES/hg19.gtf -M $BOWTIE_INDEXES/hg19.rRNA_MtRNA.gtf -r $BOWTIE_INDEXES/hg19.fa \$HOME/scratch/HD_$inputfile/tophat.mm2.accepted_hits.hg19.sam

In the output file ../cufflinks_allknown/transcripts.gtf, I still can see rRNA and mitochondria RNAs there. Shouldn't they be masked out from the output gtf when -M is set?

2. What's the suggested parameters for detecting novel transcripts using cufflinks?

I think detecting novel transcripts is one of the popular tasks for RAN-seq processing. But I found only very short "transcripts" were assembled by cufflinks if without setting option -G. I compared the output from cufflinks (without -G) with annotated transcript in UCSC, and it's obviously seen that only a small fraction of reads are used to assemble the 'transcripts' (and they are shorter than the annotated ones). OK, if I switched to use -G in cufflinks, the output will of course show FPKM for each annotated transcript in the GTF; in this case, I cannot know how many reads are used to calculated that FPKM value (I can't find way to track this. Educate me if there is a way, please) So, by comparing these two results, I have an intuitive concern: even I use -G option, it might use only a small fraction of reads for calculating the FPKM? Can the fraction of used reads be increased if I turn on the -G option? I am slightly doubting this

3. Option of "-o <outprefix>" in Cuffcompare should not contain '.' in the outprefix path.

For example, if I set "cuffcompare -o /home/projects/HD.neuro/results/cuffcompare ...", the output of cuffcompare will be output to
/home/projects/HD.combined.gtf
/home/projects/HD.loci
/home/projects/HD.tracking

I guess this is because you wrongly stop until the first '.' in the path.

4. How is the Test status NOTEST decided?

In my cuffdiff output file gene_exp.diff, I saw many comparisons with test status NOTEST. It says "not enough alignments for testing" in the manual, but in my result file, I saw some genes with comparable FPKM values in both samples, but they are marked as NOTEST; while other genes with 0 FPKM in both sample, and they are marked as OK (e.g. the 2nd record in below example)

Another confusion for the 'yes/no' significant column, It says "depending on whether p is greater then the FDR after Benjamini-Hochberg correction for multiple-testing". However, for cases with 0 uncorrected p-value, some are marked as 'yes', some are marked as 'no'. I am confused which one I should use to select differentially expression genes. The last column ?

$ tail gene_exp.diff
XLOC_048056 J01415.24 chrM:8293-8364 HD_B4430 HD_B4189 NOTEST 0 0 0 0 1 no
XLOC_048057 J01415.19,J01415.2,J01415.21,J01415.23,J01415.5,J01415.8,MT-ATP6,MT-CO3,MT-ND3,MT-ND4,MT-ND4L,MT-ND5 chrM:8364-14742 HD_B4430 HD_B4189 OK 0 0 0 0 1no
XLOC_048058 J01415.15,MT-CYB chrM:14745-15953 HD_B4430 HD_B4189 NOTEST 0 0 0 0 1 no
XLOC_048059 J01415.22 chrM:3305-4400 HD_B4430 HD_B4189 NOTEST 3695.47 2047.85 -0.590317 21.4283 0 no
XLOC_048060 J01415.1 chrM:5585-5655 HD_B4430 HD_B4189 NOTEST 1949.56 1234.71 -0.456771 12.5587 0 no
XLOC_048061 J01415.18 chrM:5655-5729 HD_B4430 HD_B4189 NOTEST 970.36 667.756 -0.373745 7.43324 1.05915e-13 no
XLOC_048062 J01415.11,J01415.14 chrM:5759-5891 HD_B4430 HD_B4189 OK 7858.87 5882.96 -0.289583 16.7969 0 yes
XLOC_048063 J01415.13 chrM:5902-7514 HD_B4430 HD_B4189 OK 14129.9 10391.4 -0.30732 23.7808 0 yes
XLOC_048064 J01415.16,MT-ND6 chrM:8364-14742 HD_B4430 HD_B4189 OK 55722.5 25892.8 -0.766417 77.8562 0 yes
XLOC_048065 J01415.20 chrM:15954-16023 HD_B4430 HD_B4189 OK 14668.5 5896.61 -0.911327 59.1021 0 yes

So many questions, I know. Sorry for that.

Regards,

Xianjun
sterding is offline   Reply With Quote
Old 12-08-2010, 01:46 PM   #2
sterding
Member
 
Location: Boston

Join Date: Sep 2010
Posts: 36
Default

I hope Cole have time for these questions
sterding is offline   Reply With Quote
Old 01-12-2011, 10:36 AM   #3
labrat73
Member
 
Location: Louisiana

Join Date: Nov 2010
Posts: 12
Smile annotated rRNA/MtRNA .gtf file

Greetings sterding-

I noticed that you have an annotated human rRNA_MtRNA file. I am trying to acquire such a file so that I may use the -M option when running cufflinks. Could you please advise on how to acquire/generate the file? Any assistance you could offer would be greatly appreciated.

Many thanks
labrat73 is offline   Reply With Quote
Old 01-12-2011, 12:48 PM   #4
sterding
Member
 
Location: Boston

Join Date: Sep 2010
Posts: 36
Default

I simply extracted the record with keywords 'rRNA' or 'Mt' from hg19.gtf file by using command 'grep'.
sterding is offline   Reply With Quote
Old 01-12-2011, 01:25 PM   #5
labrat73
Member
 
Location: Louisiana

Join Date: Nov 2010
Posts: 12
Default

Many, many, thanks sterding for your quick reply! Works great!
labrat73 is offline   Reply With Quote
Old 06-01-2011, 08:04 PM   #6
edge
Senior Member
 
Location: China

Join Date: Sep 2009
Posts: 199
Default

Hi sterding,

Is it better you using -g instead of -G in your situation?
edge is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 04:15 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO