SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
cufflinks reports extremely high FPKMs for short transcripts cram Bioinformatics 10 11-24-2014 05:24 PM
Normalization of FPKMs from Cufflinks greener Bioinformatics 3 05-06-2014 06:55 AM
Cufflinks "randomly" dropping fpkms -- how to replace? arielpaulson Bioinformatics 5 02-21-2013 09:58 PM
cufflinks 1.0.3 missing FPKMs mgogol Bioinformatics 3 07-29-2011 12:08 PM
Cufflinks low FPKMs and other wonders yehudithasin RNA Sequencing 0 07-04-2011 06:23 AM

Reply
 
Thread Tools
Old 02-07-2012, 08:26 PM   #1
nwalwort
Junior Member
 
Location: CA

Join Date: Feb 2012
Posts: 4
Default Cufflinks reporting differnt FPKMS for the same gene

Hello,

I am analyzing some bacterial RNA-seq data with cufflinks. Since in bacterial RNA-seq, splicing isn't an issue, I am using a mapping program that does not take splicing into account. I just wanted to get FPKM's for my genes and do some differential expression even though I am aware that cufflinks was created for euk RNA-seq. In the FAQ's, it states that cufflinks will work with bacterial RNA-seq given that I map with a fasta file of already annotated genes. I know cufflinks assembles transcripts, but when I feed it my sam file (generated from mapping program perM by mapping reads to a mulitfasta file of genes in the genome), cufflinks returns multiple locations of one gene in separate rows with all of their own FPKM's. I wanted just an FPKM for each gene. Does anyone have any way to resolve this issue? Cheers.
nwalwort is offline   Reply With Quote
Old 02-08-2012, 01:56 PM   #2
Nicolas
Member
 
Location: new york city

Join Date: Apr 2009
Posts: 40
Default

Could it be multiple isoforms of the same gene?
Nicolas is offline   Reply With Quote
Old 02-08-2012, 02:02 PM   #3
nwalwort
Junior Member
 
Location: CA

Join Date: Feb 2012
Posts: 4
Default

Hey Nicolas,

thats what I was thinking it might be. However, I looked at the read pileups with IGV, and cufflinks is just assembling different transcripts de novo of the same gene based on clustering reads. So, for example, looking at one gene that I am mapping reads too, instead of calculating the FPKM for all reads hitting that gene, cufflinks is splitting the gene up into thirds based on where reads are piling up and calculating three different FPKMs for each region of the gene and then reporting it as different "genes" (transcripts). So, rather than different isoforms, it looks like it is just splitting up genes based on where reads fall. I am also using a mapping program unaware of splicing. I am trying my luck with a few other programs to compare.
nwalwort is offline   Reply With Quote
Old 02-08-2012, 02:13 PM   #4
Nicolas
Member
 
Location: new york city

Join Date: Apr 2009
Posts: 40
Default

Could you post the command you're using?
Which Cufflinks mode are you using, de novo (default), with a reference annotation (-G) or RABT (-g)?
Is there a complete coverage of your gene? If not (and if you're using de novo mode), then Cufflinks has no information supporting the fact that the 3 regions are actually one single gene...

Please provide more info.
Nicolas is offline   Reply With Quote
Old 02-08-2012, 02:28 PM   #5
nwalwort
Junior Member
 
Location: CA

Join Date: Feb 2012
Posts: 4
Default

Hello Nicolas, my command is below:

cufflinks -N -u seq1_380-380_r1_out.sorted.sam

It is in default mode i think. I aligned my reads to a multifasta file with annotated genes in hopes that it would be sufficient for cufflinks to assign reads to only these genes but I was wrong, and cufflinks assembled transcripts because no GTF was supplied. I was trying to search for anything on how to obtain or generate a reference GTF file for my bacterium, but I cannot seem to find it. Surely, that would probably fix my problem. do you know how I might generate one with an annotated reference genome in fasta format. Thank you for your inquiries! I am still quite new to this
nwalwort is offline   Reply With Quote
Old 02-09-2012, 07:08 AM   #6
Nicolas
Member
 
Location: new york city

Join Date: Apr 2009
Posts: 40
Default

Does your multifasta file contains one entry per gene?
If so, it should be easy to count the number of reads mapping to each entry (samtools idxstats <aln.bam> for instance). You can then normalize by exon size and library depth to achieve something similar to FPKM.
I don't think Cufflinks could do what you want, but I am also not sure you really need it!
Nicolas is offline   Reply With Quote
Old 02-09-2012, 08:47 AM   #7
nwalwort
Junior Member
 
Location: CA

Join Date: Feb 2012
Posts: 4
Default

Thank you for the replies Nicolas. Much appreciated. good luck with everything
nwalwort is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 04:27 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO