SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
one gene one RPKM cufflinks or DEseq frankyue50 Bioinformatics 5 10-30-2014 01:37 AM
Can edgeR/DESeq have more than one covariate? arrchi Bioinformatics 8 10-28-2013 03:37 PM
DESeq and edgeR papori Bioinformatics 3 05-15-2012 07:29 PM
DESeq and EdgeR: too many differentially expressed genes!?!? cutcopy11 Bioinformatics 5 12-08-2011 01:14 AM
edgeR vs DESeq vs bayseq Azazel Bioinformatics 1 10-07-2010 08:11 AM

Reply
 
Thread Tools
Old 01-15-2013, 08:11 AM   #1
narges
Member
 
Location: Finland

Join Date: Aug 2012
Posts: 29
Default Get the RPKM value of the genes analyzed using DESeq or edgeR

Hi,

I have done analyzation over RNA seq data using edgeR and DESeq to find DE genes (BAM files -> HTSeq -> edgeR and DEseq).
For some comparisons I need to have the RPKM values related to each gene. What is the best way of getting it?

Thank you in advance.
narges is offline   Reply With Quote
Old 01-21-2013, 02:22 PM   #2
Gordon Smyth
Member
 
Location: Melbourne, Australia

Join Date: Apr 2011
Posts: 91
Default

The version of edgeR on the Bioconductor developmental repository has a function rpkm().
Gordon Smyth is offline   Reply With Quote
Old 01-22-2013, 10:07 AM   #3
ThePresident
Member
 
Location: Sherbrooke / Canada

Join Date: Jun 2012
Posts: 72
Default

Quote:
Originally Posted by Gordon Smyth View Post
The version of edgeR on the Bioconductor developmental repository has a function rpkm().
Is it v3.08?
ThePresident is offline   Reply With Quote
Old 01-22-2013, 03:09 PM   #4
Gordon Smyth
Member
 
Location: Melbourne, Australia

Join Date: Apr 2011
Posts: 91
Default

Quote:
Originally Posted by ThePresident View Post
Is it v3.08?
No, the *development* repository:

http://bioconductor.org/packages/2.1...tml/edgeR.html
Gordon Smyth is offline   Reply With Quote
Old 01-22-2013, 11:31 PM   #5
narges
Member
 
Location: Finland

Join Date: Aug 2012
Posts: 29
Default

Quote:
Originally Posted by Gordon Smyth View Post
The version of edgeR on the Bioconductor developmental repository has a function rpkm().
Thank you but can I ask how does this function calculate the gene length? Because my problem is that I do not know how to get the gene length to calculate the RPKM values. The gtf file I have used is the hg19 latest version from UCSC genome browser. I can do something like this: endposition - startpositoin+1. But there are different transcripts for each gene. which of these transcripts should be the as the source for the gene length? the average? the longest?
narges is offline   Reply With Quote
Old 01-23-2014, 04:36 PM   #6
sindrle
Senior Member
 
Location: Norway

Join Date: Aug 2013
Posts: 266
Default

Im wondering the exact same thing..
sindrle is offline   Reply With Quote
Old 01-23-2014, 04:48 PM   #7
Gordon Smyth
Member
 
Location: Melbourne, Australia

Join Date: Apr 2011
Posts: 91
Default

An appropriate measure of gene length must be input to rpkm(). Computing gene length is a job for the read count software rather than for the differential expression software because the appropriate measure of gene length depends on the way the reads have been counted.

I use subread and featureCounts:

http://www.ncbi.nlm.nih.gov/pubmed/24227677

to count reads. For most RNA-seq analyses, I count reads that overlap any exon for each gene, so the appropriate measure of gene length is the total exon length. Gene length is returned as part of the output from featureCounts.
Gordon Smyth is offline   Reply With Quote
Old 01-23-2014, 09:48 PM   #8
sindrle
Senior Member
 
Location: Norway

Join Date: Aug 2013
Posts: 266
Default

Thanks!

Is there a difference in choosing CPM/logCPM/FPKM to represent gene expression level if I want to correlate the expression of one gene against i.e. body weight?

Or the change in gene expression of one gene against change in body weight etc.
sindrle is offline   Reply With Quote
Old 01-29-2014, 09:17 AM   #9
sindrle
Senior Member
 
Location: Norway

Join Date: Aug 2013
Posts: 266
Default

Quote:
Originally Posted by Gordon Smyth View Post
An appropriate measure of gene length must be input to rpkm(). Computing gene length is a job for the read count software rather than for the differential expression software because the appropriate measure of gene length depends on the way the reads have been counted.

I use subread and featureCounts:

http://www.ncbi.nlm.nih.gov/pubmed/24227677

to count reads. For most RNA-seq analyses, I count reads that overlap any exon for each gene, so the appropriate measure of gene length is the total exon length. Gene length is returned as part of the output from featureCounts.
I have used HTSeq for counting, I guess I can run featureCounts on one sample, just to get the gene lengths from my UCSC GTF file.

Then import the lengths to edgeR/DEseq fpkm()

Or am I missing some points here?
sindrle is offline   Reply With Quote
Reply

Tags
deseq, edger, rpkm

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 07:17 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO