SEQanswers

Go Back   SEQanswers > Applications Forums > RNA Sequencing



Similar Threads
Thread Thread Starter Forum Replies Last Post
Expression quantification/differential expression gene analysis by RNA-Seq chenjy Bioinformatics 12 08-02-2013 03:06 AM
Differential gene expression analysis colaneri Bioinformatics 15 06-14-2013 05:37 AM
differential gene expression analysis between Different strains by RNA-seq qqtwee Bioinformatics 3 07-30-2012 04:19 AM
Differential gene expression analysis without reference cerebralrust Bioinformatics 7 05-04-2012 02:57 AM
A scaling normalization method for differential expression analysis of RNA-seq data severin Literature Watch 1 09-09-2010 11:09 PM

Reply
 
Thread Tools
Old 09-22-2014, 02:59 AM   #1
LeonDK
Member
 
Location: Denmark

Join Date: Sep 2014
Posts: 69
Default Preferred method for differential gene expression analysis?

Hi all,

So based on [1] I have implemented edgeR, except using the robust method of estimating the dispersion [2].

However having previously worked with non-parametric statistical methodology my attention has fallen on [3].

I am new to the whole NGS/RNASeq world and therefore it would be great to get some input reg. which method you prefer for differential gene expression analysis and why?

Cheers,
Leon

1. http://www.ncbi.nlm.nih.gov/pubmed/23975260
2. http://www.ncbi.nlm.nih.gov/pubmed/24753412
3. http://www.ncbi.nlm.nih.gov/pubmed/23981227
LeonDK is offline   Reply With Quote
Old 09-22-2014, 11:23 PM   #2
bastianwur
Member
 
Location: Germany/Netherlands

Join Date: Feb 2014
Posts: 98
Default

Review: http://www.ncbi.nlm.nih.gov/pubmed/24020486
Slightly related review: http://www.ncbi.nlm.nih.gov/pubmed/22988256

Personally: I use cuffdiff, mainly because it works without a hazzle, is fast, and doesn't require much additional effort for the input.
And I don't know R ^^.
bastianwur is offline   Reply With Quote
Old 09-23-2014, 10:03 PM   #3
LeonDK
Member
 
Location: Denmark

Join Date: Sep 2014
Posts: 69
Default

Quote:
Originally Posted by bastianwur View Post
Review: http://www.ncbi.nlm.nih.gov/pubmed/24020486
Slightly related review: http://www.ncbi.nlm.nih.gov/pubmed/22988256

Personally: I use cuffdiff, mainly because it works without a hazzle, is fast, and doesn't require much additional effort for the input.
And I don't know R ^^.
Hi bastianwur,

Thanks for the references, it seems that the consensus is that no method is better than another, each having its unique strengths and weaknesses.

Which basically means that scientists are likely to choose the one they "like" for whatever reason, ease of use, prior experience etc... Not a very scientific approach imho...

Cheers,
Leon
LeonDK is offline   Reply With Quote
Old 09-23-2014, 10:10 PM   #4
LeonDK
Member
 
Location: Denmark

Join Date: Sep 2014
Posts: 69
Default

No offence btw. it's just frustrating not being able to get clear cut answer reg. which method is the better

Cheers,
Leon
LeonDK is offline   Reply With Quote
Old 09-24-2014, 12:12 AM   #5
kopi-o
Senior Member
 
Location: Stockholm, Sweden

Join Date: Feb 2008
Posts: 319
Default

That's because there *is* no clear cut answer

- If you have a complex design (more than one experimental factor varies), you want to use limma, edgeR or DESeq2. All reviews tend to agree that these are OK. I tend to favor DESeq2.

- Most reviews agree that CuffDiff is pretty bad.

- When you have many biological replicates, SAMSeq (a nonparametric method) is a good alternative.

- Ballgown can do DE analysis on novel transcripts and isoforms (because it's run on an assembly).

- Single-cell RNA-seq needs special considerations.

And so on ...
kopi-o is offline   Reply With Quote
Old 09-24-2014, 01:03 AM   #6
LeonDK
Member
 
Location: Denmark

Join Date: Sep 2014
Posts: 69
Default

Quote:
Originally Posted by kopi-o View Post
That's because there *is* no clear cut answer

- If you have a complex design (more than one experimental factor varies), you want to use limma, edgeR or DESeq2. All reviews tend to agree that these are OK. I tend to favor DESeq2.

- Most reviews agree that CuffDiff is pretty bad.

- When you have many biological replicates, SAMSeq (a nonparametric method) is a good alternative.

- Ballgown can do DE analysis on novel transcripts and isoforms (because it's run on an assembly).

- Single-cell RNA-seq needs special considerations.

And so on ...
I understand and accept that "there *is* no clear cut answer", my acceptance however does not dampen my frustrations

My setup is the following 96 RNA-seq sample run on Illumina HiSeq2000 using the Illumina TruSeq Stranded mRNA Sample Prep Kit. ~1 case per 4 controls.

I have done autmated QC'ing using Trim Galore! followed by mapping to UCSC hg19 using TopHat2 and then counted mapped reads using HTSeq. Each raw fastq-file contain ~60-80 mio. reads.

My current analysis of differentially expressed genes have been performed using edgeR Robust by Zhou et al. (10.1093/nar/gku310) and the workflow described by Anders et al. (doi:10.1038/nprot.2013.099)

Newbie here... Am I all good or do you have "crucial" input? ...and could you elaborate on the "complex" versus "simple" design?

Cheers,
Leon

Last edited by LeonDK; 09-24-2014 at 01:08 AM. Reason: Forgot to include mapper
LeonDK is offline   Reply With Quote
Old 09-24-2014, 01:11 PM   #7
bastianwur
Member
 
Location: Germany/Netherlands

Join Date: Feb 2014
Posts: 98
Default

Quote:
Originally Posted by LeonDK View Post
No offence btw. it's just frustrating not being able to get clear cut answer reg. which method is the better

Cheers,
Leon
No offence taken, because as kopi-o said: There's no science yet in that part.
Just use the things which work for you and which give you the results you want (yes, I'm maybe naive, lazy, and a bad scientist, but well...in that case I'm fine with it).

Workflow seems so far normal, incorporates everything (no idea if trimgalore also does adapter trimming, but if so: good)...wait...besides rRNA filtering.
That seems to be missing.

complex design probably means time series and complicated relations between the conditions. No idea what your 96 samples are, but it probably qualifies.
bastianwur is offline   Reply With Quote
Old 09-25-2014, 01:44 AM   #8
dpryan
Devon Ryan
 
Location: Freiburg, Germany

Join Date: Jul 2011
Posts: 3,480
Default

Quote:
Originally Posted by bastianwur View Post
(no idea if trimgalore also does adapter trimming, but if so: good)
It trims adapters.
dpryan is offline   Reply With Quote
Old 09-25-2014, 06:07 AM   #9
kopi-o
Senior Member
 
Location: Stockholm, Sweden

Join Date: Feb 2008
Posts: 319
Default

Quote:
Newbie here... Am I all good or do you have "crucial" input? ...and could you elaborate on the "complex" versus "simple" design?
I think your approach seems sound.

By "complex" design I mean that there is more than one experimental factor that varies. For example, let's say you are looking at RNA-seq of tumor and paired normal tissue samples in several individuals. You *could* just compare the tumor vs normal groups, but you would get more statistical power by also considering which patient each sample is from - in other words, you'd model two factors, "individual" and "tumor/normal". This particular case would be a paired design ("paired" because you have "paired" tumor and normal samples from the same patient). This can be done in edgeR, limma, DESeq2, and in fact SAMSeq as well. If you do not use a paired analysis, natural variation between individuals can easily overwhelm the specific signal from the tumor vs tissue differences.

A more complex design could be a case where you have tumor cultures that have been treated with 3 different drugs at 3 different time points, with matched normals. Here you would perhaps want to model three different factors. It's this kind of scenario that you really need edgeR/DESeq2/limma for.
kopi-o is offline   Reply With Quote
Old 10-05-2014, 09:59 PM   #10
LeonDK
Member
 
Location: Denmark

Join Date: Sep 2014
Posts: 69
Default

Quote:
Originally Posted by kopi-o View Post
I think your approach seems sound.

By "complex" design I mean that there is more than one experimental factor that varies. For example, let's say you are looking at RNA-seq of tumor and paired normal tissue samples in several individuals. You *could* just compare the tumor vs normal groups, but you would get more statistical power by also considering which patient each sample is from - in other words, you'd model two factors, "individual" and "tumor/normal". This particular case would be a paired design ("paired" because you have "paired" tumor and normal samples from the same patient). This can be done in edgeR, limma, DESeq2, and in fact SAMSeq as well. If you do not use a paired analysis, natural variation between individuals can easily overwhelm the specific signal from the tumor vs tissue differences.

A more complex design could be a case where you have tumor cultures that have been treated with 3 different drugs at 3 different time points, with matched normals. Here you would perhaps want to model three different factors. It's this kind of scenario that you really need edgeR/DESeq2/limma for.
Super description - Appreciate it!

My setup is RNA-seq data from two groups: Group A from sick individuals and group B from healthy individuals and then I want to profile any gene expression differences

Cheers,
Leon
LeonDK is offline   Reply With Quote
Old 10-05-2014, 11:33 PM   #11
dpryan
Devon Ryan
 
Location: Freiburg, Germany

Join Date: Jul 2011
Posts: 3,480
Default

Keep in mind that batch effects (they're inevitable) are easily handled by more complex designs as well. You also might benefit by stratifying patients & controls by gender or ethnicity or age or ... . An initially simple design can quickly become more complicated if someone forgot to take care of a factor during sample acquisition.
dpryan is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 12:22 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO