SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
choosing & validating RNA-Seq time course data normalization method(s) anandksrao Bioinformatics 6 10-20-2012 11:50 AM
RNA seq with two time points anle Bioinformatics 0 01-16-2012 01:24 AM
RNA-seq analysis for time course honey Bioinformatics 0 01-18-2011 12:56 PM
where can I get ref seq hg18 culmen Bioinformatics 2 07-20-2010 01:28 PM
what can be done with micro rna seq from SOLiD of an organism with no ref? KevinLam Bioinformatics 0 06-02-2010 08:45 PM

Reply
 
Thread Tools
Old 03-25-2012, 11:02 PM   #1
anandksrao
Junior Member
 
Location: Sacramento

Join Date: Jun 2011
Posts: 9
Question Ref column for 2 factor RNA-Seq time course

Dear all,

I seek some help with normalization strategy for my experiment. I have described the experimental design below.

Tissue gathered one wild type and 8 mutants - so genotype may be considered 'factor #1'.
For each of these genotypes, 4 time points have been used for library generation - so time may be considered 'factor #2'

The reason why I do not clump libraries across ALL genotypes is because I think it is like comparing apples to oranges to bananas to peaches....
Since the statistical validity of RNA-Seq comparisons allows for only a small fraction of DE genes in a background of largely unchanged gene expression, I think that comparing different mutants and wild type will violate this assumption. Do you agree? Or do you think I'd have to empirically prove this theoretical prediction before I conclude its a 2-factor experiment?

Now, IF you agree that the experiment is indeed a 2-factor one, then how do I go about

1. choosing the reference column / library for TMM normalization - should this TMM normalization be performed for one genotype at a time? I am leaning towards TMM normalization for each genotype separately.

2. since each library is quadruplicated, does edgeR allow independently replicated reference libraries to be ALL used for normalization against, or can I use just one of the 4 reference libraries at any one time (which would defeat the purpose of replication)?

For the purpose of creating a ref library, I a am thinking of making an RLE-based (geometric mean) pseudo-library from the 4 lib reps for the reference 'conditions'.
However, before calculating the pseudo-ref library, for the 4 ref libs I am first considering removing genes / rows where any expression falls beyond +/- 3 SDs. Your opinions on this?

To provide a context for my questions, the final goals of my research are to:
a. cluster and identify co-expressed genes within a genotype, and
b. identify genes with variant expression patterns across genotypes

Last edited by anandksrao; 03-25-2012 at 11:10 PM. Reason: clarity
anandksrao is offline   Reply With Quote
Reply

Tags
reference, replication, rle, time-course, tmm

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 09:41 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO