We have 6 bacterial samples that we are working with - 2 sets of 3 biological replicates (condition A -1, 2, 3 and condition B 1, 2, 3)- that we performed stranded RNA-Seq on. We are completely new to NGS and are working out pipelines and workflows, so we decided to compare the differential gene expression between the biological replicates in condition A and the differential gene expression between the biological replicates in condition B before we did the differential gene expression analysis between conditions A and B. We are using Rockhopper (http://cs.wellesley.edu/~btjaden/Rockhopper/) to perform these differential gene expression comparisons as it is designed to specifically handle bacterial RNA-Seq data. In theory there would be less differentially expressed genes between the biological replicates than between the conditions, right?
When we look at the number of differentially expressed genes between biological replicates in condition A (1 vs 2 = 87; 1 vs 3 = 80; 2 vs 3 = 132), there is a much greater number than when we compare conditions A and B (each with 3 biological replicates = 58 genes). This was VERY surprising to us and we are concerned that our data may not be usable.
Is it normal to have more differentially expressed genes between biological replicates than between experimental conditions? Could batch effect be playing a role here? Anyone have a lot experience with BACTERIAL RNA-Seq that could give me some pointers? Any advice on how to proceed - data usable or not?
Thank you!
When we look at the number of differentially expressed genes between biological replicates in condition A (1 vs 2 = 87; 1 vs 3 = 80; 2 vs 3 = 132), there is a much greater number than when we compare conditions A and B (each with 3 biological replicates = 58 genes). This was VERY surprising to us and we are concerned that our data may not be usable.
Is it normal to have more differentially expressed genes between biological replicates than between experimental conditions? Could batch effect be playing a role here? Anyone have a lot experience with BACTERIAL RNA-Seq that could give me some pointers? Any advice on how to proceed - data usable or not?
Thank you!
Comment