Hi all,
Our lab recently performed RNAseq on bacteria, and found that 70-80% of the ribo-depleted reads come from one particular region of the genome in question. It's not any of the common house-keeping genes but something novel. It's also not a sequencing artifact, as we see it with qPCR.
We have now generated a knock-out of this region and are trying to determine the best way to assess differential expression in this knock-out versus the wild-type strain. We have arrays (preferential) but can also do RNAseq if necessary. The concern is as follows. Given that in the wild-type strain, 70-80% of the ribo-depleted RNA comes from this one region, then only 20-30% of the RNA will be from the remainder of the genome. Whereas in the knock-out, 100% of the RNA will be the remainder of the genome. So the question is how best to compare a sample of 20-30% to 100%.
Our options are as follows:
1) using 2 colour arrays with 4x as much RNA from the wild-type strain over the knock-out, to hopefully compensate for the dominance in the wild-type by the one region
2) use separate 1 colour arrays, and try to normalize by housekeeping genes.
3) RNAseq, using 4x number of reads allowed for wild-type strain over knock-out.
I know there are bioinformatical approaches for normalization, but many assume ~ 1:1 amounts of RNA in samples. So I guess I have multiple questions.
a) Can we get away with a 2 colour array of 4x the amount of cDNA on one over the other?
b) If so, what's the best way to normalize in that situation, when it's not definitely 1:1 ratio of input RNA?
c) Are there any other issues I need to be concerned about if I take this approach, or if we resort to RNAseq?
Any suggestions would be greatly appreciated!
Thank you!
Our lab recently performed RNAseq on bacteria, and found that 70-80% of the ribo-depleted reads come from one particular region of the genome in question. It's not any of the common house-keeping genes but something novel. It's also not a sequencing artifact, as we see it with qPCR.
We have now generated a knock-out of this region and are trying to determine the best way to assess differential expression in this knock-out versus the wild-type strain. We have arrays (preferential) but can also do RNAseq if necessary. The concern is as follows. Given that in the wild-type strain, 70-80% of the ribo-depleted RNA comes from this one region, then only 20-30% of the RNA will be from the remainder of the genome. Whereas in the knock-out, 100% of the RNA will be the remainder of the genome. So the question is how best to compare a sample of 20-30% to 100%.
Our options are as follows:
1) using 2 colour arrays with 4x as much RNA from the wild-type strain over the knock-out, to hopefully compensate for the dominance in the wild-type by the one region
2) use separate 1 colour arrays, and try to normalize by housekeeping genes.
3) RNAseq, using 4x number of reads allowed for wild-type strain over knock-out.
I know there are bioinformatical approaches for normalization, but many assume ~ 1:1 amounts of RNA in samples. So I guess I have multiple questions.
a) Can we get away with a 2 colour array of 4x the amount of cDNA on one over the other?
b) If so, what's the best way to normalize in that situation, when it's not definitely 1:1 ratio of input RNA?
c) Are there any other issues I need to be concerned about if I take this approach, or if we resort to RNAseq?
Any suggestions would be greatly appreciated!
Thank you!
Comment