Hi all,
already my second threat these days, but I have another question regarding analyzing RNA-seq data with the cufflinks package.
We have used cuffdiff and other approaches to look for differences in promoter usage etc. We find that the cuffdiff results have the best accuracy so far, when we compare our results with that of other experiments in the same setting.
However, as a next step we have to do an extensive extra amount of tests. We want to run ~ 600 comparisons with ~15 vs 15 samples each. Each sample has about 140 million 2x 100bp reads. You can imagine that this is not really possible using the traditional approach.
My question: Is it possible to
a) once estimate the FPKM values of all samples and then do the cuffdiff tests with these as input (not the raw reads).
or
b) is it ok to extract the reads from genes of interest and test only those? I imagine there might be a problem normalizing these reads when the whole dataset is not there.
Basically, we have 30 samples and want to test them in 600 different distributions.
Thanks,
Seb
already my second threat these days, but I have another question regarding analyzing RNA-seq data with the cufflinks package.
We have used cuffdiff and other approaches to look for differences in promoter usage etc. We find that the cuffdiff results have the best accuracy so far, when we compare our results with that of other experiments in the same setting.
However, as a next step we have to do an extensive extra amount of tests. We want to run ~ 600 comparisons with ~15 vs 15 samples each. Each sample has about 140 million 2x 100bp reads. You can imagine that this is not really possible using the traditional approach.
My question: Is it possible to
a) once estimate the FPKM values of all samples and then do the cuffdiff tests with these as input (not the raw reads).
or
b) is it ok to extract the reads from genes of interest and test only those? I imagine there might be a problem normalizing these reads when the whole dataset is not there.
Basically, we have 30 samples and want to test them in 600 different distributions.
Thanks,
Seb
Comment