Hi all,
first of all, I have to say this site is amazing with all its discussions going on. I found new impulses and great hints here for writing up my diploma thesis!
My question:
I am trying to build a library of differentially genes for further analysis, e.g. I don't need information on which genes are specific for which tissue.
My data is Illumina HiSeq2000 with 16 tissues(adipose, adrenal,...) and two technical replicates(75bp SingleEnd, 50bp PairedEnd). I already mapped the data with tophat and counted reads for annotated refSeq genes and normalized for transcript-length.
My idea is genes which show high variance across all 16 samples in both experiments are likely to be tissue-specific. Genes with low variance are simply broadly expressed through the samples and don't contribute much to the overall variance of the experiment.
Do you think it's a good approach to simply identify genes which describe a huge amount of the overall variance best for each experiment(75bp, 50bp_PE) and then look at the intersect to rule out some false positives.
But I don't have any p_val verification how likely an assignment is.
Thanks for your thoughts,
Johannes
first of all, I have to say this site is amazing with all its discussions going on. I found new impulses and great hints here for writing up my diploma thesis!
My question:
I am trying to build a library of differentially genes for further analysis, e.g. I don't need information on which genes are specific for which tissue.
My data is Illumina HiSeq2000 with 16 tissues(adipose, adrenal,...) and two technical replicates(75bp SingleEnd, 50bp PairedEnd). I already mapped the data with tophat and counted reads for annotated refSeq genes and normalized for transcript-length.
My idea is genes which show high variance across all 16 samples in both experiments are likely to be tissue-specific. Genes with low variance are simply broadly expressed through the samples and don't contribute much to the overall variance of the experiment.
Do you think it's a good approach to simply identify genes which describe a huge amount of the overall variance best for each experiment(75bp, 50bp_PE) and then look at the intersect to rule out some false positives.
But I don't have any p_val verification how likely an assignment is.
Thanks for your thoughts,
Johannes