Hi,
i am recently starting to write my diploma thesis about a differential genexpressionanalysis based of rna-seq data. But before I can start there are various questions about the data I have until now.
In the moment I have one data set with Illumina reads (single end) of the total transcription in one specific cell. The other data set I have to get from the Inet (GEO Database, for example from the Encode Project). My prof thinks it is good to look for an cell line which also expresses the same protein, to look where the differences between these cells are and which Pathways are involved in the expression of pathogenes in both cells.
My first question is about the replicates i need. The data set I have has no replicates. But all analysis I have read to learn more about this problem have, 2 or 3 replicates of the same condition. Are this technical or biological replicates? Do I need different sequencing runs of the same biological sample or do I need different biological samples?
Because the data sets uploaded by GeneExpressionOmnibus (Geo), especially the ones of the Encode Project I prefer have also only one Dataset per cell line.
Please help me.
i am recently starting to write my diploma thesis about a differential genexpressionanalysis based of rna-seq data. But before I can start there are various questions about the data I have until now.
In the moment I have one data set with Illumina reads (single end) of the total transcription in one specific cell. The other data set I have to get from the Inet (GEO Database, for example from the Encode Project). My prof thinks it is good to look for an cell line which also expresses the same protein, to look where the differences between these cells are and which Pathways are involved in the expression of pathogenes in both cells.
My first question is about the replicates i need. The data set I have has no replicates. But all analysis I have read to learn more about this problem have, 2 or 3 replicates of the same condition. Are this technical or biological replicates? Do I need different sequencing runs of the same biological sample or do I need different biological samples?
Because the data sets uploaded by GeneExpressionOmnibus (Geo), especially the ones of the Encode Project I prefer have also only one Dataset per cell line.
Please help me.