Hi Everyone,
About a month ago I started working on a project involving targeted RNA seq and have been rapidly learning about the techniques/analyses, …
I have however run into some theoretical questions specific to our set-up, which, after extensive online searches, I can’t seem to find an answer to. I hope that some of you may be able to give some advice… I apologise if any of my questions are due to my inadequate knowledge, I am trying very hard to catch up on all there is to know about RNA-seq…
Our assay set-up is the following:
The assay is a Targeted RNA next generation sequencing assay to screen for differential expression in a limited number of gene targets (88) between tumor samples and adjacent ’normal/healthy’ samples from patients. For this, targets are enriched by amplification with specific primers in both tumor and adjacent sample and then run on a Miseq, similar to the Truseq Targeted RNA expression kits available from Illumina. Illumina offers kits with different panels of genes related to certain pathways.
Illumina then offers the possibility to analyse the data using DESeq. For the initial sizing of the library, you can either use DESeq to estimate the size factors, or indicate reference genes that are then used to estimate the size factor based on their geometric mean. DESeq is then used to perform the differential expression analysis (or that’s how I understood it, because normally, DESeq needs the raw counts as an input).
DESeq (and many other packages available for RNA-seq ) assumes that the majority of genes is not differentially expressed. Since my panel is mostly composed of (according to micro-array data) differentially expressed genes, 4 reference genes were included to bypass this issue.
The reference genes were selected from a list of 11 candidates using Normfinder on Micro-array data.
I have the following questions:
1. Is there any way that one can validate the stable expression of reference genes in NGS data? How have other people selected reference genes for targeted RNA-seq and subsequently confirmed that they are indeed stable?
2. I would like to use a package like DESeq for the differential expression analysis. But for the estimation of the dispersion of the genes, it again assumes that the majority of the genes are not differentially expressed. For our data, that would render it too conservative, possible giving a lot of false negatives. Would it be valid approach to use a number of ‘healthy’ samples to create an artificial ‘reference’ sample and estimate the dispersion factors from this artificial reference sample? Can I then put these parameters into DESeq so that when I compare a tumor sample to it’s healthy sample the dispersion of the healthy is based on that reference?
I have a lot more questions, but I guess this is a start… I hope I’m making sense.
Looking forward to hear from any of you…
About a month ago I started working on a project involving targeted RNA seq and have been rapidly learning about the techniques/analyses, …
I have however run into some theoretical questions specific to our set-up, which, after extensive online searches, I can’t seem to find an answer to. I hope that some of you may be able to give some advice… I apologise if any of my questions are due to my inadequate knowledge, I am trying very hard to catch up on all there is to know about RNA-seq…
Our assay set-up is the following:
The assay is a Targeted RNA next generation sequencing assay to screen for differential expression in a limited number of gene targets (88) between tumor samples and adjacent ’normal/healthy’ samples from patients. For this, targets are enriched by amplification with specific primers in both tumor and adjacent sample and then run on a Miseq, similar to the Truseq Targeted RNA expression kits available from Illumina. Illumina offers kits with different panels of genes related to certain pathways.
Illumina then offers the possibility to analyse the data using DESeq. For the initial sizing of the library, you can either use DESeq to estimate the size factors, or indicate reference genes that are then used to estimate the size factor based on their geometric mean. DESeq is then used to perform the differential expression analysis (or that’s how I understood it, because normally, DESeq needs the raw counts as an input).
DESeq (and many other packages available for RNA-seq ) assumes that the majority of genes is not differentially expressed. Since my panel is mostly composed of (according to micro-array data) differentially expressed genes, 4 reference genes were included to bypass this issue.
The reference genes were selected from a list of 11 candidates using Normfinder on Micro-array data.
I have the following questions:
1. Is there any way that one can validate the stable expression of reference genes in NGS data? How have other people selected reference genes for targeted RNA-seq and subsequently confirmed that they are indeed stable?
2. I would like to use a package like DESeq for the differential expression analysis. But for the estimation of the dispersion of the genes, it again assumes that the majority of the genes are not differentially expressed. For our data, that would render it too conservative, possible giving a lot of false negatives. Would it be valid approach to use a number of ‘healthy’ samples to create an artificial ‘reference’ sample and estimate the dispersion factors from this artificial reference sample? Can I then put these parameters into DESeq so that when I compare a tumor sample to it’s healthy sample the dispersion of the healthy is based on that reference?
I have a lot more questions, but I guess this is a start… I hope I’m making sense.
Looking forward to hear from any of you…
Comment