Actually, it only does call more genes/transcripts because of better fit, not because of multiple testing. The reason is that Partek Flow is not picking the model with the best p-value, it is picking the model with the best fit. As for making biological sense, it absolutely does in my mind, as it does not assume that all genes/transcripts are influenced by the same biological factors and it informs me which genes are influenced by which biological factors and how many genes are altered by which biological factors.
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
Originally posted by rfilbert View PostThere are multiple options for normalization, but I believe the default option is to simply normalize to the total number of reads for each sample. I don't think any normalization based on the length of the transcript (like RPKM) matters as for this analysis you are comparing the same transcript in different groups of samples.
Originally posted by rfilbert View PostActually, it only does call more genes/transcripts because of better fit, not because of multiple testing. The reason is that Partek Flow is not picking the model with the best p-value, it is picking the model with the best fit. As for making biological sense, it absolutely does in my mind, as it does not assume that all genes/transcripts are influenced by the same biological factors and it informs me which genes are influenced by which biological factors and how many genes are altered by which biological factors.
Comment
-
Really? Are you sure? In my case, the samples are from a breast cancer study. Some of the samples are tumor and some are normal. Some are ER+ and some are ER-. Partek Flow tested each gene to see whether it was differentially expressed between tumor vs. normal, ER+ vs. ER-, or both (interaction) - or whether the gene was not affected by either factor. What problem do you have with that?
Comment
-
I guess you are saying that as a researcher, I can only have 1 hypothesis, and only about 1 factor, and it must be the same for every gene. That is not how a biologist thinks. We want to learn what is going on with the biology and not tell the biology what hypothesis it must answer.
Comment
-
Originally posted by joxcargator73 View PostI believe that replicates are very important to have good quality results. RNA-seq is becoming cheaper and cheaper but still quite expensive for small labs. In this case I also believe that RNAseq without replicates could be used as screening and then confirm by replicating qRT-PCR and based you conclusion on these results.
Comment
-
Well, you could estimate variance if you really wanted, and could rank genes by p-value and/or fold change. One way you could do that is to assume a Poisson distribution, where the variance is equal to the mean, and use something like a log-likelihood test. All that said, I certainly agree that replicates - that is INDEPENDENT BIOLOGICAL REPLICATES are required to estimate variance within your biological population that you wish to make an inference about.
Comment
-
Originally posted by rfilbert View PostI guess you are saying that as a researcher, I can only have 1 hypothesis, and only about 1 factor, and it must be the same for every gene. That is not how a biologist thinks. We want to learn what is going on with the biology and not tell the biology what hypothesis it must answer.
Originally posted by rfilbert View PostOne way you could do that is to assume a Poisson distribution, where the variance is equal to the mean, and use something like a log-likelihood test.
Comment
-
Originally posted by rfilbert View PostI guess you are saying that as a researcher, I can only have 1 hypothesis, and only about 1 factor, and it must be the same for every gene. That is not how a biologist thinks. We want to learn what is going on with the biology and not tell the biology what hypothesis it must answer.
What Biological justification is there to fit each gene individually? What is the Biological explanation for this? I too am a Biologist and I know of no Biological reason to justify this. However, as a Biologist, I also know that the average Biologist knows little statistics and how a Biologist thinks is not justification for choosing your statistics.
One thing I think you are ignoring is that RNA-seq data, like all measures of gene expression is subject to its own kinds of biases and technical variation. The variation you see in RNA-seq data is not purely Biological and Biological reasoning cannot justify all the variation, particularly for genes with fewer reads.Last edited by chadn737; 01-07-2013, 09:49 PM.
Comment
-
In my case, some of the genes were differentially expressed between tumor and normal, while others were not, but some of them were differentially expressed between ER+ and ER-. As a fellow biologist, why is this so confusing for you? I also asked a professor of statistics at UCSD that I work with about this and he said he thought it made perfect sense. It certainly made sense in my research, and obviously made sense to the statisticians at Partek who have always been very helpful to me as well.
Comment
-
Originally posted by rfilbert View PostIn my case, some of the genes were differentially expressed between tumor and normal, while others were not, but some of them were differentially expressed between ER+ and ER-. As a fellow biologist, why is this so confusing for you? I also asked a professor of statistics at UCSD that I work with about this and he said he thought it made perfect sense. It certainly made sense in my research, and obviously made sense to the statisticians at Partek who have always been very helpful to me as well.
Having multiple factors is different than individually fitting each gene to one of five distributions which you claim Partek does. Both jwfoley and I have criticized the latter while you keep talking about the former.
As was pointed out to you earlier, a program like DESeq is capable of multifactorial designs like yours. Of course it makes biological sense that you would find differentially expressed genes under one set of factors and not another. That's not the issue. What I am asking you to support is your claim that fitting each gene individually to a different distribution makes biological sense, let alone statistical sense.
Comment
-
Well if you want to debate this with the statisticians at Partek, feel free. I use it and like as do many of my colleagues. As you said, you are not a statistician either, and I don't know about jwfoley, but I have spoken with statisticians at Partek and I find them quite friendly, helpful, and knowledgeable.
Regarding the confusion between us, Partek Flow not only fits 5 different distributions to each gene, but also multiple factors, so for example, if my candidate models are:
1. Tumor status
2. ER status
3. Tumor + ER
4. Tumore + ER + Tumor*ER (interaction)
then they fit 5 (distributions) x 4 (model designs) = 20 statistical tests
Hopefully that clears up the confusion. Now, on to your argument that all genes or all transcripts follow the same distribution, I don't think I need to be a card carrying statistician to know that is ridiculous. If it was obvious which distribution they all follow, why hasn't the community agreed which distribution that is?
Comment
Latest Articles
Collapse
-
by seqadmin
Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...-
Channel: Articles
04-04-2024, 04:25 PM -
-
by seqadmin
Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...-
Channel: Articles
03-22-2024, 06:39 AM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, 04-11-2024, 12:08 PM
|
0 responses
30 views
0 likes
|
Last Post
by seqadmin
04-11-2024, 12:08 PM
|
||
Started by seqadmin, 04-10-2024, 10:19 PM
|
0 responses
32 views
0 likes
|
Last Post
by seqadmin
04-10-2024, 10:19 PM
|
||
Started by seqadmin, 04-10-2024, 09:21 AM
|
0 responses
28 views
0 likes
|
Last Post
by seqadmin
04-10-2024, 09:21 AM
|
||
Started by seqadmin, 04-04-2024, 09:00 AM
|
0 responses
53 views
0 likes
|
Last Post
by seqadmin
04-04-2024, 09:00 AM
|
Comment