SEQanswers

Go Back   SEQanswers > Applications Forums > RNA Sequencing



Similar Threads
Thread Thread Starter Forum Replies Last Post
DEGseq calculation method umnklang Bioinformatics 1 10-09-2012 05:48 PM
Degseq Question Amative RNA Sequencing 2 01-16-2012 09:55 AM
refFlat File for DEGseq newbietonextgen Bioinformatics 1 12-30-2010 08:16 AM
DEGseq or EdgeR MerFer Bioinformatics 3 02-25-2010 12:48 AM
DEGseq or edgeR mmanrique Bioinformatics 10 02-12-2010 02:13 PM

Reply
 
Thread Tools
Old 07-26-2011, 05:57 AM   #141
mgolo
Member
 
Location: Denmark

Join Date: Apr 2011
Posts: 10
Default

Quote:
Originally Posted by Xi Wang View Post
Hi Maria

1&2. The methods for DEG detection and the normalization beforehand should depend on how your data distributed. You may try all of them and choose the best one.

3. For biological replicates, it's better not to pool them together.

4. Raw read counts have nothing to do with gene annotation. In our documents, the opposite of 'raw read counts' is RPKM vaules. For the unannotated non-RNAs, you'd better analyze the gene structure first and then the DEGs.


Btw, we are working a new version of DEGseq, which will be more suitable for biological replicates.
Thanks for your reply Xi

I'll try all the methods when i have my annotation file. But, what are the criteria to know which one is the best?

Looking forward to your new version of DEGseq!
mgolo is offline   Reply With Quote
Old 07-26-2011, 06:58 AM   #142
Xi Wang
Senior Member
 
Location: MDC, Berlin, Germany

Join Date: Oct 2009
Posts: 317
Default

Quote:
Originally Posted by mgolo View Post
Thanks for your reply Xi

I'll try all the methods when i have my annotation file. But, what are the criteria to know which one is the best?

Looking forward to your new version of DEGseq!
I think one of the most important criteria should be how the DEGs detected consist with previous knowledge, though the new findings may give novel discoveries. From the statistical point of view, the best method should guarantee that your data don't violate the assumption of the chosen method.
__________________
Xi Wang
Xi Wang is offline   Reply With Quote
Old 08-03-2011, 09:32 PM   #143
townway
Member
 
Location: Rockville

Join Date: May 2009
Posts: 40
Default

Hi Xi,
My data is time course data with 6 time points but without replicate. I wonder if I can try your DEGseq.

If not, would you suggest some alternatively ways?

Thank you in advance!

Townway
townway is offline   Reply With Quote
Old 08-04-2011, 10:12 PM   #144
Xi Wang
Senior Member
 
Location: MDC, Berlin, Germany

Join Date: Oct 2009
Posts: 317
Default

Quote:
Originally Posted by townway View Post
Hi Xi,
My data is time course data with 6 time points but without replicate. I wonder if I can try your DEGseq.

If not, would you suggest some alternatively ways?

Thank you in advance!

Townway
Sorry Townway, DEGseq is now not suitable for time series data. Please try Cufflinks (http://cufflinks.cbcb.umd.edu/) instead. Thanks.
__________________
Xi Wang
Xi Wang is offline   Reply With Quote
Old 09-22-2011, 09:18 PM   #145
wangleibio
Junior Member
 
Location: shanghai

Join Date: Nov 2009
Posts: 8
Default DEGdseq problem

hi,xi
I have a problem using DEGseq,
DEGexp(geneExpMatrix1 = geneExpMatrix1, geneCol1 = 1,expCol1 = 2, groupLabel1 = "roottip",geneExpMatrix2 = geneExpMatrix2,geneCol2 = 1,expCol2 = 2,groupLabel2 = "hypocotyl",outputDir= "./roothypocoty",method = "MARS")

Please wait...
gene id column in geneExpMatrix1 for sample1: 1
expression value column(s) in geneExpMatrix1: 2
total number of reads uniquely mapped to genome obtained from sample1: 62747041
gene id column in geneExpMatrix2 for sample2: 1
expression value column(s) in geneExpMatrix2: 2
total number of reads uniquely mapped to genome obtained from sample2: 69469907

method to identify differentially expressed genes: MARS
pValue threshold: 0.001
output directory: ./roothypocoty

Please wait ...
Identifying differentially expressed genes ...
Please wait patiently ...
output ...

Done ...
The results can be observed in directory: ./roothypocoty



problem:


it can produce the file(outpuDir),but do not produce MA-plot,
additionaly, my two sample data do not have replicates.


hope you help !
thanks !
lei
wangleibio is offline   Reply With Quote
Old 09-22-2011, 09:42 PM   #146
Xi Wang
Senior Member
 
Location: MDC, Berlin, Germany

Join Date: Oct 2009
Posts: 317
Default

Quote:
Originally Posted by wangleibio View Post
hi,xi
I have a problem using DEGseq,
DEGexp(geneExpMatrix1 = geneExpMatrix1, geneCol1 = 1,expCol1 = 2, groupLabel1 = "roottip",geneExpMatrix2 = geneExpMatrix2,geneCol2 = 1,expCol2 = 2,groupLabel2 = "hypocotyl",outputDir= "./roothypocoty",method = "MARS")

Please wait...
gene id column in geneExpMatrix1 for sample1: 1
expression value column(s) in geneExpMatrix1: 2
total number of reads uniquely mapped to genome obtained from sample1: 62747041
gene id column in geneExpMatrix2 for sample2: 1
expression value column(s) in geneExpMatrix2: 2
total number of reads uniquely mapped to genome obtained from sample2: 69469907

method to identify differentially expressed genes: MARS
pValue threshold: 0.001
output directory: ./roothypocoty

Please wait ...
Identifying differentially expressed genes ...
Please wait patiently ...
output ...

Done ...
The results can be observed in directory: ./roothypocoty



problem:


it can produce the file(outpuDir),but do not produce MA-plot,
additionaly, my two sample data do not have replicates.


hope you help !
thanks !
lei
Thanks for using DEGseq.

To figure out your problem, please try
(1) Run the example provide in the help document. Simply type "?DEGexp" in the R console, and cope/paste the Examples at the end of the document. Then check if the example works properly
(2) Run "sessionInfo()" in R console, and paste the result here or better email to me "[email protected]"

Thanks.
__________________
Xi Wang
Xi Wang is offline   Reply With Quote
Old 03-25-2012, 10:45 PM   #147
AsoBioInfo
Member
 
Location: KSA

Join Date: Dec 2011
Posts: 37
Default DEGseq Question

Hello,

I have a question regarding DEGseq. I am not understanding the syntax of layout:
layout(matrix(c(1, 2, 3, 4, 5, 6), 3, 2, byrow = TRUE))

I am seeing my graphs but it is not interpreting anything. For my data only three rows were considered and their log fold changes were calculated. But for the remaining data, no histogram was built.

The first chunk of data is able to read the whole data, I think something is wrong in only fixing the layout and matrix.

Thanks for your help!
Aso
AsoBioInfo is offline   Reply With Quote
Old 03-26-2012, 04:13 AM   #148
Xi Wang
Senior Member
 
Location: MDC, Berlin, Germany

Join Date: Oct 2009
Posts: 317
Default

Quote:
Originally Posted by AsoBioInfo View Post
Hello,

I have a question regarding DEGseq. I am not understanding the syntax of layout:
layout(matrix(c(1, 2, 3, 4, 5, 6), 3, 2, byrow = TRUE))

I am seeing my graphs but it is not interpreting anything. For my data only three rows were considered and their log fold changes were calculated. But for the remaining data, no histogram was built.

The first chunk of data is able to read the whole data, I think something is wrong in only fixing the layout and matrix.

Thanks for your help!
Aso

Dear Aso, thanks for your questions.

The "layout" is only related to drawing the DEGSeq output plot. Specifically, the command line means to generate a figure with 6 panels in 3 rows and 2 columns.

For your problem, could you copy and paste a head of your data and your command lines here? Thus I will be able to diagnose the issues. Thanks.
__________________
Xi Wang
Xi Wang is offline   Reply With Quote
Old 03-26-2012, 04:38 AM   #149
ETHANol
Senior Member
 
Location: Western Australia

Join Date: Feb 2010
Posts: 310
Default

Quote:
Originally Posted by AsoBioInfo View Post
Hello,

I have a question regarding DEGseq. I am not understanding the syntax of layout:
layout(matrix(c(1, 2, 3, 4, 5, 6), 3, 2, byrow = TRUE))

I am seeing my graphs but it is not interpreting anything. For my data only three rows were considered and their log fold changes were calculated. But for the remaining data, no histogram was built.

The first chunk of data is able to read the whole data, I think something is wrong in only fixing the layout and matrix.

Thanks for your help!
Aso
Are you analyzing RNA-seq data? If so the overwhelming opinion of the community is that the poisson model of DEGseq is invalid and you should use edgeR or DESeq instead.
__________________
--------------
Ethan
ETHANol is offline   Reply With Quote
Old 03-26-2012, 05:38 AM   #150
AsoBioInfo
Member
 
Location: KSA

Join Date: Dec 2011
Posts: 37
Default

Quote:
Originally Posted by Xi Wang View Post
Dear Aso, thanks for your questions.

The "layout" is only related to drawing the DEGSeq output plot. Specifically, the command line means to generate a figure with 6 panels in 3 rows and 2 columns.

For your problem, could you copy and paste a head of your data and your command lines here? Thus I will be able to diagnose the issues. Thanks.

Thanks Xi for your reply!

The output score data looks like this:
"GeneNames" "value1" "value2" "log2(Fold_change)"
00000000000000 6 10 -0.736 -0.643
11111111111111 68 69 -0.02 0.072
22222222222222 1 1 0 0.095
33333333333333 NA NA NA NA NA NA NA NA FALSE
44444444444444 NA NA NA NA NA NA NA NA FALSE

Note: There are other scores also.

The fold change is calculated for only three rows. Although the matrix is having all values since it is giving output the whole matrix. The commands I used are:

-> library(DEGseq)
geneExpFile <- "D:/data/MyData.txt"
geneExpMatrix1 <- readGeneExp(file=geneExpFile, geneCol=1, valCol=c(7,9,11))
geneExpMatrix2 <- readGeneExp(file=geneExpFile, geneCol=1, valCol=c(8,10,12))
write.table(geneExpMatrix1[1:13,],row.names=FALSE)
write.table(geneExpMatrix2[1:13,],row.names=FALSE)

-> layout(matrix(c(1,2,3,4,5,6), 3, 2, byrow=TRUE))
par(mar=c(2, 2, 2, 2))
DEGexp(geneExpMatrix1=geneExpMatrix1, geneCol1=1, expCol1=c(2,3,4,5,6), groupLabel1="Label1",
geneExpMatrix2=geneExpMatrix2, geneCol2=1, expCol2=c(2,3,4,5,6), groupLabel2="Label2",
method="MARS")

Hope this helps!

Thanks!
AsoBioInfo is offline   Reply With Quote
Old 03-26-2012, 05:41 AM   #151
AsoBioInfo
Member
 
Location: KSA

Join Date: Dec 2011
Posts: 37
Default

Quote:
Originally Posted by ETHANol View Post
Are you analyzing RNA-seq data? If so the overwhelming opinion of the community is that the poisson model of DEGseq is invalid and you should use edgeR or DESeq instead.

Thanks for your reply!

Yup.. it is RNA-seq data.... Okay, I'll try DESeq and edgeR

Thanks once again....
AsoBioInfo is offline   Reply With Quote
Old 03-26-2012, 03:53 PM   #152
Xi Wang
Senior Member
 
Location: MDC, Berlin, Germany

Join Date: Oct 2009
Posts: 317
Default

Quote:
Originally Posted by AsoBioInfo View Post
Thanks Xi for your reply!

The output score data looks like this:
"GeneNames" "value1" "value2" "log2(Fold_change)"
00000000000000 6 10 -0.736 -0.643
11111111111111 68 69 -0.02 0.072
22222222222222 1 1 0 0.095
33333333333333 NA NA NA NA NA NA NA NA FALSE
44444444444444 NA NA NA NA NA NA NA NA FALSE

Note: There are other scores also.

The fold change is calculated for only three rows. Although the matrix is having all values since it is giving output the whole matrix. The commands I used are:

-> library(DEGseq)
geneExpFile <- "D:/data/MyData.txt"
geneExpMatrix1 <- readGeneExp(file=geneExpFile, geneCol=1, valCol=c(7,9,11))
geneExpMatrix2 <- readGeneExp(file=geneExpFile, geneCol=1, valCol=c(8,10,12))
write.table(geneExpMatrix1[1:13,],row.names=FALSE)
write.table(geneExpMatrix2[1:13,],row.names=FALSE)

-> layout(matrix(c(1,2,3,4,5,6), 3, 2, byrow=TRUE))
par(mar=c(2, 2, 2, 2))
DEGexp(geneExpMatrix1=geneExpMatrix1, geneCol1=1, expCol1=c(2,3,4,5,6), groupLabel1="Label1",
geneExpMatrix2=geneExpMatrix2, geneCol2=1, expCol2=c(2,3,4,5,6), groupLabel2="Label2",
method="MARS")

Hope this helps!

Thanks!
Hi, By reading your code, I guess you were going to compare gene expression levels for two groups, each having 3 replicates. The expression values for Group1 were of Columns 7,9,11 in your MyData.txt file; whilst values for Group2 were of Columns 8,10,12 of MyData.txt. Is that right? So far, I understand you did a 3 versus 3 comparison. However, in the line starting with DEGexp, it seems you performed a 5 versus 5 comparison, as you listed 5 columns for each group. Perhaps, you were confused by "layout". As I said before, layout is to format the output figure but has nothing to do with your data matrix.

Besides, I'd like to make it clear that DEGseq works well with technical replicates from the same experiment manipulation. It has been shown in our paper that the detection variance in technical replicates can be almost totally explained by Poisson models.
In Hardcastle et al 2010, DEGSeq has been shown to have a better performance than other tools compared in a real world dataset (Figure 5 of Hardcastle et al 2010). The choice of methods/tools is your decision, but you'd better have a more comprehensive understanding of these tools as well as your data.

Any further questions please let me know.

Ref:
Hardcastle, T.J. and Kelly, K.A. (2010) baySeq: empirical Bayesian methods for identifying differential expression in sequence count data, BMC Bioinformatics, 11, 422.
__________________
Xi Wang

Last edited by Xi Wang; 03-26-2012 at 04:06 PM.
Xi Wang is offline   Reply With Quote
Old 06-14-2012, 09:24 AM   #153
amdic2
Junior Member
 
Location: Quebec city

Join Date: Jul 2011
Posts: 6
Default Print q-value with SamWrapper

Dear all,
I am using the samWrapper function from DEGseq.
I would like to be able to get the q-values in the output of the method, as I need them in order to make a volcano plot. The problem is that for low q-values (e.g. 10e-4) samWrapper outputs "0". Can anybody help?
Thank you,
Anne-Marie
amdic2 is offline   Reply With Quote
Old 06-14-2012, 08:56 PM   #154
Xi Wang
Senior Member
 
Location: MDC, Berlin, Germany

Join Date: Oct 2009
Posts: 317
Default

Thanks for your question. The q-values are calculated by function in 'samr' package, and we didn't change anything regarding the calculation of q-values. You may have to add a small number (say 1e-6) to make your volcano plot work.
__________________
Xi Wang
Xi Wang is offline   Reply With Quote
Old 03-05-2015, 07:32 AM   #155
a0909
Junior Member
 
Location: France

Join Date: Nov 2014
Posts: 3
Default z-scores

Quote:
Originally Posted by Xi Wang View Post
Hi Sol, Thanks for using DEGseq.

In the output file, there are 2 columns for fold-change: "log2(Fold_change)" and "log2(Fold_change) normalized". log2(Fold_change) = log(value1/value2), and the normalized value is got from the normalized value1 and value2. From the value of fold-change, you can judge this gene is up-regulated or down-regulated. For example, for a gene if its log2(Fold_change) > 0, which means value1 > value2, and if its signature = TRUE, this gene is significantly down-regulated in condition 2. Also, you can look into z-scores.

Hope this helps.
Hello Xi,
It is regarding the last line of the quoted answer ("you can look into z-scores").I would like to know whether the Zscore >0 is equivalent to log2(Fold_change) > 0, implying the negative Zscores are the down regulated genes in the condition 2 (as per the example quoted in your answer).
I would appreciate your help.

Thanks
a0909 is offline   Reply With Quote
Old 03-05-2015, 12:38 PM   #156
Xi Wang
Senior Member
 
Location: MDC, Berlin, Germany

Join Date: Oct 2009
Posts: 317
Default

Quote:
Originally Posted by a0909 View Post
Hello Xi,
It is regarding the last line of the quoted answer ("you can look into z-scores").I would like to know whether the Zscore >0 is equivalent to log2(Fold_change) > 0, implying the negative Zscores are the down regulated genes in the condition 2 (as per the example quoted in your answer).
I would appreciate your help.

Thanks
1) Zscore >0 is equivalent to log2(Fold_change) > 0
2) negative Zscores, nagative log2(Fold_change), then expression in condition 1 < that in condition 2, thus up-regulated in condition 2.
__________________
Xi Wang
Xi Wang is offline   Reply With Quote
Reply

Tags
degseq, rna-seq

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 05:50 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2018, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO