SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
estimate differential expression papori De novo discovery 0 07-19-2011 06:32 AM
Differential expression graphics Chuckytah Bioinformatics 10 06-18-2011 01:34 PM
Determining differential expression swarbre Bioinformatics 0 11-12-2010 09:38 AM
how to study differential expression? beliefbio Bioinformatics 23 08-05-2010 04:04 AM
Differential expression noe Bioinformatics 0 07-07-2010 04:16 PM

Reply
 
Thread Tools
Old 05-11-2011, 07:37 AM   #1
ericamica
Junior Member
 
Location: Milan, Italy

Join Date: May 2011
Posts: 4
Default miRNA differential expression (DESeq)

Hi everyone!
I'm currently analyzing small-RNA-seq illumina data coming from 4 different samples. I have data coming from 2 biological replicates (so in total, 8 libraries).
We have found novel and conserved miRNA hairpins and defined the best miRNA/miRNA* duplexes for each of them.
Now I'm trying using DESeq and EdgeR for some differential expression analysis.
I have 2 different tissues, under stress and control condition. My comparison are: Tissue1stress vs Tissue1 control; Tissue2stress vs Tissue2 control; but also Tissue1stress vs Tissue2 stress; Tissue1 control vs Tissue2 stress.

I've big troubles in determining:
1. which tag to load in DESeq as raw data: should I consider all diffferent tags coming from the whole library (too much I think) or only tag mapping on putative hairpins? or uniquely tags mapping on the predicted duplexes?

2. Using tags coming from many putative hairpins, I tried running DESeq, but I got very bad SCV Plots, and what is most, many tags are not significant because of FDR value! Even if fold change is very high! Some times also ResVarA or resVarB are high too...

I can not understand if the problem is in my replicates (being real biological replicates they are not highly similar) or in the statistical model that doesn't fit miRNA data...

Some times the tag that is most expressed in a pre-miRNA and seems to have a high fold change, is not significant, while another tag on the same hairpin show a very low FDR and p-value...


Can anybody help me?

Thank you very much

Erica
ericamica is offline   Reply With Quote
Old 05-11-2011, 08:40 AM   #2
Simon Anders
Senior Member
 
Location: Heidelberg, Germany

Join Date: Feb 2010
Posts: 994
Default

1. Taking only the tags mapping on putative hairpins should be fine. For mRNA, you don't count reads mapping to intragenic regions, anyway. I guess, you see all the other tags only two or three times each, anyway, at least, if you have removed low-quality reads.

2. What is the SCV value? (Read off at the most common expression strength, i.e., where the black line peaks.) What are typical fold changes? You know that if alpha is the SCV value, you can interpret 1+sqrt(alpha) roughly as the typical fold change between replicates for strongly expressed genes. If your fold changes are not a good deal larger than the fold change between replicates, you cannot do much.

This is, unless your samples are paired. Is the "tissue1stressed" and the "tissue1" sample the same sample, used twice in a different way, or are they two different samples of the same tissue? In the former case, you gain statistical power by introducing a blocking factor.
Simon Anders is offline   Reply With Quote
Old 05-12-2011, 12:57 AM   #3
ericamica
Junior Member
 
Location: Milan, Italy

Join Date: May 2011
Posts: 4
Default

Thanks for you reply!
1: I've taken all tags mapping on hairpins even if the risk is running DESeq with less than 6000 tags, hope it won't create any problem...

2: SCV value (peak of the black line) is on the X axis around 10, and on the Y-axis a little bit less than 1.5. I'm sorry for my stupid question, but which is the SCV value? X or Y value? Anyway, the typical fold change is between 6 and 22, with some genes going to 35 or 70! Some genes showing a fold change around 10, have a high FDR. I attach here the SCV Plot, that looks very strange to me...

As for the tissue, maybe I was not clear enough: we collected 4 different samples: tissue1-control; tissue1-stress; tissue2-control and tissue2-stress.
When I refer to tissue1-stress and control they comes from two distinct groups of plants: one subjected to stress, one in control condition, while tissue1 and tissue2 in the same condition (let's say control) come from the same plants, but are 2 different part of the plants.
Given this specification, I don't understand what you have written about the blocking factor.

A last question is: When runnng Deseq, I load 8 different libraries in countsTable, defining the vector of the 4 different conditions. After this, when performing the nbinomTest, I compare 2 conditions at a time: could this influence some how the analysis? Would it be different if I performed DESeq with only 2 conditions and a countsTable of 4 libraries?

cheers,
Erica
Attached Files
File Type: pdf SCV PLOT.pdf (41.8 KB, 136 views)
ericamica is offline   Reply With Quote
Old 05-16-2011, 07:18 AM   #4
ericamica
Junior Member
 
Location: Milan, Italy

Join Date: May 2011
Posts: 4
Default

Anybody could answer my question?
ericamica is offline   Reply With Quote
Reply

Tags
deseq, differential expression, fdr, mirnas

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 07:03 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO