Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • DEG analysis with paired sample without replication

    I have a question regarding to analyze the RNAseq data.

    My data is exactly looks like this.(total 12 samples)

    disease1 patient1 pre-hormone
    disease1 patient1 post-hormone
    disease1 patient2 pre-hormone
    disease1 patient2 post-hormone
    disease1 patient3 pre-hormone
    disease1 patient3 post-hormone
    disaese2 patient4 pre-hormone
    disaese2 patient4 post-hormone
    dissease3 patient5 pre-hormone
    disease3 patient5 post-hormone
    disease3 patient6 pre-hormone
    disease3 patient6 post-hormone..

    I would like to find the genes that are affected after hormone treatment ( such as differentially expressed genes).
    Since, it is not easy to handle multiple diseases, multiple patients within disease, and hormone, I could not create the design matrix for GLM analysis.

    The easist way to handle this might separate the data per each disease.
    (e.g. disease1, disease2, disease3)

    Then, I made three different experiment set (for disease 1, disease2, disease3) and test with patient and hormone factor. Does it make sense?? Three different experiment set is called data1(disease1), data2(disease2), data3(disease3).

    data1

    disease1 patient1 pre-hormone
    disease1 patient1 post-hormone
    disease1 patient2 pre-hormone
    disease1 patient2 post-hormone
    disease1 patient3 pre-hormone
    disease1 patient3 post-hormone

    data2

    disaese2 patient4 pre-hormone
    disaese2 patient4 post-hormone

    data3

    dissease3 patient5 pre-hormone
    disease3 patient5 post-hormone
    disease3 patient6 pre-hormone
    disease3 patient6 post-hormone..


    so, to test the differentially expressed genes after hormone treatment for disease 2 with DESeq

    d_hunter <- estimateDispersions(data2, method="blind", sharingMode="fit-only") // I put this one since I only have one sample for data2(disease2)
    plotDispEsts(data2)

    plotPCA(varianceStabilizingTransformation(data2), intgroup=c("patient", "hormone"))

    dh_fit1 = fitNbinomGLMs(data2, count ~ patient + hormone)
    dh_fit0 = fitNbinomGLMs(data2, count ~ patient)
    dh_pvalsGLM = nbinomGLMTest( dh_fit1, dh_fit0 )
    dh_padjGLM = p.adjust( dh_pvalsGLM, method="BH" )

    ===============================

    Same for EdgeR
    dg_edesign <- model.matrix(~patient+hormone)

    dg_e <- DGEList(counts=data2)
    dg_e <- calcNormFactors(dg_e)
    dg_e <- estimateGLMCommonDisp(dg_e, dg_edesign)
    dg_e <- estimateGLMTrendedDisp(dg_e, dg_edesign)
    dg_e <- estimateGLMTagwiseDisp(dg_e, dg_edesign)
    dg_efit <- glmFit(dg_e, dg_edesign)
    dg_efit <- glmLRT(dg_efit, coef=3)

    ==================================

    This code is right???????????? I tried to do the same thing for DESeq and EdgeR. So, I expect these two codes did the same thing... even though the result could be different!!

    =================================================
    In addition I woud like to consider all samples at the same time instead of separating the data by disease, how can I do it?

    design2 <- model.matrix (~disease+diseaseatient+disease:hormone)

    would be a one way.. or

    design2 <- model.matrix (~0+disease+diseaseatient+disease:hormone)


    Does it make sense?

    Could you please somebody give comments?

    Thanks in advance
    Last edited by younko; 05-06-2014, 05:08 PM. Reason: adding more comments

Latest Articles

Collapse

  • seqadmin
    Essential Discoveries and Tools in Epitranscriptomics
    by seqadmin




    The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
    04-22-2024, 07:01 AM
  • seqadmin
    Current Approaches to Protein Sequencing
    by seqadmin


    Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
    04-04-2024, 04:25 PM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, Today, 11:49 AM
0 responses
13 views
0 likes
Last Post seqadmin  
Started by seqadmin, Yesterday, 08:47 AM
0 responses
16 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-11-2024, 12:08 PM
0 responses
61 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-10-2024, 10:19 PM
0 responses
60 views
0 likes
Last Post seqadmin  
Working...
X