Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • EdgeR or GLM model design question

    Dear,

    I have a question related with my design matrix.
    My RNA-seq experimental design looks like this. (paried design)


    patient gender drug
    sample1 patient1 F Pre
    sample2 patinet1 F Post
    sample3 patient2 M Pre
    sample4 patient2 M Post
    sample5 patient3 F Pre
    sample6 patient3 F Post
    sample7 patient4 F Pre
    sample8 patient4 F Post
    sample9 patient5 F Pre
    sample10 patient5 F Post
    sample11 patient6 M Pre
    sample12 patient6 M Post

    =====================================

    What I want to test is this.

    Q1) DEG responsing to drug... (regardless of gender)
    Q2) DEG responsing to drug on female
    DEG responsing to drug on male..

    More ultimately, Does genes reponse to the drug differently based on gender??? (or not really?? ) In other words, gender is the significant factor to drug response??

    Q3) Which gender is more sensitive to the drug...??

    ==================================================
    In order to test or answer Q1 , my model looks like this.

    design<- model.matrix(~drug + drug:gender, data=data)

    colnames(design)
    ##################################################
    [1] "(Intercept)" "drugpost" "drugpre:genderM" "drugpost:genderM"
    ###################################################

    lrt1 <- glmLRT(fabry_fit, coef=2) # drugpost

    returns the genes responsing to the drug.. (Q1)


    ======================================


    lrt2 <- glmLRT(fabry_fit, coef=4) # drugpost:genderM

    does it returns that genes responding the drug depending on gender differently ?? In this case, we can answer "DEG responsing to drug on female
    DEG responsing to drug on male.. " ?? Q2 ...........

    =======================================================

    Then, what is the meaing of drugpre:genderF???

    Could you please somebody to explain the meaning of this two model??

    Again, I would like to differentally expressed genes under the drug treatment as well as gender.
    (e.g. up after drug treatment for male, but not for female.. something like this... )

    thanks in advance,
    Last edited by younko; 08-29-2014, 12:03 AM.

  • #2
    One more question:

    I am trying to do several changes..

    In the same experimental design,

    I created two model..

    MODEL1
    # model.matrix(~drug + drug:gender)
    #[1] "(Intercept)" "drugpost" "drugpre:genderM" "drugpost:genderM"

    MODEL2
    # model.matrix(~drug + gender + drug:gender)
    #[1] "(Intercept)" "drugpost" "genderM" "drugpost:genderM"

    When I apply with these two models, I thought that last coefficient should test the same thing....

    BUT MAYBE NOT

    When, I compare the result from these two coefficient results, the significant genes are totally different.. which I am confusing..

    out1 <- glmLRT(dd_model1, coef=4)
    out2 <- glmLRT(dd_model2, coef=4)


    Please help me with this!
    Last edited by younko; 08-29-2014, 12:04 AM.

    Comment


    • #3
      Your experimental design is the same as that covered in Section 3.5 of the edgeR User's Guide called "Comparisons Both Between and Within Subjects". You can simply follow the advice and code given in the User's Guide.

      Comment


      • #4
        Thank you so much.

        However, I still have a problem. After reading that example, I changed my model like this

        model.matrix(~drug + gender + gender:drug)
        #[1] "(Intercept)" "drugpost" "genderM" "drugpost:genderM"

        I exepct to see the following coefficients such as

        genderF:drugpost : genes responding to the drug in female:
        genderM : drugpost : genes responding to the drug in male


        But I can only see "drugpost:genderM" coefficient

        genes responding to the gender in drug treatment.


        Why this happens? Do I misunderstand??

        Comment


        • #5
          You haven't followed the example in the User's Guide. The User's Guide tells you to fit two interaction terms but you have fitted only one. You have to follow the example exactly.

          Your gender variable corresponds to Disease in the example, while your drug variable corresponds to Treatment in the example.

          Comment


          • #6
            Thank you Gorden,

            I tried as you suggest.. but it seems my result looks weird..

            when I try my model looks like this

            # model.matrix(~drug + drug:gender)
            #[1] "(Intercept)" "drugpost" "drugpre:genderM" "drugpost:genderM"

            * cpm value for top ranked gene based on drugpost:gender coefficent looks like this.

            KYM_pre(F) KYM_post(F) GDY_pre(M) GDY_post(M) HKS_pre(F) HKS_post(F) KWK_pre(F) KWK_post(F) PYM_pre(F) PYM_post(F) SHM_pre(M) SHM_post(M)
            RPS4Y1 0 0 149.5080373 79.3694069 0 0 0 0 0 0 201.5823721 265.40717
            KDM5D 0 0 63.02199964 50.10588553 0 0 0 0.186331781 0 0 30.12833879 88.66478468
            PRKY 0 0 35.19605708 37.68467861 0 0 0 0 0 0 15.26238215 54.41238442
            UTY 0 0 20.75664905 26.10558742 0 0 0 0 0 0 4.55889337 13.1137761
            DDX3Y 0 0 29.78127907 54.94805093 0 0.191243459 0 0 0 0 4.757106125 15.46251212

            But when I used the model like this as you suggested (If I understand correctly..

            # model.matrix(~gender + gender:drug)

            cpm value for top ranked genes for this coefficients genderM:drugpost looks like this.

            KYM_pre(F) KYM_post(F) GDY_pre(M) GDY_post(M) HKS_pre(F) HKS_post(F) KWK_pre(F) KWK_post(F) PYM_pre(F) PYM_post(F) SHM_pre(M) SHM_post(M)
            IL4I1 2.9742981 1.7673777 6.9276722 1.686450 2.626656 2.2986379 3.536214 1.679694 1.2629356 1.6349216 9.1263648 2.5477597
            FKBP5 75.6793627 52.6678567 99.2464349 669.520561 148.162849 86.5820270 113.480321 87.530734 65.2516736 72.1409155 40.8702425 65.2618450
            ARG1 5.2876411 4.5951821 2.7108283 23.610297 5.156028 3.4479568 4.982847 8.585104 6.1041888 3.6785736 0.7935969 1.9598152
            MYO5B 0.9914327 0.0000000 0.1506016 1.264837 2.821223 2.4901910 2.571792 1.306429 0.8419571 0.6130956 0.0000000 1.9598152
            LOC100130855 1.6523878 2.1208533 0.7530079 6.956605 6.226147 3.0648505 4.661373 3.359389 0.4209785 0.8174608 0.0000000 0.7839261

            If I undestand corrently,

            coefficents genderF:drugpost returns the genes responding the drug at gender F

            coefficents genderM:drugpost returns the genes responding the drug at gender M.

            But what I looked at the result cpm value, it seems that coeffient drugpost:gender results looks more reasonable..

            Am I missing something??
            Last edited by younko; 08-29-2014, 12:45 AM.

            Comment


            • #7
              You haven't tried what I suggested.

              Comment


              • #8
                Hm..

                This model is the one you suggest.. Isn't it?

                model.matrix(~gender + gender:drug)
                #[1] "(Intercept)" "genderM" "genderF:drugpost" "genderM:drugpost"

                In case, I also tested with this one (which is exactly same with 3.5 EdgeR example.

                model.matrix(~gender + genderatient + gender:drug)
                #[1] "(Intercept)" "genderM" "genderFatient" "genderMatient" "genderF:drugpost"
                #[6] "genderM:drugpost"
                ###########################
                Am I missing?

                =================================


                coefficient for "genderF:drugpost" and "genderM:drugpost" represents that the the drug response genes in Female, drug response genes in Male respectively..Isn't it?


                The first model and second model reports the similar genes as significant ones... which is similar results what I report above.. ;(
                Last edited by younko; 08-31-2014, 09:41 PM.

                Comment


                • #9
                  Originally posted by younko View Post
                  Hm..

                  This model is the one you suggest.. Isn't it?

                  model.matrix(~gender + gender:drug)
                  #[1] "(Intercept)" "genderM" "genderF:drugpost" "genderM:drugpost"
                  I have no idea where you got this model from. I did not suggest it. It is obviously wrong because it ignores the patient information.

                  In case, I also tested with this one (which is exactly same with 3.5 EdgeR example.

                  model.matrix(~gender + genderatient + gender:drug)
                  #[1] "(Intercept)" "genderM" "genderFatient" "genderMatient" "genderF:drugpost"
                  #[6] "genderM:drugpost"
                  ###########################
                  Yes, this is what I have been suggesting all along.

                  coefficient for "genderF:drugpost" and "genderM:drugpost" represents that the the drug response genes in Female, drug response genes in Male respectively..Isn't it?
                  Yes, that is what the edgeR User's Guide tells you.

                  The first model and second model reports the similar genes as significant ones... which is similar results what I report above.. ;(
                  The two models are not the same.

                  Have you defined the patient factor correctly as described in the edgeR User's Guide?

                  Comment


                  • #10
                    Thank you Gordon..

                    Now, I have a brief idea..

                    Yes. I followed the exact format as EdgeR user guide.


                    gender patient drug
                    1 M 1 pre
                    2 M 1 post
                    3 M 2 pre
                    4 M 2 post
                    5 F 1 pre
                    6 F 1 post
                    7 F 2 pre
                    8 F 2 post
                    9 F 3 pre
                    10 F 3 post
                    11 F 4 pre
                    12 F 4 post

                    This is my data structure.

                    My model is "~gender + genderatient + gender:drug"

                    In here, I have one more question regaring to the edgeR manual for 3.5

                    They say that Disease and treatment should be defined as a factor..
                    But, I don't think the patient columne should be a factor......

                    So, without defining the patient as factor., my model "~gender + genderatient + gender:drug" returns that

                    "[1] "(Intercept)" "genderM" "genderFatient" "genderMatient" "genderF:enzymepost"
                    #[6] "genderM:enzymepost"
                    ############################"

                    BUT, the edgeR manual shows that
                    > design <- model.matrix(~Disease+Disease:Patient+Disease:Treatment)
                    > colnames(design)
                    [1] "(Intercept)" "DiseaseDisease1"
                    [3] "DiseaseDisease2" "DiseaseHealthy:Patient2"
                    [5] "DiseaseDisease1:Patient2" "DiseaseDisease2:Patient2"
                    [7] "DiseaseHealthy:Patient3" "DiseaseDisease1:Patient3"
                    [9] "DiseaseDisease2:Patient3" "DiseaseHealthy:TreatmentHormone"
                    [11] "DiseaseDisease1:TreatmentHormone" "DiseaseDisease2:TreatmentHormone"
                    it seems that Patient is also defined as factor!!!

                    Do I have to define patient as factor?? ( I don't think so.. ;()

                    Also, in my case, since the number of patient is different in female (4) vs male (2) , when I defined as factor for patient column, I got the following error...

                    "Error in glmFit.default(y, design, dispersion = par^4, offset = offset, :
                    Design matrix not of full rank. The following coefficients not estimable:
                    genderMatient3 genderMatient4"
                    Last edited by younko; 08-31-2014, 11:18 PM.

                    Comment


                    • #11
                      Oh.. I got it.. It seems that we posted at the same time..
                      I have updated my posting.. but I got your points!!!!!!!!!

                      I really appreciate your helps!!!!!!!!!!!!!!!! Thank you SO MUCH!!

                      I have learned a lot through this thread!! Again, thank you so much!

                      Youn
                      Last edited by younko; 08-31-2014, 11:29 PM.

                      Comment

                      Latest Articles

                      Collapse

                      • seqadmin
                        Current Approaches to Protein Sequencing
                        by seqadmin


                        Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                        04-04-2024, 04:25 PM
                      • seqadmin
                        Strategies for Sequencing Challenging Samples
                        by seqadmin


                        Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                        03-22-2024, 06:39 AM

                      ad_right_rmr

                      Collapse

                      News

                      Collapse

                      Topics Statistics Last Post
                      Started by seqadmin, 04-11-2024, 12:08 PM
                      0 responses
                      25 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 04-10-2024, 10:19 PM
                      0 responses
                      28 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 04-10-2024, 09:21 AM
                      0 responses
                      24 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 04-04-2024, 09:00 AM
                      0 responses
                      52 views
                      0 likes
                      Last Post seqadmin  
                      Working...
                      X