Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Design for normalization using DESeq

    Hi all, I am analyzing 16S sequenced data of human fecal samples. I proccessed my data with qiime and have been using the R package phyloseq for the data analysis. I wish to use the phyloseq to DESeq command of the R package DESeq2 to normalize by data to stabilize the variance, and avoid rarefaction.

    My question is that I am not certain of the design I should use. I attach 10 rows of my samples variables info.

    #SampleID Treatment Treatment1 Time Sex Age Individual
    1.1 P P0 T0 F 33 1
    1.2 P P1 T1 F 33 1
    2.1 O O0 T0 F 28 2
    2.2 O O1 T1 F 28 2
    3.1 Control C0 T0 M 24 3
    3.2 Control C1 T1 M 24 3
    4.1 Control C0 T0 M 28 4
    4.2 Control C1 T1 M 28 4
    5.1 O+P OP0 T0 M 24 5
    5.2 O+P OP1 T1 M 24 5

    I had a n=40, which I randomly assigned in 4 groups ( 3 treatments (O, P and the combination of O+P) and a control group). For each group I sequenced a fecal sample prior to the treatment (T0) and after it (T1). So in total I ended up with 80 libraries from 80 samples.
    What I want to compare is the difference of composition/abundance 1) between measurements of T0 and T1 within the treatments, and the difference of composition/abundance 2) between the treatments.

    At first I used Treatment1 for the design which is a variable that combines the treatment and time. Afterwords I saw in tutorials that people uses those kind of variables separated, and also incorporated patients variable so I used the design ~ Individual + Time + Treatment.
    But R throws the error

    error in DESeqDataSet(se, design = design, ignoreRank) :
    the model matrix is not full rank, so the model cannot be fit as specified.
    one or more variables or interaction terms in the design formula
    are linear combinations of the others and must be removed


    this also happens when I put Treatment and Individual in the design, but other combinations like Time and Individual or Treatment and Time, work just fine.

    I thought it was good to add the individuals as a variable to the design, considering that the samples that are in the same treatment-time group are the biologica replica, but show great variability in the abundance count (composition of microbiota among individuals are very large in some cases).
    I thought that adding the individual variable to the desing would help to account for that variability. However maybe by adding this variable, the degrees of freedom would be bigger and the weight of the Treatment and Time variables explaining the changes in the abundances could become insignificant if the changes are subte?


    I have been trying to understand by my self reading the DESeq papers but I lack the statistical knowledge, and would like to ask you for help in understanding the error and what would be the best design for the analysis.

    thank you!!! Cheers

Latest Articles

Collapse

  • seqadmin
    Essential Discoveries and Tools in Epitranscriptomics
    by seqadmin


    The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist on Modified Bases...
    Yesterday, 07:01 AM
  • seqadmin
    Current Approaches to Protein Sequencing
    by seqadmin


    Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
    04-04-2024, 04:25 PM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, 04-11-2024, 12:08 PM
0 responses
45 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-10-2024, 10:19 PM
0 responses
46 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-10-2024, 09:21 AM
0 responses
39 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-04-2024, 09:00 AM
0 responses
55 views
0 likes
Last Post seqadmin  
Working...
X