Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • DESeq2 collinearity issue

    Hi everybody,

    I have been trying to use DESeq2 for analyzing RNA-seq data, and ran into a problem.

    The conditions table looks like this:

    Code:
    sample	donor	virus	vpu
    DonorA1_01	A1	none	mock
    DonorA1_02	A1	CH293	wt
    DonorA1_03	A1	CH293	stop
    DonorA1_04	A1	CH293	R50K
    DonorA1_05	A1	CH293	teth_count
    DonorA1_06	A1	CH077	wt
    DonorA1_07	A1	CH077	stop
    DonorA1_08	A1	CH077	R50K
    DonorA1_09	A1	CH077	teth_count
    DonorA1_10	A1	STC01	wt
    DonorA1_11	A1	STC01	stop
    DonorA1_12	A1	STC01	R50K
    DonorA1_13	A1	STC01	teth_count
    DonorX_01	X	none	mock
    DonorX_02	X	CH293	wt
    DonorX_03	X	CH293	stop
    DonorX_04	X	CH293	R50K
    DonorX_05	X	CH293	teth_count
    DonorX_06	X	CH077	wt
    DonorX_07	X	CH077	stop
    DonorX_08	X	CH077	R50K
    DonorX_09	X	CH077	teth_count
    DonorX_10	X	STC01	wt
    DonorX_11	X	STC01	stop
    DonorX_12	X	STC01	R50K
    DonorX_13	X	STC01	teth_count
    DonorY_01	Y	none	mock
    DonorY_02	Y	CH293	wt
    DonorY_03	Y	CH293	stop
    DonorY_04	Y	CH293	R50K
    DonorY_05	Y	CH293	teth_count
    DonorY_06	Y	CH077	wt
    DonorY_07	Y	CH077	stop
    DonorY_08	Y	CH077	R50K
    DonorY_09	Y	CH077	teth_count
    DonorY_10	Y	STC01	wt
    DonorY_11	Y	STC01	stop
    DonorY_12	Y	STC01	R50K
    DonorY_13	Y	STC01	teth_count
    DonorZ_01	Z	none	mock
    DonorZ_02	Z	CH293	wt
    DonorZ_03	Z	CH293	stop
    DonorZ_04	Z	CH293	R50K
    DonorZ_05	Z	CH293	teth_count
    DonorZ_06	Z	CH077	wt
    DonorZ_07	Z	CH077	stop
    DonorZ_08	Z	CH077	R50K
    DonorZ_09	Z	CH077	teth_count
    DonorZ_10	Z	STC01	wt
    DonorZ_11	Z	STC01	stop
    DonorZ_12	Z	STC01	R50K
    DonorZ_13	Z	STC01	teth_count
    When I specify the model for differential expression analysis as dds <- DESeqDataSetFromTximport(txi, samples, ~vpu+donor+virus), I get an error message:

    Error in checkFullRank(modelMatrix) :
    the model matrix is not full rank, so the model cannot be fit as specified.
    One or more variables or interaction terms in the design formula are linear
    combinations of the others and must be removed.

    Same problem if the model is “vpu + virus” only.

    I understand that this is because of collinearity among the variables but I am not sure how to resolve the issue.

    Any help would be highly appreciated!

    Thanks!
    Chris

  • #2
    The problem is that "virus==none" is the same as "vpu==mock".

    Comment


    • #3
      Thanks a lot!
      With respect to the most important comparisons, which is VPU "wt" vs. "stop", leaving out the non-infected samples works.
      But if I want to compare non-infected vs. infected samples, how could this be resolved? Using an additional dummy variable?
      How would that look like?

      Thanks!
      Chris

      Comment


      • #4
        My guess (since I haven't a clue about the background to your experiment) is that you want vpu="Wt" for the mock infected samples. The combination of vpu and virus would then be distinct.

        *Edit*: Alternatively, set the virus="none" samples to what they're mock infected with (e.g., CH293), which I suspect will be clearer.

        Comment


        • #5
          Perfect, thanks a lot for your help, Devon!

          Best,
          Chris

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Current Approaches to Protein Sequencing
            by seqadmin


            Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
            04-04-2024, 04:25 PM
          • seqadmin
            Strategies for Sequencing Challenging Samples
            by seqadmin


            Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
            03-22-2024, 06:39 AM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, 04-11-2024, 12:08 PM
          0 responses
          18 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 10:19 PM
          0 responses
          22 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 09:21 AM
          0 responses
          16 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-04-2024, 09:00 AM
          0 responses
          46 views
          0 likes
          Last Post seqadmin  
          Working...
          X