Go Back   SEQanswers > Bioinformatics > Bioinformatics

Similar Threads
Thread Thread Starter Forum Replies Last Post
Problem in replacing the non-ACGT IUPAC codes to Ns with awk lovenlong Bioinformatics 2 12-09-2013 04:45 AM
Replacing FASTA headers for TopHat & Cufflinks brachysclereid Bioinformatics 2 02-16-2011 05:44 AM
replacing read id in bam file blu78 Bioinformatics 2 06-23-2010 12:14 PM

Thread Tools
Old 05-26-2014, 10:30 AM   #1
Location: Netherlands

Join Date: Apr 2014
Posts: 14
Default Replacing aspects of the DESeqDataSet

Upon analysing my RIPseq data with DESeq2, I noticed I had a large number of genes for which the fold change was read as "NA". This is because many of the genes have no reads in them. I wanted to change all the reads in my DESeqDataSet with a value of "0" to a value of "1" in order to avoid divide-by-zero errors resulting in "NA" fold changes. I thought it would work something like this:

Trimmed_BIP_dds[Trimmed_BIP_dds$assay == 0] <- 1
But that gives me this error, probably because DESeqDataSets can't be indexed like normal data frames:

Error in Trimmed_BIP_dds[Trimmed_BIP_dds$assay == 0] <- 1 : 
  object of type 'S4' is not subsettable
The help file on SummarizedExperiment instances (of which DESeqDataSet is a subtype) says this:

x[i,j], x[i,j] <- value:
Create or replace a subset of x. i, j can be numeric, logical, character, or missing. value must be a SummarizedExperiment instance with dimensions, dimension names, and assay elements consistent with the subset x[i,j] being replaced.
My problem is that I don't really understand what that means. I thought I should do something like this:

Trimmed_BIP_dds$assay[==0] <-1
But that doesn't work either. I also tried extracting the counts with the assay function and then changing them inside the matrix:

Trimmed_BIP_matrix <- assay(Trimmed_BIP_dds)
Trimmed_BIP_matrix[Trimmed_BIP_matrix == 0] <- 1
But then I couldn't figure out how to get the new matrix back into the DESeqDataSet object.

Can anyone help me with this?
Rivalyn is offline   Reply With Quote
Old 05-26-2014, 12:54 PM   #2
Devon Ryan
Location: Freiburg, Germany

Join Date: Jul 2011
Posts: 3,480

Given a DeseqDataSet named dds:
IDX <- which(assay(dds) == 0)
assay(dds)[IDX] <- 1
Having said that, if the genes don't have any reads for any of the samples then a fold-change of NA would seem completely appropriate. Further, I really wouldn't recommend mucking with raw counts like this, you'll inevitably change a few of the results (so don't blame me if you shoot yourself in the foot).
dpryan is offline   Reply With Quote
Old 05-26-2014, 01:59 PM   #3
Location: Netherlands

Join Date: Apr 2014
Posts: 14

A lot of our read counts look like this:

Control = 0
Treated = tons

In this case it seems to me that we're discarding perfectly useful data due to a simple math error. I see your point, though.

Thanks for the help, I'll give it a whirl once I get back to work tomorrow.
Rivalyn is offline   Reply With Quote

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

All times are GMT -8. The time now is 06:40 PM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO