Seqanswers Leaderboard Ad

**dpryan** · 06-08-2014, 01:12 AM

The functions in DESeq2 that load those files actually remove those lines for you

**JonB** · 06-08-2014, 01:16 AM

When I do

tail(counts(cds, normalized=TRUE))

I see that these lines are there, but maybe they are not taken into account when doing analyses in DESeq?

**dpryan** · 06-08-2014, 01:21 AM

What function did you use to load the files and what's the output of sessionInfo()?

**JonB** · 06-08-2014, 01:29 AM

> library("DESeq")
> sampleTable = read.csv(file="Gene_count_files/sampletable.txt", header=TRUE, sep="\t")
> cds = newCountDataSetFromHTSeqCount(sampleTable, directory="Gene_count_files/")
> cds = estimateSizeFactors(cds)

> sessionInfo()
R version 3.0.2 (2013-09-25)
Platform: x86_64-apple-darwin10.8.0 (64-bit)

locale:
[1] C

attached base packages:
[1] parallel stats graphics grDevices utils datasets methods base

other attached packages:
[1] DESeq_1.14.0 lattice_0.20-29 locfit_1.5-9.1 Biobase_2.22.0
[5] GenomicRanges_1.14.4 XVector_0.2.0 IRanges_1.20.7 BiocGenerics_0.8.0

loaded via a namespace (and not attached):
[1] AnnotationDbi_1.24.0 DBI_0.2-7 RColorBrewer_1.0-5 RSQLite_0.11.4
[5] XML_3.95-0.2 annotate_1.40.1 genefilter_1.44.0 geneplotter_1.40.0
[9] grid_3.0.2 splines_3.0.2 stats4_3.0.2 survival_2.37-7
[13] tools_3.0.2 xtable_1.7-3

**dpryan** · 06-08-2014, 02:07 AM

Try using the DESeqDataSetFromHTSeqCount() function. At least in the most recent version it strips those lines.

**JonB** · 06-08-2014, 02:18 AM

Ok, thanks!

Is it also safe to remove these lines from the raw count files or will this mess up the normalization later?

**gringer** · 06-08-2014, 02:43 AM

Unless you have a specific reason not to, you should probably be using DESeq2 rather than DESeq -- it has better statistical models, is more flexible, and makes the process a bit easier.

That said, I would expect that removing the lines will be fine, given that other ways of getting counts into a DESeq structure don't require unmapped read counts to be specified.

**dpryan** · 06-08-2014, 04:28 AM

Go ahead and remove them, they should be removed prior to normalization anyway. And as David said, switch to DESeq2, which has a number of improvements.

**JonB** · 06-08-2014, 02:09 PM

Thanks guys,
I actually didn't know there was a DESeq2. I will check it out asap

**super0925** · 06-09-2014, 03:20 AM

I mannually remove these lines. just some scripts should be OK for you.

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 24 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 25 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 22 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 52 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

DESeq: question about using HTSeq counts

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News