using DEXSeq one gene with only one exon - problem

Jane M

Senior Member

Join Date: Aug 2011

Posts: 239
- Share
- Tweet
#1

using DEXSeq one gene with only one exon - problem

08-04-2011, 04:11 AM

Hello,

I have posted the problem that I meet on the bioconductor mailing list bioc-sig- sequencing but I have not got any answer yet. So I will try here too !

I am using DEXseq for exon differential analysis. The analysis seems to work
well on genes with several exons, but I have a problem with genes with only
one exon.
Let me present you my dataset:

pData(ecs)
sizeFactor condition replicate type
Raman_1 0.9638822 nonvir 1 paired-end
HM1_1 1.4000715 vir 1 paired-end
Raman_2 1.0314049 nonvir 2 paired-end
Raman_3 1.0734133 nonvir 3 paired-end
HM1_2 0.7801269 vir 2 paired-end
HM1_3 0.9488409 vir 3 paired-end

table( table (geneIDs(ecs)))
1 2 3 4 5 6 7 8
3149 893 874 206 100 42 18 4

As you can see, I have 3149 genes (out of 5286) which have only 1 exon.
There are 9291 exons.

res1 <- DEUresultTable(ecs)

> summary(res1)
geneID exonID dispersion_CR_est
EHI_051420 : 8 E001 :5286 Min. :0.000e+00
EHI_070690 : 8 E002 :2137 1st Qu.:7.176e-03
EHI_169670 : 8 E003 :1244 Median :1.693e-02
EHI_175010+EHI_175000.1: 8 E004 : 370 Mean :5.546e-02
EHI_005010 : 7 E005 : 164 3rd Qu.:3.641e-02
EHI_042710 : 7 E006 : 64 Max. :1.300e+01
(Other) :9245 (Other): 26 NA's :3.153e+03
dispersion pvalue padjust
Min. :2.010e-02 Min. : 0.0000 Min. : 0.0000
1st Qu.:2.202e-02 1st Qu.: 0.1150 1st Qu.: 0.4597
Median :2.628e-02 Median : 0.4054 Median : 0.8086
Mean :3.229e+04 Mean : 0.4269 Mean : 0.6775
3rd Qu.:4.214e-02 3rd Qu.: 0.7133 3rd Qu.: 0.9510
Max. :1.000e+08 Max. : 1.0000 Max. : 1.0000
NA's :3153.0000 NA's :3153.0000

There are 3153 NA's values. When checking out what were those NA's values, I
realized that they were all the genes with only one exon (and only 4 more
genes in the other cases). I doubt that it is by chance... And they don't
have weird read counts. Check for example EHI_000010, EHI_000410,
EHI_000420, EHI_000430, EHI_000440:

geneCountTable(ecs)[1:21,]
Raman_1 HM1_1 Raman_2 Raman_3 HM1_2 HM1_3
EHI_000010 391 1364 464 449 457 560
EHI_000130 18806 8116 21890 24994 6509 14067
EHI_000240 24057 21825 15365 26903 21101 27841
EHI_000250 3374 2689 3938 4854 2310 3556
EHI_000290 458 3721 437 550 1716 1054
EHI_000410 896 1041 1086 978 905 745
EHI_000420 140 220 244 117 140 78
EHI_000430 583 2043 304 634 1398 1197
EHI_000440 3103 15309 5358 3619 9152 10392
EHI_000450 358 559 991 516 281 564
EHI_000460.1 357 118 449 425 43 746
EHI_000470.1 408 2657 297 397 946 909
EHI_000480 842 1702 858 778 841 847
EHI_000490 187 648 221 239 370 249
EHI_000500 329 506 454 402 244 555
EHI_000520 482 1615 883 440 925 751
EHI_000530 1794 1588 1751 1966 1099 1860
EHI_000540 2857 2504 3706 4152 1479 2547
EHI_000550 1967 3999 1241 1904 2505 2448
EHI_000560 152 434 239 165 273 175
EHI_000570 1209 2142 2183 1213 1905 2662

res1[1:20,]
geneID exonID dispersion_CR_est dispersion pvalue
EHI_000010:001 EHI_000010 E001 NA 0.02377712 NA
EHI_000130:001 EHI_000130 E001 0.074669451 0.07466945 0.48673539
EHI_000130:002 EHI_000130 E002 0.128972813 0.12897281 0.94279527
EHI_000130:003 EHI_000130 E003 0.019602219 0.02031871 0.84327914
EHI_000130:004 EHI_000130 E004 0.067337196 0.06733720 0.70429765
EHI_000240:001 EHI_000240 E001 0.043753997 0.04375400 0.74939865
EHI_000240:002 EHI_000240 E002 0.058562700 0.05856270 0.74939865
EHI_000250:001 EHI_000250 E001 0.031440630 0.03144063 0.70444841
EHI_000250:002 EHI_000250 E002 0.037457450 0.03745745 0.70444841
EHI_000290:001 EHI_000290 E001 0.021113888 0.02396819 0.05100089
EHI_000290:002 EHI_000290 E002 0.007626346 0.03471905 0.95676220
EHI_000290:003 EHI_000290 E003 0.036889359 0.03688936 0.03057067
EHI_000410:001 EHI_000410 E001 NA 0.02235347 NA
EHI_000420:001 EHI_000420 E001 NA 0.03395573 NA
EHI_000430:001 EHI_000430 E001 NA 0.02219522 NA
EHI_000440:001 EHI_000440 E001 NA 0.02037263 NA
EHI_000450:001 EHI_000450 E001 0.024495906 0.04492663 0.64717121
EHI_000450:002 EHI_000450 E002 0.020356260 0.02483641 0.64717121
EHI_000460.1:001 EHI_000460.1 E001 NA 0.02602179 NA
EHI_000470.1:001 EHI_000470.1 E001 NA 0.02254333 NA
padjust
EHI_000010:001 NA
EHI_000130:001 0.8639624
EHI_000130:002 0.9927663
EHI_000130:003 0.9756197
EHI_000130:004 0.9503086
EHI_000240:001 0.9611397
EHI_000240:002 0.9611397
EHI_000250:001 0.9503086
EHI_000250:002 0.9503086
EHI_000290:001 0.2890521
EHI_000290:002 0.9962043
EHI_000290:003 0.2087239
EHI_000410:001 NA
EHI_000420:001 NA
EHI_000430:001 NA
EHI_000440:001 NA
EHI_000450:001 0.9278993
EHI_000450:002 0.9278993
EHI_000460.1:001 NA
EHI_000470.1:001 NA

I checked the model in the available documentation and I don't see why the test does
not give a result in this case, ie why the Cox-Reid likelihood estimation cannot be computed. Has anyone got the same problem?

Thanks for your help,
Jane

Last edited by Jane M; 08-04-2011, 04:21 AM.
Tags: None
Jane M

Senior Member

Join Date: Aug 2011

Posts: 239
- Share
- Tweet
#2

08-04-2011, 06:09 AM

I got a response finally. So, here is what I got, it could help someone:

DEXSeq is not a tool to find differentially expressed genes, but removes the effect of overall changes to a gene's expression. Its purpose is to test whether the relative usage of exons in agene depends on a condition.

Let's say you have a gene with three exons, a cassette exon flanked by consitutive exons. In one condition 20% of all the reads that fall onto the gene overlap with the middle exon, in the other condition, it is 40%. This is what DEXSeq looks for. If the counts for all exons of a gene increases by the sae factor from one condition to the other, the gene may be differentially expressed but not differentially spliced, and DEXSeq does not aim to detect the former.

Hence, single-exon genes cannot be tested and are skipped.
Comment

Previous template Next

Essential Discoveries and Tools in Epitranscriptomics

by seqadmin

The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
- Channel: Articles
Yesterday, 07:01 AM
Current Approaches to Protein Sequencing

by seqadmin

Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
- Channel: Articles
04-04-2024, 04:25 PM

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 58 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 54 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 46 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 55 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

using DEXSeq one gene with only one exon - problem

Comment

Latest Articles

ad_right_rmr

News