Hello everyone,
I am using the DEXseq library in R. To get the read counts per exon, I used the script dexseq_count.py. When executing this script I get many warnings:
The used data is paired end dat in color-space format. The BAM file is created with Tophat. When using samtools flagstat I receive this output:
This data is converted to sam after sorting the bam file with samtools sort. The conversion to SAM is done by samtools view -o <outFile> <inFile>.
When ignoring the warnings of dexseq_count.py, I can load the data in R. But when using the function estimateDispersions, I get the error:
This is my R code:
How can I solve this error?
Is this error occur because of the warnings of dexseq_count.py?
And does the error of dexseq_count.py occur because of the singletons from tophat?
I am using the DEXseq library in R. To get the read counts per exon, I used the script dexseq_count.py. When executing this script I get many warnings:
Code:
/usr/local/lib64/python2.7/site-packages/HTSeq/__init__.py:592: UserWarning: Read 1350_690_366_F3 claims to have an aligned mate which could not be found. (Is the SAM file properly sorted?) "which could not be found. (Is the SAM file properly sorted?)" )
Code:
9158190 properly paired (6.38%) 89796971 singletons (62.59%)
When ignoring the warnings of dexseq_count.py, I can load the data in R. But when using the function estimateDispersions, I get the error:
Code:
Dispersion estimation. (Progress report: one dot per 100 genes) Error in FUN(c("ENSG00000000003", "ENSG00000000419", "ENSG00000000457", : Underdetermined model; cannot estimate dispersions. Maybe replicates have not been properly specified. In addition: Warning messages: 1: In .local(object, ...) : Exons with less than 10 counts will not be tested. For more details please see the manual page of 'estimateDispersions', parameter 'minCount' 2: In .local(object, ...) : Genes with more than 70 testable exons will be omitted from the analysis. For more details please see the manual page of 'estimateDispersions', parameter 'maxExon'.
Code:
samples <- data.frame(condition,type,row.names=condition) pairedGenes <- read.HTSeqCounts(countfiles = c(paired,single), design = samples, flattenedfile = annotationfile) pairedExons <- estimateSizeFactors(pairedGenes) pairedExons <- estimateDispersions(pairedExons)
Is this error occur because of the warnings of dexseq_count.py?
And does the error of dexseq_count.py occur because of the singletons from tophat?
Comment