Dear All,
I am getting completely different results for the count of mapped reads on transcript features using the countOverlaps and summarizeOverlaps.
CountOverlaps on using a gappedAlignment and Granges object gives slightly different results. SummarizeOverlaps gives totally different results for the modes "Union" and "intersectionNotEmpty".
Can someone please let me know where I am making a mistake. I have attached a screenshot of the transcripts. All the transcripts belong to the same gene.
Following steps were performed by me on a small subset of my data
1. I built a gapped alignment object of mapped reads using the command
reads=readBamGappedAlignments("path / to / bam/sorted.bam")
2. From the gapped alignment I built a genomic ranges object
ranges=GRanges(seqnames=rname(reads),
ranges=IRanges(start=start(reads),
end=end(reads)),strand=rep("*",length(reads)))
3. I built a genomic ranges list from a custom genedb object (4 transcripts).
geneE=exonsBy(genedb,'gene')
4. Used the count and summarize overlaps functions
a). count1=countOverlaps(geneE, reads)
b). count2=countOverlaps(geneE, ranges)
c). count3=assays(summarizeOverlaps(geneE, reads,
mode="UNION"))$counts[,1]
c). count4=assays(summarizeOverlaps(geneE, reads,
mode="intersectionNotEmpty"))$counts[,1]
5. I get four different count results
"T1, T2, T3, T4"
a). count1 "55972, 41029, 18270, 18734"
b). count2 "55989, 41046, 18287, 18751"
c). count3 "0, 2, 0, 1092"
d). count4 "0, 3712, 0, 2550"
I am getting completely different results for the count of mapped reads on transcript features using the countOverlaps and summarizeOverlaps.
CountOverlaps on using a gappedAlignment and Granges object gives slightly different results. SummarizeOverlaps gives totally different results for the modes "Union" and "intersectionNotEmpty".
Can someone please let me know where I am making a mistake. I have attached a screenshot of the transcripts. All the transcripts belong to the same gene.
Following steps were performed by me on a small subset of my data
1. I built a gapped alignment object of mapped reads using the command
reads=readBamGappedAlignments("path / to / bam/sorted.bam")
2. From the gapped alignment I built a genomic ranges object
ranges=GRanges(seqnames=rname(reads),
ranges=IRanges(start=start(reads),
end=end(reads)),strand=rep("*",length(reads)))
3. I built a genomic ranges list from a custom genedb object (4 transcripts).
geneE=exonsBy(genedb,'gene')
4. Used the count and summarize overlaps functions
a). count1=countOverlaps(geneE, reads)
b). count2=countOverlaps(geneE, ranges)
c). count3=assays(summarizeOverlaps(geneE, reads,
mode="UNION"))$counts[,1]
c). count4=assays(summarizeOverlaps(geneE, reads,
mode="intersectionNotEmpty"))$counts[,1]
5. I get four different count results
"T1, T2, T3, T4"
a). count1 "55972, 41029, 18270, 18734"
b). count2 "55989, 41046, 18287, 18751"
c). count3 "0, 2, 0, 1092"
d). count4 "0, 3712, 0, 2550"