SEQanswers (
-   Bioinformatics (
-   -   converting bam files to non-normalized read counts (

lpn 05-10-2011 10:31 AM

converting bam files to non-normalized read counts
I am new to this business, so I would like to apologize in advance if this is trivial.

I have bam files produced to Tophat and need to run a differential gene expression analysis. I can do that with Cufflinks, but I also would like to compare the results with e.g. edgeR. However edgeR uses the raw counts, not normalized FPKMs. I can naively convert FPKMs to counts by multiplying to the gene length and total reads, but is there a better way to convert bam files to non-normalized read counts?

kopi-o 05-10-2011 12:45 PM

Check out the following:

2004rs 05-10-2011 02:23 PM

how to do this in Galaxy?
Another newbie question - is this capability integrated into Galaxy? I am having trouble figuring out the Galaxy "Operating on Genomic Intervals" tools capabilities. I have sequencing reads mapped using bowtie and would like to find the read count for reads that fall into genomic intervals tables downloaded from the UCSC Table Browswer, like sno/miRNAs or RefSeq genes.
Alternatively I can go outside Galaxy and use some of the tools that kopi-o suggested.

wangxj 10-09-2012 05:50 PM

I have a question: how can you count the number of reads that mapped to each transcript? Now i have the bam file from the Tophat.


fubar 10-09-2012 07:52 PM


Originally Posted by 2004rs (Post 41328)
Another newbie question - is this capability integrated into Galaxy?

The answer depends on exactly what you mean by "into Galaxy" :)
1. No, not on the public Galaxy instances run by the Galaxy Team.
2. Yes, if you have your own Galaxy. There's at least one suitable tool I seem to remember seeing in the main Toolshed - but it required Matlab (!) for some strange reason... OTOH, you can find a version that uses pysam (warning: work in progress!) at and if you're game to try it. There's a bunch of other downstream tools in that repo that can use the resulting count matrixes including edgeR, DESeq and further downstream, GSEA and SPIA wrappers too. Enjoy but remember, YMMV.

All times are GMT -8. The time now is 01:49 PM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.