Go Back   SEQanswers > Bioinformatics > Bioinformatics

Similar Threads
Thread Thread Starter Forum Replies Last Post
cuffmerge crashes when converting gtf files to sam files swbiggs4 Bioinformatics 20 02-16-2017 09:19 AM
Adding Read Group info to a set of Bam files wjeck Bioinformatics 42 02-23-2015 05:16 AM
Converting FPKM from Cufflinks to raw counts for DESeq jebe Bioinformatics 34 02-05-2014 08:19 AM
Consensus part from sequence read(fastq) and align(BAM) files culmen Bioinformatics 5 12-21-2010 03:57 AM
DESeq: Read counts vs. BP counts burkard Bioinformatics 0 08-05-2010 11:52 PM

Thread Tools
Old 05-10-2011, 10:31 AM   #1
Location: west coast

Join Date: May 2011
Posts: 17
Default converting bam files to non-normalized read counts

I am new to this business, so I would like to apologize in advance if this is trivial.

I have bam files produced to Tophat and need to run a differential gene expression analysis. I can do that with Cufflinks, but I also would like to compare the results with e.g. edgeR. However edgeR uses the raw counts, not normalized FPKMs. I can naively convert FPKMs to counts by multiplying to the gene length and total reads, but is there a better way to convert bam files to non-normalized read counts?
lpn is offline   Reply With Quote
Old 05-10-2011, 12:45 PM   #2
Senior Member
Location: Stockholm, Sweden

Join Date: Feb 2008
Posts: 319

Check out the following:
kopi-o is offline   Reply With Quote
Old 05-10-2011, 02:23 PM   #3
Junior Member
Location: CA

Join Date: May 2011
Posts: 2
Default how to do this in Galaxy?

Another newbie question - is this capability integrated into Galaxy? I am having trouble figuring out the Galaxy "Operating on Genomic Intervals" tools capabilities. I have sequencing reads mapped using bowtie and would like to find the read count for reads that fall into genomic intervals tables downloaded from the UCSC Table Browswer, like sno/miRNAs or RefSeq genes.
Alternatively I can go outside Galaxy and use some of the tools that kopi-o suggested.
2004rs is offline   Reply With Quote
Old 10-09-2012, 05:50 PM   #4
Junior Member
Location: china

Join Date: Sep 2012
Posts: 7

I have a question: how can you count the number of reads that mapped to each transcript? Now i have the bam file from the Tophat.

wangxj is offline   Reply With Quote
Old 10-09-2012, 07:52 PM   #5
Crusty old bioinformatician
Location: Melbourne, Australia

Join Date: Oct 2010
Posts: 8

Originally Posted by 2004rs View Post
Another newbie question - is this capability integrated into Galaxy?
The answer depends on exactly what you mean by "into Galaxy"
1. No, not on the public Galaxy instances run by the Galaxy Team.
2. Yes, if you have your own Galaxy. There's at least one suitable tool I seem to remember seeing in the main Toolshed - but it required Matlab (!) for some strange reason... OTOH, you can find a version that uses pysam (warning: work in progress!) at and if you're game to try it. There's a bunch of other downstream tools in that repo that can use the resulting count matrixes including edgeR, DESeq and further downstream, GSEA and SPIA wrappers too. Enjoy but remember, YMMV.
fubar is offline   Reply With Quote

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

All times are GMT -8. The time now is 08:37 AM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO