TopHat (Tuxedo Suite) - AWS vs Local

drsnafu1831

Junior Member

Join Date: Dec 2016

Posts: 1
- Share
- Tweet
#1

TopHat (Tuxedo Suite) - AWS vs Local

12-14-2016, 09:24 AM

Hello All
Sorry this one's a bit long
Have come here after some testing !!

I am trying to run Tuxedo suite for RNASeq Analysis on AWS

Post alignment using Tophat, I have recorded significant difference in size of "accepted_hits.bam" between runs on AWS (Amazon Web Services) and Local System

Details
Paired end FastQs
Size - after Adapter and quality thresholding - 7 gigs (3.5 X 2)
Human Sample

AWS Instance
15 cores - of - m4.16xlarge (64 cores/256 RAM)
Time ~45 min
Out put accepted_hits.bam ~100mb

Local Server
15 cores - of - 36 cores/256 RAM
Time ~52 min
Out put accepted_hits.bam ~1.2GB

Questions
Why such significant difference? reasons?

One thread I found, which was somewhat related

Just a moment...

https://www.biostars.org/p/102874/

As suggested in the above thread, I have tried running tophat, with same parameters and input files,
on 1, 5, 10, 15, 20, 30 and 64 cores on local servers
on 30 and 64 cores on AWS

Interestingly, the size of "accepted_hits.bam" remained same (1.2 gb) till I reached 30 cores on local server (with specs mentioned above), and reduced (~120 mb) on 64 cores.
On AWS, as said above, 30 and (later) 64 are giving out ~100 mb of accepted_hits.bam

any input, suggestions and comments are welcome
thank you for your time !!
Tags: rnaseq alignment tuxedo, tophat, tuxedo suite
Brian Bushnell

Super Moderator

Join Date: Jan 2014

Posts: 2709
- Share
- Tweet
#2

12-14-2016, 09:36 AM

Sounds like a bug to me... maybe you should try a different version of Tophat. The number of cores should not affect the size of the output bam more than a tiny amount.

Alternatively, you could try BBMap; it's faster than Tophat and produces correct output for any number of cores
Comment

Previous template Next

Nine Things a Sample Prep Scientist Thinks About Before Sequencing

by SEQadmin2

I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

Here are nine questions we think about, in roughly the order they matter, before...
- Channel: Articles
06-18-2026, 07:11 AM
From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data

by SEQadmin2

Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.

The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
...
- Channel: Articles
06-02-2026, 10:05 AM

Topics	Statistics	Last Post
Whole-Genome Sequencing Traces Faroe Islands Ancestry to a North Atlantic Founder Population by SEQadmin2 Started by SEQadmin2, 06-17-2026, 06:09 AM	0 responses 34 views 0 reactions	Last Post by SEQadmin2 06-17-2026, 06:09 AM
Sequencing the Two-Toed Sloth Genome Reveals Jumping Genes Tied to Its Extreme Metabolism by SEQadmin2 Started by SEQadmin2, 06-09-2026, 11:58 AM	0 responses 99 views 0 reactions	Last Post by SEQadmin2 06-09-2026, 11:58 AM
A New Method Makes Hantavirus Genome Analysis Faster and More Accessible by SEQadmin2 Started by SEQadmin2, 06-05-2026, 10:09 AM	0 responses 120 views 0 reactions	Last Post by SEQadmin2 06-05-2026, 10:09 AM
A New Single-Cell Method Maps DNA-Protein Interactions by SEQadmin2 Started by SEQadmin2, 06-04-2026, 08:59 AM	0 responses 113 views 0 reactions	Last Post by SEQadmin2 06-04-2026, 08:59 AM

Unconfigured Ad

TopHat (Tuxedo Suite) - AWS vs Local

Comment

Latest Articles

ad_right_rmr

News