SEQanswers

Go Back   SEQanswers > Applications Forums > RNA Sequencing



Similar Threads
Thread Thread Starter Forum Replies Last Post
Tuxedo suite / Parallel Processing TPH Bioinformatics 10 01-28-2016 12:59 PM
What's the best RNA Seq Tuxedo suite tutorial/streamlined intro protocol? birne412 RNA Sequencing 9 01-23-2015 08:51 AM
Advice Needed on Tuxedo Suite thickrick99 RNA Sequencing 2 08-13-2014 09:22 AM
Tuxedo suite not giving alternative TSSs their own loci Gordo2B Bioinformatics 0 07-07-2014 04:42 AM
RNA-Seq: tuxedo suite transcription start site identification zillerm Bioinformatics 2 07-05-2013 08:58 PM

Reply
 
Thread Tools
Old 12-14-2016, 08:24 AM   #1
drsnafu1831
Junior Member
 
Location: US

Join Date: Dec 2016
Posts: 1
Default TopHat (Tuxedo Suite) - AWS vs Local

Hello All
Sorry this one's a bit long
Have come here after some testing !!

I am trying to run Tuxedo suite for RNASeq Analysis on AWS

Post alignment using Tophat, I have recorded significant difference in size of "accepted_hits.bam" between runs on AWS (Amazon Web Services) and Local System

Details
Paired end FastQs
Size - after Adapter and quality thresholding - 7 gigs (3.5 X 2)
Human Sample

AWS Instance
15 cores - of - m4.16xlarge (64 cores/256 RAM)
Time ~45 min
Out put accepted_hits.bam ~100mb

Local Server
15 cores - of - 36 cores/256 RAM
Time ~52 min
Out put accepted_hits.bam ~1.2GB

Questions
Why such significant difference? reasons?

One thread I found, which was somewhat related
https://www.biostars.org/p/102874/

As suggested in the above thread, I have tried running tophat, with same parameters and input files,
on 1, 5, 10, 15, 20, 30 and 64 cores on local servers
on 30 and 64 cores on AWS

Interestingly, the size of "accepted_hits.bam" remained same (1.2 gb) till I reached 30 cores on local server (with specs mentioned above), and reduced (~120 mb) on 64 cores.
On AWS, as said above, 30 and (later) 64 are giving out ~100 mb of accepted_hits.bam

any input, suggestions and comments are welcome
thank you for your time !!
drsnafu1831 is offline   Reply With Quote
Old 12-14-2016, 08:36 AM   #2
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

Sounds like a bug to me... maybe you should try a different version of Tophat. The number of cores should not affect the size of the output bam more than a tiny amount.

Alternatively, you could try BBMap; it's faster than Tophat and produces correct output for any number of cores
Brian Bushnell is offline   Reply With Quote
Reply

Tags
rnaseq alignment tuxedo, tophat, tuxedo suite

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 06:20 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2018, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO