Seqanswers Leaderboard Ad

**pbluescript** · 05-25-2012, 04:14 AM

Originally posted by EBER View Post

Hello I am analyzing a couple of paired-end datasets (75bp), each containing about 450 million reads.
TopHat 1.4.1 does well, as long as I have the "--non-novel-juncs" flag.
TopHat 2 however, FAILs during the "merge all bam files" step, right at the end.
I am using a 12 core server with 64GB RAM memory.

I have been suggested to partition each dataset and run TopHat with these bits and then merge the accepted_hits.bam files of each, before Cufflinks.

I have two questions:
1) Will running TopHat on my dataset partitioned compromise the quality of the alignment and therefore of the transcript assemblage done by Cufflinks?
2) What's the best tool to merge these accepted_hits.bam files? Will the Picard tools do this appropriately? Any considerations when doing this?

Many thanks.
EBER

I would recommend you try STAR. It works much better for large datasets like this in my experience, and it is MUCH faster than any Tophat version. It requires a good amount of RAM, but you have enough.

http://gingeraslab.cshl.edu/STAR/

If you want to stick with Tophat, you could merge them in a number of ways. Picard works, so would bamtools merge, or even converting them to sam and using the cat command in UNIX.

**EBER** · 05-25-2012, 05:09 AM

Thanks for you answer!

Topics	Statistics	Last Post
Expanding the Horizons of Cellular Research with the Single Cell Atlas by seqadmin Started by seqadmin, Yesterday, 11:49 AM	0 responses 13 views 0 likes	Last Post by seqadmin Yesterday, 11:49 AM
Genetic Variants and Diabetes Risk in Childhood Cancer Survivors by seqadmin Started by seqadmin, 04-24-2024, 08:47 AM	0 responses 16 views 0 likes	Last Post by seqadmin 04-24-2024, 08:47 AM
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 61 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 60 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM

Seqanswers Leaderboard Ad

Announcement

TopHat doesnt handle 450million reads

Comment

Comment

Latest Articles

ad_right_rmr

News