SEQanswers

SEQanswers (http://seqanswers.com/forums/index.php)
-   Bioinformatics (http://seqanswers.com/forums/forumdisplay.php?f=18)
-   -   indexing tophat bam files (http://seqanswers.com/forums/showthread.php?t=48767)

stormin 12-05-2014 11:40 AM

indexing tophat bam files
 
Hi,

I am having trouble using samtools to index my tophat output for IGV viewing. The tophat output bam should be sorted (although I am having trouble too using samtools to sort the tophat output bam file).

This is how I call the tophat:
tophat2 -M --b2-very-sensitive --GTF ~/Documents/transcriptome_gtf/genes.gtf -p 7 --read-realign-edit-dist 0 --output-dir ./example ~/Documents/genome_UCSC/genome ~/Documents/Data/example.fastq

I then call samtools indexing using:
samtools index accepted_hits.bam

But I would get this error:
[bam_index_build2] fail to create the index file.

Doing samtools sorting with below command give me this error:
samtools sort ./accepted_hits.bam sort.prefix

[bam_sort_core] merging from 12 files...
open: No such file or directory
[bam_merge_core] fail to open file sort.prefix.0000.bam

At this point, I'm not sure what is going on. Please help!

Zach

GenoMax 12-05-2014 04:03 PM

Can you sort using this command

Code:

$ samtools sort ./accepted_hits.bam accepted_hits_sorted
and they try indexing the sorted file.

stormin 12-05-2014 04:14 PM

Quote:

Originally Posted by GenoMax (Post 155956)
Can you sort using this command

Code:

$ samtools sort ./accepted_hits.bam accepted_hits_sorted
and they try indexing the sorted file.

Nope, I would get the same error message.

GenoMax 12-05-2014 04:38 PM

Which version of samtools are you using?

Is sorting process making temporary files (with names containing 0001.bam etc) before you get that error?

stormin 12-05-2014 05:24 PM

Quote:

Originally Posted by GenoMax (Post 155958)
Which version of samtools are you using?

Version number is 0.1.19-4428cd

Quote:

Originally Posted by GenoMax (Post 155958)
Is sorting process making temporary files (with names containing 0001.bam etc) before you get that error?

It looks like no temporary files are created. The command throws the error message after less than a minute of running (actually I'm not sure how long it typically takes). It looks like it stops after loading the file, since calling the same command with the unmapped bam file as argument is much faster in reaching the error message.

GenoMax 12-05-2014 05:41 PM

Is this the version bundled with TopHat code (which is the one tested to work)?

stormin 12-05-2014 06:52 PM

Quote:

Originally Posted by GenoMax (Post 155960)
Is this the version bundled with TopHat code (which is the one tested to work)?

I think I installed samtools before tophat. Everything works actually with tophat and I am able to use the BAM files for HTSEQ and then DESEQ2.

blancha 12-05-2014 07:41 PM

Check how much free disk space you have.

stormin 12-05-2014 09:27 PM

Quote:

Originally Posted by blancha (Post 155962)
Check how much free disk space you have.

That shouldn't be a problem, there are more than 700gb left on the hard-drive.

blancha 12-05-2014 09:46 PM

Devon Ryan seems to describe the bug here.
https://www.biostars.org/p/93368/

I would just install samtools 1.1 which has many interesting new features anyway.
It should fix the issue.

GenoMax 12-06-2014 03:07 AM

No harm in trying the latest samtools but TopHat page has this to say

Quote:

Removed SAMtools as an external dependency in order to avoid incompatibility issues with recent and future changes of SAMtools and its code library (an older, stable SAMtools version is now packaged with TopHat)
I also see a v.0.1.20 on samtools download page so if you want to stay with the old series give that a try.

blancha 12-06-2014 03:55 AM

Right, you should also get the latest version of TopHat that comes bundled with the appropriate version of samtools required by TopHat.

You'll then have the best of best worlds, the latest version of TopHat running with a tried and tested version of samtools, and the latest version of samtools with all the new bells and whistles.

I'm basing all these assumptions on Devon Ryan's post, but his explanations are quite convincing and his description of the bug corresponds to yours.

My advice:
1- Install the very latest version of samtools with all the new bells and whistles, and without the bug.
2- Install the latest version of TopHat2 which comes bundled with a tried and tested version of samtools, that has been tested for compatibility with TopHat2. (This version will be used internally by TopHat.)

blancha 12-06-2014 04:05 AM

Incidentally, you will still need to sort the BAM file before indexing it, as GenoMax pointed out.

stormin 12-08-2014 12:38 PM

Thanks for all the inputs, looks like updating fixed this bug!


All times are GMT -8. The time now is 11:11 PM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.