SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
extrat unaligned reads from unmapped.bam (tophat) vivienne_lovely Bioinformatics 1 05-26-2013 10:58 PM
de novo assembly using unmapped reads from tophat ccstaats Bioinformatics 4 04-12-2013 02:09 PM
assembly unmapped reads from tophat leifive Bioinformatics 3 03-07-2013 10:03 PM
SOLiD unmapped RNA-seq reads from TOPHAT rkk SOLiD 1 01-18-2012 04:18 AM
SOLiD unmapped RNA-seq reads from TOPHAT rkk Bioinformatics 0 12-22-2011 01:40 PM

Reply
 
Thread Tools
Old 08-06-2013, 02:35 AM   #1
bjoernoest
Member
 
Location: Potsdam

Join Date: Feb 2013
Posts: 11
Default Problem with Resuming TopHat pipeline with unmapped reads

Hi, i have some problems i don't seem to be able to solve. When i run tophat on my own machine, i dont have any problems, but when i run both a binary or a version compiled on the server on our cluster, i get the following error.
I have tried several versions of Tophat without luck.

So the big qustions: Do anyone know which directory it lacks?

The run.log is not of much help

[2013-08-06 11:52:08] Building transcriptome data files..
[2013-08-06 11:52:21] Building Bowtie index from btindex.fa
[2013-08-06 12:01:30] Mapping left_kept_reads to transcriptome btindex with Bowtie2
[2013-08-06 12:05:17] Resuming TopHat pipeline with unmapped reads
Traceback (most recent call last):
File "/package/bio/tophat/src/tophat", line 4072, in ?
sys.exit(main())
File "/package/bio/tophat/src/tophat", line 4038, in main
user_supplied_deletions)
File "/package/bio/tophat/src/tophat", line 3446, in spliced_alignment
if not nonzeroFile(initial_reads[0]) and \
File "/package/bio/tophat/src/tophat", line 1155, in nonzeroFile
samtools_view = subprocess.Popen(samtools_view_cmd, stdout=subprocess.PIPE)
File "/usr/lib64/python2.4/subprocess.py", line 550, in __init__
errread, errwrite)
File "/usr/lib64/python2.4/subprocess.py", line 996, in _execute_child
raise child_exception
OSError: [Errno 2] No such file or directory
~


I i am running TopHat (v2.0.9), Bowtie 2.1.0.0 and Samtools 0.1.18.0
__________________
Bjørn Øst
bjoernoest is offline   Reply With Quote
Old 08-06-2013, 03:07 AM   #2
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,973
Default

Since you refer to getting the error on the cluster have you checked to make sure that the filesystem your files reside on is available on the relevant cluster nodes?

You could ssh into the cluster node(s) your job is failing on and look to see if the filesystem is mounted there (or available through autofs depending on how the cluster admins have set things up). Sometimes filesystems that are mounted on the head node may not be accessible on the worker nodes.
GenoMax is offline   Reply With Quote
Old 08-06-2013, 03:18 AM   #3
bjoernoest
Member
 
Location: Potsdam

Join Date: Feb 2013
Posts: 11
Default

i can access all the directories with full rights when i ssh to my cluster, and i build everything it while i was on the cluster. That is why i dont know why it doesn't work
__________________
Bjørn Øst
bjoernoest is offline   Reply With Quote
Old 08-06-2013, 03:22 AM   #4
bjoernoest
Member
 
Location: Potsdam

Join Date: Feb 2013
Posts: 11
Default

Is it possible to see in any of the logfiles what directory is making trouble?
__________________
Bjørn Øst
bjoernoest is offline   Reply With Quote
Old 08-06-2013, 03:31 AM   #5
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,973
Default

Quote:
Originally Posted by bjoernoest View Post
i can access all the directories with full rights when i ssh to my cluster, and i build everything it while i was on the cluster. That is why i dont know why it doesn't work
When you refer to "ssh to my cluster" I assume that is referencing the head node (or log-in node). Before we go any further let us verify that this a real compute cluster with a job scheduling system (e.g. LSF or SGE).

If that is true then start your job again see what exact node(s) it is running on. SSH into one of those node(this can be done from the log-in node) and see if the file system your files reside on is visible/available on that node.
GenoMax is offline   Reply With Quote
Old 08-06-2013, 04:02 AM   #6
bjoernoest
Member
 
Location: Potsdam

Join Date: Feb 2013
Posts: 11
Default

We use Torque Portable Batch System (PBS) for job scheduling, so when i submit my job using qsub i get a xxx.mycluster.adress... so i ssh to this, and everything seem fine.
__________________
Bjørn Øst
bjoernoest is offline   Reply With Quote
Old 08-06-2013, 04:20 AM   #7
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,973
Default

Have you looked to see if there is a "logs/run.log" file in the original output directory you had specified? That should have additional information available.
GenoMax is offline   Reply With Quote
Old 08-06-2013, 04:36 AM   #8
dpryan
Devon Ryan
 
Location: Freiburg, Germany

Join Date: Jul 2011
Posts: 3,480
Default

It's complaining that samtools isn't in the available $PATH on whatever node this is being run on (you should get a samtools error if it can't find the files specified). There are a few possible reasons why this could occur, most obviously that the correct $PATH isn't being set (or set correctly) or that that mount point isn't actually mounted on the affected node (which seems to happen frequently on some clusters). You might check the documentation for your cluster (or just bug the admin) to try to figure out which of the possibilities is correct. In the later case, the whomever admins the cluster will have to fix the issue.
dpryan is offline   Reply With Quote
Old 08-06-2013, 04:40 AM   #9
bjoernoest
Member
 
Location: Potsdam

Join Date: Feb 2013
Posts: 11
Default

Hmm when i ran it again it crashed again, but now here in the run.log

/package/bio/tophat/src/bam2fastx --all --fastq test/tmp/left_kept_reads.bam|/package/bio/tophat/src/bowtie2-align -q -k 60 -D 15 -R 2 -N 0 -L 20 -i S,1,1.25 --gbar 4 --mp 6,2 --np 1 --rdg 5,3 --rfg 5,3 --score-min C,-14,0 -p 2 --sam-no-hd -x test/tmp/btindex -|/package/bio/tophat/src/fix_map_ordering --bowtie2-min-score 15 --read-mismatches 2 --read-gap-length 2 --read-edit-dist 2 --read-realign-edit-dist 3 --sam-header test/tmp/btindex.bwt.samheader.sam - - test/tmp/left_kept_reads.m2g_um.bam | /package/bio/tophat/src/map2gtf --sam-header test/tmp/btindex_genome.bwt.samheader.sam test/tmp/btindex.fa.tlst - test/tmp/left_kept_reads.m2g.bam > test/logs/m2g_left_kept_reads.out

all the files exist
__________________
Bjørn Øst
bjoernoest is offline   Reply With Quote
Old 08-06-2013, 06:35 AM   #10
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,973
Default

That is just the last command that tophat saved when the crash occurred. That is done for later use if you wanted to resume the run with the special tophat option -R (resume).

Following may cause you some grief from the cluster admins but what happens if you try to run the above command outside PBS on command line (be ready to kill the job in case it overwhelms the headnode).

Is there comparable amount of RAM available on cluster nodes as compared to your personal machine?
GenoMax is offline   Reply With Quote
Old 08-13-2013, 05:46 AM   #11
bjoernoest
Member
 
Location: Potsdam

Join Date: Feb 2013
Posts: 11
Default

sorry for the long answer, i have been sick.
That does not work either, is there a way to see which command it tries to execute?
__________________
Bjørn Øst
bjoernoest is offline   Reply With Quote
Old 08-13-2013, 05:55 AM   #12
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,973
Default

Can you be more specific as to what does not work? Trying to "resume" the TopHat job or trying to run the command line saved in "run.log"file outside PBS?

I am not familiar with PBS but there is a way to capture the standard out and error output into files (-o and -e options). Have you tried that?
GenoMax is offline   Reply With Quote
Old 08-13-2013, 06:07 AM   #13
bjoernoest
Member
 
Location: Potsdam

Join Date: Feb 2013
Posts: 11
Default

Sorry, i tried to run it outside PBS, and it still crashes at resuming job.
Yes, they does not provide much information, the error file just contains the error i posted above.
I have tested the commands from a run on my local machine, and it seems that i finishes the #>map_start, but it lacks something in #>map_segments:
/pgzip -cd< test/tmp/left_kept_reads.m2g_um_seg1.fq.z|/package/bio/tophat/src/bowtie2-align -q -k 41 -N 1 -L 20 -p 8 --sam-no-hd -x genome/btindex -|/package/bio/tophat/src/fix_map_ordering --bowtie2-min-score 10 --read-mismatches 2 --read-gap-length 2 --read-edit-dist 2 --read-realign-edit-dist 3 --index-outfile test/tmp/left_kept_reads.m2g_um_seg1.bam.index --sam-header test/tmp/btindex_genome.bwt.samheader.sam - test/tmp/left_kept_reads.m2g_um_seg1.bam test/tmp/left_kept_reads.m2g_um_seg1_unmapped.bam

tmp/left_kept_reads.m2g_um_seg1.fq.z: No such file or directory.

But i cannot seem to find where that one is generated, these are the only left_kept_reads i have in my tmp.
left_kept_reads.m2g_um_seg1
left_kept_reads.m2g_um_seg1.bam
left_kept_reads.m2g_um_seg1.bam.index
left_kept_reads.m2g_um_seg1_unmapped.bam
left_kept_reads.m2g_um_seg1_unmapped.bam.index
left_kept_reads.m2g_um_seg1_unmapped.bam:q
left_kept_reads.m2g_um_seg1_unmapped.bam:q.index
.
__________________
Bjørn Øst
bjoernoest is offline   Reply With Quote
Old 08-13-2013, 06:35 AM   #14
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,973
Default

Is the "/tmp" on the local file system on the cluster node? I wonder if that is filling up as the job progresses.

One way to check is to re-run the original TopHat job (with all parameters) and watch the /tmp usage on the node where it is running.

Another thing to check is if you are bumping up against any "limits" set by the sys admins on your account. (check by running "limit" or "ulimit").
GenoMax is offline   Reply With Quote
Reply

Tags
bowtie, rna-seq, tophat

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 08:24 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO