Yes, and its not the only problematic name i have seen in use in the forums. I guess we should avoid such usernames and/or avoid using them in the text of post replies so that they do not pollute search results...
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
Originally posted by malachig View PostOur cluster uses Sun Grid Engine (sge). Submitting jobs to the cluster is accomplished using a wrapper for the 'qsub' utility of sge. Basically the submission command is just pointing to a batch file containing bash commands (one job per line). I assume this is a somewhat common theme in cluster job submission. If this is the case for you, it shouldn't be too hard to modify the 'createAnalysisCommands' step. You would just need to modify all the lines containing 'mqsub' to match the submission style of your cluster and then when you run createAnalysisCommands use the option '--cluster_commands=1'Originally posted by obig View PostI guess there are too many different cluster configurations for alexa-seq to anticipate. So, simple bash files are produced which can be run serially (for very small libraries) or submitted to your cluster according to its protocols. You will probably have to work with your cluster administrator to get things running optimally.
Our cluster here (lawrencium) uses PBS Torque Resource manager and Moab job scheduler. And, with some work, I have been able to submit Alexa-seq jobs to it. I have processed four projects with over 100 libraries to date. So, it is doable. Instead of trying to edit all those parts of the alexa-seq pipeline code that produce job batch files and submission commands, I created a simple perl script which takes an alexa-seq job batch file (essentially just an sh file with one "task/command" per line) and produces the submission files compatible with our scheduler. I strongly recommend this strategy. Changing the alexa-seq code will be a lot more work. What I do is run the alexa-seq pipeline as instructed for steps 0 to 5B. Step 5C (submitMapBatch.sh) is the first step that requires submitting to a cluster. That sh file contains a whole bunch of bash commands for additional sh files (e.g., blast_vs_intergenics.sh). It is those files which should be submitted to a cluster, not the parent submitMapBatch.sh file. You can do them individually or cat them into combined files. I create one combined batch file for all libraries separated only by feature type (repeats, transcripts, etc) because they have different memory and runtime requirements. I can thus optimize cluster submission parameters for each of the 6 feature types. This is necessary because our cluster uses wallclock estimates and task number to determine job priority in the queue. Maybe your cluster has a more simple setup and this step will be unnecessary for you. Once I have combined the bash files I run my submitjobs.pl script on it and wait for it to finish. In later steps, whenever alexa says to submit some jobs to a cluster, the bash file typically contains the tasks/commands (instead of additional bash commands as above). I just run my submitjobs.pl script on each of those bash files. Check .output and .error files for problems and then proceed to the next step.
For each project, once the alexa-seq .commands file is produced, I make a new copy of this file and edit it to add my own commands that are necessary for job submission. This file can then be used as a template for running future projects.
Comment
-
I have the same issue with my data and analysis servers not being mounted on the head nodes. I've found rsync useful for this. I was given a decent size data folder accessible by the cluster. I do some serial steps, rsync to the cluster-mounted server, run parallel jobs and then rsync back. But, a 15-minute max job limit will be a problem. Some jobs probably take longer than that.
Comment
-
Originally posted by obig View PostI have the same issue with my data and analysis servers not being mounted on the head nodes. I've found rsync useful for this. I was given a decent size data folder accessible by the cluster. I do some serial steps, rsync to the cluster-mounted server, run parallel jobs and then rsync back. But, a 15-minute max job limit will be a problem. Some jobs probably take longer than that.
Comment
-
Oh. Sorry I see what you were saying now. This is indeed the exact situation I have then. I found that rsyncing back and forth was sometimes pretty slow (many files). So, I found myself even submitting many of the individual serial jobs to the cluster and then just rsyncing back at the end. If I could go back and do it over though, I probably would have pushed harder to get a decent box installed and mounted on the cluster for serial processing steps. Its just much easier this way and such boxes are not that expensive these days.
Comment
-
database issue
Hi! We currently play with alexa-seq VM image and try to run a demo analysis. The first attempt failed due to that permanent error message 'DBD::mysql::st execute failed: Table 'ALEXA_hs_53_36o.Gene' doesn't exist at /home/alexa-seq/ALEXA/alexa_seq/utilities/ALEXA_DB.pm line 196.'
We did all steps exactly is it described in the DEMO.txt file except for the '#4.) Import annotation database' step. We downloaded hs_53_36o.tar.gz manually and moved it to the /home/alexa-seq/ALEXA/sequence_databases/
After that step4's command was executed '/home/alexa-seq/ALEXA/alexa_seq/alternativeExpressionDatabase/installAnnotationDb.pl
--annotation_dir=/home/alexa-seq/ALEXA/sequence_databases/ --db_build=hs_53_36o --server=localhost --user=alexa-seq --password=alexa-seq'
and so on.
Do you have any ideas what could went wrong and caused presumable absence of the ALEXA_hs_53_36o.Gene in the right place?
Thanks!
Comment
-
parseRepeats.sh terminate on the first blast.gz file. Would you please help?
Begin parsing 137 blast results files
Parsing ..../final/blast_results/A/694_Lane1/repeats/blast_0000.gz for blast results
Multiple Paired Reads - Unambiguous. Subject ID: AluSx|SINE1/7SL|Primates READ1: 694_1_1101_4534_2459_R1 READ2: 694_1_1101_4534_2459_R2$VAR1 = {};
$VAR1 = undef;
Comment
Latest Articles
Collapse
-
by seqadmin
Many organizations study rare diseases, but few have a mission as impactful as Rady Children’s Institute for Genomic Medicine (RCIGM). “We are all about changing outcomes for children,” explained Dr. Stephen Kingsmore, President and CEO of the group. The institute’s initial goal was to provide rapid diagnoses for critically ill children and shorten their diagnostic odyssey, a term used to describe the long and arduous process it takes patients to obtain an accurate...-
Channel: Articles
12-16-2024, 07:57 AM -
-
by seqadmin
Innovations in next-generation sequencing technologies and techniques are driving more precise and comprehensive exploration of complex biological systems. Current advancements include improved accessibility for long-read sequencing and significant progress in single-cell and 3D genomics. This article explores some of the most impactful developments in the field over the past year.
Long-Read Sequencing
Long-read sequencing has seen remarkable advancements,...-
Channel: Articles
12-02-2024, 01:49 PM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, 12-17-2024, 10:28 AM
|
0 responses
26 views
0 likes
|
Last Post
by seqadmin
12-17-2024, 10:28 AM
|
||
Started by seqadmin, 12-13-2024, 08:24 AM
|
0 responses
42 views
0 likes
|
Last Post
by seqadmin
12-13-2024, 08:24 AM
|
||
Started by seqadmin, 12-12-2024, 07:41 AM
|
0 responses
28 views
0 likes
|
Last Post
by seqadmin
12-12-2024, 07:41 AM
|
||
Started by seqadmin, 12-11-2024, 07:45 AM
|
0 responses
42 views
0 likes
|
Last Post
by seqadmin
12-11-2024, 07:45 AM
|
Comment