![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
STAR: ultrafast universal RNA-seq aligner | alexdobin | Bioinformatics | 218 | 04-02-2018 06:59 PM |
STAR rna seq Aligner installation | arcolombo698 | Bioinformatics | 10 | 10-25-2016 10:31 AM |
STAR vs Tophat (2.0.5/6) | dvanic | Bioinformatics | 44 | 05-21-2014 08:08 AM |
STAR aligner issue | shocker8786 | Bioinformatics | 3 | 05-21-2014 02:59 AM |
![]() |
|
Thread Tools |
![]() |
#1 |
Member
Location: USA Join Date: Jul 2014
Posts: 26
|
![]()
I am running STAR for aligning wheat RNA-Seq data with Ensemble reference file . The size of reference file is 4gb. The genome directory created in the first step is 42 gb. The mapping step took more than 50 hours. Some jobs are still running for more than 75 hours
I used 5 nodes with 100gb each in our university cluster . Here is the script I used HTML Code:
#!/bin/sh #SBATCH --job-name=STAR #SBATCH --nodes=5 #SBATCH --ntasks-per-node=1 #SBATCH --time=120:00:00 #SBATCH --mem=100g #SBATCH --error=<Error File Name> #SBATCH --output=<Output File Name> cd /Dir_PATH/STAR ./STAR_2.4.0b/STAR --genomeDir /Dir_PATH/STAR/index --readFilesIn /File_PATH/L001_R1_001.fastq,/File_PATH/L002_R1_001.fastq File_PATH/L001_R2_001.fastq,/File_PATH/_L002_R2_001.fastq --outFileNamePrefix /Dir_PATH/<Prefix_Name>/ --runThreadN 10 I ran BWA-MEM on same data and it took less than 10 hours to complete the mapping. Am I doing something wrong or do I need to choose some other parameters ? |
![]() |
![]() |
![]() |
#2 |
Devon Ryan
Location: Freiburg, Germany Join Date: Jul 2011
Posts: 3,480
|
![]()
That seems quite odd. You're giving each of the nodes different files, yes?
|
![]() |
![]() |
![]() |
#3 |
Member
Location: USA Join Date: Jul 2014
Posts: 26
|
![]()
I think it uses total 500 gb (100gb x 5 nodes) for this job. It does not distribute different files into different nodes.
|
![]() |
![]() |
![]() |
#4 |
Devon Ryan
Location: Freiburg, Germany Join Date: Jul 2011
Posts: 3,480
|
![]()
You're just running the same thing on all of the nodes. If you're doing the same with bwa mem then that's happening there as well.
|
![]() |
![]() |
![]() |
#5 |
Devon Ryan
Location: Freiburg, Germany Join Date: Jul 2011
Posts: 3,480
|
![]()
I'll add that loading the I/O overhead of loading the index and constantly overwritting itself could cause a slow down (I limit STAR two 4 concurrent instances on our cluster when outputting to SAM since otherwise I can't guarantee that the drives can keep up if any other jobs are running).
|
![]() |
![]() |
![]() |
#6 |
Member
Location: USA Join Date: Jul 2014
Posts: 26
|
![]()
So I should try with this?
HTML Code:
#SBATCH --nodes=1 #SBATCH --ntasks-per-node=1 #SBATCH --time=120:00:00 #SBATCH --mem=100g |
![]() |
![]() |
![]() |
#7 |
Devon Ryan
Location: Freiburg, Germany Join Date: Jul 2011
Posts: 3,480
|
![]()
Sure, though you don't need to specify --ntasks-per-node when you just use one node. For reference, here is the start of mine:
Code:
#!/bin/bash #SBATCH -J STAR-align #SBATCH -t 4:00:00 nNodes=4 #SBATCH -N 4 #SBATCH -A ryand #SBATCH --exclusive #SBATCH --partition=work BIN=$WORK/bin i=0 for i in `seq $nNodes` do j=$(($i-1)) srun -N 1 --relative $j $BIN/slurm_STAR.sh $j $nNodes & done wait rm Aligned.out.sam Log.out Log.progress.out rm -rf _STARtmp |
![]() |
![]() |
![]() |
#8 |
Member
Location: USA Join Date: Jul 2014
Posts: 26
|
![]()
Thanks !!! I will try that
|
![]() |
![]() |
![]() |
Tags |
bwa-mem, star |
Thread Tools | |
|
|