SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
STAR: ultrafast universal RNA-seq aligner alexdobin Bioinformatics 218 04-02-2018 05:59 PM
STAR rna seq Aligner installation arcolombo698 Bioinformatics 10 10-25-2016 09:31 AM
STAR vs Tophat (2.0.5/6) dvanic Bioinformatics 44 05-21-2014 07:08 AM
STAR aligner issue shocker8786 Bioinformatics 3 05-21-2014 01:59 AM

Reply
 
Thread Tools
Old 09-22-2014, 10:50 AM   #1
sbdk82
Member
 
Location: USA

Join Date: Jul 2014
Posts: 26
Default STAR Aligner

I am running STAR for aligning wheat RNA-Seq data with Ensemble reference file . The size of reference file is 4gb. The genome directory created in the first step is 42 gb. The mapping step took more than 50 hours. Some jobs are still running for more than 75 hours

I used 5 nodes with 100gb each in our university cluster . Here is the script I used
HTML Code:
#!/bin/sh
#SBATCH --job-name=STAR
#SBATCH --nodes=5
#SBATCH --ntasks-per-node=1
#SBATCH --time=120:00:00
#SBATCH --mem=100g
#SBATCH --error=<Error File Name>
#SBATCH --output=<Output File Name>

cd  /Dir_PATH/STAR

./STAR_2.4.0b/STAR --genomeDir /Dir_PATH/STAR/index  --readFilesIn  /File_PATH/L001_R1_001.fastq,/File_PATH/L002_R1_001.fastq File_PATH/L001_R2_001.fastq,/File_PATH/_L002_R2_001.fastq --outFileNamePrefix /Dir_PATH/<Prefix_Name>/ --runThreadN 10

I ran BWA-MEM on same data and it took less than 10 hours to complete the mapping. Am I doing something wrong or do I need to choose some other parameters ?
sbdk82 is offline   Reply With Quote
Old 09-22-2014, 11:21 AM   #2
dpryan
Devon Ryan
 
Location: Freiburg, Germany

Join Date: Jul 2011
Posts: 3,480
Default

That seems quite odd. You're giving each of the nodes different files, yes?
dpryan is offline   Reply With Quote
Old 09-22-2014, 12:10 PM   #3
sbdk82
Member
 
Location: USA

Join Date: Jul 2014
Posts: 26
Default

I think it uses total 500 gb (100gb x 5 nodes) for this job. It does not distribute different files into different nodes.
sbdk82 is offline   Reply With Quote
Old 09-22-2014, 12:20 PM   #4
dpryan
Devon Ryan
 
Location: Freiburg, Germany

Join Date: Jul 2011
Posts: 3,480
Default

You're just running the same thing on all of the nodes. If you're doing the same with bwa mem then that's happening there as well.
dpryan is offline   Reply With Quote
Old 09-22-2014, 12:22 PM   #5
dpryan
Devon Ryan
 
Location: Freiburg, Germany

Join Date: Jul 2011
Posts: 3,480
Default

I'll add that loading the I/O overhead of loading the index and constantly overwritting itself could cause a slow down (I limit STAR two 4 concurrent instances on our cluster when outputting to SAM since otherwise I can't guarantee that the drives can keep up if any other jobs are running).
dpryan is offline   Reply With Quote
Old 09-22-2014, 12:23 PM   #6
sbdk82
Member
 
Location: USA

Join Date: Jul 2014
Posts: 26
Default

So I should try with this?

HTML Code:
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --time=120:00:00
#SBATCH --mem=100g
sbdk82 is offline   Reply With Quote
Old 09-22-2014, 12:27 PM   #7
dpryan
Devon Ryan
 
Location: Freiburg, Germany

Join Date: Jul 2011
Posts: 3,480
Default

Sure, though you don't need to specify --ntasks-per-node when you just use one node. For reference, here is the start of mine:

Code:
#!/bin/bash
#SBATCH -J STAR-align
#SBATCH -t 4:00:00
nNodes=4
#SBATCH -N 4
#SBATCH -A ryand
#SBATCH --exclusive
#SBATCH --partition=work
BIN=$WORK/bin
i=0
for i in `seq $nNodes`
do
    j=$(($i-1))
    srun -N 1 --relative $j $BIN/slurm_STAR.sh $j $nNodes &
done
wait
rm Aligned.out.sam Log.out Log.progress.out
rm -rf _STARtmp
The slurm_STAR.sh shell script will align every Nth pair of fastq files (or single fastq file, as appropriate) in a preset directory. Every instance is run on an individual node. Note that I highly recommend using --exclusive if that's not otherwise the default on your cluster.
dpryan is offline   Reply With Quote
Old 09-22-2014, 12:29 PM   #8
sbdk82
Member
 
Location: USA

Join Date: Jul 2014
Posts: 26
Default

Thanks !!! I will try that
sbdk82 is offline   Reply With Quote
Reply

Tags
bwa-mem, star

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 08:47 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO