SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Fastq file too big for Bowtie2? Milestailsprowe Bioinformatics 4 08-12-2015 08:47 AM
Multiple RUM jobs on a same server? mrfox Bioinformatics 0 05-18-2012 10:32 AM
viewing Galaxy public server alignment output Mshegrss Bioinformatics 2 03-29-2012 06:20 AM
BFAST Web Server for the alignment of ABI SOLiD Data nilshomer Bioinformatics 2 05-18-2010 08:17 AM

Reply
 
Thread Tools
Old 04-24-2019, 10:20 AM   #1
chayan
Member
 
Location: USA

Join Date: Nov 2012
Posts: 51
Default Multiple fastq alignment with bowtie2 in server

Hi!
I'm trying to map multiple sra files (>6500) with bowtie2 against my reference genome. I am running slurm script in a server. While mapping for single sequence is working fine but when running bash loop all the time getting the following error

"path/to/slurm_script: line 16: path/to/file1.fastq: Permission denied"

Here is my slurm script

#!/bin/bash
#BATCH --job-name=ERR1135336.clean.reads.Assembly
#SBATCH -N 1 # Number of nodes, not cores
#SBATCH -t 2-00:00:00 # Walltime
#SBATCH --ntasks-per-node 40 # Number of cores
#SBATCH --output=out-%j.log # Output (console)
#SBATCH --partition=test # Queue

module use /gpfs/shared/modulefiles_local
module use /gpfs/shared/modulefiles_local/bio
module load bio/bowtie2/2.3.4

for i in $(path/to/*.fastq)
do
bowtie2 -x PC_805 --threads 40 -U ${i} -S path/to/${i%%.fastq}.sam
done


I am not sure whether this is really a permission issue or bash scripting issue.

Output of ls -l for the directory from where I am running slurm job

drwxr-xr-x 2 chayan.roy domain users 4096 Apr 23 10:14 PC_805


Output of ls -l for the directory where I am storing my fastq is

drwxr-xr-x 22 chayan.roy domain users 4096 Apr 22 14:44 HMP_2017

Any help will be much appreciated

Thanks
chayan is offline   Reply With Quote
Old 04-24-2019, 10:32 AM   #2
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,978
Default

You can't run a bash script inside one SLURM job and expect the jobs to be parallelised. Instead you should run bash script on the command line that in turn submits multiple/individual SLURM jobs.

"path/to/" I assume this a real path on your system that you are obfuscating here? If not you need to have a real value there.
GenoMax is offline   Reply With Quote
Old 04-24-2019, 10:58 AM   #3
chayan
Member
 
Location: USA

Join Date: Nov 2012
Posts: 51
Default

Thanks for your prompt response.

If I understood correctly I have to submit >6500 slurm array? Well this particular server has 56 nodes and each with 40 threads. Every single job is taking more than 3 hours. Is there any other ways to make it faster?

p.s. I have shortened the long real path in my post.
chayan is offline   Reply With Quote
Old 04-24-2019, 11:48 AM   #4
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,978
Default

If you want true parallelization then yes you would need to submit 6500 jobs to queue. You are likely not the only user so most of them will pend but will finish eventually.
GenoMax is offline   Reply With Quote
Old 04-29-2019, 02:08 PM   #5
archana87
Junior Member
 
Location: Canada

Join Date: Jul 2018
Posts: 6
Default

Hi,
In spite of giving the path in for loop, you can first add a prefix of the serial number in all your fastq files and then try like this

for i in $(1 6500);
do
bowtie2 -x PC_805 --threads 40 -U $i -S path/to/$i\_.fastq.sam;
done

Hoping it will help.

Last edited by archana87; 04-29-2019 at 02:10 PM.
archana87 is offline   Reply With Quote
Old 05-01-2019, 07:12 AM   #6
chayan
Member
 
Location: USA

Join Date: Nov 2012
Posts: 51
Default

Hi

I am running parallel jobs but all the getting the following error which I am not sure from my array script or something else.

Slurm Array

PHP Code:
#!/bin/bash

#SBATCH --job-name=Bowtie_Array # Job name
#SBATCH --nodes=12               # Number of nodes
#SBATCH --ntasks-per-node=40     # CPUs per node (MAX=40 for CPU nodes and 80 for GPU)
#SBATCH --output=bowtie-%A_%a.out  # Standard output (log file)
#SBATCH --partition=test        # Partition/Queue
#SBATCH --time=7-00:00:00          # Maximum walltime
#SBATCH --array=0-12        # job array index

module use /cm/shared/modulefiles_local
module 
use /gpfs/shared/modulefiles_local/bio
module load bio
/bowtie2/2.3.4

names
=($(cat jobs))
 
echo ${
names[${SLURM_ARRAY_TASK_ID}]}

bowtie2 --threads 40 -/gpfs/scratch/chayan.roy/Pc_project/HGM_Genomes/Index/PC_1969.fasta -${names[${SLURM_ARRAY_TASK_ID}]} -S alignments/${names[${SLURM_ARRAY_TASK_ID}]}.sam 

Error message

Quote:
SRR1789035.fastq
/gpfs/shared/apps_local/bowtie2/2.3.4.3/bin/bowtie2-align-s: error while loading shared libraries: libtbb.so.2: cannot open shared object file: No such file or directory
(ERR): Description of arguments failed!
Exiting now ...

Any help?
chayan is offline   Reply With Quote
Old 05-01-2019, 07:30 AM   #7
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,978
Default

Did you download the bowtie2 binaries or compile the program yourself? Looks like the thread building blocks (tbb) library is missing on your cluster. See the section on "building from source" in the manual.
GenoMax is offline   Reply With Quote
Old 05-01-2019, 08:05 AM   #8
chayan
Member
 
Location: USA

Join Date: Nov 2012
Posts: 51
Default

I don't have installation access and I just ask them but they will take month to respond I know. In the meanwhile I am trying to bypass it using Anaconda. Do let me know if there is any better ways to do it.

Thanks
chayan is offline   Reply With Quote
Old 05-01-2019, 10:01 AM   #9
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,978
Default

If you use the conda option make sure to remove "module load bio/bowtie2/2.3.4 " from your script.

Hopefully your home directory is available on all cluster nodes because conda will install programs in your home directory by default.
GenoMax is offline   Reply With Quote
Reply

Tags
bash, bioinformactics, bowtie2, server mode

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 05:38 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO