SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
STAR vs Tophat (2.0.5/6) dvanic Bioinformatics 44 05-21-2014 08:08 AM
bowtie index problem (bowtie-build and then bowtie-inspect) tgenahmet Bioinformatics 4 09-10-2013 12:51 PM
v3: Effect of high cluster densities on cluster PF and %Q30 pmiguel Illumina/Solexa 3 10-05-2011 06:36 AM
miRNA mature and star sequences, isomiRs etc naluru Bioinformatics 3 04-19-2011 06:38 AM

Reply
 
Thread Tools
Old 02-06-2013, 08:21 AM   #1
babi2305
Member
 
Location: Barcelona

Join Date: Feb 2013
Posts: 14
Default Using Star/ bowtie on cluster

Hello everyone,

I am just starting my project with RNAseq data analysis. I have my sequences in .fastq format. Its trimmed and good quality wise. Now I need to map it using any tool (bowtie/star), but on cluster. Can anyone help me guiding how to start it from the scratch? May be if anyone can help me with the scripts for cluster. I need a basic idea to figure out everything. I am clueless right now.

Please help!!

Thanks.
babi2305 is offline   Reply With Quote
Old 02-06-2013, 08:41 AM   #2
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,992
Default

Do you know what kind of cluster you are going to be working with? What is the job-scheduling software that is being used on the cluster?
GenoMax is offline   Reply With Quote
Old 02-06-2013, 08:54 AM   #3
babi2305
Member
 
Location: Barcelona

Join Date: Feb 2013
Posts: 14
Default

I have replied in the next message.........

Last edited by babi2305; 02-06-2013 at 09:01 AM.
babi2305 is offline   Reply With Quote
Old 02-06-2013, 08:59 AM   #4
babi2305
Member
 
Location: Barcelona

Join Date: Feb 2013
Posts: 14
Default

Quote:
Originally Posted by GenoMax View Post
Do you know what kind of cluster you are going to be working with? What is the job-scheduling software that is being used on the cluster?
Hello Genomax,

I am working on Gencluster. there are right now the following computational nodes
node 1-10.
Each node has 24 cores, 96Gb RAM and a local hard drive of 500Gb.

node 11-20

Each node has 8 cores, 32Gb RAM and local hard drives size vary from 500Gb to 2Tb.

All in all there is 320 cores available.

These are the queue names:

QueueName -> Cores -> WaitTime
forever-> 8 ->unlimited
long 24 160h
normal 176 36h
short 112 6h


My STAR package is installed in the cluster.

I hope I replied what you asked for. Any other info required?
babi2305 is offline   Reply With Quote
Old 02-06-2013, 11:22 AM   #5
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,992
Default

In case you are not familiar with unix/Linux then this is not going to be as simple as us providing you with a set of command lines that you can run on your cluster. It would be extremely useful to spend some time learning basic unix. An excellent guide is located here

You have access to a cluster that has a fully adequate configuration to do RNAseq analysis. There is a nice guide for RNAseq analysis here.

Missing from the info you provided is what job scheduling software this cluster is using since depending on that the exact job submission procedure is going to vary (e.g. Sun/Oracle Grid Engine, Load Sharing Facility (LSF) have different job submission procedures/syntax).

It may be best to find some local help for the actual job submission procedures since each cluster may have its own set of procedures.

You can find the commands to run STAR in the program manual here: http://code.google.com/p/rna-star/downloads/list. Manual for bowtie is here: http://bowtie-bio.sourceforge.net/manual.shtml You are going to need the genome indexes (if it is a common genome) or you will need to build your own if you are working with an organism that is not common.

The general idea is to encapsulate your program (STAR or bowtie) commands in a way the job scheduling software will understand.

A general guide for job submissions for Sun/Oracle Grid Engine is here

You can google for similar guides for LSF job submissions if you find out that your cluster users LSF.

Last edited by GenoMax; 02-06-2013 at 11:29 AM.
GenoMax is offline   Reply With Quote
Old 02-06-2013, 11:55 AM   #6
babi2305
Member
 
Location: Barcelona

Join Date: Feb 2013
Posts: 14
Default

Quote:
Originally Posted by GenoMax View Post
In case you are not familiar with unix/Linux then this is not going to be as simple as us providing you with a set of command lines that you can run on your cluster. It would be extremely useful to spend some time learning basic unix. An excellent guide is located here..
Thankyou so much, yes I know the shell scripting, perl, linux blah blah..I have a basic bash script ready for submitting the job. what I do not know is the cluster computing to map NGS reads i.e. exact commands that I should submit in my perl or bash script.

Has anyone here did it before?..may be if some person can paste the part of the script here... this will be a huge help.
babi2305 is offline   Reply With Quote
Old 02-06-2013, 12:08 PM   #7
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,992
Default

The commands for mapping are not going to be different if you run the program standalone or on a cluster.

You can find the commands for bowtie alignments in the RNAseq analysis guide that I had linked in post #5 (http://en.wikibooks.org/wiki/Next_Ge..._%28NGS%29/RNA). Refer to the STAR manual for the commands for that program.

As with most programs the default parameters may be adequate for your needs but that is something you are going to have to decide (and experiment with) after running some tests.
GenoMax is offline   Reply With Quote
Old 02-06-2013, 12:11 PM   #8
babi2305
Member
 
Location: Barcelona

Join Date: Feb 2013
Posts: 14
Default

Thankyou..I am reading it..will get back
babi2305 is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 12:29 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO