Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Using Star/ bowtie on cluster

    Hello everyone,

    I am just starting my project with RNAseq data analysis. I have my sequences in .fastq format. Its trimmed and good quality wise. Now I need to map it using any tool (bowtie/star), but on cluster. Can anyone help me guiding how to start it from the scratch? May be if anyone can help me with the scripts for cluster. I need a basic idea to figure out everything. I am clueless right now.

    Please help!!

    Thanks.

  • #2
    Do you know what kind of cluster you are going to be working with? What is the job-scheduling software that is being used on the cluster?

    Comment


    • #3
      I have replied in the next message.........
      Last edited by babi2305; 02-06-2013, 09:01 AM.

      Comment


      • #4
        Originally posted by GenoMax View Post
        Do you know what kind of cluster you are going to be working with? What is the job-scheduling software that is being used on the cluster?
        Hello Genomax,

        I am working on Gencluster. there are right now the following computational nodes
        node 1-10.
        Each node has 24 cores, 96Gb RAM and a local hard drive of 500Gb.

        node 11-20

        Each node has 8 cores, 32Gb RAM and local hard drives size vary from 500Gb to 2Tb.

        All in all there is 320 cores available.

        These are the queue names:

        QueueName -> Cores -> WaitTime
        forever-> 8 ->unlimited
        long 24 160h
        normal 176 36h
        short 112 6h


        My STAR package is installed in the cluster.

        I hope I replied what you asked for. Any other info required?

        Comment


        • #5
          In case you are not familiar with unix/Linux then this is not going to be as simple as us providing you with a set of command lines that you can run on your cluster. It would be extremely useful to spend some time learning basic unix. An excellent guide is located here

          You have access to a cluster that has a fully adequate configuration to do RNAseq analysis. There is a nice guide for RNAseq analysis here.

          Missing from the info you provided is what job scheduling software this cluster is using since depending on that the exact job submission procedure is going to vary (e.g. Sun/Oracle Grid Engine, Load Sharing Facility (LSF) have different job submission procedures/syntax).

          It may be best to find some local help for the actual job submission procedures since each cluster may have its own set of procedures.

          You can find the commands to run STAR in the program manual here: http://code.google.com/p/rna-star/downloads/list. Manual for bowtie is here: http://bowtie-bio.sourceforge.net/manual.shtml You are going to need the genome indexes (if it is a common genome) or you will need to build your own if you are working with an organism that is not common.

          The general idea is to encapsulate your program (STAR or bowtie) commands in a way the job scheduling software will understand.

          A general guide for job submissions for Sun/Oracle Grid Engine is here

          You can google for similar guides for LSF job submissions if you find out that your cluster users LSF.
          Last edited by GenoMax; 02-06-2013, 11:29 AM.

          Comment


          • #6
            Originally posted by GenoMax View Post
            In case you are not familiar with unix/Linux then this is not going to be as simple as us providing you with a set of command lines that you can run on your cluster. It would be extremely useful to spend some time learning basic unix. An excellent guide is located here..
            Thankyou so much, yes I know the shell scripting, perl, linux blah blah..I have a basic bash script ready for submitting the job. what I do not know is the cluster computing to map NGS reads i.e. exact commands that I should submit in my perl or bash script.

            Has anyone here did it before?..may be if some person can paste the part of the script here... this will be a huge help.

            Comment


            • #7
              The commands for mapping are not going to be different if you run the program standalone or on a cluster.

              You can find the commands for bowtie alignments in the RNAseq analysis guide that I had linked in post #5 (http://en.wikibooks.org/wiki/Next_Ge..._%28NGS%29/RNA). Refer to the STAR manual for the commands for that program.

              As with most programs the default parameters may be adequate for your needs but that is something you are going to have to decide (and experiment with) after running some tests.

              Comment


              • #8
                Thankyou..I am reading it..will get back

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Current Approaches to Protein Sequencing
                  by seqadmin


                  Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                  04-04-2024, 04:25 PM
                • seqadmin
                  Strategies for Sequencing Challenging Samples
                  by seqadmin


                  Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                  03-22-2024, 06:39 AM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, 04-11-2024, 12:08 PM
                0 responses
                23 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-10-2024, 10:19 PM
                0 responses
                24 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-10-2024, 09:21 AM
                0 responses
                20 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-04-2024, 09:00 AM
                0 responses
                52 views
                0 likes
                Last Post seqadmin  
                Working...
                X