Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Genome generate STAR

    Hi
    I wanted to generate my own genome using STAR. It started yesterday around 4 PM (CET) and hasn't finished yet. Any ideas how long it should take?

    Code:
    Command line was: ./STAR_2.3.0e/STAR --runMode genomeGenerate --genomeDir ./star_hg19_ref --genomeFastaFiles ./ --sjdbGTFfile ./gene_splits.gtf --runThreadN 6
    std.out was:
    Code:
    Jul 10 15:55:16 ..... Started STAR run
    Jul 10 15:55:16 ... Starting to generate Genome files
    Maybe it broke?
    Any suggestions?

    Thanks,

    Phil

  • #2
    I don't recall how long my last run of that took, but it certainly didn't take overnight! You might check to see if it's actually using resources or if something else is already using all of the CPU.

    Comment


    • #3
      nope nothing else is running. It's still running. It does not use 6 threads though but... i think i'll download the provided version now.... . The one I used was the normal UCSC verison. Dunno why it does not work...

      Comment


      • #4
        Have you made the genomeDir? Also maybe specify the fasta files. I remember I had a little difficulty until I spoon-fed it.

        Comment


        • #5
          Hi Phil,

          you need to list all the .fasta files in the in the command line, not just the directory name
          --genomeFastaFiles g1.fa g2.fa ...
          You can try to use wildcards:
          --genomeFastaFiles ./*.fa

          Also, for generating genome with annotations you need to specify
          --sjdbOverhang <L>
          where <L> is ideally read (mate) length -1 - you can also set it generically at 100.

          The run should take a few hours. If fasta files were processed without a problem, within the first few minutes it it should get to the " ... starting to sort Suffix Array. This may take a long time ..." message .

          Cheers
          Alex

          Comment


          • #6
            Thanks a lot, will give it a try!

            Comment


            • #7
              Issues with STAR genomeGenerate on hg19

              I am trying to generate genome files for hg19 (using hg19 annotations downloaded from UCSC). I have to use this genome build, however I am having trouble with the genomeGenerate command. I have used STAR in the past fine with the ENSEMBL genome and gtf file, so I am not sure why I am having problems now. It has been running for over 24 hours now and seems stalled. I cannot find pre-build genome files for hg19 that have annotations. The log file is attached.

              The command I am using is:

              --runMode genomeGenerate --genomeDir ./STARoutput_1 --genomeFastaFiles genome.fa --sjdbGTFfile genome_ref_notID.gtf --sjdbOverhang 99 --runThreadN 20
              Attached Files

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Essential Discoveries and Tools in Epitranscriptomics
                by seqadmin




                The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                Yesterday, 07:01 AM
              • seqadmin
                Current Approaches to Protein Sequencing
                by seqadmin


                Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                04-04-2024, 04:25 PM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, 04-11-2024, 12:08 PM
              0 responses
              56 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 10:19 PM
              0 responses
              52 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 09:21 AM
              0 responses
              45 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-04-2024, 09:00 AM
              0 responses
              55 views
              0 likes
              Last Post seqadmin  
              Working...
              X