Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • MosaikJump not working properly

    Hello all,

    I have been trying to create a mosaik jump database from a reference fasta file (hg18, NCBI build 36). I'm using the following command:

    Code:
    ./MosaikJump -ia hg18_combined.dat -out hg18_combined.jmp -hs 15
    it starts off running fine, but when it gets to 5% completed, the ETA starts climbing and the "number of hashes" also starts climbing rapidly. this goes until both values roll over to 0, and then the program freezes. help much appreciated.

    -Rahul Dhodapkar

  • #2
    how much ram do you have?

    Code:
    MosaikJump -ia human_g1k_v37_chr_ucsc.fasta.dat -out human_g1k_v37_chr_ucsc.fasta_15 -hs 15 -mem 6
    works fine for me but I ran into memory problems again when I tried to aligned.
    http://kevin-gattaca.blogspot.com/

    Comment


    • #3
      i have 10 GB of ram, is that insufficient for this task?

      -Rahul Dhodapkar
      Last edited by rahul.m.dhodapkar; 08-05-2010, 11:54 AM.

      Comment


      • #4
        Hello all,

        I have been trying to create a mosaikbuild vertion from a reference fasta file (hg18, NCBI build 36). I'm using the following command:
        ./Mosaikbuild -fr hg18.fa -oa hg18.dat ,warning
        ERROR: Could not open FASTA file (/home/database/hg18/hg18.fa) when performing integrity check
        but I input ./Mosaikbuild -fr chr1.fa -oa chr1.dat ,It can running now .my RAM is 32G,I want to kown if my RAM is low

        Comment


        • #5
          32 Gigabytes of RAM should be plenty to convert hg18.fa to hg18.dat

          Comment


          • #6
            mosaik-aligner/bin/MosaikBuild -fr /home/database/hg18/hg18.fa.gz -oa /home/share2/chenchong/hg18.dat
            ------------------------------------------------------------------------------
            MosaikBuild 1.0.1388 2010-02-01
            Michael Stromberg Marth Lab, Boston College Biology Department
            ------------------------------------------------------------------------------

            - converting /home/database/hg18/hg18.fa.gz to a reference sequence archive.

            - parsing reference sequences:
            ref seqs: 49 (0.1205 ref seqs/s)

            - writing reference sequences:
            100%[==================================] 1.71 ref seqs/s in 28 s

            - calculating MD5 checksums:
            100%[==================================] 3.90 ref seqs/s in 12 s

            - writing reference sequence index:
            100%[==================================] 49.0 ref seqs/s in 1 s
            ERROR: Unable to allocate enough memory (3087005219 bytes) to create the concatenated reference sequence.
            my RAM is 32G,but the result warning Unable to allocate enough memory (3087005219 bytes) to create the concatenated reference sequence.

            Comment


            • #7
              wait, why do you have 49 ref seqs? Shouldn't there only be 25 (22 autosomes + X, Y, M)? That might be the source of the problem. What exactly is in your hg18.fa file?

              Comment


              • #8
                >chr10
                >chr10_random
                >chr11
                >chr11_random
                >chr12
                >chr13
                >chr13_random
                >chr14
                >chr15
                >chr15_random
                >chr16
                >chr16_random
                >chr17
                >chr17_random
                >chr18
                >chr18_random
                >chr19
                >chr19_random
                >chr1
                >chr1_random
                >chr20
                >chr21
                >chr21_random
                >chr22
                >chr22_h2_hap1
                >chr22_random
                >chr2
                >chr2_random
                >chr3
                >chr3_random
                >chr4
                >chr4_random
                >chr5
                >chr5_h2_hap1
                >chr5_random
                >chr6_cox_hap1
                >chr6
                >chr6_qbl_hap2
                >chr6_random
                >chr7
                >chr7_random
                >chr8
                >chr8_random
                >chr9
                >chr9_random
                >chrM
                >chrX
                >chrX_random
                >chrY

                Comment


                • #9
                  this mine hg18.fa

                  Comment


                  • #10
                    I'm not sure how useful anything that aligns to _random will be, since the _random .fa files contain random sequence that is on those chromosomes. They're bits that haven't been fitted in yet, so I would try removing those, and running MosaikBuild again and seeing where that puts you. It may well be that those additional sequences are eating up an enormous amount of memory. Let me know how that goes.

                    -Rahul Dhodapkar

                    Comment


                    • #11
                      [chenchong@node03 chenchong]$ mosaik-aligner/bin/MosaikBuild -fr /home/share2/chenchong/hg18/hg18.fa -oa /home/share2/chenchong/hg18.dat
                      ------------------------------------------------------------------------------
                      MosaikBuild 1.0.1388 2010-02-01
                      Michael Stromberg Marth Lab, Boston College Biology Department
                      ------------------------------------------------------------------------------

                      - converting /home/share2/chenchong/hg18/hg18.fa to a reference sequence archive.

                      - parsing reference sequences:
                      ref seqs: 25 (0.1938 ref seqs/s)

                      - writing reference sequences:
                      100%[==================================] 0.8736 ref seqs/s in 28 s

                      - calculating MD5 checksums:
                      100%[==================================] 1.99 ref seqs/s in 12 s

                      - writing reference sequence index:
                      100%[==================================] 25.0 ref seqs/s in 1 s
                      ERROR: Unable to allocate enough memory (3080448052 bytes) to create the concatenated reference sequence.
                      I have been removed the _random ,but the resullt seem to similar

                      Comment


                      • #12
                        tinacai,

                        do you have a portal you can use to monitor the memory usage of your node? How much memory is the process using when it fails? Is there a memory usage spike?

                        -Rahul Dhodapkar

                        Comment


                        • #13
                          Rahul
                          I'm sure it is not problem for my memory, my RAM is 32G and only me is using it

                          Comment


                          • #14
                            it's possible is that there is some sort of internal memory ceiling that prevents your process from using all of the RAM that you physically have available, which is why I'm asking if the memory usage is highest at the last step of the process

                            Comment


                            • #15
                              hi,Rahul
                              I have run my program again and check my memory situation,I found my memory usage highest at the last step of the process is 9.5%

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Current Approaches to Protein Sequencing
                                by seqadmin


                                Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                                04-04-2024, 04:25 PM
                              • seqadmin
                                Strategies for Sequencing Challenging Samples
                                by seqadmin


                                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                                03-22-2024, 06:39 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, 04-11-2024, 12:08 PM
                              0 responses
                              31 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-10-2024, 10:19 PM
                              0 responses
                              32 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-10-2024, 09:21 AM
                              0 responses
                              28 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-04-2024, 09:00 AM
                              0 responses
                              53 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X