Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • craczy
    Junior Member
    • Jan 2010
    • 8

    #16
    Originally posted by sklages View Post
    But format of index files has not changed from version 1 to 2?
    Unfortunately it has. The index contains extra information about the reference and with isaac2 that information has changed. Specifically, in the isaac2 index we are keeping track for each position in the reference genome if there are similar sequences elsewhere in the reference.

    Comment

    • GenoMax
      Senior Member
      • Feb 2008
      • 7142

      #17
      I did not specify a value for seed-length so the process is creating all possible combinations [--annotation-seed-lengths arg (=16 20 24 28 32 36 40 44 48 52 56 60 64 68 72 76 80]. It looks like the end may be in sight today for the process I am running since the files for 80 are being made now.

      @sven: Expect a multi-day turnaround.

      Comment

      • sklages
        Senior Member
        • May 2008
        • 628

        #18
        I haven't neither .. should use 32.
        But .. I am optmistic :-)

        Comment

        • GenoMax
          Senior Member
          • Feb 2008
          • 7142

          #19
          @Semyon/Come: Can one of you confirm if the following files represent the correct isaac2 index for hg19 genome? My isaac-sort-reference job appeared to have finished (no errors) but these are the only files I see in the top level directory (Temp directory is still there with files within)
          Code:
          1.1G 2uniqueness.16bpb.gz
           47G kmer-positions-32-0.dat
           50K sorted-reference.xml

          Comment

          • sklages
            Senior Member
            • May 2008
            • 628

            #20
            Originally posted by sklages View Post
            OK .. index creation is running for hg19 ... I'll report back tomorrow.
            Well, .. for now .. the server crashed overnight, just three hours ago ..
            We now have to investigate what event caused this crash. Maybe it is just "Murphy's Law" .. we'll see.

            Comment

            • sklages
              Senior Member
              • May 2008
              • 628

              #21
              Originally posted by sklages View Post
              Well, .. for now .. the server crashed overnight, just three hours ago ..
              We now have to investigate what event caused this crash. Maybe it is just "Murphy's Law" .. we'll see.
              Well, .. it was indeed Murphy's law :-)
              We had a failure on a network interface .. that made at least one process going frenzy and pushed the load beyond 1000...

              So I'll restart indexing today.

              Comment

              • craczy
                Junior Member
                • Jan 2010
                • 8

                #22
                Originally posted by GenoMax View Post
                @Semyon/Come: Can one of you confirm if the following files represent the correct isaac2 index for hg19 genome? My isaac-sort-reference job appeared to have finished (no errors) but these are the only files I see in the top level directory (Temp directory is still there with files within)
                Code:
                1.1G 2uniqueness.16bpb.gz
                 47G kmer-positions-32-0.dat
                 50K sorted-reference.xml
                This looks correct, but surprising. Did you specify something like "-w 1" on the command line by any chance?

                All the kmers are indexed in on single data file (kmer-positions-32-0.dat), which is not a very good thing as it prevents parallelisation when searching for mapping candidates.

                You can use the "isaac-pack-reference" and then "isaac-unpack-reference -w 6" to split the index into smaller files without having to re-doing the reference sorting.

                Comment

                • GenoMax
                  Senior Member
                  • Feb 2008
                  • 7142

                  #23
                  Originally posted by craczy View Post
                  This looks correct, but surprising. Did you specify something like "-w 1" on the command line by any chance?
                  Thanks for confirming that. I had only done this

                  Code:
                  $ isaac-sort-reference -g /path_to/HG19_UCSC/Sequence/WholeGenomeFasta/genome.fa -o .
                  Is there a better command-line for future reference?

                  Originally posted by craczy View Post
                  You can use the "isaac-pack-reference" and then "isaac-unpack-reference -w 6" to split the index into smaller files without having to re-doing the reference sorting.
                  I did the isaac-pack-reference thinking that it would "compress" the index but nothing appeared to change except the date stamps.

                  Update: I think I need to move the "Temp" directory out of the way (just realized that and trying it now) for "pack-reference" to work.

                  Comment

                  • sklages
                    Senior Member
                    • May 2008
                    • 628

                    #24
                    Well, I can confirm that.

                    It took ~64h on a 48 core "Opteron 6176 SE" (fast local storage, RAID) to build a hg19 index.

                    Code:
                    isaac-sort-reference --genome-file fa_hg19/genome.fa --jobs 1 --output-directory iSAAC2Index.32 --quiet
                    The result is:
                    Code:
                    938M 2015.07.27 06:21:35 2uniqueness.16bpb.gz
                     42G 2015.07.27 06:54:45 kmer-positions-32-0.dat
                     15K 2015.07.27 06:54:51 sorted-reference.xml
                    8.0K 2015.07.27 06:54:51 Temp
                    with 'Temp' being 1.1TiB (!) in size ... (btw, why don't you clean Temp automatically after successfully finishing a job?).

                    Comment

                    • GenoMax
                      Senior Member
                      • Feb 2008
                      • 7142

                      #25
                      @come:

                      I tried the "isaac-unpack-reference" (relevant part of the command line below)

                      Code:
                      $ isaac-unpack-reference -j 8 -w 6 -i .
                      Resulted in this error

                      Code:
                      tar: .: Cannot read: Is a directory
                      tar: At beginning of tape, quitting now
                      tar: Error is not recoverable: exiting now
                      make: *** [Temp/sorted-reference.xml] Error 2
                      @sven: Can you see if it works for you?

                      BTW: "Temp" directory is required for the unpack-reference.

                      Comment

                      • sklages
                        Senior Member
                        • May 2008
                        • 628

                        #26
                        Just tried,
                        Code:
                        isaac-unpack-reference -j 1 -w 6 -i . --dry-run
                        This (basically) results in this error:
                        Code:
                        warning: failed to load external entity "Temp/sorted-reference.xml"
                        unable to parse Temp/sorted-reference.xml
                        warning: failed to load external entity "Temp/sorted-reference.xml"
                        unable to parse Temp/sorted-reference.xml
                        Without dry-run:
                        Code:
                        isaac-unpack-reference -j 1 -w 6 -i .
                        tar fails:
                        Code:
                        tar -C Temp --touch -xvf .
                        tar: .: Cannot read: Is a directory
                        tar: At beginning of tape, quitting now
                        tar: Error is not recoverable: exiting now
                        make: *** [Temp/sorted-reference.xml] Error 2
                        Even when I copy sorted-reference.xml to Temp, I get an error:

                        Code:
                        make[1]: Entering directory `/path/to/iSAACindexBuildDir/iSAAC2Index.32'
                        make[1]: *** No rule to make target `Temp/genome.fa', needed by `/path/to/iSAACindexBuildDir/iSAAC2Index.32/genome.fa'.  Stop.
                        make[1]: Leaving directory `/path/to/iSAACindexBuildDir/iSAAC2Index.32'
                        make: *** [all] Error 2

                        Comment

                        • sklages
                          Senior Member
                          • May 2008
                          • 628

                          #27
                          Originally posted by GenoMax View Post
                          BTW: "Temp" directory is required for the unpack-reference.
                          That's funny though .. under normal circumstances I'd remove this folder as it occupies quite a lot of disk space ..

                          Comment

                          • GenoMax
                            Senior Member
                            • Feb 2008
                            • 7142

                            #28
                            @sven: A new thread has been created for posts related to isaac2 genome index creation.

                            Comment

                            • craczy
                              Junior Member
                              • Jan 2010
                              • 8

                              #29
                              The input file should be the 'sorted-reverence.xml', not the current directory:

                              This should work:

                              Code:
                              isaac-unpack-reference -j 1 -w 6 -i sorted-reference.xml
                              Remember to remove the already existing Temp directory, if any

                              Come

                              Comment

                              • GenoMax
                                Senior Member
                                • Feb 2008
                                • 7142

                                #30
                                Originally posted by craczy View Post
                                The input file should be the 'sorted-reverence.xml', not the current directory:

                                This should work:

                                Code:
                                isaac-unpack-reference -j 1 -w 6 -i sorted-reference.xml
                                Remember to remove the already existing Temp directory, if any

                                Come
                                This is not working for me:

                                Code:
                                tar: This does not look like a tar archive
                                tar: Skipping to next header
                                tar: Read 4461 bytes from ./sorted-reference.xml
                                tar: Error exit delayed from previous errors
                                make: *** [Temp/sorted-reference.xml] Error 2

                                Comment

                                Latest Articles

                                Collapse

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by SEQadmin2, 06-05-2026, 10:09 AM
                                0 responses
                                10 views
                                0 reactions
                                Last Post SEQadmin2  
                                Started by SEQadmin2, 06-04-2026, 08:59 AM
                                0 responses
                                21 views
                                0 reactions
                                Last Post SEQadmin2  
                                Started by SEQadmin2, 06-02-2026, 12:03 PM
                                0 responses
                                28 views
                                0 reactions
                                Last Post SEQadmin2  
                                Started by SEQadmin2, 06-02-2026, 11:40 AM
                                0 responses
                                22 views
                                0 reactions
                                Last Post SEQadmin2  
                                Working...