Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • lastz infinite loop problem

    Hi there,

    I am trying to run a lastz to perform a whole genome alignment between a turtle and the chicken genomes. I have split the fasta filenames of the target (turtle) into 24 files (one per CPU that I have in my server), and run 24 commands like this:

    for i in `cat ps1/x00.txt`; do for j in gg4/*.fa; do lastz ps1/$i $j B=0 C=0 E=30 H=0 K=3000 L=3000 M=50 O=400 T=1 Y=9400 > lav/`basename $i .fa`-`basename $j .fa`.lav; done; done;

    where x00.txt is the list with 1/24 of all fasta files for the turtle genome. Next is x01.txt, and so on to x23.txt.

    Now my problem is that even when the 24 threads finish doing all possible alignments (I get the correct number of output files given all the possible combinations), the lastz processes continue, starting again, i.e., they don't stop when the last combination is reached. So, I guess something's wrong with my command. Then, I try like in the "how-to" tutorial

    for i in `cat ps1/x00.txt`; do echo 'for j in gg4/*.fa; do lastz ps1/'$i' $j B=0 C=0 E=30 H=0 K=3000 L=3000 M=50 O=400 T=1 Y=9400 > lav/`basename '$i' .fa`-`basename $j .fa`.lav; done'; done;

    But that just echoes the commands....

    Where am I wrong?

    Cheers,
    champi

  • #2
    Try this:

    Code:
    for i in `cat ps1/x00.txt`; do for j in gg4/*.fa; do lastz ps1/[COLOR="Red"]'$i'[/COLOR] $j B=0 C=0 E=30 H=0 K=3000 L=3000 M=50 O=400 T=1 Y=9400 > lav/`basename $i .fa`-`basename $j .fa`.lav; done; done;
    You were missing the two tick marks around your ps1/$i.

    Comment


    • #3
      Thanks for the reply. I am actually trying the following, using a bash script like this:

      Code:
      #!/bin/bash
      
      for i in `cat ps1/x00.txt`;
          do for j in gg4/*.fa;
                  do lastz ps1/$i $j B=0 C=0 E=30 H=0 K=3000 L=3000 M=50 O=400 T=1 Y=9400 > lav/`basename $i .fa`-`basename $j .fa`.lav;
                  done;
          done;
      Do you think it will work? Or will it be an infinite loop? I'll try your change if this continues after finishing all combinations, which will take ~20 days...

      Cheers

      Comment


      • #4
        If it is going to take 20 days then perhaps you should stop now and test the loop first.

        Use the same change from my earlier post in your bash loop. It was tested with some dummy files and worked there. I suggest you do the same before launching the real lastz jobs.

        Comment


        • #5
          Yep, you're quite right. I should try with some short fasta files first...

          Thanks!!

          Comment


          • #6
            It doesn't work using '$i'... I get this 60 times, the number of combinations in my test files.

            FAILURE: fopen_or_die failed to open "ps1_test/$i" for "rb"

            Some bash problem?

            Comment


            • #7
              Well, without '$i' it looks OK. It finished correctly with the test files... I'll cross fingers then and try with the real data

              Comment


              • #8
                As long as it works...

                BTW reason $i is not working is because you are missing the single quote character around the $i (highlighted in red on my #2 comment). I am not sure where that is on a Japanese keyboard but it is on the key next to "return/enter" on a US English keyboard.

                Comment


                • #9
                  Yep, I saw it. I have a US English keyboard. The thing is that when I put them around the $i I got that error... If I write ps1/$i instead of ps1/'$i', it works. I was also editing the test script using nano on the terminal, so no problems associated with text editors in single quotes...

                  What's the difference by the way? Using or not the single quotes I mean.

                  Thank you!

                  Comment


                  • #10
                    I tested this on a Mac so perhaps my shell there has some differences. What OS are you using in case someone happens by this thread later.

                    Comment


                    • #11
                      I am using a server with Cent OS 6.3

                      Comment


                      • #12
                        Tried it on Mac without the single quotes and it does work. Some difference between CentOS and OS X (10.9.x).

                        Comment


                        • #13
                          Thanks!

                          Comment


                          • #14
                            Back ticks in shell scripts are as old as the sticks, and almost as primitive. You can replace any backtick combination, `<statement>` with $(<statement>), with the added bonus that it's nestable.

                            One thing that might trip you up doing things like this is not having a new line at the end of your text files. Some programs will prduce odd behaviour (like infinitely looping over the file) if a text file doesn't end with a new line.

                            Comment


                            • #15
                              Thanks for the info! I have now changed all the backticks to $(command). That issue with newlines could have been a problem. Since I separated the multifasta files of the genomes using a third-party script, it might be that the last sequence lacks the newline at the end...

                              Thanks!

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Essential Discoveries and Tools in Epitranscriptomics
                                by seqadmin




                                The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                                04-22-2024, 07:01 AM
                              • seqadmin
                                Current Approaches to Protein Sequencing
                                by seqadmin


                                Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                                04-04-2024, 04:25 PM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, Yesterday, 08:47 AM
                              0 responses
                              12 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-11-2024, 12:08 PM
                              0 responses
                              60 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-10-2024, 10:19 PM
                              0 responses
                              59 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-10-2024, 09:21 AM
                              0 responses
                              54 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X