Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Remove a part of a filename in a Bash loop

    I have many files named like this:

    lib01.GFBAG_UHAU.fastq.sam.bam
    lib02.ABABAB_ZU.fastq.sam.bam
    lib03.ZGAZG_IAUDH.fastq.sam.bam

    Many parts of the filenames are thus variable in length, although they are connected through the same type of punctuation (. or _).
    What I want to achieve is to remove the part .fastq.sam.bam from a filename when I loop trough these files in BASH. How do I achieve this in Bash?

  • #2
    You want to split the string on a "." delimiter and then keep the first two parts. Or use ".fastq.sam.bam" as a delimiter, I suppose!

    To split string in Bash scripting with single character or set of single character delimiters, set IFS(Internal Field Separator) to the delimiter(s) and parse the string to array. To split string in Bash with multiple character delimiter use Parameter Expansions. Examples have been provided for Bash Split String operation.
    Providing nextRAD genotyping and PacBio sequencing services. http://snpsaurus.com

    Comment


    • #3
      Refer to https://unix.stackexchange.com/quest...ck-of-variable

      An example for changing extension from fastq.sam.bam to txt.

      for file in *.fastq.sam.bam
      do
      mv ${file%.fastq.sam.bam} ${file%.fastq.sam.bam}.txt
      done

      Comment


      • #4
        Originally posted by ungsik View Post
        Refer to https://unix.stackexchange.com/quest...ck-of-variable

        An example for changing extension from fastq.sam.bam to txt.

        for file in *.fastq.sam.bam
        do
        mv ${file%.fastq.sam.bam} ${file%.fastq.sam.bam}.txt
        done
        Don't you mean:

        for file in *.fastq.sam.bam
        do
        mv $file ${file%.fastq.sam.bam}.txt
        done

        --
        Phillip

        Comment


        • #5
          Originally posted by Marius View Post
          I have many files named like this:

          lib01.GFBAG_UHAU.fastq.sam.bam
          lib02.ABABAB_ZU.fastq.sam.bam
          lib03.ZGAZG_IAUDH.fastq.sam.bam

          Many parts of the filenames are thus variable in length, although they are connected through the same type of punctuation (. or _).
          What I want to achieve is to remove the part .fastq.sam.bam from a filename when I loop trough these files in BASH. How do I achieve this in Bash?
          Using BASH parameter expansion:

          Code:
          for i in *.fastq.sam.bam; do mv $i ${i%.fastq.sam.bam}; done;
          Which is pretty fun, the "%" more-or-less meaning "clip what follows from the the very end of the value stored in variable $i." "#" does the analogous thing, but clips from the very front.

          But "%%" does a "greedy" removal of whatever follows it. So:

          Code:
          i=lib01.GFBAG_UHAU.fastq.sam.bam.fastq.sam.bam.fastq.sam.bam
          echo ${i%.fastq.sam.bam*}
          will produce:
          Code:
          lib01.GFBAG_UHAU.fastq.sam.bam.fastq.sam.bam
          whereas:

          Code:
          i=lib01.GFBAG_UHAU.fastq.sam.bam.fastq.sam.bam.fastq.sam.bam
          echo ${i%%.fastq.sam.bam*}
          will produce:
          Code:
          lib01.GFBAG_UHAU
          If you can run Perl, then finding the "rename.pl" script might be less arcane than deploying you BASH powers.

          rename.pl 's/.fastq.sam.bam$//' *.fastq.sam.bam

          Find rename.pl here:


          --
          Phillip

          Comment


          • #6
            Originally posted by Marius View Post
            I have many files named like this:

            lib01.GFBAG_UHAU.fastq.sam.bam
            lib02.ABABAB_ZU.fastq.sam.bam
            lib03.ZGAZG_IAUDH.fastq.sam.bam

            Many parts of the filenames are thus variable in length, although they are connected through the same type of punctuation (. or _).
            What I want to achieve is to remove the part .fastq.sam.bam from a filename when I loop trough these files in BASH. How do I achieve this in Bash?
            Using BASH parameter expansion:

            Code:
            for i in *.fastq.sam.bam; do mv $i ${i%.fastq.sam.bam}; done;
            Which is pretty fun, the "%" more-or-less meaning "clip what follows from the the very end of the value stored in variable $i." "#" does the analogous thing, but clips from the very front.

            But "%%" does a "greedy" removal of whatever follows it. So:

            Code:
            i=lib01.GFBAG_UHAU.fastq.sam.bam.fastq.sam.bam.fastq.sam.bam
            echo ${i%.fastq.sam.bam*}
            will produce:
            Code:
            lib01.GFBAG_UHAU.fastq.sam.bam.fastq.sam.bam
            whereas:

            Code:
            i=lib01.GFBAG_UHAU.fastq.sam.bam.fastq.sam.bam.fastq.sam.bam
            echo ${i%%.fastq.sam.bam*}
            will produce:
            Code:
            lib01.GFBAG_UHAU
            If you can run Perl, then finding the "rename.pl" script might be less arcane than deploying you BASH powers.

            rename.pl 's/.fastq.sam.bam$//' *.fastq.sam.bam

            Find rename.pl here:


            --
            Phillip

            Comment


            • #7
              A range of options exists for munging the pathnames

              The approach I would use might well depend on what else I going to do in the loop.

              FWIW:

              [basename](https://linux.die.net/man/1/basename) can be used to remove a suffix of a filename.

              [Shell Parameter Expansion](https://www.gnu.org/software/bash/ma...Expansion.html) can be used to strip or replace either suffixes or prefixes of pathnames stored in variables.

              [GNU parallel](https://www.gnu.org/software/parallel/) can be used in effect to replace your bash looping construct, and has simple syntax for to refer to the basename of a file or directory, including `{=perl expression=}` to munge the pathname any way you like. It has MANY great features and is well worth exploring and being in your toolbelt.

              [rename](https://www.computerhope.com/unix/rename.htm) is very useful for batch renaming of files using regular expressions (if that is all you need to do).

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Advancing Precision Medicine for Rare Diseases in Children
                by seqadmin




                Many organizations study rare diseases, but few have a mission as impactful as Rady Children’s Institute for Genomic Medicine (RCIGM). “We are all about changing outcomes for children,” explained Dr. Stephen Kingsmore, President and CEO of the group. The institute’s initial goal was to provide rapid diagnoses for critically ill children and shorten their diagnostic odyssey, a term used to describe the long and arduous process it takes patients to obtain an accurate...
                12-16-2024, 07:57 AM
              • seqadmin
                Recent Advances in Sequencing Technologies
                by seqadmin



                Innovations in next-generation sequencing technologies and techniques are driving more precise and comprehensive exploration of complex biological systems. Current advancements include improved accessibility for long-read sequencing and significant progress in single-cell and 3D genomics. This article explores some of the most impactful developments in the field over the past year.

                Long-Read Sequencing
                Long-read sequencing has seen remarkable advancements,...
                12-02-2024, 01:49 PM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, 12-17-2024, 10:28 AM
              0 responses
              27 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 12-13-2024, 08:24 AM
              0 responses
              43 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 12-12-2024, 07:41 AM
              0 responses
              29 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 12-11-2024, 07:45 AM
              0 responses
              42 views
              0 likes
              Last Post seqadmin  
              Working...
              X