Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • kmkocot
    Member
    • Jun 2009
    • 51

    Script to remove gap-only sites from fasta alignment?

    Hi all,

    I am trying to find a script or a program that I can call in a pipeline that can remove gap-only sites in a multiple sequence alignment. Can you think of anything I can use? I'd also like to
    delete the columns at either end of each alignment where there are only 2 or fewer sequences with non-gap characters if you can think of anything that can do that.

    Thanks!
    Kevin
  • maubp
    Peter (Biopython etc)
    • Jul 2009
    • 1544

    #2
    I can picture how I would solve this using Biopython, provided the whole alignment can be loaded into RAM.

    How big are the alignments (number of columns, number of rows)?

    Comment

    • nilshomer
      Nils Homer
      • Nov 2008
      • 1283

      #3
      Originally posted by kmkocot View Post
      Hi all,

      I am trying to find a script or a program that I can call in a pipeline that can remove gap-only sites in a multiple sequence alignment. Can you think of anything I can use? I'd also like to
      delete the columns at either end of each alignment where there are only 2 or fewer sequences with non-gap characters if you can think of anything that can do that.

      Thanks!
      Kevin
      Given your alignments in the SAM format, you can remove all alignments with indels with the C-program "dbamfilter -i". You could also easily parse the CIGAR field in the SAM format for any "I" or "D" operators.

      What format are you working with?

      Comment

      • kmkocot
        Member
        • Jun 2009
        • 51

        #4
        Hi maubp,

        The alignments vary but all have fewer than 30 taxa and fewer than 1000 amino acids.

        Thanks,
        Kevin

        Comment

        • kmkocot
          Member
          • Jun 2009
          • 51

          #5
          Hi maubp,

          The alignments vary but all have fewer than 30 taxa and fewer than 1000 amino acids.

          Thanks,
          Kevin

          Comment

          Latest Articles

          Collapse

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by SEQadmin2, 06-09-2026, 11:58 AM
          0 responses
          16 views
          0 reactions
          Last Post SEQadmin2  
          Started by SEQadmin2, 06-05-2026, 10:09 AM
          0 responses
          26 views
          0 reactions
          Last Post SEQadmin2  
          Started by SEQadmin2, 06-04-2026, 08:59 AM
          0 responses
          37 views
          0 reactions
          Last Post SEQadmin2  
          Started by SEQadmin2, 06-02-2026, 12:03 PM
          0 responses
          61 views
          0 reactions
          Last Post SEQadmin2  
          Working...