Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • hepcat72
    Junior Member
    • Dec 2009
    • 7

    How do matrix weights for amino acid 'X' affect protein alignments in MUSCLE?

    Sorry, this is a cross-post from biostars, but I haven't had any luck there in figuring this out...

    I'm developing a protein weight matrix from scratch to use with MUSCLE. I started with the 20 amino acids and the stop character, but using just those, I get a warning from MUSCLE that makes no sense^. But if I add a row & column for the amino acid code 'X' (with all 0s - everything else ranges from 1-25), the warning goes away. However, even though the sequences I'm aligning have no occurrences of 'X', the alignments produced using the matrix without 'X' and the matrix with 'X' are different. I don't understand why that is.

    How do alignment algorithms treat 'X' in the weight matrix when there are no X's in the sequences being aligned? Why would an alignment produced using a matrix with 'X' versus an alifgnment produced using a matrix without 'X' be different? Are there weights I can insert for 'X' in the matrix that would not affect the alignment?

    Side question - I don't have 'B' or 'Z' in my matrix either (nor do they occur in my sequences). Do they need to be in the matrix for the alignment software to yield good results?

    ^ [I]The warning from muscle is "*** WARNING *** Matrix is not symmetrical, ?->?=5, ?->?=0". This doesn't make sense for 3 reasons: 1. the matrix is symmetrical. 2. There were no 0s in my matrix when the warning occurred. 3. Simply adding a row/column for 'X' silenced the warning. I looked briefly at the code in muscle where the warning is generated and it's simply checking that
    Code:
    matrix[i][j] == matrix[j][i][/i]
    . The fact that there's an item that doesn't match is unsettling. Perhaps this is a hidden requirement that 'X' must be present in the matrix - but still, why would the alignment change with/without it?
  • hepcat72
    Junior Member
    • Dec 2009
    • 7

    #2
    The answer is, the presence or absence of 'X' in a protein weight matrix should not affect the alignment of sequences not containing 'X', however there is a bug in muscle that is fixed by the following patch. Supplying muscle a custom weight matrix without any of the ambiguous amino acids causes the alignment to differ because muscle was not setting the weights in the last column and that was causing the warnings.

    GitHub is where people build software. More than 150 million people use GitHub to discover, fork, and contribute to over 420 million projects.


    Originally posted by hepcat72 View Post
    Sorry, this is a cross-post from biostars, but I haven't had any luck there in figuring this out...

    I'm developing a protein weight matrix from scratch to use with MUSCLE. I started with the 20 amino acids and the stop character, but using just those, I get a warning from MUSCLE that makes no sense^. But if I add a row & column for the amino acid code 'X' (with all 0s - everything else ranges from 1-25), the warning goes away. However, even though the sequences I'm aligning have no occurrences of 'X', the alignments produced using the matrix without 'X' and the matrix with 'X' are different. I don't understand why that is.

    How do alignment algorithms treat 'X' in the weight matrix when there are no X's in the sequences being aligned? Why would an alignment produced using a matrix with 'X' versus an alifgnment produced using a matrix without 'X' be different? Are there weights I can insert for 'X' in the matrix that would not affect the alignment?

    Side question - I don't have 'B' or 'Z' in my matrix either (nor do they occur in my sequences). Do they need to be in the matrix for the alignment software to yield good results?

    ^ [I]The warning from muscle is "*** WARNING *** Matrix is not symmetrical, ?->?=5, ?->?=0". This doesn't make sense for 3 reasons: 1. the matrix is symmetrical. 2. There were no 0s in my matrix when the warning occurred. 3. Simply adding a row/column for 'X' silenced the warning. I looked briefly at the code in muscle where the warning is generated and it's simply checking that
    Code:
    matrix[i][j] == matrix[j][i][/i]
    . The fact that there's an item that doesn't match is unsettling. Perhaps this is a hidden requirement that 'X' must be present in the matrix - but still, why would the alignment change with/without it?

    Comment

    Latest Articles

    Collapse

    • SEQadmin2
      Nine Things a Sample Prep Scientist Thinks About Before Sequencing
      by SEQadmin2


      I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

      Here are nine questions we think about, in roughly the order they matter, before...
      06-18-2026, 07:11 AM
    • SEQadmin2
      From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
      by SEQadmin2


      Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


      The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
      ...
      06-02-2026, 10:05 AM

    ad_right_rmr

    Collapse

    News

    Collapse

    Topics Statistics Last Post
    Started by SEQadmin2, 06-17-2026, 06:09 AM
    0 responses
    36 views
    0 reactions
    Last Post SEQadmin2  
    Started by SEQadmin2, 06-09-2026, 11:58 AM
    0 responses
    100 views
    0 reactions
    Last Post SEQadmin2  
    Started by SEQadmin2, 06-05-2026, 10:09 AM
    0 responses
    120 views
    0 reactions
    Last Post SEQadmin2  
    Started by SEQadmin2, 06-04-2026, 08:59 AM
    0 responses
    113 views
    0 reactions
    Last Post SEQadmin2  
    Working...