SEQanswers

Go Back   SEQanswers > General



Similar Threads
Thread Thread Starter Forum Replies Last Post
Sorting Amino Acid Sequences satishg Bioinformatics 2 08-25-2015 01:02 PM
amino acid change effect on protein structure using bioinformatics ketan_bnf Bioinformatics 0 07-20-2011 10:00 PM

Reply
 
Thread Tools
Old 07-18-2017, 08:51 AM   #1
hepcat72
Junior Member
 
Location: Princeton, NJ

Join Date: Dec 2009
Posts: 6
Default How do matrix weights for amino acid 'X' affect protein alignments in MUSCLE?

Sorry, this is a cross-post from biostars, but I haven't had any luck there in figuring this out...

I'm developing a protein weight matrix from scratch to use with MUSCLE. I started with the 20 amino acids and the stop character, but using just those, I get a warning from MUSCLE that makes no sense^. But if I add a row & column for the amino acid code 'X' (with all 0s - everything else ranges from 1-25), the warning goes away. However, even though the sequences I'm aligning have no occurrences of 'X', the alignments produced using the matrix without 'X' and the matrix with 'X' are different. I don't understand why that is.

How do alignment algorithms treat 'X' in the weight matrix when there are no X's in the sequences being aligned? Why would an alignment produced using a matrix with 'X' versus an alifgnment produced using a matrix without 'X' be different? Are there weights I can insert for 'X' in the matrix that would not affect the alignment?

Side question - I don't have 'B' or 'Z' in my matrix either (nor do they occur in my sequences). Do they need to be in the matrix for the alignment software to yield good results?

^ [I]The warning from muscle is "*** WARNING *** Matrix is not symmetrical, ?->?=5, ?->?=0". This doesn't make sense for 3 reasons: 1. the matrix is symmetrical. 2. There were no 0s in my matrix when the warning occurred. 3. Simply adding a row/column for 'X' silenced the warning. I looked briefly at the code in muscle where the warning is generated and it's simply checking that
Code:
matrix[i][j] == matrix[j]
. The fact that there's an item that doesn't match is unsettling. Perhaps this is a hidden requirement that 'X' must be present in the matrix - but still, why would the alignment change with/without it?
hepcat72 is offline   Reply With Quote
Old 07-26-2017, 04:38 PM   #2
hepcat72
Junior Member
 
Location: Princeton, NJ

Join Date: Dec 2009
Posts: 6
Default

The answer is, the presence or absence of 'X' in a protein weight matrix should not affect the alignment of sequences not containing 'X', however there is a bug in muscle that is fixed by the following patch. Supplying muscle a custom weight matrix without any of the ambiguous amino acids causes the alignment to differ because muscle was not setting the weights in the last column and that was causing the warnings.

https://github.com/Bioconductor-mirror/muscle/compare/master...hepcat72atch-2

Quote:
Originally Posted by hepcat72 View Post
Sorry, this is a cross-post from biostars, but I haven't had any luck there in figuring this out...

I'm developing a protein weight matrix from scratch to use with MUSCLE. I started with the 20 amino acids and the stop character, but using just those, I get a warning from MUSCLE that makes no sense^. But if I add a row & column for the amino acid code 'X' (with all 0s - everything else ranges from 1-25), the warning goes away. However, even though the sequences I'm aligning have no occurrences of 'X', the alignments produced using the matrix without 'X' and the matrix with 'X' are different. I don't understand why that is.

How do alignment algorithms treat 'X' in the weight matrix when there are no X's in the sequences being aligned? Why would an alignment produced using a matrix with 'X' versus an alifgnment produced using a matrix without 'X' be different? Are there weights I can insert for 'X' in the matrix that would not affect the alignment?

Side question - I don't have 'B' or 'Z' in my matrix either (nor do they occur in my sequences). Do they need to be in the matrix for the alignment software to yield good results?

^ [I]The warning from muscle is "*** WARNING *** Matrix is not symmetrical, ?->?=5, ?->?=0". This doesn't make sense for 3 reasons: 1. the matrix is symmetrical. 2. There were no 0s in my matrix when the warning occurred. 3. Simply adding a row/column for 'X' silenced the warning. I looked briefly at the code in muscle where the warning is generated and it's simply checking that
Code:
matrix[i][j] == matrix[j]
. The fact that there's an item that doesn't match is unsettling. Perhaps this is a hidden requirement that 'X' must be present in the matrix - but still, why would the alignment change with/without it?
hepcat72 is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 04:05 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2017, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO