Hi,
I am new to R programming. I have two fasta files namely WT and MT that contains 3 protein sequences which is given below
WT:
>seq1
MGFTIIIJKLPADFDEMMJDHGASFWEIHDI
>seq2
IIJHGTIPIOIZTRWERQQMSCVYMMDMMLI
>seq3
KPOIHUTWMMKLIZGTIIWHSAKDFJGVAHD
MT:
>seq1
TGFTHIHJKLPADFDEMTJDHGASFWEIHDH
>seq2
HHJHGTIPIOIZTRWERQQMSCVYTMDTMLH
>seq3
KPOIHUTWTTKLIZGTHHWHSAKDFJGVAHD
WT contains set of fasta sequences without any mutated aminoacids while the MT contains same protein sequences of WT but some amino acid are mutated.
I want compare the two sequences and find out counts for particular mutations wherever M mutated to T, I mutated to H for every protein and list them like Seq1 MT_counts:xx;IH_counts:YY Seq2:MT_counts:xx; IH_counts:yy Seq3:MT_counts:xx IH_counts:yy
I have written a small function in R where I read the fasta sequencces using read.fasta from seqinr package and later using two for loops for iterating through all sequences and another for iterating through every letter of seq in WT and comparing with its counterpart in MT.
But it gives me counts for MT and IH as 0 for all seq1 ,seq2 and seq3. I know there are some mutated aminoacid but I am surprised as the function reports the count as 0. Could some guide me where if there is any mistake in comparison. Kindly help me. Thanks in advance.
I am new to R programming. I have two fasta files namely WT and MT that contains 3 protein sequences which is given below
WT:
>seq1
MGFTIIIJKLPADFDEMMJDHGASFWEIHDI
>seq2
IIJHGTIPIOIZTRWERQQMSCVYMMDMMLI
>seq3
KPOIHUTWMMKLIZGTIIWHSAKDFJGVAHD
MT:
>seq1
TGFTHIHJKLPADFDEMTJDHGASFWEIHDH
>seq2
HHJHGTIPIOIZTRWERQQMSCVYTMDTMLH
>seq3
KPOIHUTWTTKLIZGTHHWHSAKDFJGVAHD
WT contains set of fasta sequences without any mutated aminoacids while the MT contains same protein sequences of WT but some amino acid are mutated.
I want compare the two sequences and find out counts for particular mutations wherever M mutated to T, I mutated to H for every protein and list them like Seq1 MT_counts:xx;IH_counts:YY Seq2:MT_counts:xx; IH_counts:yy Seq3:MT_counts:xx IH_counts:yy
I have written a small function in R where I read the fasta sequencces using read.fasta from seqinr package and later using two for loops for iterating through all sequences and another for iterating through every letter of seq in WT and comparing with its counterpart in MT.
Code:
library(seqinr) wt=read.fasta("C:/Users/tsekaran/Documents/sample_ref_protein.fasta") mt=read.fasta("C:/Users/tsekaran/Documents/sample_mut_protein.fasta") mismatch_finder=function(wt,mt) { for (i in 1:length(names(wt))) { MT_count=0 IH_count=0 #for(j in 1:length(wt$seq1)) for(j in 1:length(wt[[i]])) { if(wt[j]=="m" && mt[j]=="t" ) { MT_count=MT_count+1 } else if(wt[j]=="i" && mt[j]=="h" ) { IH_count=IH_count+1 } } print(names(wt[i])) print(MT_count) print(IH_count) } }
Comment