SEQanswers R mismatch position in pairwise alignment
 Register FAQ Members List Calendar Search Today's Posts Mark Forums Read

 Similar Threads Thread Thread Starter Forum Replies Last Post trasver 454 Pyrosequencing 1 03-07-2011 04:31 AM peveralldubois Bioinformatics 0 01-14-2011 07:23 PM fbarreto Bioinformatics 4 02-27-2010 04:36 AM lmf_bill SOLiD 1 09-11-2009 05:21 AM scirocco Bioinformatics 0 12-28-2008 10:33 AM

 10-26-2010, 11:10 PM #1 NicoBxl not just another member   Location: Belgium Join Date: Aug 2010 Posts: 264 R mismatch position in pairwise alignment Hi everybody, I've a little question. In R, I've two sequences ( DNA sequences ) and I want to align them and known the mismatch positions. For example : seq1 <- "ATGC" seq2 <- "ATGG" mismatch.pos <- aCoolFunction(seq1,seq2) mismatch.pos --> 4 Anyone knows such thing ? if it doesn't exist in R, is it easily possible in bioperl ?
 10-27-2010, 01:10 AM #2 dariober Senior Member   Location: Cambridge, UK Join Date: May 2010 Posts: 311 What about... Code: ```x <- "ATGCGACTG" y <- "ATGGNACTG" seqdiff<- function(seq1, seq2){ seq<- strsplit(c(seq1, seq2), split= '') mismatches<- which(seq[[1]] != seq[[2]]) return(mismatches) } seqdiff(x, y) --> [1] 4 5``` (Note: No attempt is made to cope with sequences of different length, which() will throw a warning in such case). Dario
 10-27-2010, 01:15 AM #3 NicoBxl not just another member   Location: Belgium Join Date: Aug 2010 Posts: 264 thanks but in most of the cases the sequences have different length
 10-27-2010, 01:37 AM #4 dariober Senior Member   Location: Cambridge, UK Join Date: May 2010 Posts: 311 ...The variant below will right-pad the shorter sequence with 'X', which in turn will be considered as differences (not sure this is what you want...) I assume that your sequences have been already aligned by some other program and what you want is to pull out the mismatches. If instead you really want to do sequence alignment within R, I accidentally found the package bio3d which has a function called seqaln, see if it helps. Anyway, I think R is not ideal for such jobs, I'd rather go for python or perl after BLAST or other aligner. Dario Code: ```seqdiff<- function(seq1, seq2){ seq<- strsplit(c(seq1, seq2), split= '') ## If the length of the two sequences differs, ## pad the shorter one with X seqlen<- length(seq[[1]]) - length(seq[[2]]) if(seqlen > 0){ seq[[2]]<- append(seq[[2]], rep('X', seqlen)) } if(seqlen < 0){ seq[[1]]<- append(seq[[1]], rep('X', abs(seqlen))) } mismatches<- which(seq[[1]] != seq[[2]]) return(mismatches) }```
 10-27-2010, 08:36 AM #5 svl Member   Location: Netherlands Join Date: Sep 2009 Posts: 43 Are you using bioconductor? This should be doable with bioconductor as well. How to install bioconductor: source(“http://www.bioconductor.org/biocLite.R”) biocLite() How to create an aligment: http://manuals.bioinformatics.ucr.ed...quence-Alignme And here's the alignments vignette/doc: http://bioconductor.org/packages/2.5...Alignments.pdf ... in this pdf focus on the function: mismatchTable -- Creates a table for the mismatching positions Seems to be what you want. Let us know if and how you got it to work! /Stef Last edited by svl; 10-27-2010 at 08:39 AM.