View Single Post
Old 12-19-2014, 09:24 AM   #5
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

If you are looking for a web tool, I can't really offer any suggestions (hopefully someone else can). You can run BBMap locally, though, which will return alignments along with their percent identity. And rather than aligning to bacterial genomes, you can just align to 16S using one of the datasets mentioned here. However, sometimes 16S in public databases are not full-length, or are too long, so the coordinates will be misleading. You may wish to first filter out the ones that seem anomalous, for example, like this:

reformat.sh in=16S.fasta out=filtered.fasta minlen=1440 maxlen=1640

...which is what I did previously when trying to get rid of bad sequences. The exact length limits I derived empirically from looking at length distributions (using readlength.sh); possibly a tighter band would be better since you are interested in finding specific coordinates.
Brian Bushnell is offline   Reply With Quote