We are trying to do analysis for whole genome metagenomic data taken from the surface of the eye. Each sample has millions of reads generated, but of those reads at most only 1-2% are bacterial reads. We are wondering if there is some information/resources about the amount of data available related to the reliability of the results eg. finding out the taxonomic information for bacteria down to species level that are greater than 1% relative abundance. Currently we are aligning the data to whole genome bacterial sequences, but there are many multi-mapping locations which many of which may be false positives. We have tried using Metaphlan2 to do alignment which uses a custom catalog of unique markers for different clades, but usually only several hundred reads will be mapped back -- many of the samples report very low/no species present. Specifically, we are wondering methods to do analysis for whole genome metagenomic sequences where the amount of data is very low. Any help is greatly appreciated.
Daniel
Daniel
Comment