Hi!
Recently I got data at my hands were a single lane of GAIIx was sequenced from genomic DNA. >6GB output, all high qual. However, GC plot of reads shows two distinct peaks (larger at 37% -> target genome, smaller at around 50%). Seeing this and knowing the source of the DNA the second peak seems to come from an endosymbiont (or bacterial contamination). When I assemble with velvet (already tested cc=50 and large kmers) or Ray I get a genome of around 2MB (far to small) with bad cegma and also none of the stuff that should be in there, although blast hits for the right organism. Questions is: how to separate the endosymbiont from the target, possibly at read level?
Any help highly appreciated.
Recently I got data at my hands were a single lane of GAIIx was sequenced from genomic DNA. >6GB output, all high qual. However, GC plot of reads shows two distinct peaks (larger at 37% -> target genome, smaller at around 50%). Seeing this and knowing the source of the DNA the second peak seems to come from an endosymbiont (or bacterial contamination). When I assemble with velvet (already tested cc=50 and large kmers) or Ray I get a genome of around 2MB (far to small) with bad cegma and also none of the stuff that should be in there, although blast hits for the right organism. Questions is: how to separate the endosymbiont from the target, possibly at read level?
Any help highly appreciated.
Comment