Hello,
We recently did a Paired-End run on the Illumina machine to sequence several bacterial genomes (one genome per lane). When we tried mapping our reads to a reference genome, none of the reads would map. After performing a de novo assembly, which resulted in ~2,000 contigs, we found that most of the sequences have some identity to viral genomes. Furthermore, the contigs are highly AT rich (average of 38% GC) and predicted open reading frames are short (300-825bp), all indicative of viral sequence.
We are fairly confident that the viral sequences did not contaminate the bacterial cultures we extracted DNA from since the cells would have been lysed (and we wouldn't have extracted then). Furthermore, one of our lanes contained an environmental sample that had never been cultured. These observations make us think that something may have happened during the Illumina sample prep.
Has anyone had similar contamination issues?
Many thanks in advance.
We recently did a Paired-End run on the Illumina machine to sequence several bacterial genomes (one genome per lane). When we tried mapping our reads to a reference genome, none of the reads would map. After performing a de novo assembly, which resulted in ~2,000 contigs, we found that most of the sequences have some identity to viral genomes. Furthermore, the contigs are highly AT rich (average of 38% GC) and predicted open reading frames are short (300-825bp), all indicative of viral sequence.
We are fairly confident that the viral sequences did not contaminate the bacterial cultures we extracted DNA from since the cells would have been lysed (and we wouldn't have extracted then). Furthermore, one of our lanes contained an environmental sample that had never been cultured. These observations make us think that something may have happened during the Illumina sample prep.
Has anyone had similar contamination issues?
Many thanks in advance.
Comment