After preliminary de novo clustering/assembly of transcriptomic data from a non-model organism, I've found what appears to be a pretty good indication of vector contamination (bitscore:6318; evalue:0.0; 3423 out of 3424 identities) (Accession# AY817672)
This is how the library was prepared: I extracted high quality Total RNA from the organism, and shipped it to the sequencing facility who generated the library and ran 1 Lane of a flow cell (2x76bp) that generated ~5.1Gb of total data (~34,000,000 paired end reads)
Now the overall frequency appears to be low, only ~600,000bases (or <0.01%). And it actually winds working almost as an assembly "quality control metric" to allow us to assess consistency between different assemblers. But as far as i'm concerned, this vector shouldn't be in our library, and it seems like it's something that our sequencing service provider should be able to account for.
Has anybody else found this sort of thing in their Solexa/Illumina libraries? As far as I'm aware, cloning vectors are not a part of the Illumina protocol so it's unlikely to simply be an artifact from library prep. Am I wrong?
Thanks for the insight.
This is how the library was prepared: I extracted high quality Total RNA from the organism, and shipped it to the sequencing facility who generated the library and ran 1 Lane of a flow cell (2x76bp) that generated ~5.1Gb of total data (~34,000,000 paired end reads)
Now the overall frequency appears to be low, only ~600,000bases (or <0.01%). And it actually winds working almost as an assembly "quality control metric" to allow us to assess consistency between different assemblers. But as far as i'm concerned, this vector shouldn't be in our library, and it seems like it's something that our sequencing service provider should be able to account for.
Has anybody else found this sort of thing in their Solexa/Illumina libraries? As far as I'm aware, cloning vectors are not a part of the Illumina protocol so it's unlikely to simply be an artifact from library prep. Am I wrong?
Thanks for the insight.
Comment