Hi All,
I recently found an odd artifact in some 100 bp illumina GA2 reads we got from our sequencing provider. After some initial consternation, I realized that all the raw data contained duplicated bases at specific cycle numbers. More precisely, every sequence read in two of the samples that were run side-by-side had an insertion at the 37th and 74th positions that corresponded to the base at the 36th and 73 positions respectively. A third sample run at a later time had an insertion at the 51st position that was identical to the base at the 50th position for every single read. If I removed the 37th and 74th base for all the sequence reads in the first two datasets and the 51st base in the third datasets and then everything looked OK.
Has anyone else experienced this type of artifact before? Any idea what could cause this sort of thing? I brought this to their attention and mentioned that the positions of the inserted bases bore a striking resemblance to the standard 36 bp and 50 bp read lengths, but they insisted their machines were working properly and that no one else had complained about the data. Thoughts? Thanks
I recently found an odd artifact in some 100 bp illumina GA2 reads we got from our sequencing provider. After some initial consternation, I realized that all the raw data contained duplicated bases at specific cycle numbers. More precisely, every sequence read in two of the samples that were run side-by-side had an insertion at the 37th and 74th positions that corresponded to the base at the 36th and 73 positions respectively. A third sample run at a later time had an insertion at the 51st position that was identical to the base at the 50th position for every single read. If I removed the 37th and 74th base for all the sequence reads in the first two datasets and the 51st base in the third datasets and then everything looked OK.
Has anyone else experienced this type of artifact before? Any idea what could cause this sort of thing? I brought this to their attention and mentioned that the positions of the inserted bases bore a striking resemblance to the standard 36 bp and 50 bp read lengths, but they insisted their machines were working properly and that no one else had complained about the data. Thoughts? Thanks
Comment