I clipped an adapter sequence off a set of restriction fragments that were sequenced on an Illumina machine. My expectation was that all the sequences that had contained a full-length adapter sequence (I used a 9-bp motif) would end with a recognizable restriction site. Although the vast majority of reads do meet this expectation, to my surprise, some of the reads do not.
When I examine the sequence prior to clipping, I find that the region where a clip was performed is similar but not identical to the specified motif. In the example below, I show a case of a full-length sequence (103 bp) and the clipped sequence that was obtained.
Unclipped sequence:
CAGCAATACTGCTGGAAAGGACAGTTTCAAGANTTCCCATCCGTAATCCTTTATATAGGCGTTGGGGTGGACGCACACTAAGATGGAAAAGACCTGCTAATAT
Clipped sequence:
CAGCAATACTGCTGGAAAGGACAGTTTCAAGANTTCCCATCCGTAATCCTTTATATAGGCGTTGGGGTGGACGCACACTA
The site where the clipping occurred (AGATGGAA) is similar but not identical to the specified adapter sequence (AGATCGGAA). It seems that a missing C was tolerated and that clipping was allowed to go ahead.
I have performed the clip both in Galaxy and using the Fastx toolkit and did not see any option allowing one to accept or not mismatches in the adapter, and I got the exact same result both ways.
Any insights would be much appreciated
Francois
When I examine the sequence prior to clipping, I find that the region where a clip was performed is similar but not identical to the specified motif. In the example below, I show a case of a full-length sequence (103 bp) and the clipped sequence that was obtained.
Unclipped sequence:
CAGCAATACTGCTGGAAAGGACAGTTTCAAGANTTCCCATCCGTAATCCTTTATATAGGCGTTGGGGTGGACGCACACTAAGATGGAAAAGACCTGCTAATAT
Clipped sequence:
CAGCAATACTGCTGGAAAGGACAGTTTCAAGANTTCCCATCCGTAATCCTTTATATAGGCGTTGGGGTGGACGCACACTA
The site where the clipping occurred (AGATGGAA) is similar but not identical to the specified adapter sequence (AGATCGGAA). It seems that a missing C was tolerated and that clipping was allowed to go ahead.
I have performed the clip both in Galaxy and using the Fastx toolkit and did not see any option allowing one to accept or not mismatches in the adapter, and I got the exact same result both ways.
Any insights would be much appreciated
Francois
Comment