Hi everybody,
Recently I sequenced some genes (PCR amplicons; app. 5,5kb per gene) from Candida glabrata, to figure out if they contain mutations that can be linked to echinocandin resistance. I used 55 different glabrata strains and the Ion Torrent PGM (two different runs on chip318C & chip316) to get the sequences. Now I am analyzing the data using CLC genomics workbench 7 and found some unexpected results and was wondering if anybody has seen stuff like this or knows whats going on.
The case is that when I map the reads to a reference, for one of the genes it shows a deletion at a site that results in a framshift and a premature stopcodon. Sure there is nothing wrong with a result like that, but the striking thing here is that when I check the raw data I see that the deletion is caused by a missing G on the reverse strand. Like 99% of the forward strands are showing GGG where almost 95% of the reverse strands are only showing GG. Now I checked the rest of the sequence and there is no other place where this is happening, although there are a lot more places where multiple G's are causing no problems at all (up to 5 in a row). And this is true for all 55 samples and both runs.
Any thoughts??
Recently I sequenced some genes (PCR amplicons; app. 5,5kb per gene) from Candida glabrata, to figure out if they contain mutations that can be linked to echinocandin resistance. I used 55 different glabrata strains and the Ion Torrent PGM (two different runs on chip318C & chip316) to get the sequences. Now I am analyzing the data using CLC genomics workbench 7 and found some unexpected results and was wondering if anybody has seen stuff like this or knows whats going on.
The case is that when I map the reads to a reference, for one of the genes it shows a deletion at a site that results in a framshift and a premature stopcodon. Sure there is nothing wrong with a result like that, but the striking thing here is that when I check the raw data I see that the deletion is caused by a missing G on the reverse strand. Like 99% of the forward strands are showing GGG where almost 95% of the reverse strands are only showing GG. Now I checked the rest of the sequence and there is no other place where this is happening, although there are a lot more places where multiple G's are causing no problems at all (up to 5 in a row). And this is true for all 55 samples and both runs.
Any thoughts??
Comment