I have few technical queries regarding sequencing some repetitive sequences such as homopolymeric stretches of any base exactly near the start of sequencing primer on Illumina sequencing platform.
1. If sequencing is started exactly near homopolymeric tail, can illumina sequencing platform recognize the clusters? If no, can you suggest some strategy? Can placing a barcode infront of such stretch help?
2. (2nd option can be tried using PE adapters) If sequencing is started near the 5' end of the transcript from one end and then later on from the other end. Please see the following description. The first two strands represent the cDNA library (a). (In section (b)) If we read in direction 1 for first time, definitely clusters can be identified (as 5' ends will be from different mRNAs) but will the sequencing machine be able to use the cluster information from 1st round of sequencing during sequencing from the 2nd end (where basically C tail is present & we start near the first base of it)?
a) cDNA library
----------------------NNNN(cDNA)CCCCCCCCCCCC~~~~~~~~~~~
----------------------NNNN(cDNA')GGGGGGGGGGGG~~~~~~~~~~~
b)Sequencing
----------------------NNNN(cDNA)CCCCCCCCCCCC~~~~~~~~~~~
.....................................................................<-----------2
..........1------------->................
----------------------NNNN(cDNA')GGGGGGGGGGGG~~~~~~~~~~~
Please express your views. Suggestions are always welcome.
Biochembug
1. If sequencing is started exactly near homopolymeric tail, can illumina sequencing platform recognize the clusters? If no, can you suggest some strategy? Can placing a barcode infront of such stretch help?
2. (2nd option can be tried using PE adapters) If sequencing is started near the 5' end of the transcript from one end and then later on from the other end. Please see the following description. The first two strands represent the cDNA library (a). (In section (b)) If we read in direction 1 for first time, definitely clusters can be identified (as 5' ends will be from different mRNAs) but will the sequencing machine be able to use the cluster information from 1st round of sequencing during sequencing from the 2nd end (where basically C tail is present & we start near the first base of it)?
a) cDNA library
----------------------NNNN(cDNA)CCCCCCCCCCCC~~~~~~~~~~~
----------------------NNNN(cDNA')GGGGGGGGGGGG~~~~~~~~~~~
b)Sequencing
----------------------NNNN(cDNA)CCCCCCCCCCCC~~~~~~~~~~~
.....................................................................<-----------2
..........1------------->................
----------------------NNNN(cDNA')GGGGGGGGGGGG~~~~~~~~~~~
Please express your views. Suggestions are always welcome.
Biochembug