Hi Everyone,
I'm sequencing PCR amplicons generated from gDNA (HIV proviruses). I've noticed something unusual that appears in a handful of the samples I've sequenced, even my controls (PCR-amplified plasmid).
The following sequence appears in tandem array in approximately 5% of my samples, and is supported by a significant number reads, generally more than 100 reads. The tandem repeat appears to be present exclusively at the end of a read, though the entire string of bases has >Q30. The kicker: the sequence IS present in wildtype HIV, but only once. This leads me to believe that a duplicate is somehow being generated during library prep or sequencing.
Present in wildtype HIV: TAATACCAATAGTAG
Tandem Array (n=2): TAATACCAATAGTAG-TAATACCAATAGTAG
I'm going to rule out my PCR amplification step by sequencing control plasmid directly - if the tandem repeat still occurs, then it is undoubtedly a result of library prep or sequencing.
I do not see this sequence present in Nextera XT indexes or Nextera adapter sequence, nor is it present in my PCR primers. I'm stumped - has anyone seen this before in their own work? I am aware of template switching during PCR, but I don't see any sequencing homology that would allow a duplication of this repeat to occur.
Thanks
Jake
I'm sequencing PCR amplicons generated from gDNA (HIV proviruses). I've noticed something unusual that appears in a handful of the samples I've sequenced, even my controls (PCR-amplified plasmid).
The following sequence appears in tandem array in approximately 5% of my samples, and is supported by a significant number reads, generally more than 100 reads. The tandem repeat appears to be present exclusively at the end of a read, though the entire string of bases has >Q30. The kicker: the sequence IS present in wildtype HIV, but only once. This leads me to believe that a duplicate is somehow being generated during library prep or sequencing.
Present in wildtype HIV: TAATACCAATAGTAG
Tandem Array (n=2): TAATACCAATAGTAG-TAATACCAATAGTAG
I'm going to rule out my PCR amplification step by sequencing control plasmid directly - if the tandem repeat still occurs, then it is undoubtedly a result of library prep or sequencing.
I do not see this sequence present in Nextera XT indexes or Nextera adapter sequence, nor is it present in my PCR primers. I'm stumped - has anyone seen this before in their own work? I am aware of template switching during PCR, but I don't see any sequencing homology that would allow a duplication of this repeat to occur.
Thanks
Jake