SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
how ro see adapter contamination in Illumina reads paa6 Illumina/Solexa 4 03-10-2014 01:31 AM
Contamination in my sample? arkal Bioinformatics 6 10-21-2013 08:43 PM
How much adapter contamination is common? knostrov Bioinformatics 6 03-07-2013 10:49 AM
DNA contamination wolfypita RNA Sequencing 0 03-16-2011 06:27 PM
PubMed: Molecular bases of cyclodextrin adapter interactions with engineered protein Newsbot! Literature Watch 0 05-09-2010 07:00 PM

Reply
 
Thread Tools
Old 05-05-2014, 10:54 AM   #1
canerb
Junior Member
 
Location: izmir

Join Date: May 2014
Posts: 2
Default bases after adapter contamination

Hi everyone,

I was recently experimenting with simNGS, the NGS simulation tool of EBI (https://www.ebi.ac.uk/goldman-srv/simNGS/), to create a test dataset for myself. I added a custom adapter sequence to the sequencing setup and ran the simulation. Then when I grepped for the adapter in the produced reads, I see some random nucleotides after the complete adapter sequence at 3' end. I'm wondering how this is possible?

To my knowledge, adapter contamination occurs when the fragment length is shorter than the read length (which I allowed in my simulation), but shouldn't the reaction terminate after going through the complete adapter sequence? Where are those further nucleotides can be possibly coming from? (They are not multiplexed adapters, they are just random as far as I can tell)

Thanks for your replies.
canerb is offline   Reply With Quote
Old 05-05-2014, 11:10 AM   #2
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,695
Default

After the adapter sequence finishes at, say, cycle 80 of a 100 bp read, the sequencing machine is still going to run 20 more cycles, and put *something* in the output file. Since there is nothing more to sequence, it will probably just be highly-amplified random ambient light, giving random bases. Hopefully they will have very low quality scores.
Brian Bushnell is offline   Reply With Quote
Old 05-05-2014, 03:21 PM   #3
nucacidhunter
Senior Member
 
Location: Iran

Join Date: Jan 2013
Posts: 1,060
Default

Oligos complementary to 5 motif of P5 and P7 adapters immobilised on flow cell has a stretch of 10 poly T on their 5. So, in real data when the sequencing reaction reads through adapters it hits the poly T and results in calling A bases. After calling 10 A, it somehow calls mix of predominantly A and some C probably picking up signals from nearby clusters.
nucacidhunter is offline   Reply With Quote
Old 05-05-2014, 04:11 PM   #4
nucacidhunter
Senior Member
 
Location: Iran

Join Date: Jan 2013
Posts: 1,060
Default

Here is a screen shot from read2 of sequencing runs longer than insert size. PolyA is evident after adpter sequences.
Attached Files
File Type: pdf Poly A after read2.pdf (173.8 KB, 29 views)
nucacidhunter is offline   Reply With Quote
Old 05-05-2014, 04:40 PM   #5
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,695
Default

Quote:
Originally Posted by nucacidhunter View Post
Here is a screen shot from read2 of sequencing runs longer than insert size. PolyA is evident after adpter sequences.
Nice picture; thanks! Seems like the poly-A can be as short as 2 in that data. Do you know how universal the presence of post-adapter poly-A is through various library protocols?
Brian Bushnell is offline   Reply With Quote
Old 05-05-2014, 04:43 PM   #6
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,495
Default

This observation is real but it is curious that simNGS has that implemented in a simulator. Kudos to the developer(s).
GenoMax is offline   Reply With Quote
Old 05-05-2014, 06:09 PM   #7
nucacidhunter
Senior Member
 
Location: Iran

Join Date: Jan 2013
Posts: 1,060
Default

Any library that can be sequenced on Illumina systems would have poly A tract after running through adpters. Poly T is used as spacer between flow cell surface and oligos complementary to adapters P7 and P5 flow cell binding motif. In this case sequences following "CTCTGTGTAGATCTCGGTGGTCGCCGTATCATT" (P5 partial complement) are non-template sequences.
nucacidhunter is offline   Reply With Quote
Old 05-06-2014, 12:27 AM   #8
canerb
Junior Member
 
Location: izmir

Join Date: May 2014
Posts: 2
Default

Thank you, I appreciate all the answers.

GenoMax, yeah I am surprised with that as well. They seem to have missed poly-A's though.
canerb is offline   Reply With Quote
Reply

Tags
adapter, contamination, ngs

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 12:52 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2017, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO