SEQanswers

Go Back   SEQanswers > Sequencing Technologies/Companies > Illumina/Solexa



Similar Threads
Thread Thread Starter Forum Replies Last Post
Custom primers and PhiX on MiSeq gleb Illumina/Solexa 14 05-09-2017 09:53 AM
Use of two Read 1 primers on MiSeq FNB Illumina/Solexa 1 07-17-2013 03:41 AM
miseq custom primers barturas Illumina/Solexa 1 07-01-2013 06:46 AM
Merge variable-length adaptor from beginning of read sowalsky Bioinformatics 0 11-12-2012 01:27 PM
The Beginning of the End for Exome Sequencing dongzw Complete Genomics 9 07-17-2012 06:04 AM

Reply
 
Thread Tools
Old 01-15-2014, 09:37 AM   #1
clintp
Member
 
Location: Georgia

Join Date: Apr 2013
Posts: 19
Default Miseq:Trimming, and sequencing primers at the beginning of a read

I noticed that when I trim my Miseq reads for adapter contamination (getting rid of the 3' portion of the read), I could still grep the trimmed reads for ACACTCTTTCCCTACACGAC (the sequencing primer/adapter sequence) and find several thousand at the 5' end of Read1 reads. These shouldn't be there, right? What am I missing?

I used fastq-mcf to trim the 13bp common TruSeq sequence AGATCGGAAGAGC.

Primer sequences do not appear in the beginning of Read2 reads. In the sample sheet, I did not request that the MiSeq do any onboard trimming. For library prep, I used NEBNext Ultra, whose adapters, seq primers, and indicies are the same as the TruSeq stuff.

So, my questions are 1) why am I getting primer sequences in read 1? and 2) Is the 13bp sequence sufficient for trimming Illumina reads (and should I be doing this differently--the reads are used for de novo assembly and blast-based binning, so aggressively getting rid of adapter sequences is important to me)?
clintp is offline   Reply With Quote
Old 01-15-2014, 09:46 AM   #2
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,800
Default

First part could be explained by having adapter/primer dimers without any insert.

As for trimming give "trimmomatic" (http://www.usadellab.org/cms/?page=trimmomatic) or cutadapt (http://code.google.com/p/cutadapt/)/trimgalore (http://www.bioinformatics.babraham.a...s/trim_galore/) a try. Recent comparison of trimmers available http://www.plosone.org/article/info:...l.pone.0085024.
GenoMax is offline   Reply With Quote
Old 01-15-2014, 11:13 AM   #3
clintp
Member
 
Location: Georgia

Join Date: Apr 2013
Posts: 19
Default

I like the idea of trimmomatic, but I can't seem to make it trim the adapters--they still show up after the following:

Code:
java -classpath /opt/Trimmomatic-0.32/trimmomatic-0.32.jar org.usadellab.trimmomatic.TrimmomaticPE -threads 8 -trimlog TT.log Pool1_S1_L001_R1_001.fastq Pool1_S1_L001_R2_001.fastq p1r1_TT.fastq p1r1_To.fastq p1r2_TT.fastq p1r2_To.fastq LEADING:3 TRAILING:3 ILLUMINACLIP:adapter_13.fa:2:30:10 SLIDINGWINDOW:4:15 MINLEN:16
I may have the parameters set funny, but I don't know the best way to set it. My adapter sequence is the 13bp common Illumina sequence--is 13bp not scoring high enough to get trimmed?
clintp is offline   Reply With Quote
Old 01-15-2014, 11:50 AM   #4
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,800
Default

Are you using the raw data for trimming? Why not use the TruSeq3 (PE) adapters that Trimmomatic includes (you will find those files in "Trimmomatic-0.30/adapters/") for the ILLUMINACLIP input.
GenoMax is offline   Reply With Quote
Old 01-15-2014, 12:22 PM   #5
mastal
Senior Member
 
Location: uk

Join Date: Mar 2009
Posts: 667
Default

@clintp

I think the parameters you are using for the IlluminaClip step (2:30:10 ) are too high for trimmomatic to recognize a match to a 13-base adapter sequence;

You need to either change the values or use a longer adapter sequence.

See the trimmomatic web page,

http://www.usadellab.org/cms/?page=trimmomatic

particularly the discussion under the heading 'Adapter Fasta', from which I have extracted this quote:

'The thresholds used are a simplified log-likelihood approach. Each matching base adds just over 0.6, while each mismatch reduces the alignment score by Q/10. Therefore, a perfect match of a 12 base sequence will score just over 7"
mastal is offline   Reply With Quote
Old 01-15-2014, 01:38 PM   #6
clintp
Member
 
Location: Georgia

Join Date: Apr 2013
Posts: 19
Default

@mastal
Yep, understanding the cutoff scores helped a lot (durrr). Somehow I missed that discussion on the trimmomatic page.

@GenoMax
Thanks for that reference--very useful. It's too bad they didn't include ea-utils/FastqMcf in that analysis, though.
clintp is offline   Reply With Quote
Reply

Tags
miseq, trimming

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 05:38 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2018, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO