View Single Post
Old 01-27-2012, 05:29 AM   #4
Senior Member
Location: Cambridge, UK

Join Date: Sep 2009
Posts: 625

We find that using the first 13bp of the Illumina adapter ('AGATCGGAAGAGC') efficiently removes adapter contamination for both paired-end files (the adapters on both sides share this sequence before they fork, and any of the Illumina multiplex barcodes should be further downstream of that).

A typical command for Cutadapt could be

./cutadapt -f fastq -O $stringency -q 20 -a AGATCGGAAGAGC input_file.fastq

$stringency would define the overlap with the adapter required for it to remove sequence from the end, the default is 3 I believe. This command would remove poor quality sequence as well as adapters from your FastQ file.

You should only be careful with the option of removing sequences if they become too short, because this can throw off the sequence-by-sequence order of paired-end files which is required by many aligners.

I hope this helps
fkrueger is offline   Reply With Quote