SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Removing host sequences tuija Bioinformatics 4 09-17-2013 06:18 AM
Fastq_quality_filter no removing sequences oneofmany Bioinformatics 2 08-28-2013 04:09 PM
Removing Adapter Dimers with AMPure Beads jmwhitha Sample Prep / Library Generation 4 03-22-2013 07:28 AM
Removing multi-matched sequences in Bowtie gwilson Bioinformatics 8 01-11-2010 11:58 PM

Reply
 
Thread Tools
Old 10-24-2013, 08:46 AM   #1
morning latte
Member
 
Location: MI

Join Date: Jun 2013
Posts: 91
Default removing adapter sequences

Hello,

I am working with Illumina Hiseq data (100-bp PE). I am trying to remove adapter sequences using Trimmomatic. I've got adapter sequences from the sequencing core I used. But some of adapter sequences still remain after running Trimmomatic when I checked them using FastQC. Any suggestions would be great. Thanks.
morning latte is offline   Reply With Quote
Old 10-24-2013, 03:03 PM   #2
jimmybee
Senior Member
 
Location: Adelaide, Australia

Join Date: Sep 2010
Posts: 119
Default

Can you give us an example of what you ran and what you're getting as an output? We can't really help you unless we get some background....
jimmybee is offline   Reply With Quote
Old 11-12-2013, 10:45 AM   #3
ahnguyen
Junior Member
 
Location: California

Join Date: Nov 2013
Posts: 1
Default

I am having essentially the same problem originally psoted above. I want to remove adapter sequences from Illumina 100 bp PE reads. I run the following with Trimmomatic:

java -classpath ~/Scripts/Trimmomatic-0.30/trimmomatic-0.30.jar org.usadellab.trimmomatic.TrimmomaticPE -phred33 -trimlog trim_2.log R1.fastq R2.fastq T1.fastq T1.unpaired.fastq T2.fastq T2.unpaired.fastq ILLUMINACLIP:adapters.fa:3:40:15 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:15

My adapters.fa file includes the following (in addition to others):
>DNA_primers_1
AGGGAGGACGATGCGG
>DNA_primers_2
CCGCTGGAAGTGACTGACAC
>RNA_linkers_2
GTGTCAGTCACTTCCAGCGG

I ran FastQC prior to trimming the adapters and the over-represented sequences include
Sequence Count Percentage
CCGCTGGAAGTGAC... 115468 0.44
AGGGAGGACGATGC... 112267 0.427
CCGCTGGAAGTGAC... 109341 0.416
AGGGAGGACGATGC... 105312 0.401
CCGCTGGAAGTGAC... etc etc
CCGCTGGAAGTGAC...

After running Trimmomatic, run FastQC on the new fastq's and get the following for overrpresented sequences:
Sequence Count Percentage
CCGCTGGAAGTGAC... 109151 0.426
AGGGAGGACGATGC... 106128 0.414
CCGCTGGAAGTGAC... 102985 0.402
AGGGAGGACGATGC... 99644 0.389
CCGCTGGAAGTGAC... etc etc
CCGCTGGAAGTGAC...

Is this not surprising? I think a lot of the remaining adapter sequences are adapters linked to each other, so they are well represented in the first 10 bps.

- Andrew
ahnguyen is offline   Reply With Quote
Old 11-12-2013, 12:13 PM   #4
mastal
Senior Member
 
Location: uk

Join Date: Mar 2009
Posts: 667
Default

Hi Andrew (ahnguyen),

I think the adapter sequences you are using (for example the 16 bp DNA_primers_1 and the 20 bp DNA_primers_2) are not long enough for Trimmomatic to recognise a match, given the thresholds you are using
(3:40:15, so 40 for palindrome clipping and 15 for simple clipping).

If you look at the Trimmomatic web page,

http://www.usadellab.org/cms/?page=trimmomatic

on the last paragraph of the section titled 'The Adapter Fasta', it explains that
'Each matching base adds just over 0.6' to the score, so even if your read matches the adapter sequence perfectly, it would score only 20 X 0.6 = 12.
You have set the threshold for simple clipping to 15, a score which none of your reads will reach, so trimmomatic will not recognize any of the reads as having the adapter sequence you want to trim.
mastal is offline   Reply With Quote
Old 12-12-2013, 04:03 PM   #5
arcolombo698
Senior Member
 
Location: Los Angeles

Join Date: Nov 2013
Posts: 142
Default

how do you create the adapters.fa file?
I have the same problem
arcolombo698 is offline   Reply With Quote
Old 12-13-2013, 01:26 AM   #6
mastal
Senior Member
 
Location: uk

Join Date: Mar 2009
Posts: 667
Default

The more recent versions of Trimmomatic include adapters.fa files for Illumina Truseq v2 and v3.

See the link to the Trimmomatic web page that I gave in the post above if you don't already have Trimmomatic installed on your computer.

Have a look at that, and then if you want to use other adapter sequences you can either add them to the file, or make your own file.
mastal is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 01:14 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO