SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Quality-, adapter- and RRBS-trimming with Trim Galore! fkrueger Bioinformatics 138 11-12-2020 03:58 PM
Adapter trimming figo1019 RNA Sequencing 2 07-17-2018 04:00 AM
Adapter trimming and trimming by quality question alisrpp Bioinformatics 5 04-08-2013 04:55 PM
adapter trimming - help a_mt Bioinformatics 6 11-12-2012 07:36 PM
3' Adapter Trimming caddymob Bioinformatics 0 05-27-2009 12:53 PM

Reply
 
Thread Tools
Old 08-26-2021, 10:54 PM   #341
emortiz
Junior Member
 
Location: Germany

Join Date: Feb 2015
Posts: 3
Default

I have repeated Brian Bushnell's comparison among adaptor trimmers, this time including the most recent versions of Cutadapt, Trimmomatic, and fastp. Here are my commands:

Code:
# Cutadapt 3.4:
time cutadapt -m 21 -j 0 -b "file:gruseq.fa" -B "file:gruseq.fa" -o cutadapt_R1.fq.gz -p cutadapt_R2.fq.gz dirty_R1.fq.gz dirty_R2.fq.gz

# Trimmomatic 0.39:
time trimmomatic PE -phred33 dirty_R1.fq.gz dirty_R2.fq.gz trimmomatic_R1.fq.gz trimmomatic_U1.fq.gz trimmomatic_R2.fq.gz trimmomatic_U2.fq.gz ILLUMINACLIP:gruseq.fa:2:28:10:2:keepBothReads MINLEN:21

# fastp 0.22.0:
time fastp -w 8 -Q -l 21 --adapter_fasta gruseq.fa --detect_adapter_for_pe --in1 dirty_R1.fq.gz --in2 dirty_R2.fq.gz --out1 fastp_R1.fq.gz --out2 fastp_R2.fq.gz

# bbduk.sh 38.92:
time bbduk.sh in=dirty_R#.fq.gz out=bbduk_R#.fq.gz ref=gruseq.fa ktrim=r mink=12 hdist=1 minlen=21 tpe tbo

# bbduk.sh 38.92 (x2):
time bbduk.sh ktrim=r minlength=21 interleaved=f tpe tbo ref=gruseq.fa in=dirty_R#.fq.gz out=stdout.fq k=21 mink=11 hdist=2 | bbduk.sh ktrim=r minlength=21 interleaved=f tpe tbo ref=gruseq.fa in=stdin.fq out=bbduk_x2_R#.fq.gz k=19 mink=9 hdist=1
And these were the results:
MetricdirtyCutadaptTrimmomaticfastpbbdukbbduk(x2)
Time to cleanNA3m42.848s1m11.250s1m42.455s0m9.574s0m15.249s
Reads retained100.00093.34592.51492.99793.00292.994
Bases retained100.00074.5374.43673.97074.26874.186
Perfectly correct (Reads)49.97097.3580.92296.25694.84995.784
Perfectly correct (Bases)49.97096.9286.42696.03593.90095.099
Incorrect (Reads)50.0302.6519.0783.7445.1514.216
Incorrect (Bases)50.0303.0813.5743.9656.1004.901
Adaptors remaining (Reads)50.0302.415.8461.8303.8662.798
Adaptors remaining (Bases)25.1820.280.4220.0490.1930.105
Non-adaptor removed (Reads)0.0001.5313.2311.9141.2851.418
Non-adaptor removed (Bases)0.0000.040.2180.5660.3080.325

I still prefer bbdu.sh for its speed and high accuracy. fastp had slightly higher accuracy but it sometimes mistakes genomic sequence for adaptor (see this post). However Cutadapt now is clearly more accurate (but the slowest by far). I wonder if anybody can recommend some settings that could increase bbduk's accuracy a little more?
emortiz is offline   Reply With Quote
Reply

Tags
adapter, bbduk, bbtools, cutadapt, trimmomatic

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 02:32 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO