SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Introducing the Trimmomatic tonybolger Bioinformatics 189 08-16-2018 10:22 AM
Trimmomatic error dvanic Bioinformatics 31 04-06-2015 01:24 AM
Install Trimmomatic? Palgrave Bioinformatics 10 11-02-2014 04:47 AM
Problem with trimmomatic amango Bioinformatics 9 12-29-2013 08:43 AM
question about Trimmomatic dejavu2010 Bioinformatics 4 02-27-2012 09:27 AM

Reply
 
Thread Tools
Old 11-27-2012, 04:51 AM   #1
tahamasoodi
Success
 
Location: India

Join Date: May 2012
Posts: 130
Default Trimmomatic explanation

Can anybody explain the following command in Trimmonatic?

java -classpath trimmomatic-0.15.jar org.usadellab.trimmomatic.TrimmomaticPE s_1_1_sequence.txt.gz s_1_2_sequence.txt.gz lane1_forward_paired.fq.gz lane1_forward_unpaired.fq.gz lane1_reverse_paired.fq.gz lane1_reverse_unpaired.fq.gz ILLUMINACLIP:illuminaClipping.fa:2:40:15 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:36
__________________
Thanks,
tahamasoodi is offline   Reply With Quote
Old 11-27-2012, 06:58 AM   #2
DunderChief
Junior Member
 
Location: Baltimore, MD

Join Date: Aug 2012
Posts: 6
Default

All of this is explained in detail on the trimmomatic website, but in general this will:

1. Clip Illumina adapters
2. then trim the leading nucleotides until quality > 3
3. then trim the trailing nucleotides until quality > 3
4. then using a sliding window of 4 nucleotides and trims when quality < 15
5. Remove any remaining sequences that are shorter than 36 nt.

You should really try looking at their documentation before posting questions here though.
DunderChief is offline   Reply With Quote
Old 03-20-2013, 07:43 AM   #3
modi2020
Member
 
Location: New York

Join Date: May 2012
Posts: 22
Default

Hi,

I was wondering for steps 2 and 3, how many nucleotides would it trim from the start or end of the sequence. My intuition tells me its just one but I am not sure?

Quote:
Originally Posted by DunderChief View Post
All of this is explained in detail on the trimmomatic website, but in general this will:

1. Clip Illumina adapters
2. then trim the leading nucleotides until quality > 3
3. then trim the trailing nucleotides until quality > 3
4. then using a sliding window of 4 nucleotides and trims when quality < 15
5. Remove any remaining sequences that are shorter than 36 nt.

You should really try looking at their documentation before posting questions here though.
modi2020 is offline   Reply With Quote
Old 03-20-2013, 08:35 AM   #4
JackieBadger
Senior Member
 
Location: Halifax, Nova Scotia

Join Date: Mar 2009
Posts: 381
Default

"trim the leading nucleotides until quality > 3"
Its pretty self explanatory...It will continue to trim bases that have a quality score lower than 3 until it hits a base where it is 3 or greater.

This it self is pretty useless as Ive never seen an illumina base with this low score and you would want to be retaining score at a very minimum of Q20
JackieBadger is offline   Reply With Quote
Old 03-20-2013, 08:42 AM   #5
mastal
Senior Member
 
Location: uk

Join Date: Mar 2009
Posts: 667
Default Trimmomatic explanation

Quote:
Originally Posted by JackieBadger View Post
"trim the leading nucleotides until quality > 3"
Its pretty self explanatory...It will continue to trim bases that have a quality score lower than 3 until it hits a base where it is 3 or greater.

This it self is pretty useless as Ive never seen an illumina base with this low score and you would want to be retaining score at a very minimum of Q20
It's useful for removing Ns (which have quality < 3) from the ends of the reads.

maria
mastal is offline   Reply With Quote
Old 03-20-2013, 08:48 AM   #6
modi2020
Member
 
Location: New York

Join Date: May 2012
Posts: 22
Default

Thank you JackieBadger and mastal for your explaination.
I got the idea now.
modi2020 is offline   Reply With Quote
Old 03-20-2013, 11:03 AM   #7
JackieBadger
Senior Member
 
Location: Halifax, Nova Scotia

Join Date: Mar 2009
Posts: 381
Default

Quote:
Originally Posted by mastal View Post
It's useful for removing Ns (which have quality < 3) from the ends of the reads.

maria
But you shouldn't be keeping any base that has a quality of between Q3-Q19
Wouldn't it just be better to trim off actual "N"s rather than assume they have are >Q3 ?
JackieBadger is offline   Reply With Quote
Old 03-20-2013, 11:27 AM   #8
mastal
Senior Member
 
Location: uk

Join Date: Mar 2009
Posts: 667
Default

Quote:
Originally Posted by JackieBadger View Post
But you shouldn't be keeping any base that has a quality of between Q3-Q19
Wouldn't it just be better to trim off actual "N"s rather than assume they have are >Q3 ?
Yes, I agree, but I don't think you can do that with trimmomatic.
mastal is offline   Reply With Quote
Old 03-20-2013, 06:10 PM   #9
modi2020
Member
 
Location: New York

Join Date: May 2012
Posts: 22
Default

Hi Jakie

I have tried to run trimmomatic with the option -phred20 to get reads with quality scores of Q20 but it doesn't seem to like it. It runs with -phred33 though. In their website, they say that it only accepts phred scores of 33 or 64. Am I wrong or is there any way of making it accept the -phred20 option?

Thank you
Quote:
Originally Posted by JackieBadger View Post
"trim the leading nucleotides until quality > 3"
Its pretty self explanatory...It will continue to trim bases that have a quality score lower than 3 until it hits a base where it is 3 or greater.

This it self is pretty useless as Ive never seen an illumina base with this low score and you would want to be retaining score at a very minimum of Q20
modi2020 is offline   Reply With Quote
Old 03-20-2013, 06:55 PM   #10
JackieBadger
Senior Member
 
Location: Halifax, Nova Scotia

Join Date: Mar 2009
Posts: 381
Default

Quote:
Originally Posted by modi2020 View Post
Hi Jakie

I have tried to run trimmomatic with the option -phred20 to get reads with quality scores of Q20 but it doesn't seem to like it. It runs with -phred33 though. In their website, they say that it only accepts phred scores of 33 or 64. Am I wrong or is there any way of making it accept the -phred20 option?

Thank you
You are just a little confused.
Phred refers to the actual encoding of the quality score information http://en.wikipedia.org/wiki/FASTQ_format
....
phred33 or phred64 encryption are produced depend on the sequencer/software used to produce your data.

All quality codes have a range of quality scores associated with them and use different characters to ID particular quality scores (see wiki).

So you choose your quality encription (phredXX) and then choose the minimum quality score you want to enforce (e.g. 20)

Last edited by JackieBadger; 03-20-2013 at 06:57 PM.
JackieBadger is offline   Reply With Quote
Old 03-20-2013, 07:50 PM   #11
modi2020
Member
 
Location: New York

Join Date: May 2012
Posts: 22
Default

Thank you for the explanation Jakie. I got the idea now.
I tried to specify the quality after specifying the phred score and it didn't work though. To be specific I used -phred33 20
Is that what you meant?

Quote:
Originally Posted by JackieBadger View Post
You are just a little confused.
Phred refers to the actual encoding of the quality score information http://en.wikipedia.org/wiki/FASTQ_format
....
phred33 or phred64 encryption are produced depend on the sequencer/software used to produce your data.

All quality codes have a range of quality scores associated with them and use different characters to ID particular quality scores (see wiki).

So you choose your quality encription (phredXX) and then choose the minimum quality score you want to enforce (e.g. 20)
modi2020 is offline   Reply With Quote
Old 03-20-2013, 08:26 PM   #12
modi2020
Member
 
Location: New York

Join Date: May 2012
Posts: 22
Default

Actually I think I got it.
The way I did it is using a sliding window option. What I think I did is ask it to go through the sequence in a window of 4 bps, take the average score, if the average score is below 20 then drop that window, otherwise keep moving. If the total length of the sequence after dropping low quality windows is less than 60 I removed it. I also used the leading and trailing options to drop low quality leading or trailing base pairs.
My complete command is as follows:

java -classpath trimmomatic-0.22.jar org.usadellab.trimmomatic.TrimmomaticPE -phred33 -trimlog trimmlog_log.txt R1.fastq R2.fastq Output_R1.fq unpaired_output1.fq Output_R2.fq unpairedoutput2.fq LEADING:20 TRAILING:20 SLIDINGWINDOW:4:20 MINLEN:60
modi2020 is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 06:51 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO