SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Trimmomatic quality trimming kga1978 Bioinformatics 26 11-24-2015 10:14 AM
Trimmomatic error while executing Irina Pulyakhina Bioinformatics 15 07-03-2015 04:44 AM
Problem with trimmomatic amango Bioinformatics 9 12-29-2013 08:43 AM
Introducing pBWA [Parallel BWA] dp05yk Bioinformatics 52 05-21-2013 10:27 PM
Introducing our Ion Torrent! nickloman Ion Torrent 34 05-26-2011 05:56 PM

Reply
 
Thread Tools
Old 02-20-2014, 09:35 AM   #81
jcorn427
Junior Member
 
Location: Phoenix, AZ

Join Date: Jan 2014
Posts: 9
Default

So, that one run completed successfully. I went to run it again on the next set of data and now I get a new exception.

Code:
Exception in thread "main" java.lang.NullPointerException
        at org.usadellab.trimmomatic.fastq.FastqParser.parseOne(FastqParser.java:57)
        at org.usadellab.trimmomatic.fastq.FastqParser.next(FastqParser.java:106)
        at org.usadellab.trimmomatic.TrimmomaticPE.processSingleThreaded(TrimmomaticPE.java:56)
        at org.usadellab.trimmomatic.TrimmomaticPE.process(TrimmomaticPE.java:275)
        at org.usadellab.trimmomatic.TrimmomaticPE.run(TrimmomaticPE.java:347)
        at org.usadellab.trimmomatic.Trimmomatic.main(Trimmomatic.java:23)
Any hints as to what this might be would be great. Thanks for all of your help.
jcorn427 is offline   Reply With Quote
Old 02-21-2014, 01:57 AM   #82
tonybolger
Senior Member
 
Location: berlin

Join Date: Feb 2010
Posts: 156
Default

Quote:
Originally Posted by jcorn427 View Post
So, that one run completed successfully. I went to run it again on the next set of data and now I get a new exception.

Code:
Exception in thread "main" java.lang.NullPointerException
        at org.usadellab.trimmomatic.fastq.FastqParser.parseOne(FastqParser.java:57)
        at org.usadellab.trimmomatic.fastq.FastqParser.next(FastqParser.java:106)
        at org.usadellab.trimmomatic.TrimmomaticPE.processSingleThreaded(TrimmomaticPE.java:56)
        at org.usadellab.trimmomatic.TrimmomaticPE.process(TrimmomaticPE.java:275)
        at org.usadellab.trimmomatic.TrimmomaticPE.run(TrimmomaticPE.java:347)
        at org.usadellab.trimmomatic.Trimmomatic.main(Trimmomatic.java:23)
Any hints as to what this might be would be great. Thanks for all of your help.
Generally this is caused by a partial record (fastq records always come in 4 line groups) at the end of the file. Blank line(s) also may also do it.
tonybolger is offline   Reply With Quote
Old 03-19-2014, 11:20 PM   #83
bharat_iyengar
Member
 
Location: Delhi, India

Join Date: Dec 2012
Posts: 20
Default

Is it possible to alter the minimum seed length in trimmomatic.SE, for adapter clipping ?

If it is not explicit which one of the source files has to be edited ?
bharat_iyengar is offline   Reply With Quote
Old 04-13-2014, 11:33 PM   #84
shangzhong0619
Member
 
Location: La Jolla

Join Date: Nov 2013
Posts: 17
Default How does sliding window work

I was just wondering how does sliding window in trimmomatic work?
The definition is scanning from the 5 end of the read, and removes the 3
end of the read when the average quality of a group of bases drops
below a specified threshold.
For example, if we have a sequence ATCGATCGATCG and we set SLIDINGWINDOW: 4:15.
It begins with the first 4 in a window, ATCGATCGATCG, but if the score is below 15, which base it will trim? Is that the last base in this window? What is the next start position of the window? 2 or 5? thanks.
shangzhong0619 is offline   Reply With Quote
Old 04-16-2014, 06:54 AM   #85
tonybolger
Senior Member
 
Location: berlin

Join Date: Feb 2010
Posts: 156
Default

Quote:
Originally Posted by shangzhong0619 View Post
I was just wondering how does sliding window in trimmomatic work?
The definition is scanning from the 5 end of the read, and removes the 3
end of the read when the average quality of a group of bases drops
below a specified threshold.
For example, if we have a sequence ATCGATCGATCG and we set SLIDINGWINDOW: 4:15.
It begins with the first 4 in a window, ATCGATCGATCG, but if the score is below 15, which base it will trim? Is that the last base in this window? What is the next start position of the window? 2 or 5? thanks.
The sliding window moves by one position each time. So it starts with positions 1-4, if these are ok, then tries positions 2-5.

Once a 'window' falls below the required quality average, all bases beyond that point at removed, as well as any bases from the end of the window which are below the required quality until one of the required quality is found.

This can result in the final trimmed read including none, some or (in very unusual circumstances) all the bases within the failed window - but typically around half the window will be kept.

Hope this helps,

Tony.
tonybolger is offline   Reply With Quote
Old 04-24-2014, 09:55 AM   #86
Lays Cruz
Junior Member
 
Location: Brazil

Join Date: Apr 2014
Posts: 4
Default

Hi all.
I'm having problems with the trimmomatc outputs, has different numbers of reads in PE files, this should not happen.
Does anyone know what I can do?
Thanks.
Lays Cruz is offline   Reply With Quote
Old 04-24-2014, 10:03 AM   #87
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,022
Default

Quote:
Originally Posted by Lays Cruz View Post
Hi all.
I'm having problems with the trimmomatc outputs, has different numbers of reads in PE files, this should not happen.
Does anyone know what I can do?
Thanks.
Are you trimming the files together using the PE option?
GenoMax is offline   Reply With Quote
Old 04-24-2014, 10:10 AM   #88
Lays Cruz
Junior Member
 
Location: Brazil

Join Date: Apr 2014
Posts: 4
Default

Yes. My data is Illumina MiSeq and my command line is as follows:
#java -jar /usr/local/bin/trimmomatic-0.30.jar PE -threads 4 -phred33 ./jatoba/Hst_S2_L001_R1_001.fastq ./jatoba/Hst_S2_L001_R2_001.fastq ./jatoba/fq/Hst_S2_PE_1p.fq ./jatoba/fq/Hst_S2_SR_1p.fq ./jatoba/fq/Hst_S2_PE_2p.fq ./jatoba/fq/Hst_S2_SR_2p.fq LEADING:30 TRAILING:30 SLIDINGWINDOW:4:30 HEADCROP:18 MINLEN:20

Thanks for answers.
Lays Cruz is offline   Reply With Quote
Old 04-24-2014, 10:38 AM   #89
Lays Cruz
Junior Member
 
Location: Brazil

Join Date: Apr 2014
Posts: 4
Default

Quote:
Originally Posted by GenoMax View Post
Are you trimming the files together using the PE option?
Yes. My data is from Illumina MiSeq and my command's line is as follows:
#java -jar /usr/local/bin/trimmomatic-0.30.jar PE -threads 4 -phred33 ./jatoba/Hst_S2_L001_R1_001.fastq ./jatoba/Hst_S2_L001_R2_001.fastq ./jatoba/fq/Hst_S2_PE_1p.fq ./jatoba/fq/Hst_S2_SR_1p.fq ./jatoba/fq/Hst_S2_PE_2p.fq ./jatoba/fq/Hst_S2_SR_2p.fq LEADING:30 TRAILING:30 SLIDINGWINDOW:4:30 HEADCROP:18 MINLEN:20

Thanks.
Lays Cruz is offline   Reply With Quote
Old 04-24-2014, 10:57 AM   #90
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

Hi all,

I'm testing Trimmomatic's performance on adapter removal, and the results are mysteriously bad. So I'd like to make sure I'm not doing anything wrong. This is my command line (modified from the website):

java -Xmx8g -jar trimmomatic-0.32.jar SE -phred33 dirty.fq tclean.fq ILLUMINACLIP:gruseq.fa:2:30:10

...where dirty.fq is a file containing reads with adapter sequences and gruseq.fa is a file containing the adapter sequences. The adapters are inserted synthetically and the reads are tagged, so I know precisely what the correct results should be, and what I'm getting is not really close. Any suggestions?
Brian Bushnell is offline   Reply With Quote
Old 04-24-2014, 11:17 AM   #91
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,022
Default

Quote:
Originally Posted by Lays Cruz View Post
Yes. My data is from Illumina MiSeq and my command's line is as follows:
#java -jar /usr/local/bin/trimmomatic-0.30.jar PE -threads 4 -phred33 ./jatoba/Hst_S2_L001_R1_001.fastq ./jatoba/Hst_S2_L001_R2_001.fastq ./jatoba/fq/Hst_S2_PE_1p.fq ./jatoba/fq/Hst_S2_SR_1p.fq ./jatoba/fq/Hst_S2_PE_2p.fq ./jatoba/fq/Hst_S2_SR_2p.fq LEADING:30 TRAILING:30 SLIDINGWINDOW:4:30 HEADCROP:18 MINLEN:20

Thanks.
How about just running:

Code:
#java -jar /usr/local/bin/trimmomatic-0.30.jar PE -threads 4 -phred33 -basein jatoba/Hst_S2_L001_R1_001.fastq -baseout ./jatoba/Hst_S2_trim LEADING:30 TRAILING:30 SLIDINGWINDOW:4:30 HEADCROP:18 MINLEN:20
BTW: Are there 3 pairs of samples or are you just providing names of files for holding the output?
GenoMax is offline   Reply With Quote
Old 04-24-2014, 11:21 AM   #92
Lays Cruz
Junior Member
 
Location: Brazil

Join Date: Apr 2014
Posts: 4
Default

Quote:
Originally Posted by GenoMax View Post
How about just running:

Code:
#java -jar /usr/local/bin/trimmomatic-0.30.jar PE -threads 4 -phred33 -basein jatoba/Hst_S2_L001_R1_001.fastq -baseout ./jatoba/Hst_S2_trim LEADING:30 TRAILING:30 SLIDINGWINDOW:4:30 HEADCROP:18 MINLEN:20
BTW: Are there 3 pairs of samples or are you just providing names of files for holding the output?
No. It comes from a only sample. The other files are the outputs, two PE files with the paired sequences and two SR files with single reads removed from R1 and R2 files.

Much appreciate your suggestion, I will try now.

Thanks.
Lays Cruz is offline   Reply With Quote
Old 04-24-2014, 11:29 AM   #93
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,022
Default

Quote:
Originally Posted by Brian Bushnell View Post
Hi all,

I'm testing Trimmomatic's performance on adapter removal, and the results are mysteriously bad. So I'd like to make sure I'm not doing anything wrong. This is my command line (modified from the website):

java -Xmx8g -jar trimmomatic-0.32.jar SE -phred33 dirty.fq tclean.fq ILLUMINACLIP:gruseq.fa:2:30:10

...where dirty.fq is a file containing reads with adapter sequences and gruseq.fa is a file containing the adapter sequences. The adapters are inserted synthetically and the reads are tagged, so I know precisely what the correct results should be, and what I'm getting is not really close. Any suggestions?
The command looks ok, except that you are not providing min adapater length (which defaults to 8). In what way is the output "mysteriously bad"?
GenoMax is offline   Reply With Quote
Old 04-24-2014, 12:03 PM   #94
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

Quote:
Originally Posted by GenoMax View Post
The command looks ok, except that you are not providing min adapater length (which defaults to 8). In what way is the output "mysteriously bad"?
Most of the adapters didn't get removed.
Brian Bushnell is offline   Reply With Quote
Old 05-27-2014, 03:58 AM   #95
Corydoras
Member
 
Location: Norwich

Join Date: Jan 2014
Posts: 20
Default

Hi,

Sorry to hijack this post! I got paired-end 150bp RAD sequencing data that I am currently cleaning with Trimmomatic. I just have two quick questions to make sure I am not going wrong anywhere.

1) By the looks of it, most of my adapter contamination occurs within the read. I.e. I have 'P1 -sequence-P1-sequence'. In this case, the single alignment adapter mode will trim this sequence up until the start of the second P1 adapter occurring in the read, leaving me with just 'P1-sequence', am I correct? I am just thinking that I would actually prefer for Trimmomatic to discard these reads entirely, as the reverse part of this read will in all likelihood not come from the same locus as the surviving bit of the forward read? Is there an option to do this? The palindrom mode does not appear to pick these within-read sequences up.

2) Just as a very general question, I also appear to have quite a bit of reverse-complement adapter contamination in my data set and I was wondering if anybody has experience with this for RAD data? Is this something I need to worry about?

Many thanks in advance for any feedback

Sarah
Corydoras is offline   Reply With Quote
Old 06-01-2014, 07:33 AM   #96
patouch74
Member
 
Location: France

Join Date: May 2014
Posts: 16
Default

Hi,

I've chosen trimmomatic for my studies about reads processing.
However I have to justify why I choose this tool instead of others pre processing tools.

Do you have any articles which compare trimmomatic to other tools (except trimmomatic's authors article) from what I can get some informations ?

thanks
patouch74 is offline   Reply With Quote
Old 06-01-2014, 08:12 AM   #97
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

I have a comparison of adapter trimming here:

http://seqanswers.com/forums/showthread.php?t=42776

...though it's not published or peer-reviewed. I'll be doing another comparison of quality-trimming soon.

Also, here's a paper comparing quality-trimming methods:

http://www.plosone.org/article/info%...l.pone.0085024

Last edited by Brian Bushnell; 06-01-2014 at 08:15 AM.
Brian Bushnell is offline   Reply With Quote
Old 06-04-2014, 07:54 AM   #98
Corydoras
Member
 
Location: Norwich

Join Date: Jan 2014
Posts: 20
Default

Just in case anybody ever has a similar problem or is confused and stumbles across my post:

Looking closer at my files and where the reverse adapter contamination occurred, it became obvious that the rc sequences were actually simply adapter read through and everything that followed was nonesense which Trimmomatic then perfectly removed. This means in roughly 3% of cases, my fragments were too short for the 150bp HiSeq and the RAD size selection did not work perfectly, but considering it is only 3% and it was my first set of libraries I am fairly happy with that.

Above I stated that I was concerned the forward read would not match the reverse read. Now I believe this is only the case in 100-1000 fragments that consist of tiny fragments with adapter ligating to other tiny fragments of adapter. The majority of the contamination however presents itself in reverse complementary form.

This all obviously rests upon the understanding that when adapter read-through occurs, it will be reverse complementary of the P2 adapters in the forward reads, and reverse complementary of the P1 adapters in the reverse reads. Please feel free to point out if there is something wrong with my logic!
Corydoras is offline   Reply With Quote
Old 06-06-2014, 12:40 AM   #99
tsangkl
Member
 
Location: Hong Kong

Join Date: Jun 2014
Posts: 13
Default

Hi, I found trimmomatic very useful.
And it works well with my Hiseq data using Nextera PE adaptor in single end mode.
But I found the output is quite strange in paired end mode:

My output after trimming:
Input Read Pairs: 12484647 Both Surviving: 4943420 (39.60%) Forward Only Surviving: 7297375 (58.45%) Reverse Only Surviving: 16245 (0.13%) Dropped: 227607 (1.82%)

It seems that the forward and reverse reads after trimming is very unbalanced.
What would cause this?
Thanks.
tsangkl is offline   Reply With Quote
Old 06-06-2014, 12:58 AM   #100
mastal
Senior Member
 
Location: uk

Join Date: Mar 2009
Posts: 667
Default

In trimmomatic's default mode, when it trims adapters from paired end reads, it drops the second read of the pair, because the two reads are reverse complements of each other, so the second read doesn't add any extra information.

In newer versions of trimmomatic this can be turned off, so that it keeps both reads after trimming adapters from paired reads. You need to specify 'TRUE' for the <keepBothReads> parameter of the ILLUMINACLIP command.

ILLUMINACLIP:<fastaWithAdaptersEtc>:<seed mismatches>:<palindrome clip threshold>:<simple clip threshold>:<minAdapterLength>:<keepBothReads>

Last edited by mastal; 06-06-2014 at 01:12 AM.
mastal is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 12:54 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO