SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Trimmomatic quality trimming kga1978 Bioinformatics 26 11-24-2015 10:14 AM
Trimmomatic error while executing Irina Pulyakhina Bioinformatics 15 07-03-2015 04:44 AM
Problem with trimmomatic amango Bioinformatics 9 12-29-2013 08:43 AM
Introducing pBWA [Parallel BWA] dp05yk Bioinformatics 52 05-21-2013 10:27 PM
Introducing our Ion Torrent! nickloman Ion Torrent 34 05-26-2011 05:56 PM

Reply
 
Thread Tools
Old 06-14-2014, 08:59 PM   #101
tsangkl
Member
 
Location: Hong Kong

Join Date: Jun 2014
Posts: 13
Default

Hi, I have successfully trim the transposase sequence from the Hiseq data using trimmomatic, but I would like to trim the primer as well, it only has 15 base long, and I see it is fail to trim it using trimmomatic, do I need to change the seed length of 16 base? How can it be done? Thanks
tsangkl is offline   Reply With Quote
Old 06-15-2014, 03:09 AM   #102
mastal
Senior Member
 
Location: uk

Join Date: Mar 2009
Posts: 667
Default

What was the trimmomatic command that you used, and what sequences did you use in the adapters fasta file?
mastal is offline   Reply With Quote
Old 06-15-2014, 11:33 AM   #103
kevluv93
Member
 
Location: South Carolina

Join Date: Jun 2014
Posts: 10
Default

Hi! I think I'm putting in the trimmomatic code incorrectly. I'm using the binary file download of trimmomatic, I'm running these programs off of windows 8 (that might be a problem), and I'm using fastq files that need paired end adapter sequence trimming.

Here is the script I'm using:

java -jar C:\Users\kevluv93\Desktop\Trimmomatic-0.32\Trimmomatic-0.32\trimmomatic-0.32.jar PE -phred33 C:\Users\kevluv93\Desktop\L11of2_S1_L001_R1_001.fastq.fq C:\Users\kevluv93\Desktop\L11of2_S1_L001_R2_001.fastq.fq C:\Users\kevluv93\Desktop\output_forward_paired.fq C:\Users\kevluv93\Desktop\output_forward_unpaired.fq C:\Users\kevluv93\Desktop\output_reverse_paired.fq C:\Users\kevluv93\Desktop\output_reverse_unpaired.fq ILLUMINACLIP:C:\Users\kevluv93\Desktop\Trimmomatic-0.32\Trimmomatic-0.32\TruSeq3-PE.fa:2:30:10 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:36

To make things simple, all the files are on my desktop, so if there is an error in the paths I gave trimmomatic to get the files, keep that in mind.

Here is the error message:

Multiple cores found: Using 4 threads
Trimmomatic PE: Started with arguments: -phred33 C:\Users...
Multiple cores found: Using 4 threads
Exception in thread "main" java.lang.NumberFormatException: For input string: "\Users\kevluv93\Desktop\Trimmomatic-0.32\Trimmomatic-0.32\TruSeq3-PE.fa
at java.lang.NumberFormatException.for InputString (unknown source)
at java.lang.Integer.parseInt(Unknown Source)
at org.usadellab.trimmomatic.trim.IlluminaClippingTrimmer.makeIlluminaClippingTrimmer(IlluminaClippingTrimmer.java:53)
at org.usadellab.trimmomatic.trim.TrimmerFactory.makeTrimmer (TrimmerFactory.java:27)

I don't know what is wrong with the script I put in. I used the example on the trimmomatic website as the base for this one, so I thought it would work. Did I input the script correctly? Do the parameters make sense?

New to this, undergrad, greatly appreciative of any input!

Last edited by kevluv93; 06-15-2014 at 11:39 AM.
kevluv93 is offline   Reply With Quote
Old 06-15-2014, 12:02 PM   #104
mastal
Senior Member
 
Location: uk

Join Date: Mar 2009
Posts: 667
Default

@kevluv93

I have not used trimmomatic on Windows, but it looks as if it's expecting what comes after 'C:' to be the second parameter for the ILLUMINACLIP command.

Try putting quotes around the path to the fasta file and see if that helps.

Code:
ILLUMINACLIP:'C:\Users\kevluv93\Desktop\Trimmomatic-0.32\Trimmomatic-0.32\TruSeq3-PE.fa':2:30:10
mastal is offline   Reply With Quote
Old 06-15-2014, 12:34 PM   #105
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

You probably have the parameters in the wrong order. Java is parsing "\Users\kevluv93\Desktop\Trimmomatic-0.32\Trimmomatic-0.32\TruSeq3-PE.fa" and expecting an integer, which it is not. I'm not a Trimmomatic expert and I don't know the correct command line, but Trimmomatic should work fine on Windows with the correct command.

If you continue to have problems, send me a pm.
Brian Bushnell is offline   Reply With Quote
Old 06-15-2014, 12:57 PM   #106
kevluv93
Member
 
Location: South Carolina

Join Date: Jun 2014
Posts: 10
Default

@mastal

Thank you for the quick reply! Unfortunately, it's still displaying the same error message. But at least I know the problem is around there, so that's good news.

So, could there be a way to possibly direct trimmomatic to the file without beginning with C:? Or could this possibly be another issue?

I've tried other symbols like brackets and parenthesis just in case 'C:' wasn't it. Same message.
kevluv93 is offline   Reply With Quote
Old 06-15-2014, 01:30 PM   #107
usad
Member
 
Location: aachen

Join Date: Sep 2009
Posts: 53
Default

Hi

sorry for the inconvenience. At the moment it might be best to not have to use C: at all as trimmomatic looks for data after :
E.g. having the clip file in the actual directory - yukk.

Best Wishes
Björn
usad is offline   Reply With Quote
Old 06-15-2014, 01:50 PM   #108
kevluv93
Member
 
Location: South Carolina

Join Date: Jun 2014
Posts: 10
Default

I figured it would come to that. I think my best option would be to use Linux and not have to direct trimmomatic using "C:". Don't file paths in Linux all begin with a "/home/..."? I'll try that and see if it works.
kevluv93 is offline   Reply With Quote
Old 06-15-2014, 04:50 PM   #109
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

Quote:
Originally Posted by kevluv93 View Post
I figured it would come to that. I think my best option would be to use Linux and not have to direct trimmomatic using "C:". Don't file paths in Linux all begin with a "/home/..."? I'll try that and see if it works.
No, not really. The path depends on the system... Java works fine in Windows, and as long as the classpath is set correctly, it should work fine anywhere. You just need to set the "-cp" variable correctly.
Brian Bushnell is offline   Reply With Quote
Old 06-15-2014, 06:19 PM   #110
tsangkl
Member
 
Location: Hong Kong

Join Date: Jun 2014
Posts: 13
Default

Quote:
Originally Posted by mastal View Post
What was the trimmomatic command that you used, and what sequences did you use in the adapters fasta file?
java -jar trimmomatic-0.32.jar PE -threads 11 -trimlog trim_keepbothread.log Hiseq_1.fastq Hiseq_2.fastq Hiseq_1_keeppaired.fastq Hiseq_1_keepunpaired.fastq Hiseq_2_keeppaired.fastq Hiseq_2_keepunpaired.fastq ILLUMINACLIP:NexteraPE-PE.fa:2:30:10:8:TRUE LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:36

And my adapter fasta file contain the transposase sequence and primer sequence provided by illumina:

>Trans1_rc
CTGTCTCTTATACACATCTGACGCTGCCGACGA
>Trans2_rc
CTGTCTCTTATACACATCTCCGAGCCCACGAGAC

>Primerprefixi5_rc
GACGCTGCCGACGA
>Primerprefixi7_rc
CCGAGCCCACGAGAC


Thanks
tsangkl is offline   Reply With Quote
Old 06-15-2014, 07:22 PM   #111
kevluv93
Member
 
Location: South Carolina

Join Date: Jun 2014
Posts: 10
Default

Could you guys give me an example script of setting up a classpath for "ILLUMINACLIP:" to get access to the adapter files in the trimmomatic binary download? I should probably mention that my experience with scripting languages begins and ends with "making stick figures in JavaScript", so forgive me if I'm asking for baby steps. I'm still using Windows 8 for this trimmomatic job (it's the OS that all the computers in our college use, so I don't really have a choice.)

@ Bjorn, you mentioned "having the clip file in the actual directory", and that trimmomatic looks for documents after ":". With this info, I deleted the "C:" that came after ILLUMINACLIP: but I got an error message telling me, "ArrayIndexOutOfBoundsException: 1".

@Brian Bushnell, you mentioned that trimmomatic will work on windows as long as I specify the classpaths to the files correctly.

So, could someone give me an example of what I need to write after "ILLUMINACLIP:" to let "ILLUMINACLIP:" get the adapter sequence fa. files that are saved on my desktop? I'm just right clicking these icons and copy/pasting their locations into the trimmomatic script. How else does one make a path to a file?

Thanks for the help! Any input is appreciated!
kevluv93 is offline   Reply With Quote
Old 06-15-2014, 09:31 PM   #112
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

kevluv,

You are not doing anything wrong with Java; your command would fail in any environment. The classpath appears to be correct, you just have the wrong command line.
Brian Bushnell is offline   Reply With Quote
Old 06-15-2014, 11:48 PM   #113
MikhailFokin
Member
 
Location: NZ

Join Date: Mar 2014
Posts: 15
Default

Hi guys, seems that Trimmomatic is very useful and performs better that others.

The only thing I can not find, if present at all. Where can I get the info if any adaptor clipping appeared? Yes I can analyse the result by fastqc or go deep to log (still no info for clipping), but is there any obvious place to find adaptor clipping results?

The second question is - what is the clipping/trimming strategy applied for internal (junction Nextera adaptors)?
MikhailFokin is offline   Reply With Quote
Old 06-16-2014, 12:34 AM   #114
MikhailFokin
Member
 
Location: NZ

Join Date: Mar 2014
Posts: 15
Default

could you please give a real example of using -baseout flag?
when using like this "... -phred33 PE S1R1.fatsq S1R2.fastq -baseout outS1.fasta ILLUMINACLIP..." it is still trying to find the list on names afterwards...
MikhailFokin is offline   Reply With Quote
Old 06-16-2014, 07:29 AM   #115
kevluv93
Member
 
Location: South Carolina

Join Date: Jun 2014
Posts: 10
Default

Haha! I finally got it to work! It was just some user error on my part.

Changes:
Don't use "C:" at the beginning of files.
It works fine with Windows 8
I erased the ".fq" at the end of my input files, I thought I was supposed to specify what file type my input data was, but don't. The name of the file's good enough.

Worked like a charm, it was just user error on my part. Thanks for the help!
kevluv93 is offline   Reply With Quote
Old 06-17-2014, 07:51 AM   #116
usad
Member
 
Location: aachen

Join Date: Sep 2009
Posts: 53
Default

great that it works now.
usad is offline   Reply With Quote
Old 06-17-2014, 10:14 AM   #117
BFM
Member
 
Location: USA

Join Date: Jun 2014
Posts: 10
Default

Hi i have used trimmomtaic to clip the adapter sequences. Yet i am still having a problem with kmers, sequence per base content. My question is how do we improve the quality if there is any failure in Fastqc results????
BFM is offline   Reply With Quote
Old 06-24-2014, 03:24 AM   #118
kevluv93
Member
 
Location: South Carolina

Join Date: Jun 2014
Posts: 10
Default

Hi, back again. When I use trimmomatic I get abnormally high Kmer reads on FastQC, I read out the Kmers and realized that most of my forward adapter was still inside of my cDNA.

I opened the TruSeq2 adapter file and realized that the Prefix PE/1 adapter (I guess that means the forward adapter?) Didn't match the forward adapter I was using, which is:
TruSeq Adapter, Index 2
5’ GATCGGAAGAGCACACGTCTGAACTCCAGTCACCGATGTATCTCGTATGCCGTCTTCTGCTTG

So I went in and replaced the default forward adapter with the sequence you see above.

Now I have another issue, for some reason trimmomatic has managed to cut 8GB of data into 2GB of data. (if I combine the forward and reverse paired and unpaired files) Fastqc is giving me Kmers that are similar to my sequence, and when I opened the forward paired file I saw that fairly large chunks of my primers are still at the 3' end of my cDNA.

ex.
GATCGGAAGAGCACACGTCTGAACTCCAGTCAC
GATCGGAAGAGCACACGTCTGAACTCC
ATCGGAAGAGCACACGTCTGAACTCCAGTCACCGA

They're just small enough for trimmomatic to miss. Is there any helpful suggestions you can give me regarding the settings of trimmomatic to help me cut these small bits of adapters off the end of my reads? Is there a reason why only small fragments of my adapter sequences would be left after using trimmomatic? Finally, is it usual for such a large portion of data to get cut when using trimmomatic or am I screwing this up? (8GB to 2GB of data)
kevluv93 is offline   Reply With Quote
Old 06-24-2014, 05:39 AM   #119
tonybolger
Senior Member
 
Location: berlin

Join Date: Feb 2010
Posts: 156
Default

Quote:
Originally Posted by kevluv93 View Post
Hi, back again. When I use trimmomatic I get abnormally high Kmer reads on FastQC, I read out the Kmers and realized that most of my forward adapter was still inside of my cDNA.

I opened the TruSeq2 adapter file and realized that the Prefix PE/1 adapter (I guess that means the forward adapter?) Didn't match the forward adapter I was using, which is:
TruSeq Adapter, Index 2
5’ GATCGGAAGAGCACACGTCTGAACTCCAGTCACCGATGTATCTCGTATGCCGTCTTCTGCTTG
This sequence looks like a TruSeq-3 adapter - if so, either the TruSeq3-PE or TruSeq3-PE-2 adapter files should remove the sequencing adapters. The useful part of the reads should survive if they the adapters are in the normal position.

If you do have any surviving adapters, can you post a few examples?

Thanks,

Tony.
tonybolger is offline   Reply With Quote
Old 06-24-2014, 07:34 AM   #120
Mchicken
Member
 
Location: Germany

Join Date: Jan 2014
Posts: 39
Default

Hey guys,
i`ve got some problems with Trimmomatic:

I have the following 100bp long read:

@HWI-ST365:34625ECACXX:5:1101:3183:2046 1:N:0:TGACCA
NCAGGGGGAACAGGCTGATCTCCCCCAAGAGTCCACATCGACGGGGAGGTTTGGCACCTCGATGTCGGCTCATCGCAACCTGGGGCGGAAGGACGTCCCC
+
#[email protected]@FHD8)<FHIG8B89>(,5,5(::<CB?'[email protected];5?##############################

I only run SLIDINGWINDOW:4:15 on this read

and what i get is:

Log-File:

HWI-ST365:34625ECACXX:5:1101:3183:2046 1:N:0:TGACCA 52 0 52 48

trimmed fastq:

@HWI-ST365:34625ECACXX:5:1101:3183:2046 1:N:0:TGACCA
NCAGGGGGAACAGGCTGATCTCCCCCAAGAGTCCACATCGACGGGGAGGTTT
+
#[email protected]@FHD8)<FHIG8B89>(,5


But when i run my own script on the read i can see that the pattern (,5, beginning at position 50 has an average quality of 12.25, which is below the required 15. So the read should survive from position 1 to 49 and not until position 52 as determined by Trimmomatic.

So can anyone tell me where i am wrong?

Thanks
Mchicken
Mchicken is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 02:03 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2018, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO