SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
EBARDenovo - A new RNA-seq do novo assembler for paired-end Illumina data htchu.taiwan RNA Sequencing 2 06-10-2013 12:13 AM
Mapping paired-end stranded RNA-seq data hubin.keio Bioinformatics 3 03-28-2013 02:10 PM
how to determine strand from tophat output for paired-end RNA-seq data jay2008 Bioinformatics 1 05-30-2012 04:46 AM
RNA-Seq: A Probabilistic Framework for Aligning Paired-end RNA-seq Data. Newsbot! Literature Watch 11 10-16-2010 09:27 AM

Reply
 
Thread Tools
Old 12-01-2013, 08:26 AM   #1
alpha2zee
Member
 
Location: New York

Join Date: Mar 2013
Posts: 10
Default Asymmetric trimmomatic output with paired-end RNA seq. data

I have data from two experiments of 6 samples each that used 2x100 b paired-end Illumina HiSeq 2000 RNA sequencing with unstranded libraries in case of one and stranded libraries in the other. Average insert/fragment lengths in both experiments were ~200 b.

I used trimmomatic (0.32; on 64-bit Linux) to remove contaminant adapter as well as poor quality sub-sequences from the reads.

Code:
java -jar trimmomatic-0.32.jar PE -threads 16 -phred33 sample_1.fastq sample_2.fastq sample_trimmed_paired_1.fastq.gz sample_trimmed_unpaired_1.fastq.gz sample_trimmed_paired_2.fastq.gz sample_trimmed_unpaired_2.fastq.gz ILLUMINACLIP:adapters/TruSeq3-PE-2.fa:2:30:10 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:36
For both experiments and for all samples, I find that the trimmed_unpaired_1 files are >5-10 times in size than the trimmed_unpaired_2 files. The trimmed_paired_1 and _2 files are similar in size, as expected. See example file-size listings below.

What could be the reason for this?

alpha2zee is offline   Reply With Quote
Old 12-01-2013, 09:09 AM   #2
lorendarith
Guest
 

Posts: n/a
Default

It could mean that the quality in read 2 was lower than read 1, this is why you retain a lot of unpaired read 1 where the corresponding read 2 was dropped due to being trimmed to short or having a very low quality.
  Reply With Quote
Old 12-01-2013, 10:02 AM   #3
alpha2zee
Member
 
Location: New York

Join Date: Mar 2013
Posts: 10
Default

Quote:
Originally Posted by lorendarith View Post
It could mean that the quality in read 2 was lower than read 1, this is why you retain a lot of unpaired read 1 where the corresponding read 2 was dropped due to being trimmed to short or having a very low quality.
This is a possibility. However, it seems unlikely in my case (I examined read qualities using FastQC).

I wonder if the asymmetry that I am seeing is because I am not using the keepBothReads option of trimmomatic. From the manual:

After read-though has been detected by palindrome mode, and the adapter sequence removed, the reverse read contains the same sequence information as the forward read, albeit in reverse complement. For this reason, the default behaviour is to entirely drop the reverse read. By specifying 'true' for this parameter, the reverse read will also be retained, which may be useful e.g. if the downstream tools cannot handle a combination of paired and unpaired reads.
alpha2zee is offline   Reply With Quote
Old 12-01-2013, 10:53 AM   #4
alpha2zee
Member
 
Location: New York

Join Date: Mar 2013
Posts: 10
Default

Quote:
Originally Posted by alpha2zee View Post
I wonder if the asymmetry that I am seeing is because I am not using the keepBothReads option of trimmomatic.
It seems this is not the reason. I tested this, with 'ILLUMINACLIP:TruSeq3-PE-2.fa:2:30:10:8:TRUE' (see my first post). Usage of keepBothReads as per the manual is ILLUMINACLIP:<fastaWithAdaptersEtc>:<seed mismatches>:<palindrome clip threshold>:<simple clip threshold>:<minAdapterLength>:<keepBothReads>.
alpha2zee is offline   Reply With Quote
Old 12-01-2013, 04:17 PM   #5
alpha2zee
Member
 
Location: New York

Join Date: Mar 2013
Posts: 10
Default

I tweaked the trimmomatic run parameters a bit and it seems to have a significant effect: there are less unpaired reads, and the asymmetry between the left and right unpaired reads is less as well.

Code:
java -jar trimmomatic-0.32.jar PE -threads 16 -phred33 sample_1.fastq sample_2.fastq sample_trimmed_paired_1.fastq.gz sample_trimmed_unpaired_1.fastq.gz sample_trimmed_paired_2.fastq.gz sample_trimmed_unpaired_2.fastq.gz ILLUMINACLIP:adapters/TruSeq3-PE-2.fa:2:30:10:8:TRUE LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:26
alpha2zee is offline   Reply With Quote
Old 12-02-2013, 03:44 AM   #6
mastal
Senior Member
 
Location: uk

Join Date: Mar 2009
Posts: 667
Default

yes, it looks like the difference in size of your R1_unpaired file is due to changing the parameters so that Trimmomatic keeps both reads after trimming adapter sequences in palindrome mode, rather than the default behaviour, which is to discard R2, thus leaving R1 as unpaired.
mastal is offline   Reply With Quote
Old 11-19-2014, 04:57 AM   #7
annaprotasio
Junior Member
 
Location: UK

Join Date: Feb 2008
Posts: 6
Default

Hi,
just wanted to say that this thread really helped sorting out my trimmomatic call.

I have smallRNA libraries and due to their nature, a big proportion of the read is adapter. I was running the default mode and was quite unhappy about the results.

Code:
java -Xmx1000m -jar ./Trimmomatic-0.32/trimmomatic-0.32.jar PE -threads 4 1.fastq.gz 2.fastq.gz out_1.fq out.unpaired_1.fq out_2.fq out.unpaired_2.fq ILLUMINACLIP:miRNA.neb.solexa.adapters.fasta:2:10:7 MINLEN:15
The file sizes for the output files were quite discouraging:

102M Nov 7 10:53 1.out.unpaired_2.fq
56K Nov 7 10:53 1.out.unpaired_1.fq
56M Nov 7 10:53 1.out_2.fq
52M Nov 7 10:53 1.out_1.fq

After considering the suggested changes in this thread, my new call is:

Code:
java -Xmx1000m -jar ./Trimmomatic-0.32/trimmomatic-0.32.jar PE -threads 4 1.fastq.gz 2.fastq.gz out_1.fq out.unpaired_1.fq out_2.fq out.unpaired_2.fq ILLUMINACLIP:miRNA.neb.solexa.adapters.fasta:2:30:10:8:TRUE LEADING:3 TRAILING:3 SLIDINGWINDOW:4:30 MINLEN:15
And the files' sizes look much better:

13M Nov 19 13:41 2.out.unpaired_2.fq
32M Nov 19 13:41 2.out.unpaired_1.fq
484M Nov 19 13:41 2.out_2.fq
517M Nov 19 13:41 2.out_1.fq

PS I am aware that PE is overkill for miRNAs, but SE was not available to us at the time
annaprotasio is offline   Reply With Quote
Reply

Tags
adapter, bug, paired end, trimmomatic

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 02:16 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO