SEQanswers

Go Back   SEQanswers > Introductions



Similar Threads
Thread Thread Starter Forum Replies Last Post
Trimmomatic quality trimming kga1978 Bioinformatics 26 11-24-2015 10:14 AM
Newbe struggling dhatziioanou Introductions 3 05-22-2015 10:45 AM
Can you do quality trimming in NovoAlign? prs321 Bioinformatics 1 12-05-2013 11:21 AM
Adapter trimming and trimming by quality question alisrpp Bioinformatics 5 04-08-2013 04:55 PM
newbler quality trimming Himalaya 454 Pyrosequencing 2 08-22-2012 04:44 AM

Reply
 
Thread Tools
Old 12-01-2015, 09:18 AM   #1
Alex852013
Member
 
Location: Germany

Join Date: Jan 2013
Posts: 17
Default Struggling with quality trimming

Hello everybody,

this is my first try to trim Illumina (paired-end) reads on the unix command line.
If i get it correctly, de-multiplexing was already done by the sequencing service.
I guess this also means that the adapters are also gone already.

What i want to do is trimming the reads by quality.
I checked it on FastQC and want to get rid of read with a quality below 20.

I tried trim_galore with
trim_galore ../name1.fastq -q 20 --paired > trim_name.fastq

which gives me:
"No quality encoding type selected. Assuming that the data provided uses Sanger encoded Phred scores (default)
Please provide an even number of input files for paired-end FastQ trimming! Aborting ..." <- i got the idea of this line

But i don't really know how to find out how my data are encoded.
The data look like this.

@NS500339:99:H3H52AFXX:1:11101:5599:1027 2:N:0:GTGAAA
NNNCTGGTTATAGGTACATTGAGCAACTGACTGAAATGCCTCAAAATGTTCTTTACGATGCCATTGGGATATATCAACGGTGGTATATCCAGTGATTTTTTTCTCCATTTTAGCTTCCTTAGCTCCTGAAAATCTCGANANCTCNNAAAA
+
###/AEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAEEEEEEAEEEEEEEEEEEEEEEEEEEEEEEEEEEE<6/<EEEEEEEEAAE6EEE/EEEEEEEEEEEAAEEEEE/6EE<AEEEEAEA<EEE/EEE/AAAE#<#<A/##/<<6


I also tried fastx toolbox with the following command
fastq_quality_filter -q 20 -i ../name.fastq -v -o trimmed_name.fastq

The program works,
Minimum percentage: 0
Input: 4564772 reads.
Output: 4564772 reads.
discarded 0 (0%) low-quality reads.

but if i check it again with FastQC, there are still reads with a quality below 20.


Maybe someone can please help me with one of the programs.

Thanks a lot, Alex
Alex852013 is offline   Reply With Quote
Old 12-01-2015, 09:39 AM   #2
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,961
Default

Grab a copy of BBMap and then run

Code:
$ zcat yourfile.fastq.gz | head -20 > new.fq
$ testformat.sh in=new.fq
to identify the format of the encoding in your file.

While you have a copy of BBMap handy, use bbduk.sh to do trimming.

Code:
$ bbduk.sh -Xmx1g in=reads.fq.gz out=clean.fq.gz qtrim=r trimq=10
GenoMax is offline   Reply With Quote
Old 12-02-2015, 08:05 AM   #3
Alex852013
Member
 
Location: Germany

Join Date: Jan 2013
Posts: 17
Default Thanks

Thanks a lot, if now the following mapping step also works, this will be the solution
Alex852013 is offline   Reply With Quote
Old 12-02-2015, 12:52 PM   #4
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,961
Default

You could stay with BBMap and use bbmap.sh to do the mapping as well.
GenoMax is offline   Reply With Quote
Reply

Tags
fastqc, fastx, quality trimming, trim galore

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 01:03 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO