Seqanswers Leaderboard Ad

**tonybolger** · 05-13-2011, 02:39 AM

Originally posted by Kotoro View Post

What methods are normally used to trim off the unreliable poor-scoring ends of reads? Is there a tool that can statistically assess read scores and make this decision on a per-read basis, or is a cut position globally decided on?

Something like a sliding window is good - so if you get a single bad cycle you don't cut a high quality region, but persistent rubbish is trimmed.

Generally you want to do it on a per-read basis - 'local' factors often influence a particular read, and there's no advantage to 'one size fits all'.

Originally posted by Kotoro View Post

Are there any special considerations for paired-end reads? (they've already been split and de-convoluted by barcode by the pipeline in our university's sequencing core.)

Normally you might want to handle 'unpaired' reads separately - reads which survive QC but their partners didn't.

Blatant ad: if you're working with illumina data, i've released a tool, Trimmomatic, found here, which does what you need

**francois.sabot** · 05-13-2011, 04:37 AM

Hi
Have a look at the last version of Cutadapt, which uses a nice system for trimming, and cut only if the fater bases are of better quality than the previous... See their explanation : http://code.google.com/p/cutadapt/

**gaffa** · 05-13-2011, 04:45 AM

A third alternative is SolexaQA (http://solexaqa.sourceforge.net/), which can trim either down to "to the longest contiguous read segment for which the quality score at each base is greater than a user-supplied quality cutoff" or using the BWA trimming algorithm (which also can be performed by BWA in conjunction with read mapping).

**mmartin** · 07-20-2011, 07:13 AM

Originally posted by gaffa View Post

... or using the BWA trimming algorithm ...

I'd like to point out that the quality trimming in cutadapt is simply a reimplementation of BWA's algorithm. You could use the quality-trimming part of cutadapt without trimming adapters by providing an adapter sequence that's certain to not occur -- just use -a XXXXXXXX or something like this (these are literal "X" characters).

See also file lib/cutadapt/qualtrim.py in the cutadapt distribution which also shows the algorithm and contains an explanation of it.

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 40 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 41 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 36 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 55 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

trimming unreliable ends of reads

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News