SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Sequencing service:Things to know Melissa Service Providers 20 03-03-2018 01:07 AM
The things BLAST returns skbrimer General 2 05-20-2015 04:11 AM
What are the main things you look for when choosing a core facility? John.Sawyer Service Providers 3 09-12-2013 06:20 AM
Gff, moving assemblies and other things BXLion Bioinformatics 1 07-31-2012 05:07 PM
Doing things with CrossBow output ? karve Genomic Resequencing 4 03-08-2011 07:31 AM

Reply
 
Thread Tools
Old 08-06-2015, 01:46 PM   #1
antifolate
Member
 
Location: Chesterfield, MO

Join Date: Aug 2015
Posts: 52
Default Looking for a trimming software that does these things

Hello,

I'm looking for a trimming/filtering software that can do the following:

1) Trim both ends until there's at least a certain number of consecutive bases higher than a specific quality score.

2) Remove the 3'-regions of a certain length if they contained a certain percentage of bp below a specific quality score. For example, remove 3' ends of 200 bp if they were made of more than 10% of bp below 20 phred score.

3) Filter out reads with a certain percentage of bp below a specific quality score.

4) Remove reads with a certain number of consecutive Ns.

5) Be paired-end-aware, i.e. if one read was removed, remove its pair (there're several of these available, but without the other features).

6) If a read was identical to the reverse compliment to its pair, remove it.

I'd really appreciate your help.
antifolate is offline   Reply With Quote
Old 08-06-2015, 01:57 PM   #2
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,091
Default

BBduk.sh (part of BBMap), Trimmomatic, Cutadapt (and perhaps others that I am missing) should fit the bill. Though they may not check every box you have up there they should get the job done.
GenoMax is offline   Reply With Quote
Old 08-06-2015, 02:15 PM   #3
antifolate
Member
 
Location: Chesterfield, MO

Join Date: Aug 2015
Posts: 52
Default

Thanks. I tried Trimmomatic but not the other two. BBduk.sh seems promising (so does the BBMap package), but I'm gonna have to take a while before understanding its syntax. I'll post back if it does what I want.
antifolate is offline   Reply With Quote
Old 08-06-2015, 04:39 PM   #4
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

Quote:
Originally Posted by antifolate View Post
Hello,

I'm looking for a trimming/filtering software that can do the following:

1) Trim both ends until there's at least a certain number of consecutive bases higher than a specific quality score.
BBDuk used to use this strategy, but it's not optimal so I don't really recommend it. I was able to demonstrate empirically that it was not too good, either. So, BBDuk currently uses the Phred algorithm for quality trimming, which is optimal, though it's technically possible to disable that with a flag and use the old method instead. BBDuk also supports windowed trimming (trim until the average in a sliding window exceeds some threshold).

Quote:
3) Filter out reads with a certain percentage of bp below a specific quality score.
The "maq" flag filters by average quality, where average quality is calculated by transforming the quality scores into probabilities, so basically if you set "maq=20" it removes reads with an expected error rate greater than 1%. I don't recommend setting it that high, though.

Quote:
4) Remove reads with a certain number of consecutive Ns.
The "maxns=X" flag will filter reads with at least X Ns, but it doesn't care whether they are consecutive.

Quote:
5) Be paired-end-aware, i.e. if one read was removed, remove its pair (there're several of these available, but without the other features).
Check.

Quote:
6) If a read was identical to the reverse compliment to its pair, remove it.
You can do this with BBMerge, by running it but telling it not to join overlapping reads (using the "join=f" flag), and using the "maxlength" flag plus the "out" and "outu" streams. "maxlength=X" will send reads with insert sizes longer than X to outu rather than out. So:

bbmerge.sh in=reads.fq out=short.fq outu=long.fq join=f maxlen=150

(this command assumes pairs are interleaved in one file)
Brian Bushnell is offline   Reply With Quote
Old 08-11-2015, 07:19 AM   #5
antifolate
Member
 
Location: Chesterfield, MO

Join Date: Aug 2015
Posts: 52
Default

I just got around to trying these commands and- although they're not exactly what I'm trying to do- they worked pretty well. bbmerge would merge my reads so I avoided it.

Thank you!
antifolate is offline   Reply With Quote
Old 08-12-2015, 05:11 PM   #6
relipmoc
Member
 
Location: Los Angeles, CA

Join Date: Jul 2011
Posts: 58
Default try skewer

Another option is skewer. Good luck!

Quote:
Originally Posted by antifolate View Post
I just got around to trying these commands and- although they're not exactly what I'm trying to do- they worked pretty well. bbmerge would merge my reads so I avoided it.

Thank you!
relipmoc is offline   Reply With Quote
Old 10-06-2015, 05:53 AM   #7
antifolate
Member
 
Location: Chesterfield, MO

Join Date: Aug 2015
Posts: 52
Default

@Brian

"... though it's technically possible to disable that with a flag and use the old method instead."

How can I do this?
antifolate is offline   Reply With Quote
Old 10-06-2015, 09:14 AM   #8
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

Quote:
Originally Posted by antifolate View Post
@Brian

"... though it's technically possible to disable that with a flag and use the old method instead."

How can I do this?
Add the flag "otm=f" (otm stands for "optimal trimming mode").
Brian Bushnell is offline   Reply With Quote
Old 10-06-2015, 10:59 AM   #9
antifolate
Member
 
Location: Chesterfield, MO

Join Date: Aug 2015
Posts: 52
Default

otm=f (outputtrimmedtomatch) Output reads trimmed to shorter
than minlength to outm rather than discarding.


What bbduk you talking about?
antifolate is offline   Reply With Quote
Old 10-06-2015, 11:51 AM   #10
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

Ooops, looks like I have an overloaded flag. Thanks for spotting that! I'll rename that one to "ottm" in the next release. Currently, "otm" acts on the quality trimming, so "outputtrimmedtomatch" would have to be fully spelled out in order to function according to that description. To be more specific for now, use the flag "optitrim=f" to turn off optimal trimming, and "outputtrimmedtomatch" to dictate whether trimmed reads shorter than minlen go to outm.
Brian Bushnell is offline   Reply With Quote
Old 10-06-2015, 11:56 AM   #11
antifolate
Member
 
Location: Chesterfield, MO

Join Date: Aug 2015
Posts: 52
Default

I didn't know bbduk was your work. Thanks for the help and the tool!
antifolate is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 07:01 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO