SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Adapter trimming and trimming by quality question alisrpp Bioinformatics 5 04-08-2013 04:55 PM
Ancient DNA adaptor removal and read merging jimmybee Bioinformatics 8 05-24-2012 07:40 PM
P1 adaptor vs. Multiplex P1 adaptor hfaoro SOLiD 0 03-26-2012 05:54 AM
Please Help: What is the differences between standard trimming and adaptive trimming byou678 Bioinformatics 8 08-22-2011 12:05 PM
Sample/Simulated data for testing adaptor trimming hydkat Illumina/Solexa 0 12-03-2008 01:09 AM

Reply
 
Thread Tools
Old 03-19-2013, 08:08 AM   #1
ramirob
Member
 
Location: Vermont

Join Date: Apr 2012
Posts: 14
Default Adaptor removal, trimming, vs masking

I had a question about best practices regarding adaptor removing, trimming the ends of reads, and also removing short reads. All of these questions pertain preparing sequences for alignment with BWA.

1) Adaptor removal: we can either mask the adapter with Casava (puts N's) or clip it out with cutadapt, what is the best practice (masking vs. removing) for BWA?
2) Trimming at the ends: I have heard that this is important, trim a few bases at the beginning and also at the end (say 10 and 10). We can also just mask it with Casava, or trim it with some program. What is a standard program for trimming ends of reads?
3) Removal of short reads: when clipping out the adapter, sometimes this results in a few very small reads (ligated adapters perhaps), those small reads seem to give problems to BWA, we wrote a program for removing small reads but it is slow, is there any software out there that does it?

Thanks in advance,
Ramiro
ramirob is offline   Reply With Quote
Old 03-20-2013, 09:20 AM   #2
MeganS
Member
 
Location: US

Join Date: Sep 2010
Posts: 14
Default

1) I think removing the adapters is probably better. Check the BWA documentation, but I think it treats Ns as a mismatch to the reference.
2) I trim 6 bases off the 5' end of RNA-seq (due to a concern over high error rates due to hexamer priming). For genome sequencing, I only trim the ends if the quality is low.
3) I use Trimmomatic to remove reads that are too short (and remove adapters and trim low quality). I think FastX-toolkit has something as well. Whatever you use, make sure it handles the orphaned reads. BWA determines read pairs by the location in the file, not by the sequence identifier.
MeganS is offline   Reply With Quote
Old 03-27-2013, 07:30 AM   #3
thomasvangurp
Member
 
Location: Wageningen

Join Date: Jan 2009
Posts: 11
Default

Hi MeganS,

Do you have any references that indicate the source of this hexamer mispriming?

Cheers,
Thomas
thomasvangurp is offline   Reply With Quote
Old 03-27-2013, 10:31 PM   #4
MeganS
Member
 
Location: US

Join Date: Sep 2010
Posts: 14
Default

Here are a couple posts/blogs with some references included:

http://seqanswers.com/forums/showthread.php?t=11843

http://www.genomesunzipped.org/2012/...-in-humans.php

After reading what I could find on the topic, I am unsure if trimming the 5' end of RNA-seq is necessary, but I decided there was enough of a concern that when de novo assembling I trim 6 bases off the 5' end (a case could be made for as many as 15 bases). I do not trim for alignment to a reference.
MeganS is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 04:32 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO