SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Threshold quality score to determine the quality read of ILLUMINA reads problem edge Illumina/Solexa 35 11-02-2015 10:31 AM
duplicate reads removal vasvale Bioinformatics 19 01-08-2015 12:59 AM
removal of unpaired reads bioenvisage Bioinformatics 14 08-08-2014 05:30 AM
Poor read quality in GAII rakumar Illumina/Solexa 6 04-19-2011 11:44 PM
Range of quality of base calls at each position in my alignment of 454 reads trasver 454 Pyrosequencing 1 03-07-2011 04:31 AM

Reply
 
Thread Tools
Old 05-15-2011, 02:12 PM   #1
gibsongenetics
Junior Member
 
Location: ny

Join Date: Aug 2010
Posts: 3
Default Removal of poor quality reads before alignment

Hello,

So I have recently recieved a whole slide worth of Paired-End SOLID ChIP-seq data, and have encountered problems at the earliest of steps. I have tried to use BOWTIE to align these reads to the hg19 genome, but it is going extremely slow. Upon further inspection it seems that there are a number of reads with "no-calls" or otherwise poor quality values that may or may not be "gumming up" the Bowtie aligner... the question:

Does anyone have suggestions on how to pre-process the data to remove reads that are VERY unlikely to give unique alignments prior to mapping to a reference genome? I'm thinking like... anything with average quality values below 8 gets thrown out...

Any help in how to do this, or aligners that may do this or something similar would be great.

Many thanks,
Bryan
gibsongenetics is offline   Reply With Quote
Old 05-15-2011, 03:40 PM   #2
gibsongenetics
Junior Member
 
Location: ny

Join Date: Aug 2010
Posts: 3
Default

I guess I should qualify this by saying, does anyone know if there is a way to pre-process the paired files without losing pairing info that aligners like BOWTIE and such would use? I don't think the fastx / galaxy is capable of this, right?
gibsongenetics is offline   Reply With Quote
Old 05-16-2011, 05:22 AM   #3
kmkocot
Member
 
Location: Alabama

Join Date: Jun 2009
Posts: 48
Default

Hi Bryan,

You might check out lucy:
http://lucy.sourceforge.net/

seqclean:
http://compbio.dfci.harvard.edu/tgi/software/

and RepeatMasker:
http://www.repeatmasker.org/

I hope these are helpful.

Kevin
kmkocot is offline   Reply With Quote
Reply

Tags
alignment, bad reads

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 03:15 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO