SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Trimming or filtering the data from Solid anusha Bioinformatics 4 12-19-2012 09:00 AM
Filtering sequencing data kjaja Bioinformatics 0 04-11-2012 09:19 AM
Filtering and trimming data salmonella Bioinformatics 10 11-17-2011 06:39 AM
454 sequence data filtering amb1networks 454 Pyrosequencing 3 10-22-2010 07:46 AM

Reply
 
Thread Tools
Old 01-31-2013, 07:42 AM   #1
winsettz
Member
 
Location: US

Join Date: Sep 2012
Posts: 91
Default Telomere filtering of MiSeq PE data

Working on de novo assembly, but this particular problem is more about pre-filtering telomeric reads (the sequence of the repeat is known).

Any robust workflows to filter 150 nt MiSeq reads for telomeric reads? Or more abstractly, for filtering reads with a particular known kmer of varying repeat length, such as (NNNNN)X? Thanks.
winsettz is offline   Reply With Quote
Old 02-01-2013, 09:50 AM   #2
tonybolger
Senior Member
 
Location: berlin

Join Date: Feb 2010
Posts: 156
Default

Quote:
Originally Posted by winsettz View Post
Working on de novo assembly, but this particular problem is more about pre-filtering telomeric reads (the sequence of the repeat is known).

Any robust workflows to filter 150 nt MiSeq reads for telomeric reads? Or more abstractly, for filtering reads with a particular known kmer of varying repeat length, such as (NNNNN)X? Thanks.
Trimmomatic might be able to do this. Can you send me some example reads and what sequence you'd like to remove?
tonybolger is offline   Reply With Quote
Old 02-01-2013, 12:58 PM   #3
winsettz
Member
 
Location: US

Join Date: Sep 2012
Posts: 91
Default

My starting assumption (born out by the literature) is that telomeric sequence is a repeat of "TTAGGG".

The proposed workflow is to pass each fastq file through fastx_clipper

Quote:
fastx_clipper -Q33 -A TTAGGG -i input.fastq -o output.fastq
I think fastx_clipper is actually removing reads with telomeric reads, and to use it downstream in velvet as paired reads I need to ensure that both pairs are present.

I figured I would borrow khmer's pair-sorting functionality, but it works on interleaved files, so interleave then sort.

Quote:
python ~/khmer/sandbox/interleave.py read1.fastq read2.fastq > interleaved.fastq

python ~/khmer/sandbox/strip-and-split-for-assembly.py interleaved.fastq
winsettz is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 01:05 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2017, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO