SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Remove reads which are not uniquely mapped hanleng Bioinformatics 9 08-25-2015 06:04 AM
remove reads in fasta file JQL Bioinformatics 25 07-25-2013 07:16 AM
Any way to remove background reads? metheuse Bioinformatics 3 03-16-2013 11:53 AM
CREST remove duplicate reads tujchl Bioinformatics 0 04-26-2012 07:39 PM
HELP,why consider reads with 5' adaptor as contaminants? Euler Bioinformatics 0 01-14-2010 11:50 PM

Reply
 
Thread Tools
Old 06-24-2013, 12:18 PM   #1
Guigra
Member
 
Location: Brazil

Join Date: Apr 2013
Posts: 17
Default How to remove reads contaminants?

I'm working with a genome of plant origin. By aligning with the genome sequences of chloroplast and mitochondrial realized that there are contaminants in sequencing.
How to remove them?
Guigra is offline   Reply With Quote
Old 06-24-2013, 04:20 PM   #2
jimmybee
Senior Member
 
Location: Adelaide, Australia

Join Date: Sep 2010
Posts: 119
Default

What are you trying to do? Do you just want the cp and mtDNA?

Why do you say you have contaminants?
jimmybee is offline   Reply With Quote
Old 06-24-2013, 09:17 PM   #3
hanshart
Member
 
Location: Germany

Join Date: Nov 2011
Posts: 27
Default

Quote:
Originally Posted by Guigra View Post
... realized that there are contaminants in sequencing. How to remove them?
Hi Guigra,
without deeply understanding of your problem: If you know the type of contaminant you can always build an index of its corresponding genome/identifier sequences and map all reads to this index at first. The unmapped reads can than be used for the mapping against the genome. But I'm not sure if this is the answer you were looking for.
hanshart is offline   Reply With Quote
Old 07-02-2013, 05:57 AM   #4
Guigra
Member
 
Location: Brazil

Join Date: Apr 2013
Posts: 17
Default

Hi hanshart,

Is exactly what I want. How do I do that?
Guigra is offline   Reply With Quote
Old 07-02-2013, 07:16 AM   #5
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,079
Default

Making an index of the contaminats should be straightforward. Once you have the BAM files from those alignments you can recover the unmapped reads following the suggestions in these threads: http://seqanswers.com/forums/showthread.php?t=12283 and http://seqanswers.com/forums/showthread.php?t=30528
GenoMax is offline   Reply With Quote
Old 07-02-2013, 11:04 AM   #6
hanshart
Member
 
Location: Germany

Join Date: Nov 2011
Posts: 27
Default

An example for Bowtie:

1. Determine the sequences of your contaminants and write them to a FASTA file like

Quote:
>seq1
ACGT...
>seq2
GCAG...
(or directly use an available FASTA describing your contaminants)

2. Build a bowtie-index of this FASTA file in a folder called IDX or so:
Quote:
bowtie-build FASTA IDX/contaminants_idx
3. Map your reads (READFILE) against this contaminants reference and extract unmapped reads (=non-contaminants) to a fastq file (NO_CONTAMINANTS.fastq) directly with the --un flag:
Quote:
bowtie --un NO_CONTAMINANTS.fastq IDX/contaminants_idx READFILE OUTPUTFILE
The reads in NO_CONTAMINANTS.fastq can finally be mapped against the reference of interest

A non-Bowtie way would be the same: 1. build index for contaminants, 2. map against this index, 3. extract unmapped reads from alignment file to a new fastq file and 4. use only those reads in the new file for the mapping against your reference.
hanshart is offline   Reply With Quote
Old 07-02-2013, 11:26 AM   #7
westerman
Rick Westerman
 
Location: Purdue University, Indiana, USA

Join Date: Jun 2008
Posts: 1,104
Default

As hanshart says, using bowtie (I actually use bowtie2) is a good -- and easy -- method.
westerman is offline   Reply With Quote
Old 07-03-2013, 09:04 AM   #8
Guigra
Member
 
Location: Brazil

Join Date: Apr 2013
Posts: 17
Default

Thank you all. Were of great help!
Guigra is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 04:13 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO