SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Bowtie call to get unique, multi-hits and nonmatching reads PFS Bioinformatics 3 07-07-2019 07:18 AM
Aligning paired end Illumina data with Bowtie kopardev Bioinformatics 5 03-29-2012 08:46 AM
Help with Bowtie, only unique alignments khb General 1 12-16-2010 12:35 AM
Regarding Unique reads, Unique alignments sridharacharya RNA Sequencing 2 09-20-2010 05:39 AM
problem aligning SOLiD reads with Bowtie pgalante SOLiD 2 07-09-2010 07:29 AM

Reply
 
Thread Tools
Old 06-08-2010, 06:47 AM   #1
gzentner
Junior Member
 
Location: Cleveland, OH, USA

Join Date: Jun 2010
Posts: 5
Default Aligning only unique reads in Bowtie

Hi everyone,

I am aligning some ChIP-seq data using Bowtie. I have been using the -m option to throw out any reads with > 1 reportable alignment, but I would also like to try omitting non-unique reads. Is there a command line option to throw out any reads that are identical?

Thanks!
gzentner is offline   Reply With Quote
Old 06-08-2010, 08:53 AM   #2
john_mu
Member
 
Location: Stanford, CA

Join Date: May 2010
Posts: 88
Default

If you just have the raw reads, you can use the "uniq" command in Linux to extract the unique reads (after sorting).

http://en.wikipedia.org/wiki/Uniq
__________________
SpliceMap: De novo detection of splice junctions from RNA-seq
Download SpliceMap Comment here
john_mu is offline   Reply With Quote
Old 06-08-2010, 03:27 PM   #3
gzentner
Junior Member
 
Location: Cleveland, OH, USA

Join Date: Jun 2010
Posts: 5
Default

Thanks John!

That sounds like it should be a useful command. I was just wondering if you could give me a little more detail.

I have the ChIP-seq data as FASTQ files which I align using bowtie. Would I use the uniq command on the FASTQ prior file to alignment to generate another FASTQ containing only unique reads?

i.e., prior to alignment, run uniq -u on the FASTQ?

Thanks!
gzentner is offline   Reply With Quote
Old 06-08-2010, 03:35 PM   #4
john_mu
Member
 
Location: Stanford, CA

Join Date: May 2010
Posts: 88
Default

No worries, but the method I suggested is a bit of a hack... It will require you to fiddle with the data a bit.

Firstly, do you need to preserve the read-quality information? If so then it is probably best to write your own python or perl script to do it. I'm pretty sure there are existing tools to do this though... I just can't re-call off the top of my head.

-----

The method I suggested is to firstly extract the raw-reads from the FASTQ file by using
instructions here

http://www-stat.stanford.edu/~kinfai...ess.html#fastq

Then sort the reads with http://en.wikipedia.org/wiki/Sort_(Unix)

sort input_file > output_file

Finally use "uniq"

uniq -u input_file > output_file

After you do this, you can align your reads using bowtie with the "-r" option for raw reads.
__________________
SpliceMap: De novo detection of splice junctions from RNA-seq
Download SpliceMap Comment here
john_mu is offline   Reply With Quote
Old 06-10-2010, 09:48 PM   #5
lifeng.tian
Member
 
Location: Philadelphia

Join Date: Jul 2009
Posts: 16
Default

You can try fastx_collapser from http://hannonlab.cshl.edu/fastx_toolkit/
lifeng.tian is offline   Reply With Quote
Old 09-17-2010, 07:37 AM   #6
sridharacharya
Member
 
Location: Institute, WV

Join Date: May 2010
Posts: 24
Default Re: Aligning only unique reads in Bowtie

I have few questions regarding the best practices that are adopted, in dealing with multiple alignments from a single read and presence of identical reads in the data (from Biology stand point) :

I am curious, how important it is to deal with identical reads.
Having many identical reads in data means something wrong with the
experiment?

What could be considered as max. cutoff value for the number of identical reads in the data, so as to not consider those reads?

In the other case of a single read aligning at multiple places in a genome, what should be the cutoff value for number of multiple alignments, so as to not consider those reads?
sridharacharya is offline   Reply With Quote
Old 07-07-2019, 07:07 AM   #7
brojee
Member
 
Location: Bhopal

Join Date: Jul 2019
Posts: 19
Default

No stresses, however the technique I recommended is somewhat of a hack... It will expect you to tinker with the information a bit.

Initially, do you have to save the perused quality data? In the event that in this way, at that point it is most likely best to compose your very own python or perl content to do it. I'm almost certain there are existing instruments to do this however... I just can't re-cancel the highest point of my head.
brojee is offline   Reply With Quote
Reply

Tags
alignment, bowtie, chip-seq, unique reads

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 10:12 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO