SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
HTseq - count - high ambiguous count rate and reads with missing mate encountered 4galaxy7 Bioinformatics 0 12-14-2015 02:48 AM
htseq-count paolo.kunder Bioinformatics 10 10-22-2014 04:45 AM
htseq-count low count problem gandalf886 Bioinformatics 3 08-23-2014 07:05 AM
HTseq not counting reads toward genes the overlap? Sipkovandam@gmail.com Bioinformatics 11 09-20-2013 05:12 AM
multiBamCov or htseq-count to count read per feature ? NicoBxl Bioinformatics 1 07-03-2012 02:05 AM

Reply
 
Thread Tools
Old 12-30-2015, 11:28 AM   #1
is007
Junior Member
 
Location: USA

Join Date: Oct 2015
Posts: 2
Default htseq-count overlap modes

Hi! I'm using htseq-count to count reads from my RNASeq experiment mapping to genes. I have an alignment file where all reads are mapped to the genome, and now I am just counting reads for each gene. It seems to work fine, but I'm trying to understand a specific detail about how it works. Basically, it has 3 overlap modes and I am using the 'union':



This is the documentation: http://www-huber.embl.de/users/ander...doc/count.html

I'm wondering if anyone knows to what extend does it count partially mapped reads (2nd row from the figure), i.e. by how much should the read overlap for it to be counted?

The figure implies it counts partial overlaps too, but does not specify what exactly the minimum overlap needs to be. I'm not finding this information in the documentation.

Thanks!
is007 is offline   Reply With Quote
Old 12-30-2015, 02:30 PM   #2
dpryan
Devon Ryan
 
Location: Freiburg, Germany

Join Date: Jul 2011
Posts: 3,480
Default

A one base overlap counts as an overlap.
dpryan is offline   Reply With Quote
Old 12-30-2015, 02:53 PM   #3
is007
Junior Member
 
Location: USA

Join Date: Oct 2015
Posts: 2
Default

Thanks, looks like you're right. I've come to the same conclusion just now inspecting their python source code. How's that meaningful?

I guess if there's a very small overlap (extreme case: just overlap 'A' from 'ATG' at beginning) then it's likely overlapping multiple genes, hence it's classified as 'ambiguous' and not counted?
is007 is offline   Reply With Quote
Old 12-31-2015, 01:57 PM   #4
fanli
Senior Member
 
Location: California

Join Date: Jul 2014
Posts: 198
Default

Yes, but that depends on how dense the genes are in your genome, and how long your reads are
fanli is offline   Reply With Quote
Old 01-02-2016, 09:12 AM   #5
dpryan
Devon Ryan
 
Location: Freiburg, Germany

Join Date: Jul 2011
Posts: 3,480
Default

Quote:
Originally Posted by is007 View Post
Thanks, looks like you're right. I've come to the same conclusion just now inspecting their python source code. How's that meaningful?
There's no real meaning there other than how sets are dealt with.
dpryan is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 06:19 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO