Unconfigured Ad

**gringer** · 04-11-2014, 07:00 AM

So, um, the title of this is "Read Counter accounting for Overlapping Genes", but in the description of your program, I see this:

since expression counts cannot be unambiguously defined in regions where genes are overlapping, ReCOG does not count read-pairs mapped to these regions.

Accounting for things by not counting them is a bit of an oxymoron. I would advise you to either change the name of the program, or change that description to be a bit more in tune with the program name.

Also, the description of this algorithm seems to be similar to HTSeq-count in union mode:

http://www-huber.embl.de/users/anders/HTSeq/doc/count.html#count

Which is itself a toy demonstration of what can be done with a little bit of python programming in combination with HTSeq:

http://www-huber.embl.de/users/anders/HTSeq/doc/tour.html#counting-reads-by-genes

So... I notice that you're using pysam (from the looks of the files in the archive) just like HTSeq. What differentiates your program from HTSeq / HTSeq-count?

I also notice you're including a sample BAM file to test out the program, which makes it a 77MB download[!], rather than something a bit closer to HTSeq's 350kb installer. You should probably change that, and have a script do the download (if desired) after ReCOG is installed.

**dpryan** · 04-11-2014, 08:56 AM

@David Eccles: It's like you read my mind.

I also wonder what the difference would be to just munging the annotation file such that it contains only 5' and 3' most bounds and then using htseq-count.

**eszter.ari** · 04-24-2014, 03:56 AM

Dear David,

Thanks for your remarks and questions. We will improve the description of the ReCOG script!

Originally posted by gringer View Post

So, um, the title of this is "Read Counter accounting for Overlapping Genes", but in the description of your program, I see this:

Accounting for things by not counting them is a bit of an oxymoron. I would advise you to either change the name of the program, or change that description to be a bit more in tune with the program name.

You are absolutely right!

Originally posted by gringer View Post

Also, the description of this algorithm seems to be similar to HTSeq-count in union mode:

http://www-huber.embl.de/users/anders/HTSeq/doc/count.html#count

Which is itself a toy demonstration of what can be done with a little bit of python programming in combination with HTSeq:

http://www-huber.embl.de/users/anders/HTSeq/doc/tour.html#counting-reads-by-genes

So... I notice that you're using pysam (from the looks of the files in the archive) just like HTSeq. What differentiates your program from HTSeq / HTSeq-count?

The concept of HTSeq and ReCOG doesn't differ so much. First I applied HTSeq-count and I faced some problems:
HTSeq-count uses SAM files which take a lot of space on the HD. ReCOG uses BAM files.
When the annotetion of a genome is not so advanced - so it contains hundreds of "chromosomes" (contigs) - HTSeq just doesn't work. (At least for the D.simulans annotations of FlyBase.)
I found some cases when HTSeq gave different counts than it should be (I checked this with IGV Viewer). I discussed these problems with other researchers and they also found some examples when HTSeq gave wrong count results.

Originally posted by gringer View Post

I also notice you're including a sample BAM file to test out the program, which makes it a 77MB download[!], rather than something a bit closer to HTSeq's 350kb installer. You should probably change that, and have a script do the download (if desired) after ReCOG is installed.

This is also a useful advise!

Bests,
Eszter

Topics	Statistics	Last Post
UC San Diego Bioengineers Map Gene Function in Human Stem Cells by SEQadmin2 Started by SEQadmin2, 07-13-2026, 10:26 AM	0 responses 24 views 0 reactions	Last Post by SEQadmin2 07-13-2026, 10:26 AM
New Analysis Splits Leukemia Into 16 Epigenomic Subgroups by SEQadmin2 Started by SEQadmin2, 07-09-2026, 10:04 AM	0 responses 33 views 0 reactions	Last Post by SEQadmin2 07-09-2026, 10:04 AM
Genome-Wide CRISPR Screen Uncovers Unlikely Psoriasis Target by SEQadmin2 Started by SEQadmin2, 07-08-2026, 10:08 AM	0 responses 21 views 0 reactions	Last Post by SEQadmin2 07-08-2026, 10:08 AM
Engineered Protein Motor Takes Its First Steps Along DNA Track by SEQadmin2 Started by SEQadmin2, 07-07-2026, 11:05 AM	0 responses 34 views 0 reactions	Last Post by SEQadmin2 07-07-2026, 11:05 AM

Unconfigured Ad

ReCOG - Read Counter accounting for Overlapping Genes

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News