Go Back   SEQanswers > Bioinformatics > Bioinformatics

Similar Threads
Thread Thread Starter Forum Replies Last Post
Finding reads fully overlapping a feature rxzlmn Bioinformatics 0 02-17-2014 11:16 PM
remove overlapping PE reads RRBS shocker8786 Bioinformatics 1 01-22-2014 06:50 AM
MG-RAST - remove or retain non-overlapping paired-end reads rhinoceros Metagenomics 0 08-05-2013 04:31 AM
perl : Remove redundant feature in fasta file StephaniePi83 Bioinformatics 9 12-15-2012 06:01 PM
non-overlapping features GFF (BEDTools?) seqeve Bioinformatics 2 11-12-2012 10:44 PM

Thread Tools
Old 06-20-2014, 04:33 AM   #1
Location: Cambridge

Join Date: Nov 2011
Posts: 18
Default Remove overlapping feature from GFF

I'm analysing some stranded RNASeq data and looking at identifying genuine anti-sense transcripts. What I'm finding with my default STAR / htseq-count pipeline is that htseq-count (quite correctly) reports antisense transcription from those features that overlap, but on opposite strands.

The 3' ends of TLDC2 and SAMHD1 is a good example of such a scenario:

Ensembl region

Sense transcription from SAMHD1 is appearing as antisense transcription from TLDC2 in the htseq-count output. Whilst this is technically correct what I am finding is that if SAMHD1 (sense) is differentially expressed I see TLDC2 antisense as also differentially expressed, but the entire signal is driven by the SAMHD1 overlap, so I do not believe the TLDC2 call is biologically meaningful.

From reading the htseq-count manual again I don't see an option to consider reads mapping to these kinds of overlaps ambiguous. As the union/intersection_strict/intersection_nonempty specification seems to only consider features on the same strand.

So my question is: a) is there some clever way I am missing in htseq-count to consider features on opposite strands to be overlapping (and thereby ignore reads)? and b) if not is there any tool out there that can filter from a GFF file overlapping features and give me a reduced GFF with these regions excluded?
ruggedtextile is offline   Reply With Quote
Old 06-20-2014, 05:23 AM   #2
Location: Cambridge

Join Date: Nov 2011
Posts: 18

The answer to b) appears to be:

bedtools intersect -S -v -a genes_index.gff -b genes_index.gff >
ruggedtextile is offline   Reply With Quote

htseq-count, rnaseq, strand specific

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

All times are GMT -8. The time now is 05:35 AM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO