SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
How to count the reads mapping to different regions from tophat output vivienne_lovely RNA Sequencing 5 04-19-2016 10:39 AM
HTseq-count feature type choice for RNAseq shangzhong0619 Bioinformatics 3 08-07-2014 12:29 PM
Mapping reads to reference genome + count reads of genes cumulonimbus RNA Sequencing 12 10-02-2013 08:07 AM
How to count uniquely mapping reads with BWA? albireo Bioinformatics 11 05-10-2013 08:17 AM
multiBamCov or htseq-count to count read per feature ? NicoBxl Bioinformatics 1 07-03-2012 02:05 AM

Reply
 
Thread Tools
Old 02-19-2016, 06:48 PM   #1
tirohia
Member
 
Location: Auckland, NZ

Join Date: Nov 2011
Posts: 46
Default Count reads mapping to non-feature area

I have a bam file containing arabidopsis short RNA reads that I have mapped to the TAIR10 genome. Is there an easy way to get counts of reads that overlap each other and the corresponding region/sequence that they overlap. Essentially, I'm wanting to group all the reads in each bam file into groups that map to the same region of the genome to get a pseudo count (don't worry, I'm not using this for anything serious), I'm exploring at the moment.

I mapped them to the sRNA from mirBase and essentially nothing maps (0~20 reads). There's a reasonable number of reads mapping to the genome, so I want to find areas where there's significant numbers of those reads clustering.

I have a nagging feeling that this is a silly question with an easy answer, but the only way I can think of doing it at the moment is to align all of the mapped sequences with each other and count how many have decent overlaps. Which seems silly.

Cheers
Ben.
tirohia is offline   Reply With Quote
Old 02-19-2016, 06:58 PM   #2
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,046
Default

http://bedtools.readthedocs.org/en/l...genomecov.html

or featureCounts: http://bioinf.wehi.edu.au/featureCounts/
GenoMax is offline   Reply With Quote
Old 02-21-2016, 02:04 AM   #3
Michael.Ante
Senior Member
 
Location: Vienna

Join Date: Oct 2011
Posts: 123
Default

Hi Ben,
you can also try htseq count here.
If your transcript/gene annotation has additional information like gene_biotype you can count how many (uniquely mapping) reads map to protein-coding genes etc. All reads which do not overlap with your annotation are counted as "no-feature".
Cheers
Michael
Michael.Ante is offline   Reply With Quote
Old 02-23-2016, 04:44 PM   #4
tirohia
Member
 
Location: Auckland, NZ

Join Date: Nov 2011
Posts: 46
Default

Thanks Geno - I've managed to get something sort of close to what I'm trying to do out of bedtools - I should have thought of that.

Michael - I'm actually looking for areas with no annotation that have disproportionately large numbers of reads, which, if understand HTSeq correctly, would all just get lumped into the no-feature category.

Ben.
tirohia is offline   Reply With Quote
Old 02-23-2016, 05:05 PM   #5
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,046
Default

Did you create an "anti-bed" file (sort of like Genome - real bed) to get the regions you are interested in?

You could also use some kind of moving window read count and then exclude regions that are coding etc. See this thread for some inspiration: https://www.biostars.org/p/58781/
GenoMax is offline   Reply With Quote
Old 02-23-2016, 10:23 PM   #6
Michael.Ante
Senior Member
 
Location: Vienna

Join Date: Oct 2011
Posts: 123
Default

Hi Ben,
First, I like GenoMax's idea.
If you have non-annotated clusters attracting a lot of reads, you may like to look at the repeatmasker. This tool gives you regions defined by repetitive sequences; inter alia rRNA, simple repeats, tandem repeats, and many more.
AFAIK, there's a compiled annotation for Arabidopsis out. If you can't find it, you need to download the software and run it on your genome annotation.
Cheers
Michael
Michael.Ante is offline   Reply With Quote
Reply

Tags
srna

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 08:42 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO