SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
VCF file for the Mouse genome (mm9) used for GATK gap Bioinformatics 6 05-23-2014 01:10 PM
mouse GTF stephenhart Bioinformatics 0 02-06-2013 11:43 PM
mouse mm9 dbSNP_132 VCF version wanguan2000 Bioinformatics 0 05-23-2012 12:38 AM
mouse GTF file honey Bioinformatics 0 10-11-2011 05:43 AM
cuffcompare can not handle mouse gtf file from ensembl liuxq Bioinformatics 1 09-05-2010 11:54 PM

Reply
 
Thread Tools
Old 01-30-2014, 06:47 AM   #1
roll
Member
 
Location: UK

Join Date: Aug 2009
Posts: 38
Default mouse mm9 miRNA in gtf format

Hi,
Where can i find mouse miRNA mm9 reference data in gtf format?
I know mirbase but i could only found these in gff format.
I would like to use this for htseq-count.
roll is offline   Reply With Quote
Old 01-31-2014, 12:43 AM   #2
A.N.Other
Member
 
Location: London, UK

Join Date: Feb 2012
Posts: 25
Default

Ensembl will get you download whole legacy GTFs from their FTP (ftp://ftp.ensembl.org/pub/ --> choose the correct release --> gtf --> Mus musculus). You can filter out the miRNAs from the whole list, but if you're using htseq-count it might just be easier to give it the complete GTF and do the filtering afterwards.

Last edited by A.N.Other; 01-31-2014 at 03:58 AM.
A.N.Other is offline   Reply With Quote
Old 01-31-2014, 09:04 AM   #3
roll
Member
 
Location: UK

Join Date: Aug 2009
Posts: 38
Default

Quote:
Originally Posted by A.N.Other View Post
Ensembl will get you download whole legacy GTFs from their FTP (ftp://ftp.ensembl.org/pub/ --> choose the correct release --> gtf --> Mus musculus). You can filter out the miRNAs from the whole list, but if you're using htseq-count it might just be easier to give it the complete GTF and do the filtering afterwards.
What I am mising to understand is that my boss wants to have a table like the following - for each mirna
Cond1 Cond2 Cond3
mirna1 3 5 7
mirna2 2 0 6


When I download the gene.gtf from ensemble I only got these counts gene wise.

How can i obtain the counts for each mirna on 3 conditions (like the table i mentioned above).
roll is offline   Reply With Quote
Old 01-31-2014, 10:08 AM   #4
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,060
Default

You are going to use "HTSeq-count" with the GFF file to get that table: http://www-huber.embl.de/users/ander...unt.html#count after you filter the GFF file leaving only miRNA's like A.N.Other suggested.

This is how you can do that from a unix machine. Get the right GTF file for the build you used for your alignments.

Code:
$ wget ftp://ftp.ensembl.org/pub/release-74/gtf/mus_musculus/Mus_musculus.GRCm38.74.gtf.gz

$ gunzip Mus_musculus.GRCm38.74.gtf.gz

$ cat Mus_musculus.GRCm38.74.gtf | grep "miRNA" > file_name_with_miRNA.gtf

$ cat Mus_musculus.GRCm38.74.gtf | grep "snoRNA" > file_name_with_snoRNA.gtf

$ cat Mus_musculus.GRCm38.74.gtf | grep "lincRNA" > file_name_with_lincRNA.gtf

$ cat Mus_musculus.GRCm38.74.gtf | grep "snRNA" > file_name_with_snRNA.gtf
You can then use http://www.sequenceontology.org/cgi-bin/converter.cgi to convert the gtf files to gff.

Last edited by GenoMax; 02-02-2014 at 04:43 AM. Reason: Added other RNA types
GenoMax is offline   Reply With Quote
Old 02-02-2014, 01:44 AM   #5
roll
Member
 
Location: UK

Join Date: Aug 2009
Posts: 38
Default

Quote:
Originally Posted by GenoMax View Post
You are going to use "HTSeq-count" with the GFF file to get that table: http://www-huber.embl.de/users/ander...unt.html#count after you filter the GFF file leaving only miRNA's like A.N.Other suggested.

This is how you can do that from a unix machine. Get the right GTF file for the build you used for your alignments.

Code:
$ wget ftp://ftp.ensembl.org/pub/release-74/gtf/mus_musculus/Mus_musculus.GRCm38.74.gtf.gz

$ gunzip Mus_musculus.GRCm38.74.gtf.gz

$ cat Mus_musculus.GRCm38.74.gtf | grep "miRNA" > file_name_with_miRNA.gtf
You can then use http://www.sequenceontology.org/cgi-bin/converter.cgi to convert the gtf file to gff.
My understanding is that if i do the above, i will get the miRNA count per gene. Whereas i am interested in is counting each miRNA type in each sample, like the table above. Is it possible to do that with htseq-count?
roll is offline   Reply With Quote
Old 02-02-2014, 02:07 AM   #6
dpryan
Devon Ryan
 
Location: Freiburg, Germany

Join Date: Jul 2011
Posts: 3,480
Default

How easy this is depends on how the annotation is constructed. htseq-count (or featureCounts, for a generally quicker alternative) will count according to whatever feature you tell it to. So, if your annotation has a field that specifies the miRNA type, then htseq-count can be told to count according to that.

One caveat is that it will still ignore multimappers, which may be quite prevalent in your situation. The proper handling would be to increment the count of the feature by one if all of the multiple mappings of a particular read fall only in a single feature. I don't think htseq-count will do that for you.
dpryan is offline   Reply With Quote
Old 02-02-2014, 02:25 AM   #7
dpryan
Devon Ryan
 
Location: Freiburg, Germany

Join Date: Jul 2011
Posts: 3,480
Default

I should add that featureCounts might allow this (at least Wei Shi has mentioned in the past that this is the case).
dpryan is offline   Reply With Quote
Old 02-02-2014, 05:05 AM   #8
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,060
Default

HTSeq will not make that table for you. After taking into consideration Devon's suggestion you can easily create the table yourself once you have the counts for each sample.

Last edited by GenoMax; 02-04-2014 at 08:27 AM. Reason: correction
GenoMax is offline   Reply With Quote
Old 02-03-2014, 06:03 AM   #9
sudders
Member
 
Location: Sheffield, UK

Join Date: Dec 2011
Posts: 32
Default

I believe the the Kraken pipeline will process miRNA-seq data into the format you require - it includes options to summarise counts per gene or per mature miRNA.
sudders is offline   Reply With Quote
Old 02-04-2014, 07:35 AM   #10
roll
Member
 
Location: UK

Join Date: Aug 2009
Posts: 38
Default

Quote:
Originally Posted by GenoMax View Post
HTSeq (or featureCounts) will not make that table for you. After taking into consideration Devon's suggestion you can easily create the table yourself once you have the counts for each sample.
i used featureCounts and used gff file from mirBase. After selecting the right options, it returned me format i wanted it to be (count of each miRNA in each sample). Do you think, there is a mistake?
roll is offline   Reply With Quote
Old 02-04-2014, 08:27 AM   #11
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,060
Default

I should not have included featureCount in post #8. My apologies (post amended).

Devon (post #7) had said that featureCounts may do this (and it indeed seems to). He has more experience with analysis.
GenoMax is offline   Reply With Quote
Reply

Tags
download, gtf, mirna, mouse

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 09:13 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO