Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • mouse mm9 miRNA in gtf format

    Hi,
    Where can i find mouse miRNA mm9 reference data in gtf format?
    I know mirbase but i could only found these in gff format.
    I would like to use this for htseq-count.

  • #2
    Ensembl will get you download whole legacy GTFs from their FTP (ftp://ftp.ensembl.org/pub/ --> choose the correct release --> gtf --> Mus musculus). You can filter out the miRNAs from the whole list, but if you're using htseq-count it might just be easier to give it the complete GTF and do the filtering afterwards.
    Last edited by A.N.Other; 01-31-2014, 04:58 AM.

    Comment


    • #3
      Originally posted by A.N.Other View Post
      Ensembl will get you download whole legacy GTFs from their FTP (ftp://ftp.ensembl.org/pub/ --> choose the correct release --> gtf --> Mus musculus). You can filter out the miRNAs from the whole list, but if you're using htseq-count it might just be easier to give it the complete GTF and do the filtering afterwards.
      What I am mising to understand is that my boss wants to have a table like the following - for each mirna
      Cond1 Cond2 Cond3
      mirna1 3 5 7
      mirna2 2 0 6


      When I download the gene.gtf from ensemble I only got these counts gene wise.

      How can i obtain the counts for each mirna on 3 conditions (like the table i mentioned above).

      Comment


      • #4
        You are going to use "HTSeq-count" with the GFF file to get that table: http://www-huber.embl.de/users/ander...unt.html#count after you filter the GFF file leaving only miRNA's like A.N.Other suggested.

        This is how you can do that from a unix machine. Get the right GTF file for the build you used for your alignments.

        Code:
        $ wget ftp://ftp.ensembl.org/pub/release-74/gtf/mus_musculus/Mus_musculus.GRCm38.74.gtf.gz
        
        $ gunzip Mus_musculus.GRCm38.74.gtf.gz
        
        $ cat Mus_musculus.GRCm38.74.gtf | grep "miRNA" > file_name_with_miRNA.gtf
        
        $ cat Mus_musculus.GRCm38.74.gtf | grep "snoRNA" > file_name_with_snoRNA.gtf
        
        $ cat Mus_musculus.GRCm38.74.gtf | grep "lincRNA" > file_name_with_lincRNA.gtf
        
        $ cat Mus_musculus.GRCm38.74.gtf | grep "snRNA" > file_name_with_snRNA.gtf
        You can then use http://www.sequenceontology.org/cgi-bin/converter.cgi to convert the gtf files to gff.
        Last edited by GenoMax; 02-02-2014, 05:43 AM. Reason: Added other RNA types

        Comment


        • #5
          Originally posted by GenoMax View Post
          You are going to use "HTSeq-count" with the GFF file to get that table: http://www-huber.embl.de/users/ander...unt.html#count after you filter the GFF file leaving only miRNA's like A.N.Other suggested.

          This is how you can do that from a unix machine. Get the right GTF file for the build you used for your alignments.

          Code:
          $ wget ftp://ftp.ensembl.org/pub/release-74/gtf/mus_musculus/Mus_musculus.GRCm38.74.gtf.gz
          
          $ gunzip Mus_musculus.GRCm38.74.gtf.gz
          
          $ cat Mus_musculus.GRCm38.74.gtf | grep "miRNA" > file_name_with_miRNA.gtf
          You can then use http://www.sequenceontology.org/cgi-bin/converter.cgi to convert the gtf file to gff.
          My understanding is that if i do the above, i will get the miRNA count per gene. Whereas i am interested in is counting each miRNA type in each sample, like the table above. Is it possible to do that with htseq-count?

          Comment


          • #6
            How easy this is depends on how the annotation is constructed. htseq-count (or featureCounts, for a generally quicker alternative) will count according to whatever feature you tell it to. So, if your annotation has a field that specifies the miRNA type, then htseq-count can be told to count according to that.

            One caveat is that it will still ignore multimappers, which may be quite prevalent in your situation. The proper handling would be to increment the count of the feature by one if all of the multiple mappings of a particular read fall only in a single feature. I don't think htseq-count will do that for you.

            Comment


            • #7
              I should add that featureCounts might allow this (at least Wei Shi has mentioned in the past that this is the case).

              Comment


              • #8
                HTSeq will not make that table for you. After taking into consideration Devon's suggestion you can easily create the table yourself once you have the counts for each sample.
                Last edited by GenoMax; 02-04-2014, 09:27 AM. Reason: correction

                Comment


                • #9
                  I believe the the Kraken pipeline will process miRNA-seq data into the format you require - it includes options to summarise counts per gene or per mature miRNA.

                  Comment


                  • #10
                    Originally posted by GenoMax View Post
                    HTSeq (or featureCounts) will not make that table for you. After taking into consideration Devon's suggestion you can easily create the table yourself once you have the counts for each sample.
                    i used featureCounts and used gff file from mirBase. After selecting the right options, it returned me format i wanted it to be (count of each miRNA in each sample). Do you think, there is a mistake?

                    Comment


                    • #11
                      I should not have included featureCount in post #8. My apologies (post amended).

                      Devon (post #7) had said that featureCounts may do this (and it indeed seems to). He has more experience with analysis.

                      Comment

                      Latest Articles

                      Collapse

                      • seqadmin
                        Strategies for Sequencing Challenging Samples
                        by seqadmin


                        Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                        03-22-2024, 06:39 AM
                      • seqadmin
                        Techniques and Challenges in Conservation Genomics
                        by seqadmin



                        The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                        Avian Conservation
                        Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                        03-08-2024, 10:41 AM

                      ad_right_rmr

                      Collapse

                      News

                      Collapse

                      Topics Statistics Last Post
                      Started by seqadmin, Yesterday, 06:37 PM
                      0 responses
                      11 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, Yesterday, 06:07 PM
                      0 responses
                      10 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 03-22-2024, 10:03 AM
                      0 responses
                      51 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 03-21-2024, 07:32 AM
                      0 responses
                      68 views
                      0 likes
                      Last Post seqadmin  
                      Working...
                      X