Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • GFT file for rat

    I want to run tophat for rat samples. Where do I download the gtf file from?
    thanks

  • #2
    You can find it here:



    --
    Phillip

    Comment


    • #3
      the same is much bigger than the one from ucsc, why?

      Comment


      • #4
        Originally posted by HSV-1 View Post
        the same is much bigger than the one from ucsc, why?
        How did you get your one from UCSC? If you make a RefGene based GTF from TableBrowser, it only includes coding features. The pre-built GTF from Ensembl includes all coding and non-coding features. Plus the actual annotations are longer text strings (all the Ensembl accessions for gene ID, exon ID, transcript ID, name, biotype,...) so in raw text the Ensembl file will be larger.

        Also note that the UCSC file uses the notation "chr1", etc while the fist column in the Ensembl will just be "1" etc (some software will expect the prefix "chr").
        Michael Black, Ph.D.
        ScitoVation LLC. RTP, N.C.

        Comment


        • #5
          This is probably the reason.
          How to fix?
          From the same sequence data with ensemble gft I should get more accepted hits by tophat .



          Originally posted by mbblack View Post
          How did you get your one from UCSC? If you make a RefGene based GTF from TableBrowser, it only includes coding features. The pre-built GTF from Ensembl includes all coding and non-coding features. Plus the actual annotations are longer text strings (all the Ensembl accessions for gene ID, exon ID, transcript ID, name, biotype,...) so in raw text the Ensembl file will be larger.

          Also note that the UCSC file uses the notation "chr1", etc while the fist column in the Ensembl will just be "1" etc (some software will expect the prefix "chr").

          Comment


          • #6
            Originally posted by HSV-1 View Post
            From the same sequence data with ensemble gft I should get more accepted hits by tophat .
            No, not for a reasonably mature genome such as the Rat. Ensembl's build may include a handful of novel and/or predicted coding genes, but not many. Ensembl Rat rel. 66.34 had 22,938 coding genes, 22,921 of which were known and have Refseq annotation (I only know this as I'm writing up data that used 66.34 as the reference - you would have to look on Ensembl's web site for the stats for the current release).

            The annotation really should not have any significant affect on your summarized mapping results for a mature feature set like the Rat - it would only matter if there were a large number of novel, unknown or predicted genes in one annotation versus another, or if the splice boundaries of the annotation features were still largely undetermined. But once summarized by gene, your mapped count data should be unaffected given the genome build is fairly well characterized and stable at this point.
            Michael Black, Ph.D.
            ScitoVation LLC. RTP, N.C.

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Strategies for Sequencing Challenging Samples
              by seqadmin


              Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
              03-22-2024, 06:39 AM
            • seqadmin
              Techniques and Challenges in Conservation Genomics
              by seqadmin



              The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

              Avian Conservation
              Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
              03-08-2024, 10:41 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, Yesterday, 06:37 PM
            0 responses
            12 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, Yesterday, 06:07 PM
            0 responses
            10 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 03-22-2024, 10:03 AM
            0 responses
            51 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 03-21-2024, 07:32 AM
            0 responses
            68 views
            0 likes
            Last Post seqadmin  
            Working...
            X