Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • GTF file for cuffdiff 0.9.1

    The GTF file I'm using to run cuffdiff has transcript IDs but no p_ids. Consequently, cuffdiff is unable to make the cds, promoters, splicing, and tss_groups files. Is there a database where I could get an improved GTF file? If not, what table schema in the UCSC genome browser have people used to construct their GTF files?

    Thanks!

  • #2
    Have you run cuffcompare on your samples first? Cuffcompare attaches p_ids and tss_ids to the combined GTF file that you can then use as input for Cuffdiff.

    Comment


    • #3
      Thanks! I tried using the .gtf file from cuffcompare as my reference gtf for cuffdiff as you suggested. This solved some problems, but created others. The isoforms, promoters, splicing, and tss files are now populated, but the cds files still aren't. The other thing that happened is that there were no recognizable gene names in any of the files created by cuffdiff with the cuffcompare .gtf file. Instead, the gene names were "XLOC..". I'm thinking there is a problem with my reference gtf file that I used in cuffcompare. Where can I find a better reference gtf for mm9?

      Comment


      • #4
        Hi,
        I'm facing the same issue with the mouse gtf file,
        It will be good that the gtf of major organism will be made available of the cufflinks page.
        Best,
        Ramzi
        Research Scientist - Bioinformatics
        Sidra Medical and Research Center

        Comment


        • #5
          from my understanding, to have p_id you need to run cuffcompare with the -s option. Also no gene names are showing up probably because the gtf that you are supplying it does not have a gene_name attribute in the 9th column, you should try the Ensembl GTF, that one has gene names http://uswest.ensembl.org/info/data/ftp/index.html

          Comment


          • #6
            fkuo: Thanks! I tried running cuffcompare with the -s option and was able to generate a p_id. Unfortunately, my troubles didn't stop there. My combined.gtf file contained tss and p ids that didn't really make much sense. This resulted in lots of NO TEST error messages when I ran cuffdiff. What did you use as your -r .gtf files? other .gtf? Also, did you use the -p option? if so, how do you specify the prefix?

            Comment


            • #7
              hi kalidaemon,

              for the -r, I used a combined reference gtf (UCSC, Ensembl, Refseq). For the -p option, you just used -p4 for 4 threads or --num-threads 4. hope this helps!

              Comment


              • #8
                no p_id attribute

                Originally posted by kalidaemon View Post
                fkuo: Thanks! I tried running cuffcompare with the -s option and was able to generate a p_id. Unfortunately, my troubles didn't stop there. My combined.gtf file contained tss and p ids that didn't really make much sense. This resulted in lots of NO TEST error messages when I ran cuffdiff. What did you use as your -r .gtf files? other .gtf? Also, did you use the -p option? if so, how do you specify the prefix?
                Hi kalidaemon,

                I see that you ran cuffcompare with the -s option and was able to generate a p_id. I tried this and still wasn't able to generate the attribute. Could you offer any tips?
                Many thanks!

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Essential Discoveries and Tools in Epitranscriptomics
                  by seqadmin


                  The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist on Modified Bases...
                  Yesterday, 07:01 AM
                • seqadmin
                  Current Approaches to Protein Sequencing
                  by seqadmin


                  Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                  04-04-2024, 04:25 PM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, 04-11-2024, 12:08 PM
                0 responses
                55 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-10-2024, 10:19 PM
                0 responses
                51 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-10-2024, 09:21 AM
                0 responses
                45 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-04-2024, 09:00 AM
                0 responses
                55 views
                0 likes
                Last Post seqadmin  
                Working...
                X