Go Back   SEQanswers > Bioinformatics > Bioinformatics

Similar Threads
Thread Thread Starter Forum Replies Last Post
UCSC mm10 gtf file heso Bioinformatics 3 11-17-2014 01:57 AM
Problem to display wig files on UCSC. velt Bioinformatics 3 08-05-2014 09:02 AM
GTF file from UCSC with Gene name??? golharam General 2 09-17-2012 12:28 PM
UCSC genes download from GTF input gokhulkrishnakilaru Bioinformatics 0 11-09-2011 08:16 AM

Thread Tools
Old 12-23-2017, 05:13 AM   #1
Location: Netherlands

Join Date: May 2015
Posts: 20
Default Problem with UCSC GTF files?


I would like to ask for some opinion and advice related to the different available GTF-file sources for annotated genes.(mm10, but others as well)
I did some search to avoid duplicate entries, (sorry if It is still one).
The topic I would like to discuss is briefly mentioned at other forums, but was never discussed thoroughly that gave a satisfactory explanation.

I wanted to download GTF files (mm10) from UCSC genome browser to have reference genes and transcript variants for differential transcript variant expression and splicing analyses.

However, it looks like no matter how I was setting up the table browser (UCSC genes, NCBI refseq, etc) the obtained GTF files from UCSC browser were not suitable for such analyses.
I noticed that these GTF files (from UCSC) treat each transcript variants as a separate gene, since the "transcript ID" is identical to "gene ID" in these files. (did I do something wrong?)
For these analyses I need a GTF file where each gene ID is linked ( aka repeated ) to multiple transcript variants (if there are variants of course). The only source I found such GTF file is Gencode and Ensembl.
However, these files contain approx 50000 genes and 150000 transcript variants which I found too much due to predictions. While the UCSC has approx 38000 entries which might be less redundant and speculative? (no idea)

I would like to ask for some advice about where to find / how to make an optimal GTF file that would be suitable for differential splicing/ transc. variant expression analyses?

Would you recommend to avoid using UCSC GTF files for expression analyses in general?

Thank you for your help.

krapulaxdoctor is offline   Reply With Quote
Old 12-25-2017, 11:55 PM   #2
Junior Member
Location: Australia

Join Date: Nov 2013
Posts: 2


I'm not an expert and my knowledge is limited to human genes ... Although I'd like to think that the principles outlined extend to mouse genes as well.

1) Refseq - transcripts are well supported by evidence and heavily used (NM_ .. for known protein coding)
2) Ensembl / Gencode Comprehensive - Contains both annotated and manually curated transcripts
3) Ensembl / Gencode Basic - Contains manually curate transcripts only

I'm not terribly familiar with UCSC. In the literature I have come across so far, the authors have almost always leaned towards using RefSeq or Ensembl.

So the choice of which transcripts annotation to go with depends on what you're trying to do.

If you're interested in performing variant analysis of transcripts and ensure that they're supported by evidence, Refseq or Gencode basic is your friend.

If you're concerned that limiting yourself to annotations that are supported by evidence - might result in missing out other possibly novel transcripts, then Gencode Comprehensive is the way to go.

These two papers go into a significant more detail as to the pros and cons of using one annotation construct vs another.
doraemon is offline   Reply With Quote
Old 01-12-2018, 09:06 AM   #3
Location: Netherlands

Join Date: May 2015
Posts: 20

dear doraemon,

Thank you for the response. I ended up with similar conclusion. It is a bit confusing for a non-bioinformatician like me.
krapulaxdoctor is offline   Reply With Quote

gene, gtf, transcript variant, ucsc

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

All times are GMT -8. The time now is 04:20 AM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO