SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
CDS annotation in CLC genomics? evolgen Bioinformatics 1 01-11-2015 07:19 PM
An error about annotation exon number of gene in tophat-fusion report louis7781x Bioinformatics 1 02-07-2014 10:42 AM
Question about exon limits and annotation.. shyam_la Bioinformatics 16 07-26-2012 03:09 PM
cuffdiff does not output all the CDS in cds.FPKM.tracking file xiangq Bioinformatics 20 04-26-2012 11:39 AM
Exon-Junction mapping: re-assigning CDS-mapped reads to chromosomes sridharacharya RNA Sequencing 1 10-21-2010 04:07 PM

Reply
 
Thread Tools
Old 08-05-2012, 08:08 PM   #1
inukj
Junior Member
 
Location: Seoul

Join Date: Aug 2012
Posts: 2
Default How to get exon (cds) annotation of a genome?

Hi,

I need annotation of exons-only (coding sequence region information) of the chicken genome (GallusGallus).
How can I get it the most simplest way? I've tried to look into tables provided by the UCSC browser without much success (db=galGal4&hgta_group=allTables&hgta_track=galGal4&hgta_table=cds&hgta_regionType=genome&position=chr5%3A55031036-55105194&hgta_outputType=primaryTable&).

The ucsc provides a table named cds. Does this mean the coding sequence information? however, I'm not able to understand the table. there is not extra information of which chromosome each line refers to or etc..

I'd be grateful for any suggestions!

Thanks.
Inuk
inukj is offline   Reply With Quote
Old 08-05-2012, 10:25 PM   #2
Wallysb01
Senior Member
 
Location: San Francisco, CA

Join Date: Feb 2011
Posts: 286
Default

What kind of form do you want it in exactly? Are you thinking a multiple fasta file of CDS genes? Or do you want a gff3/gtf of just CDS regions? Both are possible, and not that difficult. For a fasta file, you could just download it from Ensembl. The CDS only annotation file (gff3/gtf) would take a little manipulation, but it wouldn't be too hard.
Wallysb01 is offline   Reply With Quote
Old 08-06-2012, 12:12 AM   #3
inukj
Junior Member
 
Location: Seoul

Join Date: Aug 2012
Posts: 2
Default

Thanks for the prompt reply Wallysb01!

I figured that I shall work with exons instead of CDS, since UTR regions need to be considered for my analysis. I'd be happy to have it in fasta format. I think that I might have gotten the data from the ucsc browser but I'm unsure. it shows something like (single line)
585 NM_001031401 chr1 + 77009 89017 79071 88071 15 77009,79071,80408,80609,81604,82298,83944,84189,84590,85139,85419,86072,86552,87495,87934, 77075,79155,80482,80739,81715,82353,84055,84300,84646,85209,85553,86180,86847,87567,89017, 0 HCLS1 cmpl cmpl -1,0,0,2,0,0,1,1,1,0,1,0,0,1,1,

here, they say that the format is as follows:
bin, name, chrom, txstart, txend, cdsstart, cdsend, exoncount, exonstarts, exonedns, score, name2, cdsstartstat, cdsendstat, exonframes.

Does this mean that I can use the txstart/txend as the whole exon region? What exactly does txstart txend stand for? Is it transcription start end?
Btw, the introns are filtered out.

For my analysis, I'm looking for small RNA matches within the exon region.

Regards.
inukj is offline   Reply With Quote
Old 09-13-2012, 03:37 PM   #4
billstevens
Senior Member
 
Location: Baltimore

Join Date: Mar 2012
Posts: 120
Default

Hey,

Does anyone know or have an annotation of hg19 with only ENTREZ gene IDs?
billstevens is offline   Reply With Quote
Old 09-13-2012, 04:54 PM   #5
kaboroevich
Junior Member
 
Location: Japan

Join Date: Mar 2012
Posts: 6
Default

NCBI provides the seq_gene.md file for the galGal4 assembly. The description of the columns is provided in the README. Is that acceptable?
kaboroevich is offline   Reply With Quote
Old 09-13-2012, 06:26 PM   #6
husamia
Member
 
Location: cinci

Join Date: Apr 2010
Posts: 66
Default

Quote:
Originally Posted by inukj View Post
Hi,

I need annotation of exons-only (coding sequence region information) of the chicken genome (GallusGallus).
How can I get it the most simplest way? I've tried to look into tables provided by the UCSC browser without much success (db=galGal4&hgta_group=allTables&hgta_track=galGal4&hgta_table=cds&hgta_regionType=genome&position=chr5%3A55031036-55105194&hgta_outputType=primaryTable&).

The ucsc provides a table named cds. Does this mean the coding sequence information? however, I'm not able to understand the table. there is not extra information of which chromosome each line refers to or etc..

I'd be grateful for any suggestions!

Thanks.
Inuk
I use Mutalyzer which provides webservice https://mutalyzer.nl/positionConverter
husamia is offline   Reply With Quote
Old 09-15-2012, 11:31 AM   #7
billstevens
Senior Member
 
Location: Baltimore

Join Date: Mar 2012
Posts: 120
Default

Sorry, I was asking for human genome, hg19.
billstevens is offline   Reply With Quote
Reply

Tags
cds, exons location, gallus

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 03:21 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO