![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Arabidopsis gtf file with tss and p ids | wacguy | Bioinformatics | 2 | 07-21-2013 08:08 PM |
How to decide whether a hit is unique or not? | Jamawoko | Bioinformatics | 5 | 10-19-2012 01:32 AM |
How to decide on a reference genome | User1234567 | De novo discovery | 1 | 05-02-2012 01:44 AM |
TSS plot. Drop of the signal at TSS. | neurongs | Epigenetics | 7 | 04-19-2012 07:21 AM |
to decide whether a mutation is non-synonimous or not | zslee | Bioinformatics | 8 | 12-11-2009 04:55 AM |
![]() |
|
Thread Tools |
![]() |
#1 |
Member
Location: Japan Join Date: Mar 2015
Posts: 23
|
![]()
Hi everybody
I’d like to ask you about TSS (transcription start site) annotation Now I’m analyzing Total RNA-seq data by ENCODE project. I will check reported relationship between gene expression and DNA methylation state on upstream of gene. So I have to know the approximate coordinates of TSS on targeted genes. I don’t need exact coordinates by CAGE-seq and so Until now, I downloaded the GTF annotation file from ENCODE project (from below URL) https://www.encodeproject.org/data-s...nce-sequences/ and I checked the contents like below And then, at present, I’m searching How to decide the side (5’ or 3’), which is close to TSS chr1 HAVANA gene 11869 (5’) 14409 (3’) Is it right that the side close to coordinates of exon_number1 is near TSS? If annotation data or easy method exists, Would you tell me about it ? Best regards ##description: evidence-based annotation of the human genome (GRCh38), version 24 (Ensembl 83) ##provider: GENCODE ##contact: gencode-help@sanger.ac.uk ##format: gtf ##date: 2015-12-03 chr1 HAVANA gene 11869 14409 . + . gene_id "ENSG00000223972.5"; gene_type "transcribed_unprocessed_pseudogene"; gene_status "KNOWN"; gene_name "DDX11L1"; level 2; havana_gene "OTTHUMG00000000961.2"; chr1 HAVANA transcript 11869 14409 . + . gene_id "ENSG00000223972.5"; transcript_id "ENST00000456328.2"; gene_type "transcribed_unprocessed_pseudogene"; gene_status "KNOWN"; gene_name "DDX11L1"; transcript_type "processed_transcript"; transcript_status "KNOWN"; transcript_name "DDX11L1-002"; level 2; tag "basic"; transcript_support_level "1"; havana_gene "OTTHUMG00000000961.2"; havana_transcript "OTTHUMT00000362751.1"; chr1 HAVANA exon 11869 12227 . + . gene_id "ENSG00000223972.5"; transcript_id "ENST00000456328.2"; gene_type "transcribed_unprocessed_pseudogene"; gene_status "KNOWN"; gene_name "DDX11L1"; transcript_type "processed_transcript"; transcript_status "KNOWN"; transcript_name "DDX11L1-002"; exon_number 1; exon_id "ENSE00002234944.1"; level 2; tag "basic"; transcript_support_level "1"; havana_gene "OTTHUMG00000000961.2"; havana_transcript "OTTHUMT00000362751.1"; chr1 HAVANA exon 12613 12721 . + . gene_id "ENSG00000223972.5"; transcript_id "ENST00000456328.2"; gene_type "transcribed_unprocessed_pseudogene"; gene_status "KNOWN"; gene_name "DDX11L1"; transcript_type "processed_transcript"; transcript_status "KNOWN"; transcript_name "DDX11L1-002"; exon_number 2; exon_id "ENSE00003582793.1"; level 2; tag "basic"; transcript_support_level "1"; havana_gene "OTTHUMG00000000961.2"; havana_transcript "OTTHUMT00000362751.1"; |
![]() |
![]() |
![]() |
#2 |
Senior Member
Location: uk Join Date: Mar 2009
Posts: 667
|
![]()
In your gtf file, compare the exon annotations of genes that are on the + strand with those that are on the - strand. The gene you have shown above is on the + strand.
|
![]() |
![]() |
![]() |
#3 |
Member
Location: Japan Join Date: Mar 2015
Posts: 23
|
![]()
Dear mastal
Thank you for your answer and sorry my basic question. Do you mean that as transcription mechanism (from below site), when strand of certain gene is decided, the side close to TSS is decided with "no exception" ? That is, If strand of gene is +, 5' side. If strand of gene is -, 3' side Sorry, I should be more careful Thank you for quick answer. https://www.ncbi.nlm.nih.gov/books/N...ort=objectonly |
![]() |
![]() |
![]() |
#4 |
Senior Member
Location: uk Join Date: Mar 2009
Posts: 667
|
![]()
I meant that you should check whether, for genes on the - strand, the exon closest to the rightmost end of the gene is labelled as exon1 or not.
Usually the chromosomal coordinates for a gene are given from left to right, so for genes on the minus strand, the transcript start coordinates are lower than the transcript end coordinates. |
![]() |
![]() |
![]() |
#5 |
Member
Location: Japan Join Date: Mar 2015
Posts: 23
|
![]()
Dear mastal
Thank you for answer. I checked , for genes on the minus strand, the exon closest to the rightmost end ((B) column in below expamle) of the gene is labelled as exon_number 1 (like below example) and the transcript start coordinates (below A column) are lower than the transcript end coordinates (below B column). That is, This GTF file follows the standard (you said that Usually the chromosomal coordinates for a gene are given from left to right.) So I should interpret, If strand of gene is +, coordinates of (A) column is close to TSS. If strand of gene is -, coordinates of (B) column is close to TSS. Is it correct? Thank you for your help in advance. chr1 HAVANA gene 800879(column A) 817712(column B) . - . gene_id "ENSG00000230092.7"; gene_type "transcribed_unprocessed_pseudogene"; gene_status "KNOWN"; gene_name "RP11-206L10.8"; level 2; tag "overlapping_locus"; havana_gene "OTTHUMG00000002403.3"; chr1 HAVANA transcript 800879 817712 . - . gene_id "ENSG00000230092.7"; transcript_id "ENST00000447500.4"; gene_type "transcribed_unprocessed_pseudogene"; gene_status "KNOWN"; gene_name "RP11-206L10.8"; transcript_type "processed_transcript"; transcript_status "KNOWN"; transcript_name "RP11-206L10.8-002"; level 2; tag "basic"; transcript_support_level "5"; havana_gene "OTTHUMG00000002403.3"; havana_transcript "OTTHUMT00000448550.2"; chr1 HAVANA exon 817373 817712 . - . gene_id "ENSG00000230092.7"; transcript_id "ENST00000447500.4"; gene_type "transcribed_unprocessed_pseudogene"; gene_status "KNOWN"; gene_name "RP11-206L10.8"; transcript_type "processed_transcript"; transcript_status "KNOWN"; transcript_name "RP11-206L10.8-002"; exon_number 1; exon_id "ENSE00001746491.1"; level 2; tag "basic"; transcript_support_level "5"; havana_gene "OTTHUMG00000002403.3"; havana_transcript "OTTHUMT00000448550.2"; chr1 HAVANA exon 810067 810170 . - . gene_id "ENSG00000230092.7"; transcript_id "ENST00000447500.4"; gene_type "transcribed_unprocessed_pseudogene"; gene_status "KNOWN"; gene_name "RP11-206L10.8"; transcript_type "processed_transcript"; transcript_status "KNOWN"; transcript_name "RP11-206L10.8-002"; exon_number 2; exon_id "ENSE00001674926.2"; level 2; tag "basic"; transcript_support_level "5"; havana_gene "OTTHUMG00000002403.3"; havana_transcript "OTTHUMT00000448550.2"; |
![]() |
![]() |
![]() |
#6 |
Senior Member
Location: uk Join Date: Mar 2009
Posts: 667
|
![]() |
![]() |
![]() |
![]() |
#7 |
Member
Location: Japan Join Date: Mar 2015
Posts: 23
|
![]()
Dear mastal
I really appreciate your sincere response for my question. |
![]() |
![]() |
![]() |
Tags |
annotation, gtf, method, tss |
Thread Tools | |
|
|