SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
tophat error: Converting left_kept_reads.m2g to genomic coordinates (map2gtf), Failed capricy Bioinformatics 2 03-07-2014 09:08 PM
Conversion of RefSeq coordinates to genomic coordinates ngssupporter Bioinformatics 2 02-15-2014 12:59 PM
Converting an early genome assembly to current coordinates? rareaquaticbadger Bioinformatics 1 10-24-2013 12:09 PM
Converting mRNA coordinates to chromosome coordinates david.tamborero Bioinformatics 3 10-04-2013 05:46 AM
Converting contig coordinates to genomic coordinates aurimas Bioinformatics 0 03-06-2013 11:06 AM

Reply
 
Thread Tools
Old 10-25-2014, 09:53 PM   #1
amias
Junior Member
 
Location: united states

Join Date: Sep 2014
Posts: 3
Default converting genomic coordinates to transcript coordinates

Hello,
I need to convert the genomic coordinates for UTR and CDS features in the arabidopsis gff3 gene annotations to transcript coordinates. Can anyone suggest a way to do this using awk or other languages? I have tried to first calculate the length of the features for each ID, so that I can calculate the feature start and end positions for each transcript based on the length. However I am not sure how to write the code for the second part. I have searched for related threads but none of the solution seemed to provide what I want. Is there already a package or script for this kind of conversion? Any suggestion is most welcome!

Here is a sample of my input data:
Code:
more Athaliana_167_TAIR10.gene.gff3 
##gff-version 3
##annot-version TAIR10
Chr1    phytozomev10    gene    3631    5899    .       +       .       ID=AT1G01010.TAIR10;Name=AT1G01010
Chr1    phytozomev10    mRNA    3631    5899    .       +       .       ID=AT1G01010.1.TAIR10;Name=AT1G01010.1;pacid=19656964;longest=1;Parent=AT1G01010.TAIR10
Chr1    phytozomev10    five_prime_UTR  3631    3759    .       +       .       ID=AT1G01010.1.TAIR10.five_prime_UTR.1;Parent=AT1G01010.1.TAIR10;pacid=19656964
Chr1    phytozomev10    CDS     3760    3913    .       +       0       ID=AT1G01010.1.TAIR10.CDS.1;Parent=AT1G01010.1.TAIR10;pacid=19656964
Chr1    phytozomev10    CDS     3996    4276    .       +       2       ID=AT1G01010.1.TAIR10.CDS.2;Parent=AT1G01010.1.TAIR10;pacid=19656964
Chr1    phytozomev10    CDS     4486    4605    .       +       0       ID=AT1G01010.1.TAIR10.CDS.3;Parent=AT1G01010.1.TAIR10;pacid=19656964
Chr1    phytozomev10    CDS     4706    5095    .       +       0       ID=AT1G01010.1.TAIR10.CDS.4;Parent=AT1G01010.1.TAIR10;pacid=19656964
Chr1    phytozomev10    CDS     5174    5326    .       +       0       ID=AT1G01010.1.TAIR10.CDS.5;Parent=AT1G01010.1.TAIR10;pacid=19656964
Chr1    phytozomev10    CDS     5439    5630    .       +       0       ID=AT1G01010.1.TAIR10.CDS.6;Parent=AT1G01010.1.TAIR10;pacid=19656964
Chr1    phytozomev10    three_prime_UTR 5631    5899    .       +       .       ID=AT1G01010.1.TAIR10.three_prime_UTR.1;Parent=AT1G01010.1.TAIR10;pacid=19656964
Chr1    phytozomev10    gene    5928    8737    .       -       .       ID=AT1G01020.TAIR10;Name=AT1G01020
Chr1    phytozomev10    mRNA    5928    8737    .       -       .       ID=AT1G01020.1.TAIR10;Name=AT1G01020.1;pacid=19655142;longest=1;Parent=AT1G01020.TAIR10
Chr1    phytozomev10    CDS     8571    8666    .       -       0       ID=AT1G01020.1.TAIR10.CDS.1;Parent=AT1G01020.1.TAIR10;pacid=19655142
Here is my working code based on the thread here for calculating the UTR and CDS feature length for each transcript:

Code:
awk '$3!="gene" && $3!="mRNA"' <Athaliana_167_TAIR10.gene.gff3|awk  '$9~/Parent/{sub(/.*Parent=/,"",$9);sub(/.TAIR10;.*/,"",$9);ID=$9;L[ID]=$5-$4+1}{for(i in L){print $0,L[i]}}'
amias is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 05:48 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO