Quote:
Originally Posted by glados
Yes, by using a reference annotation gtf. How do you suggest I use it? If I have a long list with genes I want to look at. Perhaps I can compare them somehow. I'm new to bioinformatics so it's not intuitive to me yet.
|
If you want to do it in R, this sample code will read the gtf file and extract the rows matching your list of genes:
Code:
## List (vector) of differentially expr. genes
degenes<- c('TNFRSF18', 'WASH7P')
gtf<- read.table('genes.gtf', stringsAsFactors= FALSE, sep= '\t', quote= '')
gene_id<- sub('.*(gene_name \")', '', gtf$V9, perl= TRUE) ## NOTE: Replace gene_name with the feature to extract (e.g. gene_id, gene_symbol)
gene_id<- sub('\".*', '', gene_id, perl=TRUE)
gtf$gene_id<- gene_id
## All features in the GTF file for each DE gene
degtf<- gtf[gtf$gene_id %in% degenes,]
## Get start and end coordinates for each DE gene
decoords<- data.frame(aggregate(degtf[, c('V1', 'V7', 'V4')], by= list(gene_id= degtf$gene_id), min),
gene_end= aggregate(degtf$V5, by= list(gene_id= degtf$gene_id), max)$x)
Hope it helps!
Dario