SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Find SNP in Sanger sequenced gene genelab Bioinformatics 2 03-10-2014 11:35 PM
Who's sequenced their genome? james hadfield General 11 12-02-2011 08:54 AM
Blast+ database with gene annotation andreitudor Bioinformatics 4 03-03-2011 07:26 AM
Human DNA with whole genome sequenced? zheng General 7 09-08-2010 10:48 AM
In gene annotation table/gff3, why is same gene name appeared in different chromosome iloveneworleans Bioinformatics 1 01-14-2010 08:55 AM

Reply
 
Thread Tools
Old 01-29-2012, 11:25 AM   #1
SF_mallish
Member
 
Location: Champaign-Urbana

Join Date: Jan 2011
Posts: 10
Question gene annotation in sequenced cancer genome

Hi all,

I am dealing with ChIP-seq data for a small lung cancer sample. The genomic sequences of this cancer sample is available in this paper: "A small-cell lung cancer genome with complex signatures of tobacco exposure". We are trying to build a specific reference genome based on the whole genome sequences provided in this paper and map our ChIP-seq data back to this specific reference genome.
My problem is that the gene annotation files like GTF files will be very different for this specific reference genome and common reference genome (Hg19), because of the somatic variation (insertions, deletions, rearrangements) in this cancer genome. The coordinates for transcripts will change a lot. Is there a software of program to solve this problem?
Something like by providing hg19 GTF files and information for all somatic variations as listed in this link, it can output a gene annotation file for this specific cancer reference genome.
or if not, could someone give a clue about how to cleverly do that?

Thanks a lot!
SF_mallish is offline   Reply With Quote
Old 01-29-2012, 04:05 PM   #2
Richard Finney
Senior Member
 
Location: bethesda

Join Date: Feb 2009
Posts: 700
Default

If this is human (I'm assuming it's not one of these fellows : http://www.holytaco.com/25-smoking-monkeys/ ), you can get the sequences for the human genes from genbak/ncbi/entrez and blat them against your custom(?) genome. This is fairly easy to set up if you know how to script and process a big data set. It is still a big job in terms of horsepower needed but easily paralleizeable if you have access to many computers. You may need to fine tune the blat alignment parameters and filter the results for good hits. The results will give you the coordinates for the genes.
Richard Finney is offline   Reply With Quote
Reply

Tags
cancer genome, gene annotation

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 03:08 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO