Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • gene annotation in sequenced cancer genome

    Hi all,

    I am dealing with ChIP-seq data for a small lung cancer sample. The genomic sequences of this cancer sample is available in this paper: "A small-cell lung cancer genome with complex signatures of tobacco exposure". We are trying to build a specific reference genome based on the whole genome sequences provided in this paper and map our ChIP-seq data back to this specific reference genome.
    My problem is that the gene annotation files like GTF files will be very different for this specific reference genome and common reference genome (Hg19), because of the somatic variation (insertions, deletions, rearrangements) in this cancer genome. The coordinates for transcripts will change a lot. Is there a software of program to solve this problem?
    Something like by providing hg19 GTF files and information for all somatic variations as listed in this link, it can output a gene annotation file for this specific cancer reference genome.
    or if not, could someone give a clue about how to cleverly do that?

    Thanks a lot!

  • #2
    If this is human (I'm assuming it's not one of these fellows : http://www.holytaco.com/25-smoking-monkeys/ ), you can get the sequences for the human genes from genbak/ncbi/entrez and blat them against your custom(?) genome. This is fairly easy to set up if you know how to script and process a big data set. It is still a big job in terms of horsepower needed but easily paralleizeable if you have access to many computers. You may need to fine tune the blat alignment parameters and filter the results for good hits. The results will give you the coordinates for the genes.

    Comment

    Latest Articles

    Collapse

    • seqadmin
      Essential Discoveries and Tools in Epitranscriptomics
      by seqadmin


      The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist on Modified Bases...
      Today, 07:01 AM
    • seqadmin
      Current Approaches to Protein Sequencing
      by seqadmin


      Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
      04-04-2024, 04:25 PM

    ad_right_rmr

    Collapse

    News

    Collapse

    Topics Statistics Last Post
    Started by seqadmin, 04-11-2024, 12:08 PM
    0 responses
    37 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-10-2024, 10:19 PM
    0 responses
    41 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-10-2024, 09:21 AM
    0 responses
    35 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-04-2024, 09:00 AM
    0 responses
    54 views
    0 likes
    Last Post seqadmin  
    Working...
    X