Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to determine the proportion of overlapping genes in a genome?

    Hello to all,

    I'm working on stranded RNA-seq data and I would justify the use of this type of protocol, showing the proportion of genes that overlapp on the genome (human). Indeed, thanks to strand of information, it is possible to assign read to the real gene, for better quantification. Does anyone has a bioinformatics solution to determine this proportion ?

    Thank you.

  • #2
    Hi,
    Try bedtools intersected, it would do the trick finding the overlaping sequence.
    Of course stranded RNA-seq is great! You don't need to justify it

    Comment


    • #3
      If you are acquainted with R, maybe you should use the GenomicRanges package. Try converting the coordinates of the genes to a GRange object, then use the function "findOverlaps", finally you have to remove those overlaps where the same gene is compared.
      Last edited by diego diaz; 02-26-2015, 05:21 AM.

      Comment


      • #4
        Hi,

        Thank you for your different ideas.

        I used R to do that:

        I imported my .gtf annotation file from ensembl:
        # my ensembl gtf file
        GTFfile = "Homo_sapiens.GRCh37.75.gtf"

        #import libraries
        library(GenomicRanges)
        library(rtracklayer)

        # importation of this gtf file in R
        GTF=import.gff(GTFfile, format="gtf", asRangedData=F, feature.type="gene")

        GTF_ok=as.data.frame(GTF)
        dim(GTF_ok)
        #57,773 initially genes

        # separation of genes from plus strand and minus strand
        GTF_strand_plus=GTF[which(GTF_ok$strand == "+"),]
        GTF_strand_minus=GTF[which(GTF_ok$strand == "-"),]

        ##28,997 + 28,776 = 57,773 initially genes
        dim(GTF_strand_plus)
        #[1] 28,997 14
        dim(GTF_strand_minus)
        #[1] 28,776 14

        # find overlaps between genes from plus and minus strands
        overlap_with_strand=findOverlaps(GTF_strand_plus, GTF_strand_minus, ignore.strand = TRUE)

        dim(overlap_with_strand)
        #16,249 overlapps
        My original GTF file contained 57,773 annotated genes.
        After these commands, I found 16,249 overlaps, that to say, ~28% of overlapping in human genome (16249*100)/57773 = 28%. Or we consider one overlap as two genes and so : (16249*2*100)/57773 = 56 % of overlapping genes ?

        This method is fine for you ?
        Last edited by a.kmg; 02-26-2015, 05:23 AM.

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Essential Discoveries and Tools in Epitranscriptomics
          by seqadmin




          The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
          04-22-2024, 07:01 AM
        • seqadmin
          Current Approaches to Protein Sequencing
          by seqadmin


          Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
          04-04-2024, 04:25 PM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, Today, 08:47 AM
        0 responses
        12 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-11-2024, 12:08 PM
        0 responses
        60 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 10:19 PM
        0 responses
        59 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 09:21 AM
        0 responses
        54 views
        0 likes
        Last Post seqadmin  
        Working...
        X