Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Cuffdiff FPKM question

    Hello,

    I am fairly new to the world of bioinformatics. I have data from an Illumina RNAseq run we did with Arabidopsis RNA. I ran the reads through the Cufflinks/Cuffdiff programs to get transcript levels for genes. I've noticed that for some genes, the program will group a few (2-4) genes/loci together and give them all the same FPKM. But if I search for these genes they are obviously separate genes and not isoforms of one gene. I believe it is because these genes are highly similar in sequence. But if I go and look at my aligned reads with a genome browser, many times most of the reads will only align to one of the genes. However in some instances both or all the genes show equal amounts of reads aligning. So my question is what exactly is the FPKM for these "grouped" genes? Is it for only one of the genes, or is it distributed amongst all the genes in the group? How do I find out what the FPKM for the individual genes are? Thanks in advance to anyone with any ideas!

  • #2
    did you got any solution for your problem. I am facing the same issue.

    Comment


    • #3
      Hi,

      Yes I actually did. When I ran my Cuffdiff analysis, I used my merged.gtf file as the reference genome, not the original reference genome. My basic understanding is that the merged file takes into account your experimental data, as well the original reference genome, and creates a "merged" version of the two. So in the case of the multiple loci, the merged file decided based on my data that some reads span the entire region covering those two or more loci, thus the multiple loci with one FPKM. To fix this, I ran Cuffdiff again with the original reference genome, not the merged one. Hope that made sense...

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Essential Discoveries and Tools in Epitranscriptomics
        by seqadmin




        The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
        04-22-2024, 07:01 AM
      • seqadmin
        Current Approaches to Protein Sequencing
        by seqadmin


        Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
        04-04-2024, 04:25 PM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, Today, 11:49 AM
      0 responses
      12 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, Yesterday, 08:47 AM
      0 responses
      16 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-11-2024, 12:08 PM
      0 responses
      61 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-10-2024, 10:19 PM
      0 responses
      60 views
      0 likes
      Last Post seqadmin  
      Working...
      X