Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to decide FPKM fo known genes

    Hi , everyone.

    I am wondering how I should determin the FPKM of reference genes.

    I can get FPKM of cufflinks predicted transcript from cuffsompare output(transcript.tmap).

    like this..
    ref_gene_id ref_id class_code cuff_gene_id cuff_id FMI FPKM FPKM_conf_lo FPKM_conf_hi cov len major_iso_id
    AT1G01020 AT1G01020.1 e CUFF.1 CUFF.1.1 100 0.823930 0.000000 1.989144 1.195219 251 CUFF.1.1
    AT1G01030 AT1G01030.1 p CUFF.3 CUFF.3.1 100 2.209471 0.233260 4.185682 2.136752 234 CUFF.3.1
    AT1G01030 AT1G01030.1 c CUFF.5 CUFF.5.1 100 7.940817 5.599198 10.282437 8.514190 599 CUFF.5.1
    AT1G01020 AT1G01020.1 c CUFF.8 CUFF.8.1 100 9.703712 5.618407 13.789018 10.301606 846 CUFF.8.1
    AT1G01020 AT1G01020.2 j CUFF.8 CUFF.8.2 51 4.949442 1.006100 8.892785 5.254402 816 CUFF.8.1

    But, in case that ref_gene_id appers many times(for example, AT1G01030),
    how can I calculate FPKM of that ref_gene_id??

    Or,are there different ways to know FPKM of know genes from RNA-seq??

    Please help me!!

  • #2
    Hi Zun,

    I guess to get the FPKM of reference genes you can run cufflinks with the relevant GTF file without the need of cuffcompare (if you have reference genes you should be able to download or create the GTF file). Cufflinks will look into the gtf file and determine the FPKM of each gene/transcript without assembling new ones. That is, if you run something like:
    Code:
    cufflinks accepted_hits.sam -G yourgtffile.gtf  [other options...]
    You should get a file called transcripts.expr that looks like this:
    Code:
    trans_id	bundle_id	chr	left	right	FPKM	FMI	frac	FPKM_conf_lo	FPKM_conf_hi	coverage	length
    ENSSSCT00000004429	303522	1	349787	389816	2.73911	1	1	2.013	3.46522	1.29032	1544
    ENSSSCT00000004430	303526	1	390644	391199	1.1352	1	1	0.355534	1.91487	0.534763	555
    ENSSSCT00000004431	303530	1	399818	708754	0.0708618	1	1	0	0.160517	0.0556351	2620
    ...
    Is it what you need?

    Hope it helps!

    Comment


    • #3
      Thanks!

      Hi, dariober!

      That's exactry what I want!!
      Thanks a lot !!

      But, the number of genes are different between reference and output...

      reference extracted ID only)
      ID=Os01t0100100-01;Name=Os01t0100100-01;GO=Molecular
      ID=Os01t0100200-01;Name=Os01t0100200-01;Alias=AK059894;ID_converter=Os01g0100200;Locus_id=Os01g0100200;NIAS_FLcDNA=006-208-E01;Note=Conserved
      ID=Os01t0100400-01;Name=Os01t0100400-01;Alias=AK101455;GO=Molecular
      ID=Os01t0100500-01;Name=Os01t0100500-01;Alias=AK067316;ID_converter=Os01g0100500;Link_to=Gene
      gene.expr
      gene_id bundle_id chr left right FPKM FPKM_conf_lo FPKM_conf_hi
      Os01t0100100-01 48157 chr01 1982 9815 14.5909 12.9796 16.2022
      Os01t0100400-01 48157 chr01 11720 14685 4.77297 3.69897 5.84697
      Os01t0100500-01 48157 chr01 15398 19144 1.9978 1.31256 2.68303
      The gene whose ID is Os01t0100200-01 is not in gene.expr...
      This time , I used reference gene gtf file of Chr01 just to know how it would work.

      cufflinks accept_hist.sam -G reference_gene_chr01.gtf
      The number of reference gene was 5928,but output was reduced to 3848...
      Why did cufflinks avoid some genes??
      Last edited by zun; 12-01-2010, 05:43 PM.

      Comment


      • #4
        Hi Zun,

        Originally posted by zun View Post
        Hi, dariober!

        That's exactry what I want!!
        Thanks a lot !!
        Glad to hear that!

        The number of reference gene was 5928,but output was reduced to 3848...
        Why did cufflinks avoid some genes??
        I'm pretty sure that cufflinks doesn't report in output those genes that are not expressed at all. 3848 expressed genes out of 5928 sounds about right, although I know nothing about your experiment.
        (This might be trivial to say but... Also, make sure you are counting in the gft file the features that cufflinks uses as reference.)

        Dario

        Comment


        • #5
          I got it!

          Hi, dariober
          I appreciate your prompt reply!

          I'm pretty sure that cufflinks doesn't report in output those genes that are not expressed at all. 3848 expressed genes out of 5928 sounds about right, although I know nothing about your experiment.
          (This might be trivial to say but... Also, make sure you are counting in the gft file the features that cufflinks uses as reference.)
          I see! I made sure that the reads were not mapped on the Os01t0100200-01 gene at all.
          I am just a infomatician for a wet experiment, so I don't know the condition of this RNA-seq. But, I want to make sure whether many cuffcompare predicted genes really express or not!

          I hope your work will do well ,too!
          THANKS!

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Techniques and Challenges in Conservation Genomics
            by seqadmin



            The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

            Avian Conservation
            Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
            03-08-2024, 10:41 AM
          • seqadmin
            The Impact of AI in Genomic Medicine
            by seqadmin



            Artificial intelligence (AI) has evolved from a futuristic vision to a mainstream technology, highlighted by the introduction of tools like OpenAI's ChatGPT and Google's Gemini. In recent years, AI has become increasingly integrated into the field of genomics. This integration has enabled new scientific discoveries while simultaneously raising important ethical questions1. Interviews with two researchers at the center of this intersection provide insightful perspectives into...
            02-26-2024, 02:07 PM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, 03-14-2024, 06:13 AM
          0 responses
          32 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 03-08-2024, 08:03 AM
          0 responses
          71 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 03-07-2024, 08:13 AM
          0 responses
          80 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 03-06-2024, 09:51 AM
          0 responses
          68 views
          0 likes
          Last Post seqadmin  
          Working...
          X