Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to decide FPKM fo known genes

    Hi , everyone.

    I am wondering how I should determin the FPKM of reference genes.

    I can get FPKM of cufflinks predicted transcript from cuffsompare output(transcript.tmap).

    like this..
    ref_gene_id ref_id class_code cuff_gene_id cuff_id FMI FPKM FPKM_conf_lo FPKM_conf_hi cov len major_iso_id
    AT1G01020 AT1G01020.1 e CUFF.1 CUFF.1.1 100 0.823930 0.000000 1.989144 1.195219 251 CUFF.1.1
    AT1G01030 AT1G01030.1 p CUFF.3 CUFF.3.1 100 2.209471 0.233260 4.185682 2.136752 234 CUFF.3.1
    AT1G01030 AT1G01030.1 c CUFF.5 CUFF.5.1 100 7.940817 5.599198 10.282437 8.514190 599 CUFF.5.1
    AT1G01020 AT1G01020.1 c CUFF.8 CUFF.8.1 100 9.703712 5.618407 13.789018 10.301606 846 CUFF.8.1
    AT1G01020 AT1G01020.2 j CUFF.8 CUFF.8.2 51 4.949442 1.006100 8.892785 5.254402 816 CUFF.8.1

    But, in case that ref_gene_id appers many times(for example, AT1G01030),
    how can I calculate FPKM of that ref_gene_id??

    Or,are there different ways to know FPKM of know genes from RNA-seq??

    Please help me!!

  • #2
    Hi Zun,

    I guess to get the FPKM of reference genes you can run cufflinks with the relevant GTF file without the need of cuffcompare (if you have reference genes you should be able to download or create the GTF file). Cufflinks will look into the gtf file and determine the FPKM of each gene/transcript without assembling new ones. That is, if you run something like:
    Code:
    cufflinks accepted_hits.sam -G yourgtffile.gtf  [other options...]
    You should get a file called transcripts.expr that looks like this:
    Code:
    trans_id	bundle_id	chr	left	right	FPKM	FMI	frac	FPKM_conf_lo	FPKM_conf_hi	coverage	length
    ENSSSCT00000004429	303522	1	349787	389816	2.73911	1	1	2.013	3.46522	1.29032	1544
    ENSSSCT00000004430	303526	1	390644	391199	1.1352	1	1	0.355534	1.91487	0.534763	555
    ENSSSCT00000004431	303530	1	399818	708754	0.0708618	1	1	0	0.160517	0.0556351	2620
    ...
    Is it what you need?

    Hope it helps!

    Comment


    • #3
      Thanks!

      Hi, dariober!

      That's exactry what I want!!
      Thanks a lot !!

      But, the number of genes are different between reference and output...

      reference extracted ID only)
      ID=Os01t0100100-01;Name=Os01t0100100-01;GO=Molecular
      ID=Os01t0100200-01;Name=Os01t0100200-01;Alias=AK059894;ID_converter=Os01g0100200;Locus_id=Os01g0100200;NIAS_FLcDNA=006-208-E01;Note=Conserved
      ID=Os01t0100400-01;Name=Os01t0100400-01;Alias=AK101455;GO=Molecular
      ID=Os01t0100500-01;Name=Os01t0100500-01;Alias=AK067316;ID_converter=Os01g0100500;Link_to=Gene
      gene.expr
      gene_id bundle_id chr left right FPKM FPKM_conf_lo FPKM_conf_hi
      Os01t0100100-01 48157 chr01 1982 9815 14.5909 12.9796 16.2022
      Os01t0100400-01 48157 chr01 11720 14685 4.77297 3.69897 5.84697
      Os01t0100500-01 48157 chr01 15398 19144 1.9978 1.31256 2.68303
      The gene whose ID is Os01t0100200-01 is not in gene.expr...
      This time , I used reference gene gtf file of Chr01 just to know how it would work.

      cufflinks accept_hist.sam -G reference_gene_chr01.gtf
      The number of reference gene was 5928,but output was reduced to 3848...
      Why did cufflinks avoid some genes??
      Last edited by zun; 12-01-2010, 05:43 PM.

      Comment


      • #4
        Hi Zun,

        Originally posted by zun View Post
        Hi, dariober!

        That's exactry what I want!!
        Thanks a lot !!
        Glad to hear that!

        The number of reference gene was 5928,but output was reduced to 3848...
        Why did cufflinks avoid some genes??
        I'm pretty sure that cufflinks doesn't report in output those genes that are not expressed at all. 3848 expressed genes out of 5928 sounds about right, although I know nothing about your experiment.
        (This might be trivial to say but... Also, make sure you are counting in the gft file the features that cufflinks uses as reference.)

        Dario

        Comment


        • #5
          I got it!

          Hi, dariober
          I appreciate your prompt reply!

          I'm pretty sure that cufflinks doesn't report in output those genes that are not expressed at all. 3848 expressed genes out of 5928 sounds about right, although I know nothing about your experiment.
          (This might be trivial to say but... Also, make sure you are counting in the gft file the features that cufflinks uses as reference.)
          I see! I made sure that the reads were not mapped on the Os01t0100200-01 gene at all.
          I am just a infomatician for a wet experiment, so I don't know the condition of this RNA-seq. But, I want to make sure whether many cuffcompare predicted genes really express or not!

          I hope your work will do well ,too!
          THANKS!

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Advancing Precision Medicine for Rare Diseases in Children
            by seqadmin




            Many organizations study rare diseases, but few have a mission as impactful as Rady Children’s Institute for Genomic Medicine (RCIGM). “We are all about changing outcomes for children,” explained Dr. Stephen Kingsmore, President and CEO of the group. The institute’s initial goal was to provide rapid diagnoses for critically ill children and shorten their diagnostic odyssey, a term used to describe the long and arduous process it takes patients to obtain an accurate...
            12-16-2024, 07:57 AM
          • seqadmin
            Recent Advances in Sequencing Technologies
            by seqadmin



            Innovations in next-generation sequencing technologies and techniques are driving more precise and comprehensive exploration of complex biological systems. Current advancements include improved accessibility for long-read sequencing and significant progress in single-cell and 3D genomics. This article explores some of the most impactful developments in the field over the past year.

            Long-Read Sequencing
            Long-read sequencing has seen remarkable advancements,...
            12-02-2024, 01:49 PM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, 12-17-2024, 10:28 AM
          0 responses
          28 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 12-13-2024, 08:24 AM
          0 responses
          43 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 12-12-2024, 07:41 AM
          0 responses
          29 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 12-11-2024, 07:45 AM
          0 responses
          42 views
          0 likes
          Last Post seqadmin  
          Working...
          X