Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • sum of FPKMs?

    I'm analyzing the expression levels of certain genes in different tissues with data from a database and I need to count two different genes as one because I know by experimental data that they were erroneously annotated.

    The expression levels in the database are in FPKM and I know I can't simple make the sum of the two genes to count it as one.

    If I had raw counts what I would do is

    gene A = 400, gene B = 300, counting them as a single gene = 700.

    what would be the best thing to do this with FPKMs?

    gene A = 12 FPKM
    gene B = 20 FPKM
    as single gene = x ?
    Last edited by dlepe; 07-23-2014, 11:51 AM.

  • #2
    FPKM is fragments per kilobase of transcript per million mapped reads.

    So then
    x = total number of fragments / ((total number of bases of transcipt / 1000) * (mapped fragments / 1000000))
    = (fragments mapped to gene A + fragments mapped to gene B) / ((bases of gene A + bases of gene B) / 1000 * (mapped fragments / 1000000))

    This would be if gene A and gene B did not overlap (and by that I mean that no read is mapped to both gene A and gene B). If they do, you'll have to use something like the inclusion-exclusion principle. I don't think you can simply add the two FPKM values, like you mentioned.

    Comment


    • #3
      The thing is I don´t have the total number of mapped fragments from the libraries, I would have to try to see if the raw data is available somewhere and do the mapping myself..

      Since I'm trying to get an estimation of the correlated expression between the gene in question to another gene a friend suggested to simply use the average of gene A and gene B as the expression value I'm trying to find.

      His reasoning is that since FPKMs are normalized by length, and assuming that the number of raw counts in gene A and B similar, the FPKM for only gene A or B should be very similar to the number of FPKMs we'd get if we calculate the FPKMs for they both as a single gene.

      Comment


      • #4
        I suppose you could do an average. I think a weighted average would be better suited for this. You could weight each FPKM value by the length of the corresponding gene.

        Comment


        • #5
          Yeah I guess, I'll see how that goes, thanks.

          Comment


          • #6
            I just did the math, and the weighted average is what you want, provided the genes don't overlap like I previously stated. So if gene A has FPKM a, and gene B has FPKM b, you want:

            a * |A| + b * |B|
            |A| + |B|

            where |x| is the length of gene x.

            Edit: If you want, I can type up my reasoning in latex. I just don't know of a nice way to display fractions on seqanswers.

            Comment


            • #7
              awesome, I'll look into it, thanks again.

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Strategies for Sequencing Challenging Samples
                by seqadmin


                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                03-22-2024, 06:39 AM
              • seqadmin
                Techniques and Challenges in Conservation Genomics
                by seqadmin



                The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                Avian Conservation
                Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                03-08-2024, 10:41 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, Yesterday, 06:37 PM
              0 responses
              8 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, Yesterday, 06:07 PM
              0 responses
              8 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 03-22-2024, 10:03 AM
              0 responses
              49 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 03-21-2024, 07:32 AM
              0 responses
              66 views
              0 likes
              Last Post seqadmin  
              Working...
              X