Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • FPKM=RPKM in single-read RNA-seq?

    Sorry this may be a silly old question.
    I saw some places say "FPKM is equivalent to RPKM in single-end RNA-seq". But I thought FPKM is based on the counts of fragments (reads extended by the fragment length, default to 200bp as the mean for single-end reads). So I don't understand how FPKM could be equal to RPKM for single-end reads.

  • #2
    I think that you may be mis-interpreting the statement. It says "FPKM is equivalent" not "equal". "Equivalent" in this case -- as far as I interpret it -- is in the way of thinking about the problem. In other words if you are coming from a single-end-read background and use to thinking about Reads-Per-Kilobase_exon-per-Million_mapped_reads then once you are in the paired-end world and having to deal with pairs that can map to multiple places with multiple overall lengths thus making FPKM more useful, well, all you have to do is substitute the word "FPKM" into places where you use to think "RPKM".

    Comment


    • #3
      Originally posted by westerman View Post
      I think that you may be mis-interpreting the statement. It says "FPKM is equivalent" not "equal". "Equivalent" in this case -- as far as I interpret it -- is in the way of thinking about the problem. In other words if you are coming from a single-end-read background and use to thinking about Reads-Per-Kilobase_exon-per-Million_mapped_reads then once you are in the paired-end world and having to deal with pairs that can map to multiple places with multiple overall lengths thus making FPKM more useful, well, all you have to do is substitute the word "FPKM" into places where you use to think "RPKM".
      Thanks for the answer.
      Yes, I understand FPKM is more different from RPKM in paired-end RNA-seq, because one read does not necessarily correspond to one fragment.
      But in single-end, fragment is basically extended read, so one read corresponds to one fragment absolutely. So counts of fragments would tend to be larger than counts reads, since the extended reads definitely cover larger space. Is this right?

      Comment


      • #4
        Well, I wasn't going to discuss your interpretation of FPKM as applied to single-end reads ...

        reads extended by the fragment length, default to 200bp as the mean for single-end reads
        ... since I do not think that it is correct. Cufflinks, with which I am most familiar, does not calculate FPKM in that manner -- as far as I understand the program. But, heck, probably someone somewhere uses that definition. If you could provide a reference to your interpretation of FPKM then perhaps the rest of us can provide more clarification.

        Comment


        • #5
          One potential issue is defining FPKM for discordantly mapped read paired-end reads, however in a single-end context, this would not be an issue.

          The way Cufflinks calculates it, multi-mapping reads are divided up amongst all locations. As a result, one cannot directly calculate the number of reads mapping to a locus simply by multiplying by the locus length and number of reads.

          Comment


          • #6
            Originally posted by westerman View Post
            Well, I wasn't going to discuss your interpretation of FPKM as applied to single-end reads ...



            ... since I do not think that it is correct. Cufflinks, with which I am most familiar, does not calculate FPKM in that manner -- as far as I understand the program. But, heck, probably someone somewhere uses that definition. If you could provide a reference to your interpretation of FPKM then perhaps the rest of us can provide more clarification.
            Thanks.
            I was asking if my understanding is correct. I got the impression that fragments are extended reads from some tutorial slides for RNA-seq, but it's not for Cufflinks. So I may mess it up.
            I was trying to find the definition in the cufflinks paper, but it mainly talks about paired-end data. For single-end, honestly I don't know how it gets fragments from reads.
            If you can tell me your interpretation, that's very appreciated.

            Comment


            • #7
              Originally posted by metheuse View Post
              Thanks.
              I was asking if my understanding is correct. I got the impression that fragments are extended reads from some tutorial slides for RNA-seq, but it's not for Cufflinks. So I may mess it up.
              I was trying to find the definition in the cufflinks paper, but it mainly talks about paired-end data. For single-end, honestly I don't know how it gets fragments from reads.
              If you can tell me your interpretation, that's very appreciated.
              Its simply the name. RPKM was used in the original Mortazavi paper. This calculation is relatively strait-forward where as the Cufflinks method attempts to rescue multi-mapping reads by dividing it up amongst each location. I think part of the motivation then is to distinguish this calculation from the original RPKM calculation.

              Comment


              • #8
                Originally posted by chadn737 View Post
                One potential issue is defining FPKM for discordantly mapped read paired-end reads, however in a single-end context, this would not be an issue.

                The way Cufflinks calculates it, multi-mapping reads are divided up amongst all locations. As a result, one cannot directly calculate the number of reads mapping to a locus simply by multiplying by the locus length and number of reads.
                Thanks for the explanation.
                I still don't understand how cufflinks get "fragments" from "reads" exactly, in a single-end case? There are some parameters in the program to control the estimation of fragment length. I don't know how these are used to get to "fragment counts".

                Comment


                • #9
                  Let me just ask one question:

                  What does "fragments" exactly mean, in single-end case? (parts of reads? combination of reads? Extension of reads? Extension of parts of reads? Or something else?)

                  Comment


                  • #10
                    Originally posted by metheuse View Post
                    Let me just ask one question:

                    What does "fragments" exactly mean, in single-end case? (parts of reads? combination of reads? Extension of reads? Extension of parts of reads? Or something else?)
                    Its a name. I would be more concerned with understanding the calculation of it rather than getting hung up over the difference between a read and a fragment.

                    Comment


                    • #11
                      ontology is important

                      Originally posted by chadn737 View Post
                      Its a name. I would be more concerned with understanding the calculation of it rather than getting hung up over the difference between a read and a fragment.
                      I understand your sentiment. However, I have come across the use of the work "fragment" without knowing exactly what it means. If I don't know what it means, I can't completely wrap my head around the definition that I happen to be reading at the time.

                      I guess what I'm really asking is: does "fragment" have no standard definition when it comes to NGS data? If so, that is very bad.

                      Thanks your all your help.

                      Comment


                      • #12
                        Originally posted by SrCardgage View Post
                        I understand your sentiment. However, I have come across the use of the work "fragment" without knowing exactly what it means. If I don't know what it means, I can't completely wrap my head around the definition that I happen to be reading at the time.

                        I guess what I'm really asking is: does "fragment" have no standard definition when it comes to NGS data? If so, that is very bad.

                        Thanks your all your help.
                        It does have a definition. Are you familiar with how libraries are prepared? At some step in the actual bench work, the DNA or RNA is fragmented into smaller segments and only fragments of a certain length are then used in sequencing.

                        A read is the portion of the fragment that has been sequenced. Illumina sequencers can sequence from one end of the fragment, giving a single read per fragment, or from both ends in a paired-end fashion, giving two reads per fragment. When you try to assess gene expression from paired-end data, you count the two reads from either end as one count, i.e. one fragment.

                        In single-end data, the read is synonymous with the fragment, however, what you have to remember is that how RPKM, FPKM, etc are calculated differs even with single-end data. So that it can become a bit confusing if you don't understand how these things are being calculated.

                        Comment

                        Latest Articles

                        Collapse

                        • seqadmin
                          Advancing Precision Medicine for Rare Diseases in Children
                          by seqadmin




                          Many organizations study rare diseases, but few have a mission as impactful as Rady Children’s Institute for Genomic Medicine (RCIGM). “We are all about changing outcomes for children,” explained Dr. Stephen Kingsmore, President and CEO of the group. The institute’s initial goal was to provide rapid diagnoses for critically ill children and shorten their diagnostic odyssey, a term used to describe the long and arduous process it takes patients to obtain an accurate...
                          12-16-2024, 07:57 AM
                        • seqadmin
                          Recent Advances in Sequencing Technologies
                          by seqadmin



                          Innovations in next-generation sequencing technologies and techniques are driving more precise and comprehensive exploration of complex biological systems. Current advancements include improved accessibility for long-read sequencing and significant progress in single-cell and 3D genomics. This article explores some of the most impactful developments in the field over the past year.

                          Long-Read Sequencing
                          Long-read sequencing has seen remarkable advancements,...
                          12-02-2024, 01:49 PM

                        ad_right_rmr

                        Collapse

                        News

                        Collapse

                        Topics Statistics Last Post
                        Started by seqadmin, 12-17-2024, 10:28 AM
                        0 responses
                        32 views
                        0 likes
                        Last Post seqadmin  
                        Started by seqadmin, 12-13-2024, 08:24 AM
                        0 responses
                        48 views
                        0 likes
                        Last Post seqadmin  
                        Started by seqadmin, 12-12-2024, 07:41 AM
                        0 responses
                        34 views
                        0 likes
                        Last Post seqadmin  
                        Started by seqadmin, 12-11-2024, 07:45 AM
                        0 responses
                        46 views
                        0 likes
                        Last Post seqadmin  
                        Working...
                        X