Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • how to identify number of fragments are produced for given gene in RNA-seq?

    Hi to all

    I have RNA -seq data, for calculation of FPKM value manually, we should need to know no of fragments were generated during RNA-seq for given gene am I right? Please any one tell me where to look to get the no of fragments produced for a given gene or any feature liks cds etc.,

    I followed one procedure Please let me know if my way of knowing read is wrong?

    first i open bam files into IGV , then I calculate the no of fragments for a given gene. Whether I did a right? and also please tell if above procedure is right for pair end sequencing whether i have to count both left and right fragments separately or combine into 1.

  • #2
    Hi Muthukumar,

    In principle, you are rigth. A fragment is given by the read/read-pair. Unfortunately, each read/read-pair can map to several positions on your annotation and cause a bit of ambiguity.
    Therefore, there are many ways to count the reads/read-pairs for a certain gene/transcript or feature. You may start with the Tuxedo-Suite pipeline http://www.ncbi.nlm.nih.gov/pubmed/22383036. Other methods are Salmon, featureCount, RSem, and many many more.

    Cheers,

    Michael

    Comment


    • #3
      @Muthukumar: You don't want to do this by hand. There are software packages featureCounts and htseq-count that do this for one (or more) aligned BAM files. Both packages require a genomic feature definition file (GFF/GTF). If you are using a model organism then they are easy to find. Make sure you use one that matches the genome build used for your alignments.

      Comment


      • #4
        thanking you for answering the question. I am already following the nature protocol which u were mentioned. when i ran a command for cuffdiff and cufflinks , I got one column as FPKM , I want to do check my manually calculated FPKM and cuffdiff generated FPKM are same.Unfortunately I am not getting the exactly same answer.

        Here is the procedure that I was followed for calculation of FPKM.

        1. I counted the reads using IGV for specific gene.
        (here I want to clarify one doubt I am using pairwise end seq data, some of the reads were found both on left and right i mean overlapping reads for some genes whether I have to calculate as 2 reads or 1 read

        for instance:

        --------------> (read 1) <-------------- (read L)
        ------------------->(read R)
        ______________Exon1_______|___________________________|____Exon2___________

        for exon2 whether should I have 2 fragments combine into 1 or into separately. Pls tell calrify me. For my manual calculation I calculated as 2 fragments .

        2. I calculated using following formula

        # of fragments
        FPKM = ___________________________ * 10^9
        length of gene. Total no of reads

        Whether above formula is r8?

        One more doubt => my gene of interest contains 17 exons , All 17 exons are not having read fragments and some of the fragments for a exon is small and some of the fragments are lengthy. So whether I have the count the small reads also pls rectify me?

        Comment


        • #5
          Actually, the FPK-values are different, since Cuffdiff performs some extra heuristics.
          As GenoMax posted, you should assess the read counts not manually, but use an accepted tool.
          The FPKM is usually computed on transcript level and taking its exons' length as part of the denominator.
          The read-length is something which should be controlled for in the alignment. Therefore, if your aligner reported a read/read-pair to map there, I would take it as a valid read/read-pair. In case you are doubting it, you might re-align your data with length-filtered reads (e.g. using bbduk.sh from bbmap).

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Recent Advances in Sequencing Analysis Tools
            by seqadmin


            The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...
            Today, 07:48 AM
          • seqadmin
            Essential Discoveries and Tools in Epitranscriptomics
            by seqadmin




            The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
            04-22-2024, 07:01 AM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, Today, 07:17 AM
          0 responses
          11 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 05-02-2024, 08:06 AM
          0 responses
          19 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-30-2024, 12:17 PM
          0 responses
          20 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-29-2024, 10:49 AM
          0 responses
          28 views
          0 likes
          Last Post seqadmin  
          Working...
          X