Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Cufflinks reporting differnt FPKMS for the same gene

    Hello,

    I am analyzing some bacterial RNA-seq data with cufflinks. Since in bacterial RNA-seq, splicing isn't an issue, I am using a mapping program that does not take splicing into account. I just wanted to get FPKM's for my genes and do some differential expression even though I am aware that cufflinks was created for euk RNA-seq. In the FAQ's, it states that cufflinks will work with bacterial RNA-seq given that I map with a fasta file of already annotated genes. I know cufflinks assembles transcripts, but when I feed it my sam file (generated from mapping program perM by mapping reads to a mulitfasta file of genes in the genome), cufflinks returns multiple locations of one gene in separate rows with all of their own FPKM's. I wanted just an FPKM for each gene. Does anyone have any way to resolve this issue? Cheers.

  • #2
    Could it be multiple isoforms of the same gene?

    Comment


    • #3
      Hey Nicolas,

      thats what I was thinking it might be. However, I looked at the read pileups with IGV, and cufflinks is just assembling different transcripts de novo of the same gene based on clustering reads. So, for example, looking at one gene that I am mapping reads too, instead of calculating the FPKM for all reads hitting that gene, cufflinks is splitting the gene up into thirds based on where reads are piling up and calculating three different FPKMs for each region of the gene and then reporting it as different "genes" (transcripts). So, rather than different isoforms, it looks like it is just splitting up genes based on where reads fall. I am also using a mapping program unaware of splicing. I am trying my luck with a few other programs to compare.

      Comment


      • #4
        Could you post the command you're using?
        Which Cufflinks mode are you using, de novo (default), with a reference annotation (-G) or RABT (-g)?
        Is there a complete coverage of your gene? If not (and if you're using de novo mode), then Cufflinks has no information supporting the fact that the 3 regions are actually one single gene...

        Please provide more info.

        Comment


        • #5
          Hello Nicolas, my command is below:

          cufflinks -N -u seq1_380-380_r1_out.sorted.sam

          It is in default mode i think. I aligned my reads to a multifasta file with annotated genes in hopes that it would be sufficient for cufflinks to assign reads to only these genes but I was wrong, and cufflinks assembled transcripts because no GTF was supplied. I was trying to search for anything on how to obtain or generate a reference GTF file for my bacterium, but I cannot seem to find it. Surely, that would probably fix my problem. do you know how I might generate one with an annotated reference genome in fasta format. Thank you for your inquiries! I am still quite new to this

          Comment


          • #6
            Does your multifasta file contains one entry per gene?
            If so, it should be easy to count the number of reads mapping to each entry (samtools idxstats <aln.bam> for instance). You can then normalize by exon size and library depth to achieve something similar to FPKM.
            I don't think Cufflinks could do what you want, but I am also not sure you really need it!

            Comment


            • #7
              Thank you for the replies Nicolas. Much appreciated. good luck with everything

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Current Approaches to Protein Sequencing
                by seqadmin


                Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                04-04-2024, 04:25 PM
              • seqadmin
                Strategies for Sequencing Challenging Samples
                by seqadmin


                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                03-22-2024, 06:39 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, 04-11-2024, 12:08 PM
              0 responses
              25 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 10:19 PM
              0 responses
              29 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 09:21 AM
              0 responses
              24 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-04-2024, 09:00 AM
              0 responses
              52 views
              0 likes
              Last Post seqadmin  
              Working...
              X