Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • ivygreen
    Junior Member
    • Jan 2016
    • 3

    Reads Mapped but 0 FPKM for Every Gene

    Hi all,

    I'm using Tophat 2.1.0 and Cufflinks 2.2.1 to analyze mouse RNA sequencing. I used Tophat to align the reads, and this step appeared to be successful. A sample alignment summary for one of my Tophat runs is below:

    --------------------------------------

    >008_F0/align_summary.txt
    Left reads:
    Input : 16203854
    Mapped : 12585562 (77.7% of input)
    of these: 7295254 (58.0%) have multiple alignments (38724 have >20)
    Right reads:
    Input : 16203854
    Mapped : 11407353 (70.4% of input)
    of these: 6601221 (57.9%) have multiple alignments (38723 have >20)
    74.0% overall read mapping rate.

    Aligned pairs: 10988017
    of these: 6368400 (58.0%) have multiple alignments
    1480 ( 0.0%) are discordant alignments
    67.8% concordant pair alignment rate.

    --------------------------------------

    However, when I use cuffdiff to get abundance and differential expression estimates, I get 0 FPKM for everything (all the values for the files created by this step have 0 values!).

    I suspect that the error is in the original cuffdiff step, but I'm not sure what is wrong. My reference files are from http://ftp.ensembl.org/pub/current_fasta/. This was the output for my cuffdiff run:

    --------------------------------------

    >cuffdiff -o all-diffs -b download/Mus_musculus.GRCm38.dna.toplevel.fa -p 4 -L 008_F0,008_F4,018_F0,018_F4,019_F0,019_F4 -u download/Mus_musculus.GRCm38.83.gtf 008_F0/accepted_hits.bam 008_F4/accepted_hits.bam 018_F0/accepted_hits.bam 018_F4/accepted_hits.bam 019_F0/accepted_hits.bam 019_F4/accepted_hits.bam

    You are using Cufflinks v2.2.1, which is the most recent release.
    [16:08:46] Loading reference annotation and sequence.
    Warning: No conditions are replicated, switching to 'blind' dispersion method
    [16:09:41] Inspecting maps and determining fragment length distributions.
    [16:19:10] Modeling fragment count overdispersion.
    > Map Properties:
    > Normalized Map Mass: 0.00
    > Raw Map Mass: 0.00
    > Number of Multi-Reads: 0 (with 0 total hits)
    > Fragment Length Distribution: Truncated Gaussian (default)
    > Default Mean: 200
    > Default Std Dev: 80
    > Map Properties:
    > Normalized Map Mass: 0.00
    > Raw Map Mass: 0.00
    > Number of Multi-Reads: 0 (with 0 total hits)
    > Fragment Length Distribution: Truncated Gaussian (default)
    > Default Mean: 200
    > Default Std Dev: 80
    > Map Properties:
    > Normalized Map Mass: 0.00
    > Raw Map Mass: 0.00
    > Number of Multi-Reads: 0 (with 0 total hits)
    > Fragment Length Distribution: Truncated Gaussian (default)
    > Default Mean: 200
    > Default Std Dev: 80
    > Map Properties:
    > Normalized Map Mass: 0.00
    > Raw Map Mass: 0.00
    > Number of Multi-Reads: 0 (with 0 total hits)
    > Fragment Length Distribution: Truncated Gaussian (default)
    > Default Mean: 200
    > Default Std Dev: 80
    > Map Properties:
    > Normalized Map Mass: 0.00
    > Raw Map Mass: 0.00
    > Number of Multi-Reads: 0 (with 0 total hits)
    > Fragment Length Distribution: Truncated Gaussian (default)
    > Default Mean: 200
    > Default Std Dev: 80
    > Map Properties:
    > Normalized Map Mass: 0.00
    > Raw Map Mass: 0.00
    > Number of Multi-Reads: 0 (with 0 total hits)
    > Fragment Length Distribution: Truncated Gaussian (default)
    > Default Mean: 200
    > Default Std Dev: 80
    [16:25:17] Calculating preliminary abundance estimates
    > Processed 33596 loci. [*************************] 100%
    [16:51:51] Learning bias parameters.
    [17:00:27] Testing for differential expression and regulation in locus.
    > Processed 33596 loci. [*************************] 100%
    Performed 0 isoform-level transcription difference tests
    Performed 0 tss-level transcription difference tests
    Performed 0 gene-level transcription difference tests
    Performed 0 CDS-level transcription difference tests
    Performed 0 splicing tests
    Performed 0 promoter preference tests
    Performing 0 relative CDS output tests
    Writing isoform-level FPKM tracking
    Writing TSS group-level FPKM tracking
    Writing gene-level FPKM tracking
    Writing CDS-level FPKM tracking
    Writing isoform-level count tracking
    Writing TSS group-level count tracking
    Writing gene-level count tracking
    Writing CDS-level count tracking
    Writing isoform-level read group tracking
    Writing TSS group-level read group tracking
    Writing gene-level read group tracking
    Writing CDS-level read group tracking
    Writing read group info
    Writing run info

    --------------------------------------

    As you can see, all the map properties were 0, which I was told might not be an incorrect result, although it seemed unexpected to me. Does anyone know of anything else that I can try? Any suggestions would be greatly appreciated.

    Thanks in advance!
  • westerman
    Rick Westerman
    • Jun 2008
    • 1104

    #2
    Looks like you are doing things ok. But try without the '-u' option.

    Comment

    • ivygreen
      Junior Member
      • Jan 2016
      • 3

      #3
      I tried re-running without the -u option, but all FPKM are still 0.

      Comment

      • Michael.Ante
        Senior Member
        • Oct 2011
        • 127

        #4
        Hi Ivygreen,
        I'd run cufflinks on one/two sample first, there you'll maybe get a more detailed report. Check also if the bam-files are indexed (e.g. samtools index 008_F0/accepted_hits.bam) and if the chromosome names are the same in the alignment and the GTF file.

        Comment

        • ivygreen
          Junior Member
          • Jan 2016
          • 3

          #5
          Hi Michael Ante,
          I'm pretty new to these programs. I ran cufflinks on a few samples - how do I interpret the resulting reports? Also, if the files are indexed with different alignments (which I now suspect they are), is there an easy way to re-align them properly?

          Comment

          • cmbetts
            Senior Member
            • Jun 2012
            • 120

            #6
            This wouldn't explain why your FPKMs are zero for every gene, but your alignment statistics are a little alarming.
            Aligned pairs: 10988017
            of these: 6368400 (58.0%) have multiple alignments
            That's a really high multiple alignment rate for mouse RNA-Seq. I've only ever seen it that high when the rRNA depletion/ polyA selection failed. Even so, with 10M reads, you should be picking up >10k genes. How were these samples prepared?

            Comment

            • Michael.Ante
              Senior Member
              • Oct 2011
              • 127

              #7
              Hi Ivygreen,

              did you receive the same map mass as with cuffdiff? If not, you can use e.g. R to analyse the genes.fpkm_tracking and isoforms.fpkm_tracking files. For instance, you can make a boxplot of the FPKM values.

              Could you please post the tophat command you were using for the alignment?

              Cheers,

              Michael

              Comment

              Latest Articles

              Collapse

              • SEQadmin2
                From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                by SEQadmin2


                Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                ...
                06-02-2026, 10:05 AM
              • SEQadmin2
                Single-Cell Sequencing at an Inflection Point: Early Impacts of New Platforms and Emerging Trends
                by SEQadmin2


                With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.


                Introduction

                Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...
                05-22-2026, 06:42 AM
              • SEQadmin2
                Environmental Genomics in the Age of NGS: From Microbes to Conservation Strategies
                by SEQadmin2

                Studying ecosystems means dealing with complex, multi-species communities that are hard to observe at scale. This complexity, however, hides many important questions to be answered, from how biogeochemical cycles work and how climate change can affect species distribution to how conservation strategies can work best.


                Genomics, particularly since the expansion of NGS, has transformed ecosystem ecology. By sequencing environmental DNA, we can now assess biodiversity without direct...
                05-06-2026, 09:04 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by SEQadmin2, Today, 08:59 AM
              0 responses
              9 views
              0 reactions
              Last Post SEQadmin2  
              Started by SEQadmin2, 06-02-2026, 12:03 PM
              0 responses
              21 views
              0 reactions
              Last Post SEQadmin2  
              Started by SEQadmin2, 06-02-2026, 11:40 AM
              0 responses
              17 views
              0 reactions
              Last Post SEQadmin2  
              Started by SEQadmin2, 05-28-2026, 11:40 AM
              0 responses
              30 views
              0 reactions
              Last Post SEQadmin2  
              Working...