Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Reads Mapped but 0 FPKM for Every Gene

    Hi all,

    I'm using Tophat 2.1.0 and Cufflinks 2.2.1 to analyze mouse RNA sequencing. I used Tophat to align the reads, and this step appeared to be successful. A sample alignment summary for one of my Tophat runs is below:

    --------------------------------------

    >008_F0/align_summary.txt
    Left reads:
    Input : 16203854
    Mapped : 12585562 (77.7% of input)
    of these: 7295254 (58.0%) have multiple alignments (38724 have >20)
    Right reads:
    Input : 16203854
    Mapped : 11407353 (70.4% of input)
    of these: 6601221 (57.9%) have multiple alignments (38723 have >20)
    74.0% overall read mapping rate.

    Aligned pairs: 10988017
    of these: 6368400 (58.0%) have multiple alignments
    1480 ( 0.0%) are discordant alignments
    67.8% concordant pair alignment rate.

    --------------------------------------

    However, when I use cuffdiff to get abundance and differential expression estimates, I get 0 FPKM for everything (all the values for the files created by this step have 0 values!).

    I suspect that the error is in the original cuffdiff step, but I'm not sure what is wrong. My reference files are from http://ftp.ensembl.org/pub/current_fasta/. This was the output for my cuffdiff run:

    --------------------------------------

    >cuffdiff -o all-diffs -b download/Mus_musculus.GRCm38.dna.toplevel.fa -p 4 -L 008_F0,008_F4,018_F0,018_F4,019_F0,019_F4 -u download/Mus_musculus.GRCm38.83.gtf 008_F0/accepted_hits.bam 008_F4/accepted_hits.bam 018_F0/accepted_hits.bam 018_F4/accepted_hits.bam 019_F0/accepted_hits.bam 019_F4/accepted_hits.bam

    You are using Cufflinks v2.2.1, which is the most recent release.
    [16:08:46] Loading reference annotation and sequence.
    Warning: No conditions are replicated, switching to 'blind' dispersion method
    [16:09:41] Inspecting maps and determining fragment length distributions.
    [16:19:10] Modeling fragment count overdispersion.
    > Map Properties:
    > Normalized Map Mass: 0.00
    > Raw Map Mass: 0.00
    > Number of Multi-Reads: 0 (with 0 total hits)
    > Fragment Length Distribution: Truncated Gaussian (default)
    > Default Mean: 200
    > Default Std Dev: 80
    > Map Properties:
    > Normalized Map Mass: 0.00
    > Raw Map Mass: 0.00
    > Number of Multi-Reads: 0 (with 0 total hits)
    > Fragment Length Distribution: Truncated Gaussian (default)
    > Default Mean: 200
    > Default Std Dev: 80
    > Map Properties:
    > Normalized Map Mass: 0.00
    > Raw Map Mass: 0.00
    > Number of Multi-Reads: 0 (with 0 total hits)
    > Fragment Length Distribution: Truncated Gaussian (default)
    > Default Mean: 200
    > Default Std Dev: 80
    > Map Properties:
    > Normalized Map Mass: 0.00
    > Raw Map Mass: 0.00
    > Number of Multi-Reads: 0 (with 0 total hits)
    > Fragment Length Distribution: Truncated Gaussian (default)
    > Default Mean: 200
    > Default Std Dev: 80
    > Map Properties:
    > Normalized Map Mass: 0.00
    > Raw Map Mass: 0.00
    > Number of Multi-Reads: 0 (with 0 total hits)
    > Fragment Length Distribution: Truncated Gaussian (default)
    > Default Mean: 200
    > Default Std Dev: 80
    > Map Properties:
    > Normalized Map Mass: 0.00
    > Raw Map Mass: 0.00
    > Number of Multi-Reads: 0 (with 0 total hits)
    > Fragment Length Distribution: Truncated Gaussian (default)
    > Default Mean: 200
    > Default Std Dev: 80
    [16:25:17] Calculating preliminary abundance estimates
    > Processed 33596 loci. [*************************] 100%
    [16:51:51] Learning bias parameters.
    [17:00:27] Testing for differential expression and regulation in locus.
    > Processed 33596 loci. [*************************] 100%
    Performed 0 isoform-level transcription difference tests
    Performed 0 tss-level transcription difference tests
    Performed 0 gene-level transcription difference tests
    Performed 0 CDS-level transcription difference tests
    Performed 0 splicing tests
    Performed 0 promoter preference tests
    Performing 0 relative CDS output tests
    Writing isoform-level FPKM tracking
    Writing TSS group-level FPKM tracking
    Writing gene-level FPKM tracking
    Writing CDS-level FPKM tracking
    Writing isoform-level count tracking
    Writing TSS group-level count tracking
    Writing gene-level count tracking
    Writing CDS-level count tracking
    Writing isoform-level read group tracking
    Writing TSS group-level read group tracking
    Writing gene-level read group tracking
    Writing CDS-level read group tracking
    Writing read group info
    Writing run info

    --------------------------------------

    As you can see, all the map properties were 0, which I was told might not be an incorrect result, although it seemed unexpected to me. Does anyone know of anything else that I can try? Any suggestions would be greatly appreciated.

    Thanks in advance!

  • #2
    Looks like you are doing things ok. But try without the '-u' option.

    Comment


    • #3
      I tried re-running without the -u option, but all FPKM are still 0.

      Comment


      • #4
        Hi Ivygreen,
        I'd run cufflinks on one/two sample first, there you'll maybe get a more detailed report. Check also if the bam-files are indexed (e.g. samtools index 008_F0/accepted_hits.bam) and if the chromosome names are the same in the alignment and the GTF file.

        Comment


        • #5
          Hi Michael Ante,
          I'm pretty new to these programs. I ran cufflinks on a few samples - how do I interpret the resulting reports? Also, if the files are indexed with different alignments (which I now suspect they are), is there an easy way to re-align them properly?

          Comment


          • #6
            This wouldn't explain why your FPKMs are zero for every gene, but your alignment statistics are a little alarming.
            Aligned pairs: 10988017
            of these: 6368400 (58.0%) have multiple alignments
            That's a really high multiple alignment rate for mouse RNA-Seq. I've only ever seen it that high when the rRNA depletion/ polyA selection failed. Even so, with 10M reads, you should be picking up >10k genes. How were these samples prepared?

            Comment


            • #7
              Hi Ivygreen,

              did you receive the same map mass as with cuffdiff? If not, you can use e.g. R to analyse the genes.fpkm_tracking and isoforms.fpkm_tracking files. For instance, you can make a boxplot of the FPKM values.

              Could you please post the tophat command you were using for the alignment?

              Cheers,

              Michael

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Strategies for Sequencing Challenging Samples
                by seqadmin


                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                03-22-2024, 06:39 AM
              • seqadmin
                Techniques and Challenges in Conservation Genomics
                by seqadmin



                The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                Avian Conservation
                Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                03-08-2024, 10:41 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, Yesterday, 06:37 PM
              0 responses
              10 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, Yesterday, 06:07 PM
              0 responses
              10 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 03-22-2024, 10:03 AM
              0 responses
              51 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 03-21-2024, 07:32 AM
              0 responses
              67 views
              0 likes
              Last Post seqadmin  
              Working...
              X