Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Very low % of reads showing primary alignment to transcriptome

    Dear users,

    I have recently started analyzing RNA-Seq data for gene expression analysis, hence am quite new to the field.

    I have used STAR for aligning RNA-Seq reads (hg38, Ensemble, release 94) using --quantMode TranscriptomeBAM for STAR run.

    When I analyzed the quality of BAM files (2 files - genomic BAM and Aligned.toTranscriptome.bam) using BamQC, I get widely different results in terms of basic statistics like primary alignments

    In whole in genomic BAM, total 96.95 reads fall in primary alignment, transcriptome BAM has only 29.4% primary aligned reads. Does this low % means the data quality is bad for doing analysis like differential isoform and allele expression?

    Thanks for your inputs.

  • #2
    Have you tried to see where (96-95-29.4) reads are aligning (since they are not aligning to transcripts)? Does your data have rRNA present? Inspecting the resulting BAM using IGV would be a great place to start.

    Comment


    • #3
      Thanks for the response!

      I did try to visualize the two BAM files in IGV.

      When visualizing genomic BAM (Aligned.out.bam), I can see that many reads are falling into the exonic region of genes, with corresponding higher coverage, however, the coverage is missing in transcriptomics BAM file for the same region.

      I would expect the coverage from transcriptomics BAM file to exist at least in genes whose annotation is present in the GTF files used for mapping. (Here the visualization is over part of MYH9, myosin 9 gene, ENSG00000100345)

      As an additional note, when I analyzed my BAM (genomic) file using PICARD tools, I observed that 45% of total input bases were classified as intronic bases, while 50% of total bases were categorized as mRNA bases.

      Is that the reason why there is low %reads in transcriptome BAM?

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Current Approaches to Protein Sequencing
        by seqadmin


        Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
        04-04-2024, 04:25 PM
      • seqadmin
        Strategies for Sequencing Challenging Samples
        by seqadmin


        Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
        03-22-2024, 06:39 AM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, 04-11-2024, 12:08 PM
      0 responses
      22 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-10-2024, 10:19 PM
      0 responses
      24 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-10-2024, 09:21 AM
      0 responses
      19 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-04-2024, 09:00 AM
      0 responses
      50 views
      0 likes
      Last Post seqadmin  
      Working...
      X