Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Miseq % Aligned Metric

    Hello,

    I have been running a Miseq instrument for just under a year now, and received the upgrade in October. I run a variety of amplicon and whole genome samples prepared by various collaborators of mine. I always spike 25-40% PhiX into the amplicon samples, as they tend to be low diversity. I never gave the "% aligned" metric much thought before the upgrade because it always accurately reflected the amount of PhiX added (i.e. if I added 30% phiX to my amplicon pool, the % aligned to the PhiX genome was always ~30%).

    Recently, the "% aligned" values on my amplicon runs have ranged from 1% to 98% despite the fact I have been adding 25-40%. I have not been able to correlate anything sample-related to this phenomenon (DNA concentration, sample source, amplicon type, lab preparing the samples, cluster density, reads PF, etc...) and am now seeking to understand this "% aligned" metric a bit better. Most concerning to me is that I recently ran a 100% PhiX run on the instrument and only 63% of the reads aligned to the PhiX genome. What could cause this?

    I am new to this forum and attempted to search for answers to this issue as best I could, but if you know of an existing thread related to this matter, please let me know!

    Any insight is greatly appreciated!

  • #2
    We've already seen fluctuations in % aligned when compared to PhiX spike in too.
    I've always put it down to two factors. One, differences in quantification between our library and the PhiX. Two, varying competition in cluster gen efficiency between the two libraries. We tend to see higher % when sequencing larger library fragments. The PhiX library has around a 200bp insert size, so will out-compete libraries with larger inserts, sometimes quite drastically. Because of this we are now prepping our own diversity spike-ins with longer insert sizes.
    Your 63% metric is a little worrying though. Maybe it's something to do with the on-the-fly aligner Illumina use. Have you run the fastq through bowtie to check what alignment rate you get? The PhiX reference fasta should be available through iGenomes. I did this for our install run data and had pretty good alignment (>95%, from what I rememeber)

    Comment


    • #3
      Thanks Tony! I initially assumed that the fluctuations was due to our library quantification but I can't seem to find a correlation between high or low PhiX % aligned and high or low DNA concentration in our pools.
      The amplicon length competition is something I hadn't considered. I know that read length can affect the % aligned, but I didn't think about how shorter fragments will out-compete longer ones. This may actually make sense with our data since we are trying to sequence 400 - 500 bp amplicons. Most of the issues we've experiences are with the longer amplicons...
      I did not try to realize the Phix run with a different aligner, which may help - although I still think it's odd that the instrument used this aligner for all the other phix runs that had 90% aligned.
      Thanks for the input. It's greatly appreciated.

      Comment


      • #4
        For your 100% phiX run, what was the data quality like? If the quality is poor due (due to over-clustering, severe under-clustering, or various optical issues) then your % aligned will drop because the reads will be too erroneous to 100% align to the reference. If the quality was within spec., then that result is quite interesting and you might want to give your FAS a call.

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Essential Discoveries and Tools in Epitranscriptomics
          by seqadmin




          The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
          04-22-2024, 07:01 AM
        • seqadmin
          Current Approaches to Protein Sequencing
          by seqadmin


          Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
          04-04-2024, 04:25 PM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, 04-11-2024, 12:08 PM
        0 responses
        59 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 10:19 PM
        0 responses
        57 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 09:21 AM
        0 responses
        53 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-04-2024, 09:00 AM
        0 responses
        56 views
        0 likes
        Last Post seqadmin  
        Working...
        X