Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    Originally posted by greigite View Post
    I don't have this info- got the plot from the person who did the library prep and they didn't change the bioanalyzer settings to output in bp- unfortunately.
    Hmmm. Well, could they tell you what type of bioanalyzer chip they ran?

    As it is, as far as we know the two plots more or less match one another...


    --
    Phillip

    Comment


    • #17
      Did you ever determine what the source of your sequencing problem was? We've just sequenced a cDNA library on the Titanium and have nearly the exact same read distribution and di-nucleotide repeat issues you describe. We are currently attempting to troubleshoot

      Thanks
      Meredith

      Comment


      • #18
        Hi all

        Maybe a related issue, and I'd appreciate any suggestions. We're trying to sequence an amplicon library of a three-base repeat region (essentially, deep sequencing of a microsatellite marker from a population), and are also getting short average read lengths. Smaller numbers of repeats (15-20 copies) aren't too bad, but larger ones (in some cases, 40-50 copies or more) won't get through to the end of the repeat region - quality just drops off too much to call.

        I was wondering if the problem might be polymerase slippage, either during the emPCR or the sequencing itself, and a colleague has suggested inhibition of emPCR (caused by localised imbalances in dNTP concentrations from only three being used) which might give poor quality amplification on the beads. Does anyone have any comments on either of these theories (especially how to solve them) or other ideas why repeats are a problem?

        Cheers

        Mark

        Comment


        • #19
          Originally posted by pmiguel View Post
          For cDNAs, issues with short read lengths generally stem from polyA tracts in the library molecules.
          I have the following from an email just sent me:

          I came across your post about 2 years ago regarding a high fraction of short reads. In your post, you said that "For cDNAs, issues with short read lengths generally stem from polyA tracts in the library molecules." Could you please elaborate a little more on this? My libraries are cDNAs and I have got a lot of short reads from multiple runs. I would really appreciate your suggestions.
          My preference is to answer questions of this sort in the forum -- that way they may help others as well.

          First, I should point out that things have changed since I wrote that. Roche released an official method for generating cDNA libraries for running on the GS-FLX, that uses random primers for cDNA synthesis, so this issue has greatly diminished.

          The story is this: sequencers generally have an "Achilles heel" or "kryptonite", if you prefer -- some weakness to which they are particularly subject. The 454's weakness is homopolymers in general and poly T in the library molecule in particular.

          The 454 is built to be fast and achieves this speed by not using reversible terminator chemistry to precisely control addition of each base. This gives you speed -- no need to deblock after scanning plus if you have 2 or 3 bases in a row you collect sequence from all of them at once.

          But in this strength lies a weakness: longer homopolymers are difficult to distinguish among. (Is that 9 A's in a row or 8?) Further, in extreme cases, a long stretch of a single base will exhaust all the dNTP being flowed without reaching the end of that stretch of bases on every nascent strand on the bead. This is bad both because the signal produced will be so high it can bleed into adjacent wells and because next time that base cycles around you get bad "CAFIE" effects from all the strands that were not fully extended.

          On top of that, because 454 relies on a chemical cascade to produce the ATP used by luciferase to generate signal -- natural dATP cannot be used. The analog used instead of dATP is not incorporated as well as dATP would be. Nevertheless, the conditions work well enough except in extreme cases.

          Alas, one of those extremes is occasioned by the most common homopolymer in eukaryotic molecular biology: a poly A tail. cDNA production protocols frequently prime first strand synthesis from a dT oligomer. So after ligating this cDNA library to adapters about 1/2 of them may have that stretch of homopolymeric dT right next to the sequencing primer.

          These factors combine to make a perfect storm that will sink a run. All your beads key pass -- then the next X bases incorporated are all "A". So you get the blinding burst of A, plus lots of incompletely extended strands of various lengths leading to non-synchronized signal in later cycles.

          There are plenty of ways to work around this issue during library construction. But looking over the wreckage of your run, it is actually difficult to tell what has happened. No warning is generated by the Roche image processing pipe line that says "Your library sucks!" You just get poor results -- not very diagnostic, many factors can lead to poor results. If I suspect this is an issue I have to fire up the RunBrowser and page through the early cycles, and check to see that a burst of homopolymer is to blame.

          Anyway, Roche did finally release a sanctioned cDNA library construction protocol. We don't really run into these issues using it because first strand synthesis is primed by random oligos, instead of oligo dT. So -- this is likely not your problem unless someone used a naive oligo dT primed 1st strand synthesis method to construct your library. Not that likely these days.

          --
          Phillip

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Essential Discoveries and Tools in Epitranscriptomics
            by seqadmin


            The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist on Modified Bases...
            Yesterday, 07:01 AM
          • seqadmin
            Current Approaches to Protein Sequencing
            by seqadmin


            Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
            04-04-2024, 04:25 PM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, 04-11-2024, 12:08 PM
          0 responses
          37 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 10:19 PM
          0 responses
          41 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 09:21 AM
          0 responses
          35 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-04-2024, 09:00 AM
          0 responses
          54 views
          0 likes
          Last Post seqadmin  
          Working...
          X