Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Weird results at the 51st nucleotide

    Hi all,

    We just got some bisulfite sequencing data from a set of patients. After running FastQC on a few samples, we observed an extremely low quality score at position #51, for two samples.

    Attached are the distribution of base quality scores, % of Ns, and the nucleotide composition at each position.

    My question is, what is the possible reason for this? Should we trim all the nucleotides after 50?

    Thanks in advance!
    Attached Files

  • #2
    There was probably a bubble that floated through the flowcell. That's a pretty common cause of random dips in quality. I wouldn't bother trimming that off, the dip in quality is only affecting a few bases. Any of the common aligners should still be able to accurately align the reads. Just choose one that allows you to set a minimum phred score during methylation extraction (i.e., one that will ignore methylation calls at those crappy bases).

    Comment


    • #3
      Thank you Devon for the explanation and suggestion.

      Comment


      • #4
        Were these two samples on the same lane? Were there other samples besides these two in the lane that are not showing this problem?

        You should at least request your sequence provider to re-run the sample(s) at no cost (unless there were other samples in that lane that do not have this problem).

        Comment


        • #5
          We got the data from another group. There is no barcodes used, I assume each sample was sequenced on one entire lane. (right?)

          We have checked about 10 samples, two samples mentioned above have the problem at exactly the same position (#51). The two samples are in the control group.

          There are also other samples with problem at different position. Three of the case samples we checked have low quality base-calling at position #63 (see attached).

          It seems that the position of abnormal base-calling differs in a run-by-run manner.

          Originally posted by GenoMax View Post
          Were these two samples on the same lane? Were there other samples besides these two in the lane that are not showing this problem?

          You should at least request your sequence provider to re-run the sample(s) at no cost (unless there were other samples in that lane that do not have this problem).
          Attached Files

          Comment


          • #6
            You are correct in that your samples must have been run in separate lanes, if there are no barcodes. It is possible that one or more lanes had a bubble flow though that affected the basecalls. You can look at the tile representation from FastQC to see if you can identify bad tiles across cycles.

            In general Q-scores take a nose dive when the nucleotide diversity no longer exists (or is significantly reduced). It appears that majority of the data in the example above consists of N's beyond cycle #63 and would likely need to be trimmed before analysis.

            Comment


            • #7
              Thank you GenoMax. I will trim the low quality bases as you suggested.

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Strategies for Sequencing Challenging Samples
                by seqadmin


                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                03-22-2024, 06:39 AM
              • seqadmin
                Techniques and Challenges in Conservation Genomics
                by seqadmin



                The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                Avian Conservation
                Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                03-08-2024, 10:41 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, Yesterday, 06:37 PM
              0 responses
              10 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, Yesterday, 06:07 PM
              0 responses
              9 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 03-22-2024, 10:03 AM
              0 responses
              51 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 03-21-2024, 07:32 AM
              0 responses
              67 views
              0 likes
              Last Post seqadmin  
              Working...
              X