Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • fastqc - overrepresented sequences

    I have run a FASTQC analysis and found out that there are hundreds of over-represented sequences in my datasets. Some of these, are Illumina PCR primers, some are single end adapters.

    Does it mean I have some contamination going on? What can I do about it? Do I simply remove the primer and adapter sequences or what else?

    Thanks in advance.

  • #2
    If your data is microRNA or something where the sequence length is less than the read length, then you might end up reading into adapter or primer sequences. This would possibly show up as over-represented k-mers on the 3' end of the sequences. In this case you might want to trim the adapters before alignment.

    Comment


    • #3
      If they're in the overrepresented sequences then it probably means that your library was contaminated with primer dimers. If you read through into adapter you'd get slightly different sequences so they would appear in the Kmer plot rather than the overrepresented sequences.

      There's not much you can do about the dimers for the data you've already collected. They're easy enough to filter out if you want to remove them, but they probably won't map to whichever genome you're using anyway.

      You can normally spot this kind of contamination by doing a BioAnalyser run on your library before sequencing. Others here are better qualified than me to advise on how you might avoid them in the first place.

      Comment


      • #4
        Another source contributing to the over-represented sequences is reads from rRNA genes. Even with rRNA removal, oftentimes you still have high-level rRNA reads.

        Douglas

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Techniques and Challenges in Conservation Genomics
          by seqadmin



          The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

          Avian Conservation
          Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
          03-08-2024, 10:41 AM
        • seqadmin
          The Impact of AI in Genomic Medicine
          by seqadmin



          Artificial intelligence (AI) has evolved from a futuristic vision to a mainstream technology, highlighted by the introduction of tools like OpenAI's ChatGPT and Google's Gemini. In recent years, AI has become increasingly integrated into the field of genomics. This integration has enabled new scientific discoveries while simultaneously raising important ethical questions1. Interviews with two researchers at the center of this intersection provide insightful perspectives into...
          02-26-2024, 02:07 PM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, 03-14-2024, 06:13 AM
        0 responses
        32 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 03-08-2024, 08:03 AM
        0 responses
        71 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 03-07-2024, 08:13 AM
        0 responses
        80 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 03-06-2024, 09:51 AM
        0 responses
        68 views
        0 likes
        Last Post seqadmin  
        Working...
        X