Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • GC bias at the 5’ of transcriptome

    Hi all,
    We have just received data from our last bacterial transcriptome analysis and we got a very weird result. The first 12 bases show a high GC content that can not be random (see attached file). Did this thing ever happened to you and do you have any idea why such things happen? This is not caused by a contamination or an adaptor problem as all sequences (including the 12 first bases of each sequence) can be mapped to our genome.
    Any idea will be welcomed
    Thanks,
    Yaara
    Attached Files

  • #2
    These two papers should answer your question:

    Hansen KD, Brenner SE, Dudoit S.
    Biases in Illumina transcriptome sequencing caused by random hexamer priming. Nucleic Acids Research. 2010:gkq224.


    Levin JZ, Yassour M, Adiconis X, et al.
    Comprehensive comparative analysis of strand-specific RNA sequencing methods. Nature Methods. 2010;7(9):709-715

    Comment


    • #3
      In addition to the hexamer bias (which everyone gets), you've also got something else contaminating that library (possibly adapter dimers?). The content plot should flatten out after about 12 bases but yours has ongoing fluctuations. Hopefully the overrepresented sequences result will pinpoint what this is.

      Comment


      • #4
        Two more things…

        Thank you very much for your replay. There are two things I am still not sure I got:
        1. I understand that the first bases are always problematic but isn’t the bias I got more severe than what one usually gets? Can I use the data from this run (after correction of course) or would you consider it dead end?
        2. As for the second comment (simonandrews) I am not sure how can I check for adapter dimmers contamination. 80% of my reads were mapped to the genome so I assumed that I do not have any problem there. Can you please be elaborate?

        Thank you all! It helps a lot!

        Comment


        • #5
          There might be one specific sequence that is repeated over and over in your remaining 20%, and this is what causes the wiggles for the middle and right of the reads in your plot. Some read quality assessment tools give you a list of the most often repeated sequences among your reads. Try this and see if you recognize your adapters in the sequence.

          Regarding tools: Martin Morgan's ShortRead Bioconductor package gives you lists of the most common reads, Simon Andrew's FastQC listrs of most common kmers.

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Essential Discoveries and Tools in Epitranscriptomics
            by seqadmin




            The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
            04-22-2024, 07:01 AM
          • seqadmin
            Current Approaches to Protein Sequencing
            by seqadmin


            Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
            04-04-2024, 04:25 PM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, Yesterday, 12:17 PM
          0 responses
          11 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-29-2024, 10:49 AM
          0 responses
          18 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-25-2024, 11:49 AM
          0 responses
          24 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-24-2024, 08:47 AM
          0 responses
          22 views
          0 likes
          Last Post seqadmin  
          Working...
          X