Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • AlexCalderwood
    Junior Member
    • Mar 2018
    • 4

    low GC% peak in one end of paired end reads

    Hi,
    I have paired end RNA seq data prepared from Brassica napus using TruSeq kit. After adapter trimming, FastQC shows a second low GC% peak per sequence in the _1.fq files. The _2 files all look ok.

    The low GC% reads don't align to our reference transcriptome, but after blasting a small proportion of the unaligned reads, don't appear to be contamination from another organism - (hits are mostly predicted genes for Brassicas).

    The average GC content is consistent across the length of the reads.

    Does anyone know what might be causing this, particularly in only one of each set of read pairs?

    thanks,
    Alex
    Attached Files
  • pmiguel
    Senior Member
    • Aug 2008
    • 2328

    #2
    RNAseq?
    Tell us more about the libraries.
    Are you say the forward reads show this bimodal GC distribution but the reverse reads do not? Or does "_1" and "_2" mean something else.
    --
    Phillip

    Comment

    • AlexCalderwood
      Junior Member
      • Mar 2018
      • 4

      #3
      Hi Phillip, thanks for your attention - what would you like to know about the libraries?

      Yes exactly, the forward "_1 file" reads are red and orange lines in the thumbnail, the reverse reads are the green. Some of the reverse reads samples have a slight shoulder in the low GC region, but much more minor than the _1 files.

      Comment

      • pmiguel
        Senior Member
        • Aug 2008
        • 2328

        #4
        How were the libraries constructed? What average insert size did they have? Were the libraries stranded?

        --
        Phillip

        Comment

        • AlexCalderwood
          Junior Member
          • Mar 2018
          • 4

          #5
          They were made using "NEB next ultra directional library kit", which uses dUTP method to retain strandedness, and should give an insert size of ~200bp

          Comment

          • pmiguel
            Senior Member
            • Aug 2008
            • 2328

            #6
            Originally posted by AlexCalderwood View Post
            They were made using "NEB next ultra directional library kit", which uses dUTP method to retain strandedness, and should give an insert size of ~200bp
            Okay, then my hypothesis is that the reverse read is always reading 5' in the cDNA of the forward read. So that elevated AT% is just polyA tail. Or, since you mention hits to "predicted genes", the elevated AT% may just be 3' or 5' non-translated. (Not sure which orientation the NEB kits retain.) Nor whether a 5' or 3' bias is likely in your sequence.

            The non-translated regions of plants are often replete with transposable elements which can themselves have lower GC content. Or, with time after insertion, often become reduced in GC due to cytosine methylation. That is, C deamination is easily repaired because "U's" don't belong in DNA. However, 5-me-C deaminates to "T". So, over evolutionary time, simply methylating transposable elements has a sort of slow-motion "RIPping" effect.

            Just speculation on my part, of course.

            --
            Phillip

            Comment

            • nucacidhunter
              Jafar Jabbari
              • Jan 2013
              • 1250

              #7
              Could you also post "Per base sequence content" plot form FastQC output.

              Comment

              • AlexCalderwood
                Junior Member
                • Mar 2018
                • 4

                #8
                Please see attached for "per base sequence content" for one of the reverse read problem files post trimming. (Sorry, in a previous post I screwed up forward and reverse reads -> _1 is reverse, relative to mRNA)

                I think the gradient of the GC lines is consistent with Phillip's idea of the AT rich 3'UTR being a factor.
                Attached Files

                Comment

                Latest Articles

                Collapse

                • SEQadmin2
                  Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                  by SEQadmin2


                  I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

                  Here are nine questions we think about, in roughly the order they matter, before...
                  06-18-2026, 07:11 AM
                • SEQadmin2
                  From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                  by SEQadmin2


                  Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                  The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                  ...
                  06-02-2026, 10:05 AM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by SEQadmin2, 06-17-2026, 06:09 AM
                0 responses
                38 views
                0 reactions
                Last Post SEQadmin2  
                Started by SEQadmin2, 06-09-2026, 11:58 AM
                0 responses
                100 views
                0 reactions
                Last Post SEQadmin2  
                Started by SEQadmin2, 06-05-2026, 10:09 AM
                0 responses
                121 views
                0 reactions
                Last Post SEQadmin2  
                Started by SEQadmin2, 06-04-2026, 08:59 AM
                0 responses
                114 views
                0 reactions
                Last Post SEQadmin2  
                Working...