View Single Post
Old 11-14-2018, 01:09 AM   #3
Jafar Jabbari
Location: Melbourne

Join Date: Jan 2013
Posts: 1,234

The plots that you are referring should show normal distribution around the GC content of genome in a random library. But IP is not expected to be random so there would be a bias. Extreme GC at the start of reads could also be due to library construction method where some non-template bases are added.

Stretches of G in NextSeq data indicates that there was not any signal in those cycles which could be due to short inserts and adapter dimers. F

From FastQC manual:

“This module measures the GC content across the whole length of each sequence in a file and compares it to a modelled normal distribution of GC content.”

“An unusually shaped distribution could indicate a contaminated library or some other kinds of biased subset. A normal distribution which is shifted indicates some systematic bias which is independent of base position. If there is a systematic bias which creates a shifted normal distribution then this won't be flagged as an error by the module since it doesn't know what your genome's GC content should be.”
nucacidhunter is offline   Reply With Quote