Hello everybody,
I am working on my first ever Chip-Seq experiment (Transcription factor binding on a HiSeq with 51bp single end reads) and at the moment I am looking at my libraries using Fastqc.
Among several fails that I could track down the reasons for, the following seems odd to me (pictures are attached):
Under Sequence content across all bases I find that my data seems quite AT rich.
Then, under sequence duplication level I find it is bigger than 95%.
As suggested in various posts in this forum, I read up on this following this link:
Is it likely that in the course of library prep or sequencing a bias was created that I now find as a duplication of lots of AT rich reads?
And if that could be the case, how could I confirm this?
Maybe I should add that my bioinformatics level is very low, so at the moment I rely solely on the functions found in fastqc and anything that has a GUI.
Thanks al lot for your input!
Tobias
I am working on my first ever Chip-Seq experiment (Transcription factor binding on a HiSeq with 51bp single end reads) and at the moment I am looking at my libraries using Fastqc.
Among several fails that I could track down the reasons for, the following seems odd to me (pictures are attached):
Under Sequence content across all bases I find that my data seems quite AT rich.
Then, under sequence duplication level I find it is bigger than 95%.
As suggested in various posts in this forum, I read up on this following this link:
Is it likely that in the course of library prep or sequencing a bias was created that I now find as a duplication of lots of AT rich reads?
And if that could be the case, how could I confirm this?
Maybe I should add that my bioinformatics level is very low, so at the moment I rely solely on the functions found in fastqc and anything that has a GUI.
Thanks al lot for your input!
Tobias
Comment