![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Per base sequence content error | barthez95 | Illumina/Solexa | 6 | 05-23-2018 05:48 PM |
Per base sequence quality and per base N content | barthez95 | Illumina/Solexa | 1 | 05-22-2018 10:35 AM |
FastQC per base sequence content | analyst | Bioinformatics | 14 | 02-15-2017 07:25 AM |
FastQC,kmer content, per base sequence content: is this good enough | mgg | Bioinformatics | 10 | 11-06-2013 11:45 PM |
Per Base Sequence Content | sindrle | RNA Sequencing | 2 | 08-24-2013 09:19 AM |
![]() |
|
Thread Tools |
![]() |
#1 |
Junior Member
Location: St. Louis, MO Join Date: Nov 2020
Posts: 3
|
![]()
Hello,
I'm fairly new to analyzing sequencing data like this. I've noticed this pattern appear several times (both when I look at full runs, or individual samples within a single run). Sequencer: Illumina NextSeq Kit: 71 x 10 x 10 I've processed untrimmed reads (though I've also tried trimming, and the same overall pattern appears). At ~38bp, I see a favoring(?) of "A" over "C", "G", or "T". This favoring continues, more-or-less, throughout the remainder of the read. I'm not quite sure what to make of this pattern. Whenever I run FASTQC on my data, this metric shows up as a "WARN" or "FAIL" every time. I'm not sure what this might be due to. The material we are attempting to use is DNA (from lysate, plant material); not sure if that has an impact or not. Thanks in advance for any advice you all might have! |
![]() |
![]() |
![]() |
#2 |
Senior Member
Location: East Coast USA Join Date: Feb 2008
Posts: 7,091
|
![]()
It is difficult to say what is happening. It may be fine to use the data. If you have a reference then you could try aligning and see.
There was no known problem with this run correct? |
![]() |
![]() |
![]() |
#3 |
Junior Member
Location: St. Louis, MO Join Date: Nov 2020
Posts: 3
|
![]()
No sequencing problems that I'm aware of. We've run ~5 sequencing runs, and I've noticed that, on all 5, this same pattern appears. When I map these raw reads to an ampli-ome, only ~50% of the reads map (and when I map them to the genome, the same amount map), so I don't think I'm dealing with off-target amplification. My guess right now is that we are dealing with a lot of dimers soaking up reads on the Illumina flowcell. But the fact that the Per-Base-Sequence-Content was consistently giving me this pattern made me think that maybe I'm dealing with something else. But, I'm not very sure. I was curious if anyone else had seen a pattern like this.
|
![]() |
![]() |
![]() |
#4 |
Senior Member
Location: New England Join Date: Jun 2012
Posts: 200
|
![]()
I think you're right about adapter dimers - after you sequence through the adapter and hit the flow cell you see overcalling of either A or G.
|
![]() |
![]() |
![]() |
#5 |
Junior Member
Location: St. Louis, MO Join Date: Nov 2020
Posts: 3
|
![]()
Ah, thank you!
|
![]() |
![]() |
![]() |
Thread Tools | |
|
|