SEQanswers

Go Back   SEQanswers > Sequencing Technologies/Companies > Illumina/Solexa



Reply
 
Thread Tools
Old 09-24-2018, 02:25 AM   #1
JasperGeh
Junior Member
 
Location: Hanover, Germany

Join Date: Sep 2018
Posts: 6
Default Basequality Dropoff

Hello people,

first time poster and also kind of first time sequencer.
I just started my PhD which involves a lot of NGS on a MiSeq and my supervisor and I ran 11 samples last week.

The initial metrics looked very nice (cluster density of 1100 and 92% of clusters passing QC) but the data after the run showed a quality dropoff in both directions after about 100 cycles and only 40% of >Q30 reads.
We both don't really know what the issue could be and think that it was probably not an issue with the adaptors or clustering that looked nice in the beginning so I decided to take the troubleshooting on!

You find the screenshot attached, any input is valued Thanks in advance!
Attached Images
File Type: jpg image001.jpg (94.6 KB, 22 views)
JasperGeh is offline   Reply With Quote
Old 09-24-2018, 04:20 AM   #2
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,815
Default

1. Are these amplicons or sequences where there is low nucleotide diversity expected?
2. You possibly have short inserts. Once you go through your inserts you are then sequencing into the adapter on the 3'-end. That leads to low nucleotide diversity and drop-off's in Q-scores.
3. How much phiX is spiked into this run?
GenoMax is offline   Reply With Quote
Old 09-24-2018, 05:36 AM   #3
JasperGeh
Junior Member
 
Location: Hanover, Germany

Join Date: Sep 2018
Posts: 6
Default

Thanks for your ideas.

1. In a population-genetics sense? And what is the underlying mechanism between the nucleotide diversity and basecall quality? But I guess not; every sample is the complete Human Cytomegalovirus genome isolated from patients.

2. We ran the samples through an Agilent Bioanalyzer prior to sequencing. The insert-length is not perfectly normally distributed and some samples have some little spikes around 300bp, but the bulk of the library was between 470 and 550bp.

3. 1% phiX
JasperGeh is offline   Reply With Quote
Old 09-24-2018, 06:11 AM   #4
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,815
Default

Low nucleotide diversity in this case means majority of clusters will have e.g. an "A". When that happens the ability of the image analysis software to distinguish among clusters is hampered which then can lead to poor Q scores.

I assume this is a V3 reagent run (since you have 1100 k/mm^2 cluster density). While it is possible to push the limit of cluster density (with good/diverse libraries) the fall over the cliff (in terms of Q score drop) is precipitous.

Have you analyzed the data to see if your assumption in #2 above checks out. In general, smaller fragments will cluster efficiently and will out compete larger ones every time. Since you have a reference genome available you can use the method described by Brian in this post to actually find the real insert sizes in your data. It would be interesting to see what the results look like compared to your expectation.

Can you show us what the "Summary" looks like for phiX alignments in that third tab?

Last edited by GenoMax; 09-24-2018 at 06:13 AM.
GenoMax is offline   Reply With Quote
Old 09-24-2018, 06:33 AM   #5
JasperGeh
Junior Member
 
Location: Hanover, Germany

Join Date: Sep 2018
Posts: 6
Default

Ah, low diversity between the clusters in each sequencing cycle, I understand how that would be problematic.
Yes, it is a V3 run.
And I see how a overclustering of smaller inserts would lead to sequencing into the opposite adapter with low nucleotide diversity. As soon as I get a PC and access to the institutes server I will look into the linked method of determining the true insert length.
As I see the dropoff after around 100 bases, I should also see that as the true insert length, right? And that should be pretty salient in the data?

As my supervisor will only return next week and my PC is still not set up, I won't be able to look up the phiX controls but I remember him saying that they looked okay.
Thank you for your help so far, I'll will post any developments.
JasperGeh is offline   Reply With Quote
Old 09-24-2018, 07:00 AM   #6
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,815
Default

Quote:
As I see the dropoff after around 100 bases, I should also see that as the true insert length, right?
While likely let us hope that is not the case because otherwise you would be losing a lot of data to adapters and would have short reads.

When you have had a chance to investigate let us know what the FastQC profiles and the BBMap insert sizes look like.

Is this a MiSeq you have control over/physical access to? It may be worth having Illumina tech support take a look at this run remotely. They can diagnose if there was a hardware issue that led to the Q-score drop. If you have a maintenance contract they will likely replace the reagents at no charge in that case.
GenoMax is offline   Reply With Quote
Old 09-24-2018, 06:18 PM   #7
ikripp
Member
 
Location: QLD, Aus

Join Date: Jan 2018
Posts: 16
Default

Do you have access to the basespace account that this was run on? I'd be curious to see what the metrics tab looks like.
ikripp is offline   Reply With Quote
Old 09-25-2018, 11:44 AM   #8
JasperGeh
Junior Member
 
Location: Hanover, Germany

Join Date: Sep 2018
Posts: 6
Default

It is our own MiSeq and we will have tech support take a look at it. But from first sighting of the reads it actually does look like the short inserts have outcompeted the larger ones during clustering.
JasperGeh is offline   Reply With Quote
Reply

Tags
illumina miseq, paired end, quality control

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 11:16 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2018, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO