View Single Post
Old 04-22-2020, 02:29 PM   #3
invu
Junior Member
 
Location: Boston, MA, USA

Join Date: Apr 2020
Posts: 5
Default

Quote:
Originally Posted by GenoMax View Post
Yikes this is a really low diversity sample. Do you know how much phiX (if any) was added to this sample. Did you not tell the sequence provider that these were low diversity? If you did not then it would be hard to make a case for them to re-sequence this sample again for free. You may have to pay for a re-run with a significant % of phiX (10-20% or more), if you want to get improved Q-scores.

It is possible that in spite of the bad Q-scores etc your sequence may still be usable. Have you looked at that?
Thanks for your reply, GenoMax!
The sample is a custom set of sequences with well-defined regions (hence those low-diversity regions). I had declined PhiX spike-in to obtain as many valid read lines as possible w/o sacrificing any to PhiX. I hadn't told them about the diversity because I had no idea about this kind of issue before; that being said, my old results for samples similar to this (even though they did have a few degenerate bases at the beginning) didn't have this problem (at least weren't as bad as this). Will I really need PhiX if I get to repeat something like this? Which way will I lose more data -- 10-20% loss by PhiX or less well-defined loss by poor quality reads like this?

I am looking at the data, and a big portion of the lines do seem valid and usable, but again, I'd need more lines to be ideal, and more importantly, even among those lines that apparently look okay, if more base call errors were caused by this issue, then that's a separate problem, which is quite hard to tell just from looking at those other lines.

Do you happen to know if someone looks at the rawer data (e.g., imaging data? if they're preserved? sorry I'm not really familiar with the details of the seq machines..) whether they could correct or improve the base calls throughout the seq data even now? Or is everything done real time by the machine and there's nothing that can be done to improve this?
Also, do you know if this issue caused by low diversity would also cause the tile-dependent quality loss as shown in my diagram? (This is something I am having hard time in understanding, and something I'm trying to argue about..)

Last edited by invu; 04-22-2020 at 02:33 PM.
invu is offline   Reply With Quote