SEQanswers

Go Back   SEQanswers > Sequencing Technologies/Companies > Illumina/Solexa



Similar Threads
Thread Thread Starter Forum Replies Last Post
pooled reads or Cuffmerge? dawn1313 Bioinformatics 0 09-10-2015 10:48 AM
Pooled sample SNP calling: SNVer lung1212 General 5 01-16-2014 11:37 PM
GATK UnifiedGenotyper pooled sample sirmark Bioinformatics 5 03-13-2013 06:43 AM
SNVer error on pooled sample Anelda Bioinformatics 1 06-10-2012 10:18 AM
Why Illumina reads may have uneven coverage? ojy Illumina/Solexa 4 12-12-2011 03:01 AM

Reply
 
Thread Tools
Old 02-03-2016, 10:05 AM   #1
theduke
Member
 
Location: San Antonio

Join Date: Aug 2010
Posts: 14
Default Uneven reads in pooled sample

Hi All,

I've been doing this a while, but have run into a new problem on a batch of RNA-Seq samples (Truseq) that I'm concerned about. We prepped a bunch of RNA samples for sequencing and have a few that are low concentration. The core facility that I have been using for years was unhappy with this, but I didn't really see where the issue was given there was plenty of material for pooling.

Nevertheless they ran one lane for me (HiSeq), but apparently had to modify their dilution protocol. The result was as follows for the lane:


Reads % lane % PF cluster Qual score
Sample*1 31,287,035 12.66 89.81 36.96
Sample*2 70,103,756 28.36 89.49 36.96
Sample*3 8,599,888 3.48 90.17 36.97
Sample*4 3,309,119 1.34 89.80 37.01
Sample*5 98,775,103 39.95 89.24 36.94
Sample*6 28,415,646 11.49 90.10 36.94

Now, clearly the issue is that the distribution of data is very uneven, BUT the number of reads directly and perfectly correlate with the starting concentration of each sample prior to pooling (R2 = 0.99)! From what I can gather, the core did not make an intermediate dilution for each but instead added variable volumes to a pool and used that to cluster, but what happened in between I do not know. What I dont understand is why they did this. Here are the sample concs:

ng/ul nM
Sample1 1.6 6.3
Sample2 2.4 9.6
Sample3 0.9 3.5
Sample4 0.6 2.4
Sample5 2.7 10.8
Sample6 1.5 6.1

We are almost certain that given the correlation with the pre-pooled sample concentration the pooling has been done incorrectly and that the variable volumes has perhaps thrown everything off balance. Does that sound reasonable?

According to all the illumina documentation I have read, the best thing to do would be to normalize each sample to 2 nM, and then combine equal volume of each together to make a 2 nM pool, which would then be the starting point for cluster generation. Does this sound reasonable? Am I missing something here?

****I should mention that samples were quantified with both picogreen and bioanalyzer HS DNA. These gave very similar results (r = 0.8) so I doubt quantification is a major issue here****

Last edited by theduke; 02-03-2016 at 10:10 AM.
theduke is offline   Reply With Quote
Old 02-03-2016, 01:50 PM   #2
nucacidhunter
Jafar Jabbari
 
Location: Melbourne

Join Date: Jan 2013
Posts: 1,226
Default

If the library sizes were similar, they have failed to do equimolar pooling. Using different volumes for pooling or equal volumetric pooling of normalised libraries should not affect the read ratios if done correctly. qPCR is the most accurate library quantification method and the ones mentioned here are not.
nucacidhunter is offline   Reply With Quote
Old 02-03-2016, 02:27 PM   #3
theduke
Member
 
Location: San Antonio

Join Date: Aug 2010
Posts: 14
Default

Agreed that qPCR is the way to go for accurate quantification, but that's a separate issue. Here we are talking about relative concentrations and the pooling of multiple samples. It certainly looks like equimolar pooling has failed in this instance.
theduke is offline   Reply With Quote
Old 02-03-2016, 04:03 PM   #4
HESmith
Senior Member
 
Location: Bethesda MD

Join Date: Oct 2009
Posts: 503
Default

We've consistently observed lower-than-calculated cluster numbers when working with dilute (<1ng/ul) libraries. We don't think it's an issue with adsorption b/c we use LoBind tubes, nor an issue with the quantification (using PicoGreen fluorometry and BioAnalyzer, same as OP). We suspect that it's related to library quality, particularly when prepared in parallel. If the same amount of RNA input yields substantially different amounts of library, it's reasonable to assume that something wonky happened.

Also, you may want to recalculate your correlation. For example, sample 2 (9.6 nM) is 4X the concentration of sample 4 (2.4 nM), but produces 21X as many reads (70M / 3.3M).
HESmith is offline   Reply With Quote
Old 02-03-2016, 05:42 PM   #5
theduke
Member
 
Location: San Antonio

Join Date: Aug 2010
Posts: 14
Default

What was the cutoff for the observed weirdness? <1ng/ul? I think we will redo the 5 or 6 samples that are odd ones out and start over.

Does my pooling plan sound reasonable?
theduke is offline   Reply With Quote
Old 02-03-2016, 07:24 PM   #6
nucacidhunter
Jafar Jabbari
 
Location: Melbourne

Join Date: Jan 2013
Posts: 1,226
Default

Quote:
Originally Posted by HESmith View Post
We've consistently observed lower-than-calculated cluster numbers when working with dilute (<1ng/ul) libraries. We don't think it's an issue with adsorption b/c we use LoBind tubes, nor an issue with the quantification (using PicoGreen fluorometry and BioAnalyzer, same as OP). We suspect that it's related to library quality, particularly when prepared in parallel. If the same amount of RNA input yields substantially different amounts of library, it's reasonable to assume that something wonky happened.

Also, you may want to recalculate your correlation. For example, sample 2 (9.6 nM) is 4X the concentration of sample 4 (2.4 nM), but produces 21X as many reads (70M / 3.3M).
Lower than expected yield indicates some issues with library prep but it could be unrelated to quality, for instance loss during library clean-up of a good library. If there was an issue with library quality, then the cluster density and yield should be lower than expected as the bad library would not contribute to sequencing output. Because of formatting issue I can't figure out the stats given on output here. Pooling equimolar quantity of libraries quantified with qPCR with below 1 ng/ul and up to 10 ng/ul I have not observed this.
nucacidhunter is offline   Reply With Quote
Old 02-04-2016, 07:01 AM   #7
theduke
Member
 
Location: San Antonio

Join Date: Aug 2010
Posts: 14
Default

Quote:
Originally Posted by nucacidhunter View Post
Because of formatting issue I can't figure out the stats given on output here.
Apologies for that. In the first table, there are five columns in the following order:

Sample name
Number of reads
% of the lane
%PF clusters
Quality score

There doesn't appear to be any quality issues per se.
theduke is offline   Reply With Quote
Old 02-05-2016, 01:22 AM   #8
nucacidhunter
Jafar Jabbari
 
Location: Melbourne

Join Date: Jan 2013
Posts: 1,226
Default

Sequencing reads is over 230M which is around average for V4 chemistry on HiSeq 2500 if that is the platform used for sequencing and read quality and PF is up to specs.
nucacidhunter is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 08:13 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO