![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
MiSeq V4 16S run low OTU evenness | Egansbay | Illumina/Solexa | 16 | 12-04-2015 12:16 PM |
16S Miseq run with 96 indexed samples | marcpavi | Illumina/Solexa | 7 | 09-26-2015 02:15 AM |
MiSeq 16S amplicon bad quality | fish | Illumina/Solexa | 10 | 04-06-2015 08:43 PM |
QIIME constraints and time to run for 16S Illumina | danwiththeplan | Metagenomics | 4 | 03-27-2013 02:24 PM |
QIIME constraints and time to run for 16S Illumina | danwiththeplan | Bioinformatics | 0 | 03-26-2013 02:46 PM |
![]() |
|
Thread Tools |
![]() |
#1 |
Junior Member
Location: Luxembourg Join Date: Nov 2014
Posts: 6
|
![]()
Hi SEQ-users,
We are currently following the Illumina Demonstrated Protocol for 16S sequencing on the MiSeq (24-96 samples) for stool & saliva samples. We are experiencing some inconsistencies with the results of the run metrics (e.g. %Q30 ranges between 64 - 85%, cluster density between 421 to 1328 K/mm2 etc) between each run. I was just wondering for those who use the same protocol, 1. What does your 16S MiSeq sequencing run look like? (in terms of %Q30, cluster density, % aligned etc). Have you set specific run metrics for run acceptance? (I have attached the run summary of our recent run, let me know your thoughts.) 2. What is the minimum number of sequences you process per sample? Apart from the Illumina document, do you know any publication that recommends a certain number of reads per sample? 3. We are using USEARCH for quality filtering before assembly. However we get very low R2 reads. Would you recommend other quality filtering tools? Any answers or suggestions would be greatly appreciated. Thank you in advance. |
![]() |
![]() |
![]() |
#2 |
Senior Member
Location: East Coast USA Join Date: Feb 2008
Posts: 7,082
|
![]()
Are you spiking in phiX and if so at what concentration? Are these same libraries being run repeatedly or different sample pools? What method are you using for estimation of concentration? Do you expect the reads to overlap and are you using any software to do the read merge before you quality trim?
|
![]() |
![]() |
![]() |
#3 | |
Junior Member
Location: Luxembourg Join Date: Nov 2014
Posts: 6
|
![]() Quote:
I am not really an expert in Bioinformatics but the way our bioinformatician set-up our pipeline is to 1. validate first the DNA sequences from the miseq using USEARCH by quality filtering each R1 & R2 separately). Then once they pass the quality filtering, they will go to the next step which is... 2. bacterial classification by cleaning, clustering, taxonomic assignment, building of abundance matrix using UPARSE. R2 rarely pass the quality filtering as we usually get only about 4000 reads per sample that pass.. Is this too low? Are we doing it differently from all the others? Is there a better way? |
|
![]() |
![]() |
![]() |
#4 |
Senior Member
Location: California Join Date: Jul 2014
Posts: 198
|
![]()
Our 16S runs using the V4 region generally have Q30 ~ 85-95%, density of ~1000K/mm2 with no PhiX spike. I'll check with on how much we load.
No specific run metrics for acceptance, we use the standard Illumina "does it pass spec" criteria. For 16S, you really don't need many reads per sample as you will rarify later in the analysis anyways. We aim for 100k reads per sample just to make sure most/all of them will have enough to be included, but the saturation curves generally plateau very quickly (even as low as 4-6k reads). Look at the HMP papers. We use the fastq-join utility to join reads (it's a quality score aware joiner, so low Q score pairs will be discarded). Is it possible that you're setting the quality filter too strict? |
![]() |
![]() |
![]() |
#5 |
Senior Member
Location: East Coast USA Join Date: Feb 2008
Posts: 7,082
|
![]()
That is odd. Have you run FastQC on these? Can you post Q-score plots for read 1 and 2? Do you know what Q-score cut-off your informatics people are using?
|
![]() |
![]() |
![]() |
#6 | ||
Senior Member
Location: New England Join Date: Jun 2012
Posts: 200
|
![]() Quote:
Quote:
Last edited by microgirl123; 05-11-2015 at 11:12 AM. |
||
![]() |
![]() |
![]() |
#7 | |||
Junior Member
Location: Luxembourg Join Date: Nov 2014
Posts: 6
|
![]() Quote:
Quote:
Quote:
• Expected error of global reads sequence < 1 • Each reads nucleotide Q score > 3 • Length > 250 bp (to have an overlap > 40bp after quality filtering) Is this too strict or just right? As per our bioinformatician, the Qscore and Expected error values are the ones recommended by Uparse developers and in the Uparse publication. An example of our quality filtering result is attached. I have also just ran a FASTQC on one of the samples. Attached are the results. Any thoughts? Thank you in advance. |
|||
![]() |
![]() |
![]() |
#8 |
Senior Member
Location: East Coast USA Join Date: Feb 2008
Posts: 7,082
|
![]()
I think you meant to say Q30 (not 3) since your data does not seem to have any reads below Q5.
If that is indeed Q30 (and above) then it seems to be a very stringent filter. Since the reads are expected to overlap perhaps the merging should be done prior and Q-score used as a criteria to keep the base with the higher quality (if the merge is not perfect). Look into BBMerge (http://seqanswers.com/forums/showthread.php?t=43906) or FLASH as options http://ccb.jhu.edu/software/FLASH/. |
![]() |
![]() |
![]() |
#9 |
Junior Member
Location: Luxembourg Join Date: Nov 2014
Posts: 6
|
![]()
Thank you GenoMax and thank you all for your replies.
I had a look at our pipeline closely and indeed there was something that needs to be fixed (UPARSE workflow recommends merging of paired reads first before read quality filtering.) So your right, merging needs to be done first. For some reasons I don't know why our bioinformatician set-up the pipeline this way: STEP 1. reads quality filtering of R1 and R2 separately (this is where a lot of our reads are discarded and the bioinformatician tells me that the MiSeq data are not usable) IF the sequences pass STEP 1, then what will be done is step 2... STEP 2. back to scratch>> merging of paired reads, read quality filtering.... assembly.... I believe starting from step 2 would be sufficient. |
![]() |
![]() |
![]() |
#10 |
Member
Location: USA Join Date: Jul 2015
Posts: 28
|
![]()
Hello, Fanli, I like your result of 16s V4 Miseq run. I would like to know how much you loaded? And the size of your library is ?
Thanks. ![]() |
![]() |
![]() |
![]() |
#11 |
Senior Member
Location: California Join Date: Jul 2014
Posts: 198
|
![]()
We load 8.0 pM library using the 515F/806R primers detailed here:
http://www.earthmicrobiome.org/emp-s...protocols/16s/ Edit: 1.8pM was for NextSeq runs Last edited by fanli; 07-30-2015 at 07:44 AM. |
![]() |
![]() |
![]() |
#12 |
Member
Location: USA Join Date: Jul 2015
Posts: 28
|
![]()
Fanli, thank you for your information. Two more questions, did you use Miseq V2 kit for this 16s V4 run? Why no Phix spike in(how do you decide no Phix, any protocol mentioned or you tested it out? )? I want to change my protocol, but I would like to know why. Thank you very much.
|
![]() |
![]() |
![]() |
#13 |
Senior Member
Location: California Join Date: Jul 2014
Posts: 198
|
![]()
Yes, these numbers are for v2 kits. We've found that there's little difference with a small PhiX spike on our particular MiSeq, but I don't really see the harm in doing something in the 5% range. You generally aren't going to be constrained for sequencing depth with 16S anyways.
|
![]() |
![]() |
![]() |
#14 |
Member
Location: USA Join Date: Jul 2015
Posts: 28
|
![]()
Thank you, Fanli.
|
![]() |
![]() |
![]() |
#15 |
Member
Location: Baton Rouge, Louisiana Join Date: Feb 2010
Posts: 31
|
![]()
Not meaning to hijack the thread, but can anyone explain why we see such low 1st cycle intensities with 16s libraries? I see this in both v3-v4 and v4 only libaries, perhaps due to low diversity? If I look at run summary from targeted reseq or phix, then the 1st cycle intensities are comparable and normally in the 300-400 range, but 16s runs are usually <50. Thanks for any insight.
|
![]() |
![]() |
![]() |
#16 |
Senior Member
Location: California Join Date: Jul 2014
Posts: 198
|
![]()
I think low diversity would only cause jaggedness in the intensity profile, but maybe you should check w/ tech support.
Our 16S runs have 1st cycle intensities ~150. |
![]() |
![]() |
![]() |
#17 |
Member
Location: Baton Rouge, Louisiana Join Date: Feb 2010
Posts: 31
|
![]()
hi fanli,
hmmm, the run summary you posted on 5-11-2015 shows read 1 intensity at 17 and read 4 intensity at 53. The other run summary from the OP also shows low intensity 1st cycle. This also fits with what I see in 16s libraries on miseq. So, 3 different machines, 3 different places, 3 similarly low 1st cycle intensities... Tech support is telling me this is the reason I am having issues with run completion. Last 3 runs are terminating randomly, run1- cycle 385, run2-cycle 60, run3 - cycle 587. Tech support has been helpful in replacing kits, but miseq is still out of commission. Have arrange for libraries to be sequenced on another miseq, Qc all checks out so I have little to no concern about the libraries. One comment that came out was " your 1st cycle intensities are very low..." I get a stopped run and funky z-stage errors, z-stage replaced but same issue persists. |
![]() |
![]() |
![]() |
#18 |
Senior Member
Location: California Join Date: Jul 2014
Posts: 198
|
![]()
My bad - that last screenshot is Called Int. Yeah, you're right - the Cycle 1 intensities for my last run are 30 and 39 for read 1 and read 3, respectively.
We haven't had any issues with runs terminating randomly for 16S libraries though. Although now that I think about it, we did have one bacterial WGS run that died on cycle 515 or so. Something about a .NET framework error and tech support said they hadn't seen it before. |
![]() |
![]() |
![]() |
#19 |
Member
Location: Belgium Join Date: Apr 2009
Posts: 24
|
![]()
Hi all,
what is the minimum Q-value you would suggest for a 16s read/merged amplicon Trimming/clipping ? thanks |
![]() |
![]() |
![]() |
#20 |
Senior Member
Location: CT Join Date: Apr 2015
Posts: 243
|
![]()
I don't trim based on qscore, I use the qscores to merge reads. I use mothur which impliments pandaseq for it's read merging
__________________
Microbial ecologist, running a sequencing core. I have lots of strong opinions on how to survey communities, pretty sure some are even correct. |
![]() |
![]() |
![]() |
Thread Tools | |
|
|