Hi,
I've got some 16S sequences obtained with a miseq sequencer.
I have sequenced several negative control samples to identify contaminants that I have to remove them from my actual samples.
I was thinking about deleting the contaminated OTU from each sample if its number of reads is not present 10x more than in the highest contaminated negative control.
in short, if I have this pattern for OTU x :
sample A : 20 reads
Sample B : 300 reads
Negative control C : 3 reads
Negative control D : 12 reads
I will delete this OTU for sample A but not for sample B as 300> 120
This being said, here is my question :
When should the sub-sampling being performed? should it be before or after removing the contaminants?
I was first thinking about subsampling after the contaminant removing step.
but then, I thought that the rule for removing contaminant stated previously should be efficient only in a subsampled dataset.
therefore, should i rather subsample before removing the contaminants? But then, the problem is that some samples are way more contaminated than others, which means that the resulting sequencing depth will vary from a sample to another...
Or should I subsample twice? before AND after?
Thank you in advance for your precious advices
Adrien
I've got some 16S sequences obtained with a miseq sequencer.
I have sequenced several negative control samples to identify contaminants that I have to remove them from my actual samples.
I was thinking about deleting the contaminated OTU from each sample if its number of reads is not present 10x more than in the highest contaminated negative control.
in short, if I have this pattern for OTU x :
sample A : 20 reads
Sample B : 300 reads
Negative control C : 3 reads
Negative control D : 12 reads
I will delete this OTU for sample A but not for sample B as 300> 120
This being said, here is my question :
When should the sub-sampling being performed? should it be before or after removing the contaminants?
I was first thinking about subsampling after the contaminant removing step.
but then, I thought that the rule for removing contaminant stated previously should be efficient only in a subsampled dataset.
therefore, should i rather subsample before removing the contaminants? But then, the problem is that some samples are way more contaminated than others, which means that the resulting sequencing depth will vary from a sample to another...
Or should I subsample twice? before AND after?
Thank you in advance for your precious advices
Adrien