Oh - that's an intentional protection from overwriting files. Just delete the output file first or add the "overwrite" flag.
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
high contaninants
Thanks.
Input is being processed as unpaired
Input: 385043 reads 10781204 bases.
Contaminants: 341911 reads (88.80%) 9573508 bases (88.80%)
Result: 43132 reads (11.20%) 1207696 bases (11.20%)
What is diffinition of contaminants? It looks very high.
Comment
-
k=16 shows high contaminants than k=26
zheng@zheng-XPS-8500:~/Desktop/bbmap/20140916ngs$ bbduk.sh -Xmx1g in=probe48mix25fg_S7_L001_R2_001.fastq ref=ngs13template.fasta stats=probe48mix25fg_S7_L001_R2_001_26.txt k=26 fbm
java -ea -Xmx1g -cp /home/zheng/Desktop/bbmap/current/ jgi.BBDukF -Xmx1g in=probe48mix25fg_S7_L001_R2_001.fastq ref=ngs13template.fasta stats=probe48mix25fg_S7_L001_R2_001_26.txt k=26 fbm
Executing jgi.BBDukF [-Xmx1g, in=probe48mix25fg_S7_L001_R2_001.fastq, ref=ngs13template.fasta, stats=probe48mix25fg_S7_L001_R2_001_26.txt, k=26, fbm]
No output stream specified. To write to stdout, please specify 'out=stdout.fq' or similar.
Initial:
Memory: free=237m, used=14m
Added 13 kmers; time: 0.023 seconds.
Memory: free=228m, used=23m
Input is being processed as unpaired
Input: 159642 reads 4469976 bases.
Contaminants: 130724 reads (81.89%) 3660272 bases (81.89%)
Result: 28918 reads (18.11%) 809704 bases (18.11%)
Time: 0.197 seconds.
Reads Processed: 159k 811.47k reads/sec
Bases Processed: 4469k 22.72m bases/sec
zheng@zheng-XPS-8500:~/Desktop/bbmap/20140916ngs$ ^C
zheng@zheng-XPS-8500:~/Desktop/bbmap/20140916ngs$ bduk.sh -Xmx1g in=probe48mix25fg_S7_L001_R2_001.fastq ref=ngs13template.fasta stats=probe48mix25fg_S7_L001_R2_001_16.txt k=16 fbm
bduk.sh: command not found
zheng@zheng-XPS-8500:~/Desktop/bbmap/20140916ngs$ bbduk.sh -Xmx1g in=probe48mix25fg_S7_L001_R2_001.fastq ref=ngs13template.fasta stats=probe48mix25fg_S7_L001_R2_001_16.txt k=16 fbm
java -ea -Xmx1g -cp /home/zheng/Desktop/bbmap/current/ jgi.BBDukF -Xmx1g in=probe48mix25fg_S7_L001_R2_001.fastq ref=ngs13template.fasta stats=probe48mix25fg_S7_L001_R2_001_16.txt k=16 fbm
Executing jgi.BBDukF [-Xmx1g, in=probe48mix25fg_S7_L001_R2_001.fastq, ref=ngs13template.fasta, stats=probe48mix25fg_S7_L001_R2_001_16.txt, k=16, fbm]
No output stream specified. To write to stdout, please specify 'out=stdout.fq' or similar.
Initial:
Memory: free=237m, used=14m
Added 143 kmers; time: 0.028 seconds.
Memory: free=228m, used=23m
Input is being processed as unpaired
Input: 159642 reads 4469976 bases.
Contaminants: 151727 reads (95.04%) 4248356 bases (95.04%)
Result: 7915 reads (4.96%) 221620 bases (4.96%)
Comment
-
So... that's telling you that you are getting matches between the stuff in your input file (probe48mix25fg_S7_L001_R2_001.fastq) and your reference file (ngs13template.fasta). And a shorter kmer will always find more matches in the presence of error.
probe48mix25fg_S7_L001_R2_001_26.txt will contain a list of which reference sequences were seen, and how many times they were seen.
Comment
-
And a shorter kmer will always find more matches in the presence of error.
Here k=16 shows less match sequences than k=26
for k=16
Input: 159642 reads 4469976 bases.
Contaminants: 151727 reads (95.04%) 4248356 bases (95.04%)
Result: 7915 reads (4.96%) 221620 bases (4.96%)
for k=26
Input: 159642 reads 4469976 bases.
Contaminants: 130724 reads (81.89%) 3660272 bases (81.89%)
Result: 28918 reads (18.11%) 809704 bases (18.11%)
Comment
-
In this case, the output is misleading... BBDuk assumes that the ref file is a file of contaminants because that's what I originally designed it for. So "Contaminants" actually means "Things that match the reference". I may change the wording eventually.
In other words, 95.04% of the reads matched the reference for K=16 and 81.89% did for K=26.
Comment
Latest Articles
Collapse
-
by seqadmin
The human gut contains trillions of microorganisms that impact digestion, immune functions, and overall health1. Despite major breakthroughs, we’re only beginning to understand the full extent of the microbiome’s influence on health and disease. Advances in next-generation sequencing and spatial biology have opened new windows into this complex environment, yet many questions remain. This article highlights two recent studies exploring how diet influences microbial...-
Channel: Articles
02-24-2025, 06:31 AM -
-
by seqadmin
Like all molecular biology applications, next-generation sequencing (NGS) workflows require diligent quality control (QC) measures to ensure accurate and reproducible results. Proper QC begins at nucleic acid extraction and continues all the way through to data analysis. This article outlines the key QC steps in an NGS workflow, along with the commonly used tools and techniques.
Nucleic Acid Quality Control
Preparing for NGS starts with isolating the...-
Channel: Articles
02-10-2025, 01:58 PM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, 03-03-2025, 01:15 PM
|
0 responses
46 views
0 likes
|
Last Post
by seqadmin
03-03-2025, 01:15 PM
|
||
Started by seqadmin, 02-28-2025, 12:58 PM
|
0 responses
167 views
0 likes
|
Last Post
by seqadmin
02-28-2025, 12:58 PM
|
||
Started by seqadmin, 02-24-2025, 02:48 PM
|
0 responses
525 views
0 likes
|
Last Post
by seqadmin
02-24-2025, 02:48 PM
|
||
Started by seqadmin, 02-21-2025, 02:46 PM
|
0 responses
256 views
0 likes
|
Last Post
by seqadmin
02-21-2025, 02:46 PM
|
Comment