Hi All,
I've been using the fastx toolkit to preprocess Illumina HiSeq reads for alignment using shrimp and assembly using Velvet/Oases. We've had some problems with using the artefacts filter: when we use this (normally first stage of the process, followed by quality filter and clipper), the assemblies come back with a few huge (300-400M bp) contigs, and a small number of other pretty long ones (around 50-60, 20k-100k bp) that are full of more repetitive sequence than we'd expect).
Code used: fastx_artifacts_filter -i $FQ -o $OUT_noart -v -Q 33
However, these issues have been resolved in one data set by omitting the artefacts filter. Does anyone know why this would be and whether we should trust the rest of the assemblies with these few anomalous contigs?
Many thanks
I've been using the fastx toolkit to preprocess Illumina HiSeq reads for alignment using shrimp and assembly using Velvet/Oases. We've had some problems with using the artefacts filter: when we use this (normally first stage of the process, followed by quality filter and clipper), the assemblies come back with a few huge (300-400M bp) contigs, and a small number of other pretty long ones (around 50-60, 20k-100k bp) that are full of more repetitive sequence than we'd expect).
Code used: fastx_artifacts_filter -i $FQ -o $OUT_noart -v -Q 33
However, these issues have been resolved in one data set by omitting the artefacts filter. Does anyone know why this would be and whether we should trust the rest of the assemblies with these few anomalous contigs?
Many thanks