Haven't found a thread answering this question yet, apologies if it already exists somewhere.
I'm assembling some very large metagenomes in SPades from NextSeq data. I understand that the four FASTQ files from each of the flowcell lanes is typically concatenated to make a single file. However, SPades is running out of memory on my server mid assembly.
My question is this: Is there any technical reason for concatenating the FASTQ prior to analysis, rather than doing four assemblies and merging the scaffolds later? Doing the latter would save me memory but don't want to do it if it's bad form.
Still learning this stuff so any pointers welcome...
Cheers,
Nathan
I'm assembling some very large metagenomes in SPades from NextSeq data. I understand that the four FASTQ files from each of the flowcell lanes is typically concatenated to make a single file. However, SPades is running out of memory on my server mid assembly.
My question is this: Is there any technical reason for concatenating the FASTQ prior to analysis, rather than doing four assemblies and merging the scaffolds later? Doing the latter would save me memory but don't want to do it if it's bad form.
Still learning this stuff so any pointers welcome...
Cheers,
Nathan
Comment