Hello everyone,
I'm new to genome assemblies and am trying to patch together de novo a genome without an available reference (expected size 20-100 Mb).
I've got about 32 files of paired end Illumina MiSeq reads with mostly 150bp but also 75bp reads which I've been trimming with cutadapt, fastqGroomed, currently trying to make Quake work on them and hopefully that will help stitch it all together in a better way than my employer already has which results in over 66.000 scaffolds! For the assembly I'm going to try SOAP de novo and ALLPATHS-LG.
Any tips or links to pipelines for something similar would be appreciated. Also can anyone tell me if I should be combining my sequence files before running Quake or not? FastQC shows me I've got between 8 and 35% duplicates in my sequence files.
I'm new to genome assemblies and am trying to patch together de novo a genome without an available reference (expected size 20-100 Mb).
I've got about 32 files of paired end Illumina MiSeq reads with mostly 150bp but also 75bp reads which I've been trimming with cutadapt, fastqGroomed, currently trying to make Quake work on them and hopefully that will help stitch it all together in a better way than my employer already has which results in over 66.000 scaffolds! For the assembly I'm going to try SOAP de novo and ALLPATHS-LG.
Any tips or links to pipelines for something similar would be appreciated. Also can anyone tell me if I should be combining my sequence files before running Quake or not? FastQC shows me I've got between 8 and 35% duplicates in my sequence files.
Comment