well, there are probably at least two genomes in there and lots of contaminating host DNA. One genome is pretty easy because it has >95% sequence identity to an annotated genome. The other is more of a challenge b/c is likely quite diverged from anything else. both are under 500kb in length.
jt
also i tried aggressive filtering. reads >50bp, quality >30 for 100% of the read.
cut it down to 5 million PE reads. didn't help. did lower the coverage of the more abundant genome accordingly, but also reduced the contig lengths and didn't effect the number of Ns in the contigs.
jt
also i tried aggressive filtering. reads >50bp, quality >30 for 100% of the read.
cut it down to 5 million PE reads. didn't help. did lower the coverage of the more abundant genome accordingly, but also reduced the contig lengths and didn't effect the number of Ns in the contigs.
Comment