I'm noticing surprisingly low pre-assembly yields for an HGAP 2.2 run on a small bacteriophage genome (~150kb). Depending mostly on the min seed read length, I see yields between 0.081 to 0.19. I know "good" yields are at least >0.5. Any idea why I'm seeing such low yields?
Sequencing:
1 SMRT
~426Mb sequence
HGAP details (best run):
min seed read length = auto
tgt cov = 15x
genome size = 150000
all else is default
HGAP results:
1 contig @ ~139kb
Polymerase Read Bases 295,181,019
Length Cutoff 14,925
Seed Bases 4,512,649
Pre-Assembled bases 899,955
Pre-Assembled Yield .199
Pre-Assembled Reads 254
Pre-Assembled Reads Length 3,543
Pre-Assembled N50 5,210
Is this simply because the genome is tiny? Theoretical coverage is ~1,966x (from polymerase read bases estimate); so is HGAP limiting the number of subreads participating in the pre-assembly?
Thanks!
Sequencing:
1 SMRT
~426Mb sequence
HGAP details (best run):
min seed read length = auto
tgt cov = 15x
genome size = 150000
all else is default
HGAP results:
1 contig @ ~139kb
Polymerase Read Bases 295,181,019
Length Cutoff 14,925
Seed Bases 4,512,649
Pre-Assembled bases 899,955
Pre-Assembled Yield .199
Pre-Assembled Reads 254
Pre-Assembled Reads Length 3,543
Pre-Assembled N50 5,210
Is this simply because the genome is tiny? Theoretical coverage is ~1,966x (from polymerase read bases estimate); so is HGAP limiting the number of subreads participating in the pre-assembly?
Thanks!
Comment