Hello bioinformaticians!
First post here, sorry if it's not in the completely correct forum section. I'm completely new to bioinformatics and have just been assigned to a genomic dna sequencing + genome assembly project and would appreciate your advice for some basic questions I have!
We are conducting a population survey of ~90 strains of Bacillus Subtilis (genome size = 4 MB). We would like to do full genome sequencing on each of these strains. We already have many reference genomes sequenced and will use those as anchors.
We will purify DNA from each strain and will have 90 individual DNA samples. We want to send these DNA samples out to a university/company to be sequenced. Using that data, we want to assemble the genome of each of these strains with Velvet, SPAdes, or an alternative.
Right now, I have the responsibility of choosing where our DNA samples get sent out, what NGS platform to use, and what run specifications to use for our project.
My issue right now is that I'm not sure which next generation sequencing platform is suitable for our project if we intend to do this genome assembly as our goal. How do I pick between Illumina (Miseq, Hiseq, etc) vs. PacBio?
I also am unsure of what depth of coverage would be acceptable for our purposes; I've been told 10X should be good enough but that seems low to me - perhaps 20X would suffice?
I'm also not sure what read lengths would be appropriate - do you know if paired-end reads of 2x150, 2x250 would be good?
I'm from a different field so I have a lot to learn - I'd appreciate any pointers you have. Thanks!
First post here, sorry if it's not in the completely correct forum section. I'm completely new to bioinformatics and have just been assigned to a genomic dna sequencing + genome assembly project and would appreciate your advice for some basic questions I have!
We are conducting a population survey of ~90 strains of Bacillus Subtilis (genome size = 4 MB). We would like to do full genome sequencing on each of these strains. We already have many reference genomes sequenced and will use those as anchors.
We will purify DNA from each strain and will have 90 individual DNA samples. We want to send these DNA samples out to a university/company to be sequenced. Using that data, we want to assemble the genome of each of these strains with Velvet, SPAdes, or an alternative.
Right now, I have the responsibility of choosing where our DNA samples get sent out, what NGS platform to use, and what run specifications to use for our project.
My issue right now is that I'm not sure which next generation sequencing platform is suitable for our project if we intend to do this genome assembly as our goal. How do I pick between Illumina (Miseq, Hiseq, etc) vs. PacBio?
I also am unsure of what depth of coverage would be acceptable for our purposes; I've been told 10X should be good enough but that seems low to me - perhaps 20X would suffice?
I'm also not sure what read lengths would be appropriate - do you know if paired-end reads of 2x150, 2x250 would be good?
I'm from a different field so I have a lot to learn - I'd appreciate any pointers you have. Thanks!