So I have a bit of an issue that I keep running into.
Here's what I'm trying to do.
I have sequencing data for a bacterial genome on BOTH PacBio and Illumina. I'm trying to finish this genome with the amount of data I have.
I've downloaded a reference genome and mapped both data sets to the reference using Geneious. They map very well and I try to extract a consensus sequence.
However, there are areas of the map reference that my genome doesn't have and there are areas of poor coverage that I want to be spliced out.
Basically, what I'm asking is how can I throw out the crap data from a map reference? I extract the consensus sequence and get a ton of unspecified bases in areas of poor coverage (http://www.chem.qmul.ac.uk/iubmb/misc/naseq.html).
Thanks
Alec
Here's what I'm trying to do.
I have sequencing data for a bacterial genome on BOTH PacBio and Illumina. I'm trying to finish this genome with the amount of data I have.
I've downloaded a reference genome and mapped both data sets to the reference using Geneious. They map very well and I try to extract a consensus sequence.
However, there are areas of the map reference that my genome doesn't have and there are areas of poor coverage that I want to be spliced out.
Basically, what I'm asking is how can I throw out the crap data from a map reference? I extract the consensus sequence and get a ton of unspecified bases in areas of poor coverage (http://www.chem.qmul.ac.uk/iubmb/misc/naseq.html).
Thanks
Alec
Comment