Hello SEQers,
We have a series of 10 bacterial genomes sequenced with Illumina (300-base PE reads, before read processing). We want to find SND/InDels responsible for a phenotype in 7 of them.
Using different mappers (BWA-mem, bowtie2, cushaw2) with reads processed to different extents (through trim, trimmomatic, BBduk) I can get maps that provide variant calls (mpileup or FreeBayes) for several established/known mutations (relative to the reference), but some assemblers give maps that show a change in certain genes and others don't (I can detail the differences in the maps if needed). One simple approach to interrogate the promising variant calls is to compare the same loci to those found in de novo assemblies from the same reads (an unbiased sequence).
Unfortunately, the genome assemblers we have used repeatedly spit out contigs from other regions, not the 2 or 3 specific locations we are interested in.
I don't need to build a complete genome and general gap-filling strategies do not guarantee a contig build covering the regions of interest.
Is it possible to "force" a contig builder (Velvet, BBmap, A5, etc.) to build from seeded region(s) adjacent to a locus of interest? Say, maybe start 500 bases to the left or right, then let the contig grow across the mutant locus.
All we need is some independent way (not reference-influenced) of looking at the consensus contig that covers that particular locus to strengthen our variant lists before we start Sanger sequencing potential hits.
Thanks for any help.
We have a series of 10 bacterial genomes sequenced with Illumina (300-base PE reads, before read processing). We want to find SND/InDels responsible for a phenotype in 7 of them.
Using different mappers (BWA-mem, bowtie2, cushaw2) with reads processed to different extents (through trim, trimmomatic, BBduk) I can get maps that provide variant calls (mpileup or FreeBayes) for several established/known mutations (relative to the reference), but some assemblers give maps that show a change in certain genes and others don't (I can detail the differences in the maps if needed). One simple approach to interrogate the promising variant calls is to compare the same loci to those found in de novo assemblies from the same reads (an unbiased sequence).
Unfortunately, the genome assemblers we have used repeatedly spit out contigs from other regions, not the 2 or 3 specific locations we are interested in.
I don't need to build a complete genome and general gap-filling strategies do not guarantee a contig build covering the regions of interest.
Is it possible to "force" a contig builder (Velvet, BBmap, A5, etc.) to build from seeded region(s) adjacent to a locus of interest? Say, maybe start 500 bases to the left or right, then let the contig grow across the mutant locus.
All we need is some independent way (not reference-influenced) of looking at the consensus contig that covers that particular locus to strengthen our variant lists before we start Sanger sequencing potential hits.
Thanks for any help.
Comment