View Single Post
Old 01-09-2012, 09:01 AM   #1
Junior Member
Location: San Francisco

Join Date: Feb 2011
Posts: 6
Default Targeted Genome Assembly for region poorly represented in reference genome?


I am not sure if this is the best place to put this, but any help is appreciated.

I am working on identifying candidate genes for a mutation in a subtelomeric region of Zebrafish. The Zv9 Assembly is not great, and is particularly bad in this region. The region contains an improperly placed clone with WGS contigs surrounding it, which contain genes known to be deleted in one of the mutant alleles but are not causative. The reference in this region is so poor that my best bet is to attempt to re-build it. I am already in contact with the people at Sanger in regards to this, but I also have quite a bit of differential RNAseq data as well as WGS data of both a different mutant allele (that causes the same phenotype but was generated via ENU mutagenesis and so is likely a point mutation).

Does anyone have any ideas for the best way to assemble this region in a targeted fashion? I have access to a 32GB ram fairly powerful computer, but this obviously is not enough to take the large amount of WGS data I have and assemble with velvet. Even then, I feel that the velvet assembly would only give me at best 5kb contigs that won't be much more effective than what is already available. I have considered trying to align sequences to the region as it is known with bowtie then assembling those reads only, but the reference is so poor that I don't think this will be effective either (multiple genes we know to be deleted are not represented in Zv9 in any fashion, or are only partially represented, or are represented split far apart with opposite strandedness).

Thanks in advance for any assistance or advice.
gumbos is offline   Reply With Quote