Go Back   SEQanswers > Bioinformatics > Bioinformatics

Similar Threads
Thread Thread Starter Forum Replies Last Post
Alignment to selected region of the reference genome houkto General 1 02-20-2012 05:51 AM
Assisted de novo genome assembly? Create new consensus mapping reads to reference? zmartine Bioinformatics 8 02-10-2012 12:31 AM
Please help: imperfect reference genome/get consensus on genome/read alignment? KAP Bioinformatics 1 08-19-2011 07:14 AM
RNA-seq assembly and reference genome lfaino Bioinformatics 3 04-13-2011 07:05 AM
Reference genome for MAQ - split reference genome by chromosome or not? inesdesantiago Bioinformatics 4 02-18-2009 08:44 AM

Thread Tools
Old 01-09-2012, 09:01 AM   #1
Junior Member
Location: San Francisco

Join Date: Feb 2011
Posts: 6
Default Targeted Genome Assembly for region poorly represented in reference genome?


I am not sure if this is the best place to put this, but any help is appreciated.

I am working on identifying candidate genes for a mutation in a subtelomeric region of Zebrafish. The Zv9 Assembly is not great, and is particularly bad in this region. The region contains an improperly placed clone with WGS contigs surrounding it, which contain genes known to be deleted in one of the mutant alleles but are not causative. The reference in this region is so poor that my best bet is to attempt to re-build it. I am already in contact with the people at Sanger in regards to this, but I also have quite a bit of differential RNAseq data as well as WGS data of both a different mutant allele (that causes the same phenotype but was generated via ENU mutagenesis and so is likely a point mutation).

Does anyone have any ideas for the best way to assemble this region in a targeted fashion? I have access to a 32GB ram fairly powerful computer, but this obviously is not enough to take the large amount of WGS data I have and assemble with velvet. Even then, I feel that the velvet assembly would only give me at best 5kb contigs that won't be much more effective than what is already available. I have considered trying to align sequences to the region as it is known with bowtie then assembling those reads only, but the reference is so poor that I don't think this will be effective either (multiple genes we know to be deleted are not represented in Zv9 in any fashion, or are only partially represented, or are represented split far apart with opposite strandedness).

Thanks in advance for any assistance or advice.
gumbos is offline   Reply With Quote
Old 01-09-2012, 04:01 PM   #2
Senior Member
Location: Boston area

Join Date: Nov 2007
Posts: 747

Yuck! You certainly have an ugly situation.

I think if I were in your shoes I would try as many different approaches as possible to identify paired end reads where at least one of the pairs can be mapped to the region (existing assembly, mapping to your RNA-Seq data that you think is in the region, etc), then try assembling that with Velvet or another assembler. Then use that assembly to identify additional paired end reads mapping in & repeat. Keep cycling until things don't seem to be getting better.

Is there a publicly available BAC or cosmid library for zebrafish? I doubt you'll get anything like what you want without some long-range sequence information, and pulling out a big clone for the region could be a bunch of work but is one of the more obvious ways to go about it. Alternatively, have you tried making mate pair libraries for the whole genome?
krobison is offline   Reply With Quote

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

All times are GMT -8. The time now is 06:01 AM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO