We have a 454 titanium run of ~50 pooled BACs. Not bar-coded. Not paired-end. Two clonal lines. Previously mostly unsequenced genome. Genome is undoubtedly repetitive. BACs could overlap.
I am having trouble assembling the BACs. Newbler runs but then hangs in the 'deconvoluting step'. The TIGR EST clustering pipeline -- hey, I figured this was like an EST program only with bigger "ESTs" -- is throwing most of the reads into one contig even after masking out vector, adapters, etc. Of course ideally one would like to see 50 or so contigs which could then be assembled.
Does anyone have any papers to read or ideas on how to extract these BACs from the 350 Mbase dataset? I guess that basically I need a good clustering method. After that the assembly itself should be simple.
Thanks,
-- Rick
I am having trouble assembling the BACs. Newbler runs but then hangs in the 'deconvoluting step'. The TIGR EST clustering pipeline -- hey, I figured this was like an EST program only with bigger "ESTs" -- is throwing most of the reads into one contig even after masking out vector, adapters, etc. Of course ideally one would like to see 50 or so contigs which could then be assembled.
Does anyone have any papers to read or ideas on how to extract these BACs from the 350 Mbase dataset? I guess that basically I need a good clustering method. After that the assembly itself should be simple.
Thanks,
-- Rick
Comment