SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Assembling De Novo 454 Transcriptome Contigs and Singletons with Illumina Short Reads Vickenstein Bioinformatics 7 03-05-2011 01:43 AM
Any good idea for assembling 454 and Solexa mate-pair data anyone1985 Bioinformatics 0 09-09-2010 06:26 AM
In silico data sets from BACs for GAII Illumina CG&R Bioinformatics 1 12-16-2009 07:37 AM
454 assembling against reference genome donniemarco 454 Pyrosequencing 2 08-17-2009 07:22 AM
Assembling .sff files from 454 and finishing Raj Bioinformatics 6 06-03-2009 11:55 AM

Reply
 
Thread Tools
Old 01-21-2009, 02:20 PM   #1
westerman
Rick Westerman
 
Location: Purdue University, Indiana, USA

Join Date: Jun 2008
Posts: 1,104
Default Assembling pooled BACs from 454 data

We have a 454 titanium run of ~50 pooled BACs. Not bar-coded. Not paired-end. Two clonal lines. Previously mostly unsequenced genome. Genome is undoubtedly repetitive. BACs could overlap.

I am having trouble assembling the BACs. Newbler runs but then hangs in the 'deconvoluting step'. The TIGR EST clustering pipeline -- hey, I figured this was like an EST program only with bigger "ESTs" -- is throwing most of the reads into one contig even after masking out vector, adapters, etc. Of course ideally one would like to see 50 or so contigs which could then be assembled.

Does anyone have any papers to read or ideas on how to extract these BACs from the 350 Mbase dataset? I guess that basically I need a good clustering method. After that the assembly itself should be simple.

Thanks,
-- Rick
westerman is offline   Reply With Quote
Old 01-22-2009, 12:31 AM   #2
mgenome
Junior Member
 
Location: Korea

Join Date: Sep 2008
Posts: 4
Default

How about dividing sequences into small groups?

After assembling the each group and gathering the contigs, you can assemble the whole contigs one more time.

Using more stringent criteria such as higher homology and longer mimium overlaps can be an another approach.
mgenome is offline   Reply With Quote
Old 01-22-2009, 06:23 AM   #3
westerman
Rick Westerman
 
Location: Purdue University, Indiana, USA

Join Date: Jun 2008
Posts: 1,104
Default

Quote:
Originally Posted by mgenome View Post
How about dividing sequences into small groups?

After assembling the each group and gathering the contigs, you can assemble the whole contigs one more time.
A good idea and one that I will try. If nothing else I might get to the repetitive parts of the BACs.

Quote:

Using more stringent criteria such as higher homology and longer mimium overlaps can be an another approach.
Yes. I was running several of these clusters last night only to come back to work this morning and find that I was over my disk quota and that my programs crashed in mysterious ways. Who would have guessed that 250 GB would not be enough space.
westerman is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 02:34 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO