SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Need some suggestion for overlap assembler ljhwahaha Bioinformatics 5 03-25-2014 07:27 AM
suggestion and price Dan_cham SOLiD 11 03-04-2011 05:07 AM
New topic suggestion Susanne Site Feedback/Suggestions 2 10-16-2009 03:47 AM
some suggestion Chien-Yuan Chen Site Feedback/Suggestions 1 08-24-2009 02:30 PM
What's your suggestion to the paper title? baohua100 General 2 07-02-2008 01:21 PM

Reply
 
Thread Tools
Old 12-06-2010, 11:37 AM   #1
qqsmallfrog
Junior Member
 
Location: USA

Join Date: Dec 2010
Posts: 1
Default Scaffolding suggestion?

Hello:

I'm assembling a genomic region about 10Mb, using data from various platforms. Here are the types of data I have:
1. Some Sanger sequences of BAC ends and target genes
2. Single end 454 reads
3. Single end 50 bp Solexa reads
4. Paired end 74 bp Solexa reads
I think my current strategy is to assemble those data separately into 4 pools of contigs. Then I would like to assemble the 4 pools of contigs and then scaffold them together (with the PE Solexa data). There are two strategies:
A. Assemble those contigs first (with CAP3 or so), and use the Solexa PE reads to help scaffolding the final long contigs.
B. Assemble the contigs together with all the Solexa PE reads, in software like MIRA, then the scaffolding process is automatically done within MIRA.
Do people have an idea which one is better? For strategy A to work, I assume I would need to map the Solexa PE reads to the contigs (with software like BWA) and use the mapping information for scaffolding. Do people know of a scaffolding software that could deal with this?

Thanks,
Cheng-Ruei Lee
qqsmallfrog is offline   Reply With Quote
Old 12-06-2010, 11:23 PM   #2
flxlex
Moderator
 
Location: Oslo, Norway

Join Date: Nov 2008
Posts: 415
Default

If you have a close enough reference genome, you could run different assemblies and 'merge' them using MAIA: http://bioinformatics.oxfordjournals...6/18/i433.full. I haven't used it myself, but it looks very promising! Not what you asked for, but just another idea...
flxlex is offline   Reply With Quote
Old 12-10-2010, 11:49 PM   #3
huma Asif
Member
 
Location: Japan

Join Date: Oct 2010
Posts: 53
Question 165 scaffold

Dear All,
I have sequenced a bacterial genome using solexa
these days working working on assembly
I have assembled it using SOAP denovo and have got 164 scaffold
I am now confused that what must i do with the scaffold . shall i annotate the data i have got or try to improve scaffold with using other assembler
please help
huma Asif is offline   Reply With Quote
Old 12-11-2010, 06:47 AM   #4
jjohnson
Member
 
Location: Washington DC Metro Area

Join Date: Aug 2009
Posts: 20
Default

You could also try running the Celera assembler, which has a built in scaffolder and supports all of the data types you mention. http://j.mp/h7uX9i

It can have a pretty steep learning curve, but I have found it produces spectacular results. There is excellent help and how to and the team supporting it at the Venter Institute and University of Maryland CBCB are always willing to help out.
__________________
Justin H. Johnson | Twitter: @BioInfo | LinkedIn: http://bit.ly/LIJHJ | EdgeBio
jjohnson is offline   Reply With Quote
Old 12-11-2010, 06:49 AM   #5
jjohnson
Member
 
Location: Washington DC Metro Area

Join Date: Aug 2009
Posts: 20
Default

Quote:
Originally Posted by huma Asif View Post
Dear All,
I have sequenced a bacterial genome using solexa
these days working working on assembly
I have assembled it using SOAP denovo and have got 164 scaffold
I am now confused that what must i do with the scaffold . shall i annotate the data i have got or try to improve scaffold with using other assembler
please help
Asif,

This is a decision that is completely up to what the project dictates. You could try another assembler, like Celera, and see if you fill in gaps or produce a better assembly. If you want to annotate the genome, then scaffolding is not the sole important metric. You should look to see what your avg or N50 contig size is, if it is small, then producing good de novo annotation will be hard.
__________________
Justin H. Johnson | Twitter: @BioInfo | LinkedIn: http://bit.ly/LIJHJ | EdgeBio
jjohnson is offline   Reply With Quote
Old 12-11-2010, 05:45 PM   #6
huma Asif
Member
 
Location: Japan

Join Date: Oct 2010
Posts: 53
Default N50=70234

thank you for ur reply
N50 of my assembly is 70234 .My project demand is just to assemble the data that i have got from illumina and to to figure out plant pathogenic genes.
If u think that this N50 is nt bad suggest me some online bacterial genome annotation tool .I have tried Glimmer and in output i got some ORF .i want to check what they are or are they complete .
I have expertise in Chloroplast genomics resequencing projects and have newly started working on bacterial genomics and denovo assembly so confused about how to generate complete sequence from Scaffold .this bacterial genome that i assemble through SOAP denovo shows 9942 gaps .As far as I understand i need to fill these gaps to get complete genome .At present i am not interested in completing the genome so my thought is make a rough map of the this bacteria with gap and see how many genes are covered and what do they code
please help me with annotation tools to start with this thought
huma Asif is offline   Reply With Quote
Old 01-18-2011, 01:11 AM   #7
Mona
Member
 
Location: Uppsala

Join Date: Feb 2010
Posts: 27
Default

Hello Huma,
Have you tried to blast the ORF that you got to check what these ORFs could be?
Mona is offline   Reply With Quote
Old 01-18-2011, 01:47 AM   #8
huma Asif
Member
 
Location: Japan

Join Date: Oct 2010
Posts: 53
Default Yes

yes I have checked these ORF
Now I have covered many problems with the help of this Best forum
I have assembled my genome again with the suggestions i got from this forum
I have annotated them and and have checked the evolutionary genes and found that the species I am working is Pseudomonas putida.As far as I know it is not plant pathogen but having some virulence genes .these days I am trying to figure out papers on pseudomonas putida and their role in biofilm formation
I will be obliged if I get any info about these organism from here
Regards
huma Asif is offline   Reply With Quote
Old 04-22-2012, 11:30 PM   #9
waterboy
Member
 
Location: India

Join Date: Oct 2010
Posts: 14
Default how to merge mulitple scaffold files??

Hello All,
I have two scaffold sequence files obtained from assembly of SOLiD MP and 454 PE paired reads. I would like to build super scaffolds using these two scaffold sequence files with the help of 454 paired information(20kb). please suggest any pipeline/software's for this purpose.
waterboy is offline   Reply With Quote
Old 04-01-2013, 01:18 AM   #10
sivasubramani
Member
 
Location: India

Join Date: Apr 2011
Posts: 14
Default

Hello waterboy,

What is the assembly tool you used to assemble SOLiD MP data...??

Thanks,
sivasubramani is offline   Reply With Quote
Old 04-18-2013, 07:31 PM   #11
iaia
Junior Member
 
Location: Georgia

Join Date: Apr 2013
Posts: 2
Default

Dear all,
I have sequence data in 4 contigs, please could you inform me which program to use to get one FASTA? I have no experience in this filed and please helm me.
Thank you in advance
iaia is offline   Reply With Quote
Old 04-20-2013, 08:47 AM   #12
krobison
Senior Member
 
Location: Boston area

Join Date: Nov 2007
Posts: 747
Default

Quote:
Originally Posted by iaia View Post
Dear all,
I have sequence data in 4 contigs, please could you inform me which program to use to get one FASTA? I have no experience in this filed and please helm me.
There's no magical solution; it will depend on the data you have & the genome you are studying. The obvious question is how many contigs do you expect to have when you are complete and why? What is the nature of the contigs and how did you generate them?
krobison is offline   Reply With Quote
Old 04-29-2013, 02:09 AM   #13
Mona
Member
 
Location: Uppsala

Join Date: Feb 2010
Posts: 27
Default

Hi iaia,

Do you just want to merge them simply? or you want the proper scaffolding based on the sequence, which contig should come first and which later?
Mona is offline   Reply With Quote
Old 05-29-2013, 07:50 PM   #14
iaia
Junior Member
 
Location: Georgia

Join Date: Apr 2013
Posts: 2
Default

Thank you for your reply,
yea, I wanted to scaffold them based on the sequence...
iaia is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 08:46 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2018, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO