SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
How to map 454 reads/contigs to a mitochondrial genome? fruktimport Bioinformatics 2 03-28-2011 10:35 AM
Assembling De Novo 454 Transcriptome Contigs and Singletons with Illumina Short Reads Vickenstein Bioinformatics 7 03-05-2011 12:43 AM
454 and Illumina data; classifying reads/contigs poisson200 Bioinformatics 0 10-29-2010 02:53 AM
Calculate phrap quality with two 3730 reads anyone1985 Bioinformatics 0 03-11-2010 05:50 PM
Best tool to map 454 reads onto sanger reads? dan Bioinformatics 3 07-27-2009 08:51 AM

Reply
 
Thread Tools
Old 11-21-2010, 04:30 PM   #1
cleoho175
Junior Member
 
Location: Canada

Join Date: Nov 2010
Posts: 5
Default Using PHRAP to assemble 454 contigs and Sanger reads

Hi all,

I'm a beginner in sequence assembly and have a question about using the assembler, PHRAP.

I'm trying to use PHRAP to assemble 454 contigs (generated by MIRA) and Sanger reads. However, PHRAP requires its input to follow the St. Louis naming convention, and my 454 contigs clearly do not conform to this system since they are outputs from MIRA; also, they don't contain any info that allows me to convert their names to St. Louis names.

Has anyone encountered similar problems before? And how did one resolve naming issue like this?

Thanks.
cleoho175 is offline   Reply With Quote
Old 11-22-2010, 02:58 AM   #2
sklages
Senior Member
 
Location: Berlin, DE

Join Date: May 2008
Posts: 628
Default

Hi,

- naming schemes are usually applied if you need to put forward/reverse reads into relation (template name, insert size etc). This is probably not important for your contigs. These are unordered (MIRA doesn't do any scaffolding) pieces of sequence of different sizes ...

- why don't you use MIRA for a hybrid 454/sanger assembly?

- why don't you use Roche's newbler (2.5) for a hybrid 454/sanger assembly?
(OK, you need to have access to Roche software).

cheers,
Sven
sklages is offline   Reply With Quote
Old 11-22-2010, 06:23 AM   #3
cleoho175
Junior Member
 
Location: Canada

Join Date: Nov 2010
Posts: 5
Default

Hi Sven,

I have done a 454/sanger hybrid assembly with the reads in MIRA. Now I'm trying assemble the 454 contigs and sanger reads to see its difference between the hybrid trial.

And since MIRA does not output an TRACEINFO XML file for my 454 contigs, I don't have the required input for another trial in MIRA; this is why I chose PHRAP because it needs only the fasta and fasta quality files.

But now I'm having problems with the naming.....so as you said, if the direction isn't important for my contigs, can I input my contigs into PHRAP without conforming to the naming? (PHRAP manual doesn't recommend to do so..?)
cleoho175 is offline   Reply With Quote
Old 11-22-2010, 07:21 AM   #4
sklages
Senior Member
 
Location: Berlin, DE

Join Date: May 2008
Posts: 628
Default

As you have all input data, you are probably more successful if use all data in a hybrid approach.

Nevertheless, no need to use phrap, MIRA knows the switch "--notraceinfo" or if you need to handle chemistries differently, "merge_xmltraceinfo=yes/no". Have alook at the docs accordingly.

cheers,
Sven
sklages is offline   Reply With Quote
Old 11-22-2010, 06:53 PM   #5
Torst
Senior Member
 
Location: The University of Melbourne, AUSTRALIA

Join Date: Apr 2008
Posts: 275
Default

Quote:
Originally Posted by cleoho175 View Post
I'm trying to use PHRAP to assemble 454 contigs (generated by MIRA) and Sanger reads.
You may want to considere CAP3 or wgs-assembler / Arache too.
Torst is offline   Reply With Quote
Old 11-23-2010, 01:08 PM   #6
cleoho175
Junior Member
 
Location: Canada

Join Date: Nov 2010
Posts: 5
Default

Quote:
Originally Posted by sklages View Post
As you have all input data, you are probably more successful if use all data in a hybrid approach.

Nevertheless, no need to use phrap, MIRA knows the switch "--notraceinfo" or if you need to handle chemistries differently, "merge_xmltraceinfo=yes/no". Have alook at the docs accordingly.

cheers,
Sven
I asked Bastien Chevreux (the developer of MIRA) about assembling my contigs in MIRA, and he doesn't suggest doing so because the 454 contigs may be too long for MIRA to handle. So I guess the question goes back to whether one can input contigs into PHRAP without naming it properly according to the St. Louis convention.

As you said before, it probably doesn't matter because my contigs are unordered (I concur to this point of view). But since I don't know the consequences of disobeying the convention, I'm afraid that inputting the contigs into PHRAP will just completely crash the program :S
cleoho175 is offline   Reply With Quote
Old 11-23-2010, 01:09 PM   #7
cleoho175
Junior Member
 
Location: Canada

Join Date: Nov 2010
Posts: 5
Default

Quote:
Originally Posted by Torst View Post
You may want to considere CAP3 or wgs-assembler / Arache too.
Thanks for the suggestion!
cleoho175 is offline   Reply With Quote
Old 11-23-2010, 03:05 PM   #8
Torst
Senior Member
 
Location: The University of Melbourne, AUSTRALIA

Join Date: Apr 2008
Posts: 275
Default

Quote:
Originally Posted by cleoho175 View Post
So I guess the question goes back to whether one can input contigs into PHRAP without naming it properly according to the St. Louis convention. But since I don't know the consequences of disobeying the convention, I'm afraid that inputting the contigs into PHRAP will just completely crash the program :S
I'm pretty sure you won't crash phrap if you disobey the convention. I've fed it arbitrary .fasta files before. Surely you can just do a simple test anyway to check?

The Phrap manual "phrap.doc" describes the various St.Louis suffixes:

"s" forward direction read on single stranded (SS) template, dye primer chemistry
"f" forward read on double stranded (DS) template, dye primer chemistry[LIST]
"r" DS reverse read, dye primer chemistry
"x" SS forward read, standard dye terminator chemistry
"z" DS forward read, standard dye terminator chemistry
"y" DS reverse read, standard dye terminator chemistry
"i" SS forward read, big dye terminator chemistry
"b" DS forward read, big dye terminator chemistry
"g" DS reverse read, big dye terminator chemistry
"t" for T7 (cDNAs)
"p" for SP6 (cDNAs)
"e" for T3 (cDNAs)
"d" for special
"c" consensus pieces
"a" assembly pieces

As you want to feed it existing contigs, just use ".c" or ".a" suffixes on your read IDs. That's what I've done in the past.
Torst is offline   Reply With Quote
Old 11-24-2010, 11:45 AM   #9
cleoho175
Junior Member
 
Location: Canada

Join Date: Nov 2010
Posts: 5
Default

I'll give it a try. Thanks so much for the help! I'll report back the results.
cleoho175 is offline   Reply With Quote
Reply

Tags
454 assembly, hybrid assembly, phrap, sanger, st. louis naming

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 07:46 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO