Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Using PHRAP to assemble 454 contigs and Sanger reads

    Hi all,

    I'm a beginner in sequence assembly and have a question about using the assembler, PHRAP.

    I'm trying to use PHRAP to assemble 454 contigs (generated by MIRA) and Sanger reads. However, PHRAP requires its input to follow the St. Louis naming convention, and my 454 contigs clearly do not conform to this system since they are outputs from MIRA; also, they don't contain any info that allows me to convert their names to St. Louis names.

    Has anyone encountered similar problems before? And how did one resolve naming issue like this?

    Thanks.

  • #2
    Hi,

    - naming schemes are usually applied if you need to put forward/reverse reads into relation (template name, insert size etc). This is probably not important for your contigs. These are unordered (MIRA doesn't do any scaffolding) pieces of sequence of different sizes ...

    - why don't you use MIRA for a hybrid 454/sanger assembly?

    - why don't you use Roche's newbler (2.5) for a hybrid 454/sanger assembly?
    (OK, you need to have access to Roche software).

    cheers,
    Sven

    Comment


    • #3
      Hi Sven,

      I have done a 454/sanger hybrid assembly with the reads in MIRA. Now I'm trying assemble the 454 contigs and sanger reads to see its difference between the hybrid trial.

      And since MIRA does not output an TRACEINFO XML file for my 454 contigs, I don't have the required input for another trial in MIRA; this is why I chose PHRAP because it needs only the fasta and fasta quality files.

      But now I'm having problems with the naming.....so as you said, if the direction isn't important for my contigs, can I input my contigs into PHRAP without conforming to the naming? (PHRAP manual doesn't recommend to do so..?)

      Comment


      • #4
        As you have all input data, you are probably more successful if use all data in a hybrid approach.

        Nevertheless, no need to use phrap, MIRA knows the switch "--notraceinfo" or if you need to handle chemistries differently, "merge_xmltraceinfo=yes/no". Have alook at the docs accordingly.

        cheers,
        Sven

        Comment


        • #5
          Originally posted by cleoho175 View Post
          I'm trying to use PHRAP to assemble 454 contigs (generated by MIRA) and Sanger reads.
          You may want to considere CAP3 or wgs-assembler / Arache too.

          Comment


          • #6
            Originally posted by sklages View Post
            As you have all input data, you are probably more successful if use all data in a hybrid approach.

            Nevertheless, no need to use phrap, MIRA knows the switch "--notraceinfo" or if you need to handle chemistries differently, "merge_xmltraceinfo=yes/no". Have alook at the docs accordingly.

            cheers,
            Sven
            I asked Bastien Chevreux (the developer of MIRA) about assembling my contigs in MIRA, and he doesn't suggest doing so because the 454 contigs may be too long for MIRA to handle. So I guess the question goes back to whether one can input contigs into PHRAP without naming it properly according to the St. Louis convention.

            As you said before, it probably doesn't matter because my contigs are unordered (I concur to this point of view). But since I don't know the consequences of disobeying the convention, I'm afraid that inputting the contigs into PHRAP will just completely crash the program :S

            Comment


            • #7
              Originally posted by Torst View Post
              You may want to considere CAP3 or wgs-assembler / Arache too.
              Thanks for the suggestion!

              Comment


              • #8
                Originally posted by cleoho175 View Post
                So I guess the question goes back to whether one can input contigs into PHRAP without naming it properly according to the St. Louis convention. But since I don't know the consequences of disobeying the convention, I'm afraid that inputting the contigs into PHRAP will just completely crash the program :S
                I'm pretty sure you won't crash phrap if you disobey the convention. I've fed it arbitrary .fasta files before. Surely you can just do a simple test anyway to check?

                The Phrap manual "phrap.doc" describes the various St.Louis suffixes:

                "s" forward direction read on single stranded (SS) template, dye primer chemistry
                "f" forward read on double stranded (DS) template, dye primer chemistry[LIST]
                "r" DS reverse read, dye primer chemistry
                "x" SS forward read, standard dye terminator chemistry
                "z" DS forward read, standard dye terminator chemistry
                "y" DS reverse read, standard dye terminator chemistry
                "i" SS forward read, big dye terminator chemistry
                "b" DS forward read, big dye terminator chemistry
                "g" DS reverse read, big dye terminator chemistry
                "t" for T7 (cDNAs)
                "p" for SP6 (cDNAs)
                "e" for T3 (cDNAs)
                "d" for special
                "c" consensus pieces
                "a" assembly pieces

                As you want to feed it existing contigs, just use ".c" or ".a" suffixes on your read IDs. That's what I've done in the past.

                Comment


                • #9
                  I'll give it a try. Thanks so much for the help! I'll report back the results.

                  Comment

                  Latest Articles

                  Collapse

                  • seqadmin
                    Current Approaches to Protein Sequencing
                    by seqadmin


                    Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                    04-04-2024, 04:25 PM
                  • seqadmin
                    Strategies for Sequencing Challenging Samples
                    by seqadmin


                    Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                    03-22-2024, 06:39 AM

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by seqadmin, 04-11-2024, 12:08 PM
                  0 responses
                  30 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 04-10-2024, 10:19 PM
                  0 responses
                  32 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 04-10-2024, 09:21 AM
                  0 responses
                  28 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 04-04-2024, 09:00 AM
                  0 responses
                  53 views
                  0 likes
                  Last Post seqadmin  
                  Working...
                  X