Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Fasta to Ace conversion

    Is there a program to convert a Fasta file to an Ace assembly file? While googling I came across references to fasta2ace.pl but no program itself.

    Thanks.
    Farhat Habib

  • #2
    I am looking for the exact same tool ... fasta to ace, but have not succeeded yet.
    If it can use quality values, even better...

    the ace file can then be used by eagleView to visualize reads on reference
    --
    bioinfosm

    Comment


    • #3
      I am looking for it for Eagleview as well.

      -Farhat
      Farhat Habib

      Comment


      • #4
        Originally posted by Farhat View Post
        Is there a program to convert a Fasta file to an Ace assembly file?
        Can you be a bit more precise on what you require?
        A FASTA file is just a bunch of sequences with an ID and a description.
        What form do you want the ACE file to take?

        Comment


        • #5
          Farhat,

          I don't think it is possible to do what you are asking. FASTA files only contain ID/definition line(s) followed by sequence line(s). You may also have an accompanying quality score file. An ACE file contains much more information than this. For each contig (an ACE file may include more than one contig) it will contain the gapped sequence and quality scores, the gapped sequences of the constituent reads as well as offset information indicating where each of the constituent reads is located on the contig (reference). This information does not exist in the FASTA files so it would be impossible to construct a meaningful ACE file.

          Comment


          • #6
            Thanks for the replies. Yes, I realize the Fasta File by itself doesn't have enough information to construct the ACE file. I wrote my own script to take in a FASTA file, a FASTQ quality file and the output from a SOAP or ELAND aligner and convert that to ACE which does work with EagleView.
            Farhat Habib

            Comment


            • #7
              Originally posted by Farhat View Post
              Thanks for the replies. Yes, I realize the Fasta File by itself doesn't have enough information to construct the ACE file. I wrote my own script to take in a FASTA file, a FASTQ quality file and the output from a SOAP or ELAND aligner and convert that to ACE which does work with EagleView.
              Thats great !
              I started writing a script of my own, but then got on to other things

              Farhat - is it possible for you to share the script for format conversion?
              --
              bioinfosm

              Comment


              • #8
                Originally posted by bioinfosm View Post
                Thats great !
                I started writing a script of my own, but then got on to other things

                Farhat - is it possible for you to share the script for format conversion?
                Yes, but it is not very mature though and has limitations. It works fine with Eagleview but there seem to be issues making it work with pbShort. If you want it, PM me with your email.

                -F
                Farhat Habib

                Comment


                • #9
                  I'm looking for exactly the same thing for eagleview too!!
                  Would you mind sharing your script with me? I'll send you a message shortly. Thanks!
                  Jia


                  Originally posted by Farhat View Post
                  Yes, but it is not very mature though and has limitations. It works fine with Eagleview but there seem to be issues making it work with pbShort. If you want it, PM me with your email.

                  -F

                  Comment


                  • #10
                    Can I ave a look to your script ?
                    Thanks

                    nico l'allias

                    Comment


                    • #11
                      This still makes no sense.

                      ACE is an assembly output, while fasta is just a bunch of sequences with no assembly information. Are you asking for advice on what assembler to use? This will obviously depend a lot on the type of data and whether you want a denovo or mapped assembly.

                      James

                      PS. Contrary to above, I don't believe ACE supports quality values. At least I've never seen any - instead the authors of ace preferred to store qualities in "phd" files (in possibly the most inefficient format known to man). I'd love to be wrong on this though as it'll make my life easier. :-)

                      Comment


                      • #12
                        Originally posted by jkbonfield View Post
                        PS. Contrary to above, I don't believe ACE supports quality values. At least I've never seen any - instead the authors of ace preferred to store qualities in "phd" files (in possibly the most inefficient format known to man). I'd love to be wrong on this though as it'll make my life easier. :-)
                        You can store PHRED qualities for a contig in an ACE file on BQ lines. I don't think the quality scores of the reads themselves are stored, which is probably what you meant.

                        P.S. The MIRA assembly format (MAF, which is a bit like ACE), stores both - using FASTQ like encoding which is much more space efficient:

                        Comment


                        • #13
                          Getting off-topic, sorry.

                          However MAF looks like a nice format. The problems of random ordering of data in CAF and the complete lack of sequence quality in ACE is one reason why I produced BAF, although it never really went anywhere and I only use it locally as an interchange format.

                          Certainly it's true that ACE and CAF are very cumbersome for next-gen data, while SAM/BAM have other major issues when it comes to mixed technologies (such as not supporting older capillary style assemblies with potentially more than two sequences per template).

                          A good find. :-)

                          Comment


                          • #14
                            Originally posted by jkbonfield View Post
                            Getting off-topic, sorry.

                            However MAF looks like a nice format. The problems of random ordering of data in CAF and the complete lack of sequence quality in ACE is one reason why I produced BAF, although it never really went anywhere and I only use it locally as an interchange format.
                            I think Bastien was thinking along the same lines when he came up with MAF for internal use in MIRA.
                            Originally posted by jkbonfield View Post
                            Certainly it's true that ACE and CAF are very cumbersome for next-gen data, while SAM/BAM have other major issues when it comes to mixed technologies (such as not supporting older capillary style assemblies with potentially more than two sequences per template).
                            I'd like the option to include the reference sequences (not just their names and lengths; and as a further option the reference quality scores) to make a SAM/BAM file self contained. This is probably not important for people working on model organisms, but would seem useful for early stages of projects with draft assemblies, or if working on a new strain etc. Its something that ACE and other assembly formats have.

                            Comment


                            • #15
                              Originally posted by jkbonfield View Post
                              This still makes no sense.

                              ACE is an assembly output, while fasta is just a bunch of sequences with no assembly information. Are you asking for advice on what assembler to use? This will obviously depend a lot on the type of data and whether you want a denovo or mapped assembly.
                              The original question was probably misleading. Farhat did later on say he was able to convert FASTQ reads into an ACE assembly by getting the missing information from the SOAP/ELAND alignment.

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Current Approaches to Protein Sequencing
                                by seqadmin


                                Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                                04-04-2024, 04:25 PM
                              • seqadmin
                                Strategies for Sequencing Challenging Samples
                                by seqadmin


                                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                                03-22-2024, 06:39 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, 04-11-2024, 12:08 PM
                              0 responses
                              27 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-10-2024, 10:19 PM
                              0 responses
                              30 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-10-2024, 09:21 AM
                              0 responses
                              26 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-04-2024, 09:00 AM
                              0 responses
                              52 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X