Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • .abi to fasta/fastq conversion script/program?

    Hi All,
    We have sanger sequencing data we'd like to incorporate into a de novo 454 bacterial genome assembly using MIRA; only the core facility that does our sanger sequencing can only provide us .abi files, not the more useful fasta/fastq file combination. Any suggestions for a program or script that has useful batch conversion features?

  • #2
    Hi,
    in general it is always better to have the original run/chromatogram data

    You should basecall the data (not only extracting the sequence from the abi file).

    There are two (common) basecalling programs:

    1) phred (http://www.phrap.org/consed/consed.html#howToGet)
    2) TraceTuner (https://sourceforge.net/projects/tracetuner/)

    Both produce sequences in fasta format with qualities ...

    You should also get rid of vector / low quality. For sanger data lucy is doing a good job (https://sourceforge.net/projects/lucy/).

    After basecalling and vector/lowquality clipping you can go for MIRA ..

    cheers,
    Sven

    Comment


    • #3
      Thanks!

      Much appreciated!

      Comment


      • #4
        Currently Trace Tuner's tool ttrace does not offer FASTQ output.

        You can use ttrace to output FASTA+QUAL and then use a script to merge them into a FASTQ file. Or, you can also use ttrace to output PHD files, and convert those into FASTQ.

        Since it is open source (GPL v2 or later), I've written a patch to ttrace to directly support FASTQ output:

        Comment


        • #5
          Patch to the patch

          Dear Maubp,

          Thanks for the patch! Very useful extension to a very useful program! Unfortunately after applying your patch it did not work straight away (for me at least) therefore I made some adaptations to your patch and posted it at:


          Cheers,
          Bart

          Comment


          • #6
            Originally posted by BratdaKing View Post
            Dear Maubp,

            Thanks for the patch! Very useful extension to a very useful program! Unfortunately after applying your patch it did not work straight away (for me at least) therefore I made some adaptations to your patch and posted it at:


            Cheers,
            Bart
            Hi Bart,

            What did you need to change and why? A quick verbal summary would be great (since the diff formats we used are different, comparing them by eye
            is a pain).

            I should write back to the tracetuner guys about getting this merged in...

            Peter

            P.S. It would have been simpler to post the patch on the existing feature request wouldn't it - rather than filing a new one?
            Last edited by maubp; 10-18-2010, 06:08 AM. Reason: formatting

            Comment


            • #7
              Hi, I know this thread is very old, but I'm trying to do the same thing what AppleInformatics wanted to do.

              I have no idea how to use the TraceTuner.
              Could you give me some advice, how to run it?

              Comment


              • #8
                - download the archive, deflate it, chdir to tracetuner_3.0.6beta/src
                - read README
                - type
                make
                - type
                ../rel/Linux_64/ttuner -h
                .. assuming your are running a linux system.

                Comment


                • #9
                  Yes, at least I'll use command line version. I was wondered if I could use the viewer
                  (java -jar ttuner_tools.jar). In the log file I've got:

                  Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: No such child: 0
                  at java.awt.Container.getComponent(Container.java:306)
                  at com.paracel.tt.util.Filespec.setSingleSelection(Filespec.java:509)
                  at com.paracel.tt.util.Filespec.<init>(Filespec.java:64)
                  at com.paracel.tt.run.TTrun.<init>(TTrun.java:67)
                  at com.paracel.tt.run.TTrun.main(TTrun.java:553)

                  Comment


                  • #10
                    yep, same for me. I never used the jar file .. ttuner works just fine :-)

                    A few years ago we trained our ttuner with a few 10,000 ABI traces
                    for better performance.

                    You might want to give phred a try as well; a bit faster and the
                    "defacto standard" in sanger sequencing.

                    Comment


                    • #11
                      This is a promising thread! However, after I'm stuck on using the command line for ttuner. I keep getting an error saying I haven't provided an output format. Can someone give me a common command usage for ttuner? I'd like to convert multiple phd.1 files into FastQ.

                      I've used:

                      ttuner -id /directory/*.phd.1 -fd /directory/

                      Comment


                      • #12
                        Little More Progress on Using ttuner - still need help

                        So, I've figured out that I had ordered the commands improperly. I am now writing it as:

                        ttuner -fd /directory/ -qd /directory/ -if /directory/*phd.1

                        I get "Can't stat: 2: No such file or directory"

                        I'm pretty sure it is the *phd.1 that is causing the problems. Can anyone suggest how to input multiple .phd.1 files? When I remove the *phd.1 the program automatically looks for .ab1 files.

                        Comment


                        • #13
                          ttuner is a basecalling software, it reads chromatograms like ab1, determines the base sequence and its qualities and finally writes some other format, either chromatogram files (SCF) or their textual representation (fasta, phd etc.). You cannot convert phd to fastq this way.

                          You might want to have a look at 'convert_project' from the MIRA assembly package (http://mira-assembler.sourceforge.ne...mutils_convpro).
                          There might be other solutions as well (Bioperl might be interesting as well, http://www.bioperl.org/wiki/HOWTO:SeqIO).

                          hth, Sven

                          Comment

                          Latest Articles

                          Collapse

                          • seqadmin
                            Strategies for Sequencing Challenging Samples
                            by seqadmin


                            Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                            03-22-2024, 06:39 AM
                          • seqadmin
                            Techniques and Challenges in Conservation Genomics
                            by seqadmin



                            The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                            Avian Conservation
                            Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                            03-08-2024, 10:41 AM

                          ad_right_rmr

                          Collapse

                          News

                          Collapse

                          Topics Statistics Last Post
                          Started by seqadmin, Yesterday, 06:37 PM
                          0 responses
                          8 views
                          0 likes
                          Last Post seqadmin  
                          Started by seqadmin, Yesterday, 06:07 PM
                          0 responses
                          8 views
                          0 likes
                          Last Post seqadmin  
                          Started by seqadmin, 03-22-2024, 10:03 AM
                          0 responses
                          49 views
                          0 likes
                          Last Post seqadmin  
                          Started by seqadmin, 03-21-2024, 07:32 AM
                          0 responses
                          67 views
                          0 likes
                          Last Post seqadmin  
                          Working...
                          X