Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • .abi to fasta/fastq conversion script/program?

    Hi All,
    We have sanger sequencing data we'd like to incorporate into a de novo 454 bacterial genome assembly using MIRA; only the core facility that does our sanger sequencing can only provide us .abi files, not the more useful fasta/fastq file combination. Any suggestions for a program or script that has useful batch conversion features?

  • #2
    Hi,
    in general it is always better to have the original run/chromatogram data

    You should basecall the data (not only extracting the sequence from the abi file).

    There are two (common) basecalling programs:

    1) phred (http://www.phrap.org/consed/consed.html#howToGet)
    2) TraceTuner (https://sourceforge.net/projects/tracetuner/)

    Both produce sequences in fasta format with qualities ...

    You should also get rid of vector / low quality. For sanger data lucy is doing a good job (https://sourceforge.net/projects/lucy/).

    After basecalling and vector/lowquality clipping you can go for MIRA ..

    cheers,
    Sven

    Comment


    • #3
      Thanks!

      Much appreciated!

      Comment


      • #4
        Currently Trace Tuner's tool ttrace does not offer FASTQ output.

        You can use ttrace to output FASTA+QUAL and then use a script to merge them into a FASTQ file. Or, you can also use ttrace to output PHD files, and convert those into FASTQ.

        Since it is open source (GPL v2 or later), I've written a patch to ttrace to directly support FASTQ output:

        Comment


        • #5
          Patch to the patch

          Dear Maubp,

          Thanks for the patch! Very useful extension to a very useful program! Unfortunately after applying your patch it did not work straight away (for me at least) therefore I made some adaptations to your patch and posted it at:


          Cheers,
          Bart

          Comment


          • #6
            Originally posted by BratdaKing View Post
            Dear Maubp,

            Thanks for the patch! Very useful extension to a very useful program! Unfortunately after applying your patch it did not work straight away (for me at least) therefore I made some adaptations to your patch and posted it at:


            Cheers,
            Bart
            Hi Bart,

            What did you need to change and why? A quick verbal summary would be great (since the diff formats we used are different, comparing them by eye
            is a pain).

            I should write back to the tracetuner guys about getting this merged in...

            Peter

            P.S. It would have been simpler to post the patch on the existing feature request wouldn't it - rather than filing a new one?
            Last edited by maubp; 10-18-2010, 06:08 AM. Reason: formatting

            Comment


            • #7
              Hi, I know this thread is very old, but I'm trying to do the same thing what AppleInformatics wanted to do.

              I have no idea how to use the TraceTuner.
              Could you give me some advice, how to run it?

              Comment


              • #8
                - download the archive, deflate it, chdir to tracetuner_3.0.6beta/src
                - read README
                - type
                make
                - type
                ../rel/Linux_64/ttuner -h
                .. assuming your are running a linux system.

                Comment


                • #9
                  Yes, at least I'll use command line version. I was wondered if I could use the viewer
                  (java -jar ttuner_tools.jar). In the log file I've got:

                  Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: No such child: 0
                  at java.awt.Container.getComponent(Container.java:306)
                  at com.paracel.tt.util.Filespec.setSingleSelection(Filespec.java:509)
                  at com.paracel.tt.util.Filespec.<init>(Filespec.java:64)
                  at com.paracel.tt.run.TTrun.<init>(TTrun.java:67)
                  at com.paracel.tt.run.TTrun.main(TTrun.java:553)

                  Comment


                  • #10
                    yep, same for me. I never used the jar file .. ttuner works just fine :-)

                    A few years ago we trained our ttuner with a few 10,000 ABI traces
                    for better performance.

                    You might want to give phred a try as well; a bit faster and the
                    "defacto standard" in sanger sequencing.

                    Comment


                    • #11
                      This is a promising thread! However, after I'm stuck on using the command line for ttuner. I keep getting an error saying I haven't provided an output format. Can someone give me a common command usage for ttuner? I'd like to convert multiple phd.1 files into FastQ.

                      I've used:

                      ttuner -id /directory/*.phd.1 -fd /directory/

                      Comment


                      • #12
                        Little More Progress on Using ttuner - still need help

                        So, I've figured out that I had ordered the commands improperly. I am now writing it as:

                        ttuner -fd /directory/ -qd /directory/ -if /directory/*phd.1

                        I get "Can't stat: 2: No such file or directory"

                        I'm pretty sure it is the *phd.1 that is causing the problems. Can anyone suggest how to input multiple .phd.1 files? When I remove the *phd.1 the program automatically looks for .ab1 files.

                        Comment


                        • #13
                          ttuner is a basecalling software, it reads chromatograms like ab1, determines the base sequence and its qualities and finally writes some other format, either chromatogram files (SCF) or their textual representation (fasta, phd etc.). You cannot convert phd to fastq this way.

                          You might want to have a look at 'convert_project' from the MIRA assembly package (http://mira-assembler.sourceforge.ne...mutils_convpro).
                          There might be other solutions as well (Bioperl might be interesting as well, http://www.bioperl.org/wiki/HOWTO:SeqIO).

                          hth, Sven

                          Comment

                          Latest Articles

                          Collapse

                          • seqadmin
                            Advancing Precision Medicine for Rare Diseases in Children
                            by seqadmin




                            Many organizations study rare diseases, but few have a mission as impactful as Rady Children’s Institute for Genomic Medicine (RCIGM). “We are all about changing outcomes for children,” explained Dr. Stephen Kingsmore, President and CEO of the group. The institute’s initial goal was to provide rapid diagnoses for critically ill children and shorten their diagnostic odyssey, a term used to describe the long and arduous process it takes patients to obtain an accurate...
                            12-16-2024, 07:57 AM
                          • seqadmin
                            Recent Advances in Sequencing Technologies
                            by seqadmin



                            Innovations in next-generation sequencing technologies and techniques are driving more precise and comprehensive exploration of complex biological systems. Current advancements include improved accessibility for long-read sequencing and significant progress in single-cell and 3D genomics. This article explores some of the most impactful developments in the field over the past year.

                            Long-Read Sequencing
                            Long-read sequencing has seen remarkable advancements,...
                            12-02-2024, 01:49 PM

                          ad_right_rmr

                          Collapse

                          News

                          Collapse

                          Topics Statistics Last Post
                          Started by seqadmin, 12-17-2024, 10:28 AM
                          0 responses
                          33 views
                          0 likes
                          Last Post seqadmin  
                          Started by seqadmin, 12-13-2024, 08:24 AM
                          0 responses
                          48 views
                          0 likes
                          Last Post seqadmin  
                          Started by seqadmin, 12-12-2024, 07:41 AM
                          0 responses
                          34 views
                          0 likes
                          Last Post seqadmin  
                          Started by seqadmin, 12-11-2024, 07:45 AM
                          0 responses
                          46 views
                          0 likes
                          Last Post seqadmin  
                          Working...
                          X