Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • SAM: a generic alignment format

    For NGS data analysis, an aligner tends to be successful when it comes with utilities for comprehensive downstream analyses such as reference based assembly, SNP/indel calling and alignment viewer. Eland/GAPipeline, Soap and Maq are such examples. Unfortunately, it is non-trivial to implement all these downstream analyses and implementing these for each aligner would be a waste of time and human resources as well. Mostly we want to separate alignment from the downstream analyses after the alignment. To achieve this, we need a generic alignment format that makes all aligners happy. NovoAlign and Bowtie can output Maq alignment format to take the advantage of Maq downstream data processing. However, Maq format does not really suit the goal. It does not support longer reads nor alignment with more than one indel and it is too specific to Maq. To solve this problem, the 1000Genome Project Committee decided to develop a generic alignment format. And now the first version of specification and implementation have come out.

    The new alignment format, SAM (Sequence Alignment/Map), is the collaborative result of several major genome centres. It eliminates the major defects of Maq format while retaining its advantages. We also migrated and improved various downstream data processing implemented in Maq/Maqview, such as indexing, pileup, viewer and consensus caller. For more information, please check website:



    I hope samtools may help aligner developers to promote their own software: once a program can generate alignment in SAM format, Maq-like downstream analysis will be available right now.

  • #2
    Thanks Heng.
    It looks this will be very useful and make it easy to try various new upcoming tools..

    Is it possible to have a workflow like MAQ's easyrun that takes through a user case for SAM/BAM?
    --
    bioinfosm

    Comment


    • #3
      Hey lh3,

      Thanks for posting this here. I'm going to sticky it in the Bioinformatics forum for a while to make sure everyone sees it!

      Comment


      • #4
        The documentation notes that "Only MAQ->SAM converter is implemented." However, I could not find anywhere that referenced this conversion utility. Is there software to perform this conversion?

        Comment


        • #5
          To lparsons:

          After you compile samtools with "make", you will find "maq2sam-short" and "maq2sam-long" in the "misc/" directory. There is also a script "export2sam.pl" that converts Illumina's export to SAM. I have not thoroughly tested this script on all export files, though.

          Comment


          • #6
            I downloaded samtools-0.1.1 but could not find "wgsim" or "wgsim_eval.pl" programs which are noted in bwa-0.3.0 documentation.
            How can I get these programs ?

            Comment


            • #7
              To corthay:

              You are quick. I am planning a new bwa release as I realized that I could improve it a little without much work (PS: the new version is released now). Wgsim, wgsim_eval.pl and converters for soap and bowtie are available from SVN only:

              svn co https://samtools.svn.sourceforge.net...s/dev/samtools samtools
              Last edited by lh3; 01-06-2009, 07:34 AM.

              Comment


              • #8
                indelpe vs samtools indels

                Hi Heng Li.
                Could you comment on how the indel detection works in SAM pileups vs MAQ indelpe? I am seeing many more indels in my SAM pileup generated from a MAQ alignment (as compared to the output from indelpe). Is there a good filtering strategy for these?

                Thanks,

                Ryan

                Comment


                • #9
                  I am planning to release samtools-0.1.2 which fixed some bugs in the old version and added new features. For now you can check out source codes from SVN. It should be quite close to 0.1.2.

                  The new version comes with a Bayesian indel caller, although it is just a prototype at present. The strength of the samtools' caller is that it makes use of reads mapped without indel. Using this information helps to reduce false negatives. In addition, the new caller gives genotype rather than just saying there is an indel. You cannot easily tell from maq's indelpe if the indel is a heterozygote or a homozygote. With the new caller, the filters could be: a) the indel score; b) two indels should not be too close to each other.

                  Comment


                  • #10
                    What's the difference between maq2sam-short and -long?

                    Also, short seems to segfault on 64-bit versions of Red Hat and Ubuntu... Am I missing something?

                    Comment


                    • #11
                      maq2sam-short is for the .map files generated by maq-0.6.x, while maq2sam-long for files generated by maq-0.7.x. Sorry for the confusion, and one of the aims of SAM is to avoid such confusions in future.

                      Comment


                      • #12
                        samtools index seg fault

                        I am using the most current version of samtools from svn.
                        I successfully ran the "samtools import" command on my .sam file from bwa.
                        When I then run "samtools index" on the .bam file, it seg faults.
                        Let me know if you need more information to determine what is causing this.
                        Last edited by webbrewer; 03-05-2009, 08:28 PM.

                        Comment


                        • #13
                          samtools import

                          samtools import is for making a .bam file from a .sam file. Why are you attempting to run this command on a .bam file?

                          Comment


                          • #14
                            Originally posted by myrna View Post
                            samtools import is for making a .bam file from a .sam file. Why are you attempting to run this command on a .bam file?
                            Oops. I meant to say that "samtools index" seg faults.

                            Comment


                            • #15
                              samtools index

                              Have you tried samtools view foo.bam?

                              If you get the sam alignments back, then all should be well. I believe you get a warning if the .bam file is unsorted, but perhaps you should try this if you haven't already:

                              samtools sort foo.bam bar.sort

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Techniques and Challenges in Conservation Genomics
                                by seqadmin



                                The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                                Avian Conservation
                                Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                                03-08-2024, 10:41 AM
                              • seqadmin
                                The Impact of AI in Genomic Medicine
                                by seqadmin



                                Artificial intelligence (AI) has evolved from a futuristic vision to a mainstream technology, highlighted by the introduction of tools like OpenAI's ChatGPT and Google's Gemini. In recent years, AI has become increasingly integrated into the field of genomics. This integration has enabled new scientific discoveries while simultaneously raising important ethical questions1. Interviews with two researchers at the center of this intersection provide insightful perspectives into...
                                02-26-2024, 02:07 PM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, 03-14-2024, 06:13 AM
                              0 responses
                              34 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 03-08-2024, 08:03 AM
                              0 responses
                              72 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 03-07-2024, 08:13 AM
                              0 responses
                              81 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 03-06-2024, 09:51 AM
                              0 responses
                              68 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X