Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • baika
    Member
    • Apr 2012
    • 12

    How to call variants from a whole genome alignment file

    Hi All
    I have a quick question about calling SNP/InDel from a whole genome alignment file which is in FASTA format. I have aligned 45 assembled bacterial genomes (~4Mb each) using Mugsy tool and converted the MAF output to FASTA. One of the 45 genomes is the reference. Now I want a table (preferably in VCF format) showing SNP/InDels in 44 strains as compared to my reference strain. Please suggest how to do this?

    Thanks

    baika
  • arkal
    advancing one byte at a time!
    • Jun 2011
    • 56

    #2
    I think if u use the samtools mpileup command using the reference genome of ur choice as the input reference fasta and the remaining 44 alignments in the bam format u shud get what u want! Correct me if im wrong!

    Comment

    • baika
      Member
      • Apr 2012
      • 12

      #3
      Is it possible to convert an aligned FASTA file into a BAM file? I would really appreciate if you could suggest me any tool to do that.

      Thanks

      Comment

      • brofallon
        Member
        • May 2011
        • 26

        #4
        I don't really think its possible to convert a fasta file directly into BAM. BAM files are meant to store many short reads with associated quality scores, but fasta is just a listing of a single sequence. You may be able to use some trickery to force the fasta into BAM (or, more likely, BAMs uncompressed version, SAM), but I'm guessing this would be more trouble than its worth.
        If you want to convert the aligned, fasta-formatted genomes into a vcf, I bet you'll end up writing a script (bash, perl, python, etc.) to do the job. VCF format isn't super complicated, and the script would simply look at each alignment column and see which samples differed from the reference. That would be my advice...
        HTH

        Comment

        • arkal
          advancing one byte at a time!
          • Jun 2011
          • 56

          #5
          Originally posted by baika View Post
          Is it possible to convert an aligned FASTA file into a BAM file? I would really appreciate if you could suggest me any tool to do that.

          Thanks
          Sorry there my mistake... when you said you aligned genomes i automatically thought in terms of fastq onto ref fast alignment stored in the form of bams.
          So in my opinion, what you CAN do instead is
          1. Identify your reference bacterial genome fasta and store it in a separate file.
          2. Download a read simulator/generator (eg DWGsim, ART, Maq, etc)
          3. generate PAIRED END reads for the remaining 44 genomes. ENSURE THAT SNP AND INDEL INTRODUCTION RATES ARE SET TO ZERO.
          4. Align these reads to the reference separately using an aligner such as BWA, NOVOAlign or Stampy, etc.
          5. Do a samtools mpileup to get your results in a bcf format!
          6. Thank me later

          Comment

          • aniruddha.otago
            Member
            • Jan 2010
            • 21

            #6
            I have analysed my NGS data from Illumina.. (RRBS).. I have fragments (chromosome, genomic position etc).. I would like to search whether there is any common SNPs present in my fragments. I mean I would like to search against databases and see what are the chances that these fragments could contain a potential common SNPS. I presume I would like to DbSNPs (131 or 135 may be). But if anyone details the process or advices which will enable me to do this search quickly that will be much appreciated.

            Comment

            • DR.AYAH
              Junior Member
              • Jul 2019
              • 1

              #7
              How to call variants from a whole genome alignment file

              Originally posted by baika View Post
              Hi All
              I have a quick question about calling SNP/InDel from a whole genome alignment file which is in FASTA format. I have aligned 45 assembled bacterial genomes (~4Mb each) using Mugsy tool and converted the MAF output to FASTA. One of the 45 genomes is the reference. Now I want a table (preferably in VCF format) showing SNP/InDels in 44 strains as compared to my reference strain. Please suggest how to do this?

              Thanks

              baika
              i am interested to call variants from a whole-genome alignment file, i am working with fungi have 6 whole genomes one of them is reference strain (~13Mb for each).

              i read more about Mugsy software, i am new for using Linux and dealing with Terminal command.

              could you help to know the command to run the alignment for 6 genomes?

              Comment

              • salarshaaf
                Junior Member
                • Aug 2019
                • 1

                #8
                Have a look on TASSEL. It can convert your fasta file into VCF without writing commands.
                here is the download link:

                Comment

                Latest Articles

                Collapse

                • SEQadmin2
                  Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                  by SEQadmin2


                  I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.


                  Here are nine questions we think about, in roughly the order they matter, before...
                  06-18-2026, 07:11 AM
                • SEQadmin2
                  From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                  by SEQadmin2


                  Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                  The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                  ...
                  06-02-2026, 10:05 AM
                • SEQadmin2
                  Single-Cell Sequencing at an Inflection Point: Early Impacts of New Platforms and Emerging Trends
                  by SEQadmin2


                  With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.


                  Introduction

                  Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...
                  05-22-2026, 06:42 AM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by SEQadmin2, 06-17-2026, 06:09 AM
                0 responses
                24 views
                0 reactions
                Last Post SEQadmin2  
                Started by SEQadmin2, 06-09-2026, 11:58 AM
                0 responses
                40 views
                0 reactions
                Last Post SEQadmin2  
                Started by SEQadmin2, 06-05-2026, 10:09 AM
                0 responses
                47 views
                0 reactions
                Last Post SEQadmin2  
                Started by SEQadmin2, 06-04-2026, 08:59 AM
                0 responses
                49 views
                0 reactions
                Last Post SEQadmin2  
                Working...