Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to call variants from a whole genome alignment file

    Hi All
    I have a quick question about calling SNP/InDel from a whole genome alignment file which is in FASTA format. I have aligned 45 assembled bacterial genomes (~4Mb each) using Mugsy tool and converted the MAF output to FASTA. One of the 45 genomes is the reference. Now I want a table (preferably in VCF format) showing SNP/InDels in 44 strains as compared to my reference strain. Please suggest how to do this?

    Thanks

    baika

  • #2
    I think if u use the samtools mpileup command using the reference genome of ur choice as the input reference fasta and the remaining 44 alignments in the bam format u shud get what u want! Correct me if im wrong!

    Comment


    • #3
      Is it possible to convert an aligned FASTA file into a BAM file? I would really appreciate if you could suggest me any tool to do that.

      Thanks

      Comment


      • #4
        I don't really think its possible to convert a fasta file directly into BAM. BAM files are meant to store many short reads with associated quality scores, but fasta is just a listing of a single sequence. You may be able to use some trickery to force the fasta into BAM (or, more likely, BAMs uncompressed version, SAM), but I'm guessing this would be more trouble than its worth.
        If you want to convert the aligned, fasta-formatted genomes into a vcf, I bet you'll end up writing a script (bash, perl, python, etc.) to do the job. VCF format isn't super complicated, and the script would simply look at each alignment column and see which samples differed from the reference. That would be my advice...
        HTH

        Comment


        • #5
          Originally posted by baika View Post
          Is it possible to convert an aligned FASTA file into a BAM file? I would really appreciate if you could suggest me any tool to do that.

          Thanks
          Sorry there my mistake... when you said you aligned genomes i automatically thought in terms of fastq onto ref fast alignment stored in the form of bams.
          So in my opinion, what you CAN do instead is
          1. Identify your reference bacterial genome fasta and store it in a separate file.
          2. Download a read simulator/generator (eg DWGsim, ART, Maq, etc)
          3. generate PAIRED END reads for the remaining 44 genomes. ENSURE THAT SNP AND INDEL INTRODUCTION RATES ARE SET TO ZERO.
          4. Align these reads to the reference separately using an aligner such as BWA, NOVOAlign or Stampy, etc.
          5. Do a samtools mpileup to get your results in a bcf format!
          6. Thank me later

          Comment


          • #6
            I have analysed my NGS data from Illumina.. (RRBS).. I have fragments (chromosome, genomic position etc).. I would like to search whether there is any common SNPs present in my fragments. I mean I would like to search against databases and see what are the chances that these fragments could contain a potential common SNPS. I presume I would like to DbSNPs (131 or 135 may be). But if anyone details the process or advices which will enable me to do this search quickly that will be much appreciated.

            Comment


            • #7
              How to call variants from a whole genome alignment file

              Originally posted by baika View Post
              Hi All
              I have a quick question about calling SNP/InDel from a whole genome alignment file which is in FASTA format. I have aligned 45 assembled bacterial genomes (~4Mb each) using Mugsy tool and converted the MAF output to FASTA. One of the 45 genomes is the reference. Now I want a table (preferably in VCF format) showing SNP/InDels in 44 strains as compared to my reference strain. Please suggest how to do this?

              Thanks

              baika
              i am interested to call variants from a whole-genome alignment file, i am working with fungi have 6 whole genomes one of them is reference strain (~13Mb for each).

              i read more about Mugsy software, i am new for using Linux and dealing with Terminal command.

              could you help to know the command to run the alignment for 6 genomes?

              Comment


              • #8
                Have a look on TASSEL. It can convert your fasta file into VCF without writing commands.
                here is the download link:

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Strategies for Sequencing Challenging Samples
                  by seqadmin


                  Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                  03-22-2024, 06:39 AM
                • seqadmin
                  Techniques and Challenges in Conservation Genomics
                  by seqadmin



                  The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                  Avian Conservation
                  Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                  03-08-2024, 10:41 AM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, Yesterday, 06:37 PM
                0 responses
                10 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, Yesterday, 06:07 PM
                0 responses
                9 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 03-22-2024, 10:03 AM
                0 responses
                49 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 03-21-2024, 07:32 AM
                0 responses
                67 views
                0 likes
                Last Post seqadmin  
                Working...
                X