Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • help (!!!!): snynonous and nonsynonymous in bacteria

    Dear all,

    I have several bacterial strains (very closed to each other) and want to compare SNPs between any two genomes.
    several steps in my analysis:
    1. mummer (nucmer) to do alignment and identity SNPs. the result is like this:
    [P1] [SUB] [P2] | [BUFF] [DIST] | [FRM] [TAGS]
    ===================================================================
    831 T C 831 | 831 831 | 1 1 genome1 genome2

    2. remove indels (>1bp).
    3. for the remaining SNPs, the SNPs could be determined they are in coding regions or not. for the SNPs in coding regions, I did blast two genes (SNPs inside) to see they are high homologous or not. if yes,
    4. two coding sequences were aligned using muscle. then synonymous and nonsynonymous SNPs were told by SNAP.pl.

    The result is quite weird because the total number of synonymous SNPs is smaller than the number of nonsynonymous SNPs.

    This should be incorrect. But I couldn't figure out where is the problem.
    Can anybody help me? May someone give me any clue where the problem is? Many many many thanks.

    Salmon

  • #2
    Why do you think the results are weird? In other words maybe your results are true. What apriori proof gave you the idea that the synonymous SNPs should be smaller?

    Comment


    • #3
      positive Darwinian selection?

      Provided that your observation is not due to bugs (have you checked some genes with nonsynonymous SNPs manually?), you can see positive Darwinian selection acting on the genes that have more nonsynonymous than synonymous SNPs (dN/dS > 1).
      This means that changing the encoded amino acid provided an evolutionary advantage compared to a silent mutation, while most codons stay identical. I would expect that for bacterial strains because they are evolving rapidly.
      If you want to go into more detail on dN/dS, there is the PAML package, http://abacus.gene.ucl.ac.uk/software/paml.html, and a web site where you can try it out, http://www.bork.embl.de/pal2nal/

      Comment


      • #4
        There are 3 nt positions to the codon. In general nt changes (SNPs) in the 1st or 2nd position of a codon result in an amino acid change (are nonsynonymous), where as many nt changes in the third position of the codon do not (are synonymous). Therefore a quick thumbnail calculation would suggest that if mutations are occurring randomly (ie are equally likely in any position of the codon), and there are no evolutionary selective forces acting (ie most mutations are selectively neutral), then one would expect that 2/3s (66%) of the coding SNPs would be nonsynonymous and 1/3 (33%) would be synonymous. If you do the actual calculation of all possible nt changes in all of the codons, I think the actual ratio is more like 70% of all coding SNPs are nonsynonymous and 30% are synonymous. Now if you are using real world data and not theoretical data, then there are selective forces acting and nonsynonymous SNPs are more like to have consequences than synonymous. If we assume that bacteria are the products of eons of evolution and that therefore most of the enzymes are fairly highly evolved, then it would seem more likely that nonsynonymous mutations are more likely to be deleterious than beneficial, meaning that amino acid changing mutations are more likely to result in a strain that is less fit and less able to compete. Over time these strains and therefore nonsynonymous mutations will be selectively lost. However if your strains are closely related and have been isolated temporally fairly close to each other (say a clonal outbreak of a pathogenic bacteria) then you are unlikely to observe much loss of nonsynonymous mutations because there has not been enough time for the slightly less fit strains to be competitively removed (competed out of the population). If on the other hand you are comparing more distantly related strains that have not shared a recent common ancestor then it is more likely that the strains will differ by many more SNPs (because there has been more time for random mutation to occur), but there will be a lower proportion of nsSNPs becuse there has been time for evolutionary selective forces to have removed less fit polymorphisms.

        Comment


        • #5
          Dear all,

          Many thanks for your wonderful replies. ALL are helpful! I have calculated dN/dS either and all results are reasonable, but could be slightly different between genomes.

          This is my first work on SNPs. I had wrong conceptions about SNPs before I read your comments.
          westerman teaches me be more confident on my own results. The number of synonymous SNPs could be much larger than the number of nonsynonymous SNPs. (sbberes has very very clear explanations).
          positive Darwinian selection acting on the genes that have more nonsynonymous than epigen pointed out if the positive selection act on genes, the index dN/dS would be bigger than 1.

          In a word, generally, for bacterial genes, # of nonsynonymous > # of synonymous but dN/dS < 1 if no obvious positive selection observed.

          Great help and interesting topic.

          Thanks again.
          Salmon
          Last edited by Salmon; 09-14-2010, 08:35 PM.

          Comment


          • #6
            Thank you everybody on this thread for the questions and the answers. This thread helped me in clearing my ideas about positive selection.

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Essential Discoveries and Tools in Epitranscriptomics
              by seqadmin




              The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
              04-22-2024, 07:01 AM
            • seqadmin
              Current Approaches to Protein Sequencing
              by seqadmin


              Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
              04-04-2024, 04:25 PM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, 04-11-2024, 12:08 PM
            0 responses
            59 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 10:19 PM
            0 responses
            57 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 09:21 AM
            0 responses
            51 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-04-2024, 09:00 AM
            0 responses
            56 views
            0 likes
            Last Post seqadmin  
            Working...
            X