Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • automating the determination of a core genome set from annotated genomes

    Dear all

    I am currently working on some large data sets where I have assembled and annotated closely related bacterial genomes. In order to identify "core" and "accesory" genes within my strain set I have been csowly working through artemis and tagging genes as orthologues. This becomes extremely time consuming as the number of genomes increases. Does anyone know of an automated process for this? I have considered using Mauve or Mugsy but to my knowledge these just allow extraction of blocks of sequence, not actual orthologous genes. In essence what i want is a txt file list of genes present in all the strains, and genes unique to strains.

    Any help would be greatly appreciated

    Alan

  • #2
    Orthomcl does a good job. The software predicts orthology based on reciprocal blast. The output gives you a general picture of what you want. However for more specific details you want to reconstruct some trees with the putative orthologs. The software does not take in account gene synteny, because it needs only the coding sequences. In my hands most of the predicted orthologues show a clear synteny though.
    Here is the link:

    Comment


    • #3
      Alan, if you supply Genbank files as Mauve input so it knows where the genes are you will find it offers the ability to export an orthologue report. It's a menu option.

      Comment


      • #4
        Many thanks Gents. I will check out both options tomorrow. It's problems like this that make me think i need to find a really good Perl course

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Essential Discoveries and Tools in Epitranscriptomics
          by seqadmin




          The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
          04-22-2024, 07:01 AM
        • seqadmin
          Current Approaches to Protein Sequencing
          by seqadmin


          Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
          04-04-2024, 04:25 PM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, Yesterday, 08:47 AM
        0 responses
        12 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-11-2024, 12:08 PM
        0 responses
        60 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 10:19 PM
        0 responses
        59 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 09:21 AM
        0 responses
        54 views
        0 likes
        Last Post seqadmin  
        Working...
        X