Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • SeqMonk - Flexible analysis of mapped reads

    http://www.bioinformatics.bbsrc.ac.uk/projects/seqmonk/

    SeqMonk is a visualisation and analysis package for mapped read data. It is designed to be easy to use and to provide a variety of tools to visualise your data against an annotated genome, quantitate it, filter out regions of interest and then create lists of regions of interest. We have been using this software as our primary analysis tool for over a year now on a variety of different types of experiment and it is still under active development.

    SeqMonk is a cross platform tool and runs on Windows OSX and Linux. All you need to get started is some mapped data in a text file. It can directly read BED, GFF and Eland (sorted or export), but also has a generic import tool for other formats.

    We have been running SeqMonk training courses for a while now, and all of our training material can also be downloaded from our site to help you get started.

    We are happy to receive feedback about SeqMonk (either positive or negative!). Bug reports or enhancement requests can be put into our bugzilla database or sent directly to me.

    SeqMonk is free software released under the GPL.

  • #2
    SAM import

    Hey Simon,

    looks like a nice program, but it's not possible to import SAM files at the moment, as they don't give strand and stop position information. BAM/SAM import would be nice to have!

    Cheers

    Comment


    • #3
      If you can give me an example of SAM output I'd be happy to add in an input filter for this in the next release.

      Comment


      • #4
        The SAM format and it's binary counterpart BAM are described on the samtools page at http://samtools.sourceforge.net/, see the format specification link.
        Your competitor, the Integrative Genomics Viewer from the Broad Institute, reads BAM already.

        Comment


        • #5
          Other genomes

          I agree SeqMonk seems like a very useful program. How are new genomes created? I have several bacterial genomes I would like use, but there is no guidance in the documentation as to how to format the files.

          Comment


          • #6
            Originally posted by Ryanw View Post
            I agree SeqMonk seems like a very useful program. How are new genomes created? I have several bacterial genomes I would like use, but there is no guidance in the documentation as to how to format the files.
            For most users the genomes can be downloaded from within the program from the precompiled set we have available. All of the data comes straight out of Ensembl and is just slightly reformatted to use within the program.

            If you want to create your own genomes then it's actually fairly simple. The genomes are stored in EMBL format files (actually just the headers to save the space of storing the sequence - but leaving the sequence on there won't hurt). The only change you need to make from a standard EMBL file is a particular format for the accession line so that SeqMonk can figure out the chromosome name. These files are then placed into a standard directory structure in the programs genomes folder.

            In house we've made up genome files for Ecoli by adapting the public K12 sequence and it should be similarly easy for any other published genome. The only slight limitation is that SeqMonk currently has no concept of circular genomes so any reads which spanned the join would be discarded.

            In the latest release we've included the EnsemblAPI script we use for generating new genomes in house. With the Ensembl bacterial genomes project moving forward it would probably be very easy to use this to generate a wide range of bacterial genomes.

            If you have a particular species you're interested in then contact me and I'll look at making up a genome file for you.

            Comment


            • #7
              Hi! Seqmonk is very useful, thanks. Would there be any way of implementing display of interval files with thick and thin lines (BEDs thickStart and thickEnd), in my case for displaying mapped splice junctions?

              Regards,
              Johannes Waage
              Uni of Copenhagen

              Comment


              • #8
                Originally posted by jwaage View Post
                Hi! Seqmonk is very useful, thanks. Would there be any way of implementing display of interval files with thick and thin lines (BEDs thickStart and thickEnd), in my case for displaying mapped splice junctions?
                I've looked at the idea of having this sorted of linked regions either for mRNA mapping or simply for denoting the ends of paired reads. The problem with doing that is that you are potentially storing quite a bit of extra information per read than happens currently. SeqMonk keeps all reads in memory so it can update its display instantly and flip between different genomic locations with next to no delay - however this means we have to be very careful about managing memory usage. When you are storing 100million+ reads then every extra piece of data you store can have a big impact.

                We're going to be doing more work on spliced RNA sequencing in the near future so I'll look into better ways of representing this data in future versions.

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Essential Discoveries and Tools in Epitranscriptomics
                  by seqadmin




                  The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                  04-22-2024, 07:01 AM
                • seqadmin
                  Current Approaches to Protein Sequencing
                  by seqadmin


                  Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                  04-04-2024, 04:25 PM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, Today, 08:47 AM
                0 responses
                12 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-11-2024, 12:08 PM
                0 responses
                60 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-10-2024, 10:19 PM
                0 responses
                59 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-10-2024, 09:21 AM
                0 responses
                54 views
                0 likes
                Last Post seqadmin  
                Working...
                X