Seqanswers Leaderboard Ad

**ShellfishGene** · 07-22-2009, 01:50 AM

SAM import

Hey Simon,

looks like a nice program, but it's not possible to import SAM files at the moment, as they don't give strand and stop position information. BAM/SAM import would be nice to have!

Cheers

**simonandrews** · 07-22-2009, 03:00 AM

If you can give me an example of SAM output I'd be happy to add in an input filter for this in the next release.

**ShellfishGene** · 07-22-2009, 04:30 AM

The SAM format and it's binary counterpart BAM are described on the samtools page at http://samtools.sourceforge.net/, see the format specification link.
Your competitor, the Integrative Genomics Viewer from the Broad Institute, reads BAM already.

**Ryanw** · 07-22-2009, 06:43 AM

Other genomes

I agree SeqMonk seems like a very useful program. How are new genomes created? I have several bacterial genomes I would like use, but there is no guidance in the documentation as to how to format the files.

**simonandrews** · 07-23-2009, 05:02 AM

Originally posted by Ryanw View Post

I agree SeqMonk seems like a very useful program. How are new genomes created? I have several bacterial genomes I would like use, but there is no guidance in the documentation as to how to format the files.

For most users the genomes can be downloaded from within the program from the precompiled set we have available. All of the data comes straight out of Ensembl and is just slightly reformatted to use within the program.

If you want to create your own genomes then it's actually fairly simple. The genomes are stored in EMBL format files (actually just the headers to save the space of storing the sequence - but leaving the sequence on there won't hurt). The only change you need to make from a standard EMBL file is a particular format for the accession line so that SeqMonk can figure out the chromosome name. These files are then placed into a standard directory structure in the programs genomes folder.

In house we've made up genome files for Ecoli by adapting the public K12 sequence and it should be similarly easy for any other published genome. The only slight limitation is that SeqMonk currently has no concept of circular genomes so any reads which spanned the join would be discarded.

In the latest release we've included the EnsemblAPI script we use for generating new genomes in house. With the Ensembl bacterial genomes project moving forward it would probably be very easy to use this to generate a wide range of bacterial genomes.

If you have a particular species you're interested in then contact me and I'll look at making up a genome file for you.

**jwaage** · 07-24-2009, 03:19 AM

Hi! Seqmonk is very useful, thanks. Would there be any way of implementing display of interval files with thick and thin lines (BEDs thickStart and thickEnd), in my case for displaying mapped splice junctions?

Regards,
Johannes Waage
Uni of Copenhagen

**simonandrews** · 07-24-2009, 04:12 AM

Originally posted by jwaage View Post

Hi! Seqmonk is very useful, thanks. Would there be any way of implementing display of interval files with thick and thin lines (BEDs thickStart and thickEnd), in my case for displaying mapped splice junctions?

I've looked at the idea of having this sorted of linked regions either for mRNA mapping or simply for denoting the ends of paired reads. The problem with doing that is that you are potentially storing quite a bit of extra information per read than happens currently. SeqMonk keeps all reads in memory so it can update its display instantly and flip between different genomic locations with next to no delay - however this means we have to be very careful about managing memory usage. When you are storing 100million+ reads then every extra piece of data you store can have a big impact.

We're going to be doing more work on spliced RNA sequencing in the near future so I'll look into better ways of representing this data in future versions.

Topics	Statistics	Last Post
Genetic Variants and Diabetes Risk in Childhood Cancer Survivors by seqadmin Started by seqadmin, Today, 08:47 AM	0 responses 12 views 0 likes	Last Post by seqadmin Today, 08:47 AM
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 60 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 59 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 54 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM

Seqanswers Leaderboard Ad

Announcement

SeqMonk - Flexible analysis of mapped reads

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News