SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
SeqMonk: Visualisation and analysis for large mapped data sets simonandrews Bioinformatics 313 08-02-2018 01:01 PM
how to import my own genome sequence into SeqMonk? slny Bioinformatics 19 11-18-2014 11:43 PM
New release of SeqMonk (v0.8) simonandrews Bioinformatics 0 01-22-2010 05:53 AM
SeqMonk hon Bioinformatics 2 11-02-2009 12:48 AM
SeqMonk - Flexible analysis of mapped reads simonandrews Bioinformatics 7 07-24-2009 04:12 AM

Reply
 
Thread Tools
Old 06-24-2010, 07:38 AM   #1
simonandrews
Simon Andrews
 
Location: Babraham Inst, Cambridge, UK

Join Date: May 2009
Posts: 871
Default SeqMonk v0.10.0 released

We have just released a major update to our SeqMonk package for the analysis of large mapped sequence datasets on normal desktop PCs (or Macs). This release features a host of new functionality including:
  • Quantitation probes are now directional, meaning you can (for example) count the number of antisense reads for every transcript in the genome
  • There is now support for either removing duplicate reads upon import or ignoring them in every quantitation method
  • You can now extend reads when importing them to make the analysis of single end ChIP-Seq data easier
  • You can import methylation call data directly from bisulphite seq files mapped with bismark
  • There is a generic text import option for importing new annotation tracks
  • The trend plot can now smooth your data interactively
  • There is a mechanism to update genome annotation sets when new annotations become available
  • Loading and saving of data is now 2X (often more) quicker

You can get the new release from our project website at:

http://www.bioinformatics.bbsrc.ac.uk/projects/seqmonk/

[If you see the old release press shift+refresh to force our web cache to update]

We're keen to hear feedback about the program (good or bad!).
simonandrews is offline   Reply With Quote
Old 06-25-2010, 11:33 AM   #2
mattanswers
Member
 
Location: Boston

Join Date: Oct 2009
Posts: 65
Default

Thanks Simon for this very nice program. I use it for Chip-Seq and it is really nice to be able to load a file right from alignment without having to write Perl scripts to reformat the data or to extract only a specific portion of the alignment data. Being able to go right from the output of an alignment file to viewing anywhere in the genome is really nice.

I now have 8GB of RAM on the computer and I can easily load 7 or more lanes of data and navigate around anywhere in the genome very quickly and easily. This is very nice for Chip-Seq when checking to see what the pattern of reads are in various genes located throughout the genome as well as to have a direct comparison of control and treatment samples at the same time; or to compare with our own data with publicly available data.

However, some in the lab (who don't work with the data but who do have an important opinion) prefer that the reads are not ordered from the center moving out, but from a baseline going up. In Chip-Seq this would highlight more a 'peak' of reads. Or, also possibly separating reads on different strands so that reads on one strand were above the baseline and reads on the other strand were below the baseline. This would also be visually helpful for Chip-Seq.

Thanks again for a really helpful program.
mattanswers is offline   Reply With Quote
Old 06-25-2010, 12:11 PM   #3
simonandrews
Simon Andrews
 
Location: Babraham Inst, Cambridge, UK

Join Date: May 2009
Posts: 871
Default

Quote:
Originally Posted by mattanswers View Post
However, some in the lab (who don't work with the data but who do have an important opinion) prefer that the reads are not ordered from the center moving out, but from a baseline going up. In Chip-Seq this would highlight more a 'peak' of reads. Or, also possibly separating reads on different strands so that reads on one strand were above the baseline and reads on the other strand were below the baseline. This would also be visually helpful for Chip-Seq.
I'm going to look at some more layout options for the next release. We're starting to find that the volume of data people are now producing means that the display can become saturated with reads when you start looking at larger regions. We may therefore put in a more high density view where every read is slimmer - as long as the increased number of objects doesn't adversely affect the performance of the program.

I'll have a play with the suggestions you made. Using a baseline rather than a centre based layout is a trivial change - but I'm slightly wary of offering too many options. The split forward / reverse view would also be easy to implement, but leaves a problem with reads with unknown strand (which is possible within the program).

I should point out that for a more quanitative view it's much better to use the quantitation tools. Packed read views can be difficult to interpret when you have a lot of data and the quantitation views are a more reliable way of looking at exactly how much data is there and automatically filtering out regions of interest.
simonandrews is offline   Reply With Quote
Old 06-25-2010, 12:19 PM   #4
simonandrews
Simon Andrews
 
Location: Babraham Inst, Cambridge, UK

Join Date: May 2009
Posts: 871
Default

I've just updated our training manual for SeqMonk to include the new features added in 0.10.0. I've also added a section on the analysis of bisulphite seq data. All of our training material can be downloaded from:

http://www.bioinformatics.bbsrc.ac.u...g.html#seqmonk
simonandrews is offline   Reply With Quote
Old 06-27-2010, 01:20 AM   #5
simonandrews
Simon Andrews
 
Location: Babraham Inst, Cambridge, UK

Join Date: May 2009
Posts: 871
Default

I had a play with the layout code in SeqMonk and it's actually proved fairly easy to add some more options. The next release will allow you to change the density with which reads are packed, and to split forward and reverse reads into separate parts of the disaplay.

I've attached a few screenshots to see what a region of ChIPSeq data would look like now, and under these new layouts.
Attached Images
File Type: jpg original_view.jpg (22.0 KB, 32 views)
File Type: jpg high_density.jpg (20.4 KB, 28 views)
File Type: jpg split_reads.jpg (20.3 KB, 23 views)
simonandrews is offline   Reply With Quote
Old 06-28-2010, 05:06 AM   #6
simonandrews
Simon Andrews
 
Location: Babraham Inst, Cambridge, UK

Join Date: May 2009
Posts: 871
Default

I've just put out v0.10.1 of SeqMonk. This fixes a couple of drawing optimisation bugs which meant that the chromosome view was drawing way more objects than it needed to and should be much quicker now these are fixed.

Since I had the layout changes ready I've also added these in to the latest release so you should now be able to try out the new layout options I showed in the last post.
simonandrews is offline   Reply With Quote
Old 06-28-2010, 10:05 AM   #7
mattanswers
Member
 
Location: Boston

Join Date: Oct 2009
Posts: 65
Default

Thank you very much. Looks great. I will download it and give it a try.
mattanswers is offline   Reply With Quote
Old 06-28-2010, 12:26 PM   #8
lparsons
Member
 
Location: NJ

Join Date: Nov 2008
Posts: 28
Question

This looks like an excellent program and seems it would be very useful for Chip-Seq data. However, I couldn't figure out how to add new genomes that aren't available on the server. Is there a way to add my own genome?
lparsons is offline   Reply With Quote
Old 06-28-2010, 11:23 PM   #9
simonandrews
Simon Andrews
 
Location: Babraham Inst, Cambridge, UK

Join Date: May 2009
Posts: 871
Default

Quote:
Originally Posted by lparsons View Post
This looks like an excellent program and seems it would be very useful for Chip-Seq data. However, I couldn't figure out how to add new genomes that aren't available on the server. Is there a way to add my own genome?
Included in the install is a document called CREATING_CUSTOM_GENOMES.txt which describes how to make your own custom genome. It's fairly straightforward - the program uses EMBL header files to describe the genome. The only real change is turning the accession line into a standard format so that SeqMonk can parse chromosome names and lengths from the file.

We'll be expanding the coverage of species very soon since we've just converted the scripts we had to process vertebrate species so that we can also process all genomes in Ensembl plant and bacteria. Those new genomes have started to appear already and more will be added over the next few weeks. If you have a request for a genome then let me know and I'll move it up the list.
simonandrews is offline   Reply With Quote
Old 06-29-2010, 09:43 AM   #10
lparsons
Member
 
Location: NJ

Join Date: Nov 2008
Posts: 28
Default

Excellent. Thanks for point that file out. I had installed it on OS X, thus it wasn't obvious those files were there. So far, it looking good.
lparsons is offline   Reply With Quote
Old 07-02-2010, 12:12 AM   #11
simonandrews
Simon Andrews
 
Location: Babraham Inst, Cambridge, UK

Join Date: May 2009
Posts: 871
Default

I've just added around 180 new bacterial genomes to the online repository for SeqMonk. There are more to come from other phyla, but hopefully this will cover the needs of a lot of people who are looking at making custom genomes.
simonandrews is offline   Reply With Quote
Old 09-13-2010, 06:59 AM   #12
simonandrews
Simon Andrews
 
Location: Babraham Inst, Cambridge, UK

Join Date: May 2009
Posts: 871
Default SeqMonk v0.11.0 released

I've just released SeqMonk v0.11.0. This fixes some bugs in SAM file parsing, and also adds some new features:
  • You can now create replicate sets to group together biological replicates
  • There is a replicate statistical filter which allows a statistical comparison of groups of replicates
  • I've added an aligned probes view to look at trends across hundreds of probes simultaneously
  • The boxwhisker plot now supports viewing serveral probe lists for the same data store
  • The Bismark import filter now supports splitting non-CpG methylation into CHH and CHG context

You can get the new release from our project website at:

http://www.bioinformatics.bbsrc.ac.uk/projects/seqmonk/

[If you see the old release press shift+refresh to force our web cache to update]

I've also put up some more screenshots of the program to show what some of the new displays look like.
simonandrews is offline   Reply With Quote
Old 03-02-2015, 07:33 AM   #13
chris202
Junior Member
 
Location: belgium

Join Date: Nov 2014
Posts: 7
Default seqmonk: get probe value

Hi,
Not sure this is the right place to ask this but I'm using seqmonk, and I'm running a simple "read count quantitation". I'd simply like to get a text file with the basic info (chr, start, stop, and value).
I can get that by simply clicking on the name of the probe list, however, this doesn't display the value (it's always a "?").
Does anyone know how to solve this ?
chris202 is offline   Reply With Quote
Old 03-02-2015, 07:35 AM   #14
simonandrews
Simon Andrews
 
Location: Babraham Inst, Cambridge, UK

Join Date: May 2009
Posts: 871
Default

Quote:
Originally Posted by chris202 View Post
Hi,
Not sure this is the right place to ask this but I'm using seqmonk, and I'm running a simple "read count quantitation". I'd simply like to get a text file with the basic info (chr, start, stop, and value).
I can get that by simply clicking on the name of the probe list, however, this doesn't display the value (it's always a "?").
Does anyone know how to solve this ?
Once you've done a quantitation you should just be able to make up an annotated probe report (Reports > Annotated Probe Report) to make a report with the positions of the probes and the current quantitation for all of the data stores you have visible.
simonandrews is offline   Reply With Quote
Old 03-02-2015, 08:21 AM   #15
chris202
Junior Member
 
Location: belgium

Join Date: Nov 2014
Posts: 7
Default

Okay thank you very very much !
However I just realized the quantitation for each probe is not what I was expecting. In fact I see for example that one probe houses 8 reads (no duplicates), but the displayed value is 6,383.
(and the only quantitation I ran is the "read count quantitation" which should give me a value of 8 for this probe if I understand correctly)
chris202 is offline   Reply With Quote
Old 03-02-2015, 11:27 AM   #16
simonandrews
Simon Andrews
 
Location: Babraham Inst, Cambridge, UK

Join Date: May 2009
Posts: 871
Default

Quote:
Originally Posted by chris202 View Post
Okay thank you very very much !
However I just realized the quantitation for each probe is not what I was expecting. In fact I see for example that one probe houses 8 reads (no duplicates), but the displayed value is 6,383.
(and the only quantitation I ran is the "read count quantitation" which should give me a value of 8 for this probe if I understand correctly)
Without seeing what options you used for the quantitation it's difficult to know why you didn't get the value you expected.

Assuming you have a normal, unspliced library and that you really want raw counts then you would indeed use the read count quantitation, but you'll need to turn off the option to correct for total count and the option to log transform the counts. You should then get the counts you expect.

If you still can't make it work then can you make up a probe list description report (Reports > Probe List Description Report) from your All Probes list and email it to me. If you could also send me a view of the chromosome view for one of the features you think is incorrectly quantitated I can try to figure out what's going on in your case.
simonandrews is offline   Reply With Quote
Reply

Tags
release, seqmonk, software

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 09:30 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO