SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
ChIP-Seq: Enabling Data Analysis on High-Throughput Data in Large Data Depository Usi Newsbot! Literature Watch 1 04-18-2018 10:50 PM
Cufflinks - Nature Biotech data sets adrian Bioinformatics 1 04-16-2011 05:40 PM
public data sets muchomaas Bioinformatics 2 06-08-2010 02:48 AM
sff_extract: combining data from 454 Flx and Titanium data sets agroster Bioinformatics 7 01-14-2010 11:19 AM
SeqMonk - Flexible analysis of mapped reads simonandrews Bioinformatics 7 07-24-2009 05:12 AM

Reply
 
Thread Tools
Old 07-02-2013, 12:39 PM   #201
simonandrews
Simon Andrews
 
Location: Babraham Inst, Cambridge, UK

Join Date: May 2009
Posts: 871
Default

Quote:
Originally Posted by crazyhottommy View Post
I did not quite get it.

I think you mean to say:

10, 12, 8, 10 are values for 4 probes in the same sample ( or in my case TF1)

20,22,18,20 are values for the same 4 probes in my other sample ( the other TF2)
No, the other way around.

10,12,8,10 are the values for 1 probe across 4 different samples.

20,22,18,20 are the values for a different probe against the same samples.

Per-probe normalisation is a way to be able to compare the relative changes in each probe and removes the effect of the absolute values it has. In this case you'd be interested that sample 2 had the highest value and 3 the lowest even though the absolute values of the probes are quite different.
simonandrews is offline   Reply With Quote
Old 08-12-2013, 08:38 AM   #202
Aspadia
Junior Member
 
Location: Europe

Join Date: Aug 2013
Posts: 4
Default

Hi Simon,

Seqmonk sounds really awesome but I do not manage to run it
It is downloaded and when I try to run it it says it cannot find java. When I type java -version in cmd it says 'java' is not recognized as an internal or external command, operable program or batch file.

In previous reply you mentioned this:
Java isn't installed properly, or the java binary isn't in your path. Open a command prompt and type 'java -version' if you get an error saying this isn't a recognised command then this is the problem.
So I guess the second part is my problem but how can I solve this? I just installed the latest version of Java (Version 7 Update 25).

Hope you can help me, thanks in advance!
Aspadia is offline   Reply With Quote
Old 08-12-2013, 09:11 AM   #203
mathew
Member
 
Location: australia

Join Date: Jan 2011
Posts: 81
Default Seqmonk and Java

Simon,

The Java and seqmonk problem is now creating a real headache. I am also in same boat. Perhaps at some point in time Seqmonk and Java dependency need to be thoroughly investigated. I agree it may be because of Java updates but we want Seqmonk to work.

Thanks
mathew is offline   Reply With Quote
Old 08-12-2013, 10:10 AM   #204
simonandrews
Simon Andrews
 
Location: Babraham Inst, Cambridge, UK

Join Date: May 2009
Posts: 871
Default

If java is not found on your command line the most likely cause we've seen is that you have installed the 32 bit version of java on a 64 bit machine. Oracle only show the 32 bit version of java on the front page of java.com, to get the 64 bit version you need to click on "See all java downloads" and then get the 64 bit version.

You can install 32 and 64 bit side by side so adding the 64 bit version won't mess up your browser, and you'll need the 64 bit version to be able to use more than 2GB memory.

Hope this helps
simonandrews is offline   Reply With Quote
Old 08-26-2013, 08:12 AM   #205
markf
Junior Member
 
Location: Belgium

Join Date: Sep 2009
Posts: 3
Default Data import

Dear Simon,

I have a question about seqmonk: Is it possible to create my own probesets & quantifications outside of seqmonk, and then load them? For example from GFF, and assign the GFF score as the quantification score for that probe.

I have my own set of peak calls (from a 4C experiment, with my own preprocessing, and using R3Cseq) - and I was hoping to use seqmonk for some sanity checking, visualization and report generation.

Thanks.
Mark

PS - keep up the good work - seqmonk is great!
markf is offline   Reply With Quote
Old 08-26-2013, 08:31 AM   #206
simonandrews
Simon Andrews
 
Location: Babraham Inst, Cambridge, UK

Join Date: May 2009
Posts: 871
Default

Quote:
Originally Posted by markf View Post
Dear Simon,

I have a question about seqmonk: Is it possible to create my own probesets & quantifications outside of seqmonk, and then load them? For example from GFF, and assign the GFF score as the quantification score for that probe.
Yes, and no. It's pretty easy to import external probe positions, for example from a peak caller, and we do this all the time. The best way to do this (and soon the only way to do it) is to import the peak positions as an annotation track using whichever type of annotation importer makes sense (text, GFF etc), and then use the feature probe generator to place probes over the features in that track.

What you would then do is to import the data for that ChIP and then use the standard quantitation tools to assign quantitated values to the peaks you imported. What you can't do is to import pre-quantitated data - we have thought about this a lot since we've had a fair few requests for it, but it causes so many problems in other parts of the data model that we've decided against this. There are other programs around which deal solely with pre-quantitated data (IGV etc) and the point about SeqMonk is that it has access to the raw data so you can do things which those programs can't do.


Quote:
Originally Posted by markf View Post
PS - keep up the good work - seqmonk is great!
Cool, thanks! There should be a new release coming out very soon now which has loads of new stuff which you'll hopefully find useful.
simonandrews is offline   Reply With Quote
Old 09-03-2013, 06:56 AM   #207
Aspadia
Junior Member
 
Location: Europe

Join Date: Aug 2013
Posts: 4
Default Relative quantitation

Hi Simon,

Thanks for your previous quick reply, I got seqmonk running now

This brings a next problem. I want to use the 'relative quantitation' option, I create probes in an IP and input sample from ChIP then apply the running window probe generator and would then like to use the relative quantitation. However, after the probe generation it does not show me the option for relative quantitation nor when I go to Data --> quantitate existing probes....
Can you tell me how I can get to this relative quantitation?

Thanks a lot!
Aspadia is offline   Reply With Quote
Old 09-03-2013, 08:22 AM   #208
simonandrews
Simon Andrews
 
Location: Babraham Inst, Cambridge, UK

Join Date: May 2009
Posts: 871
Default

Quote:
Originally Posted by Aspadia View Post
I want to use the 'relative quantitation' option, I create probes in an IP and input sample from ChIP then apply the running window probe generator and would then like to use the relative quantitation. However, after the probe generation it does not show me the option for relative quantitation nor when I go to Data --> quantitate existing probes....
Can you tell me how I can get to this relative quantitation?
Relative quantitation is a normalisation method for an existing quantitation, so you'd start by doing a normal read count or base pair count normalisation. If you then go back to the quantitation options you'll see a new set of methods appear in red which can be used to alter the existing quantitation - you could then use the relative quantitation to select a reference and either subtract, divide or log divide the other samples by the values in this dataset.
simonandrews is offline   Reply With Quote
Old 09-08-2013, 10:18 PM   #209
mathew
Member
 
Location: australia

Join Date: Jan 2011
Posts: 81
Default

I am curious if Seqmonk can give me coverage of specific genes/ transcripts from DNA seq data as is given by coverageBed in bedtools http://bedtools.readthedocs.org/en/latest/
I want to calculate coverage per gene/ chromosomes
Thanks
mathew is offline   Reply With Quote
Old 09-08-2013, 11:48 PM   #210
simonandrews
Simon Andrews
 
Location: Babraham Inst, Cambridge, UK

Join Date: May 2009
Posts: 871
Default

Quote:
Originally Posted by mathew View Post
I am curious if Seqmonk can give me coverage of specific genes/ transcripts from DNA seq data as is given by coverageBed in bedtools http://bedtools.readthedocs.org/en/latest/
I want to calculate coverage per gene/ chromosomes
Thanks
Yes - depending on what you want there are a couple of ways you can do this. For standard RNA-Seq type quantitation you can use the RNA-Seq pipeline. This can quantitate at the gene (combined exons) or transcript level and can correct for transcript length. It works on simple overlaps so doesn't do re-partitioning between ambiguous splice variants.

For other non-spliced features you can use the normal mechanism of placing reads using the feature probe generator and then counting reads using the read count quantitation (or whichever other quantitaiton you want). You can place reads in all sorts of other ways too if you want to count whole chromosomes or other regions.
simonandrews is offline   Reply With Quote
Old 09-09-2013, 01:54 AM   #211
simonandrews
Simon Andrews
 
Location: Babraham Inst, Cambridge, UK

Join Date: May 2009
Posts: 871
Default

I've just pushed out a new release of SeqMonk (v0.25.0). This has been nearly ready to go for ages now and has loads of new stuff in it. You can see the full list of additions in the release notes, but some of the big changes are:
  • Adding a quantitation trend plot to look at how any quantitated data changes around a set of features
  • Adde a multi-sample chi-square for application such as allele specific expression
  • Allow multiple samples in the aligned probes plot and added custom sorting
  • The abilty to filter raw reads against features when re-importing
  • Adding a domainogram plot to look at quantitations over different window sizes
  • Added ways to find sets of featutres from a list of names
  • Added a nice report to completely describe how you came to a set of filtered probes
  • Improved normalisation options

We've also done some profiling of the seqmonk code so it should (hopefully) be noticeably quicker than the last version.

We've also had to make a change to the file format for seqmonk (to allow for comments to be added to probe lists), so projects saved with this version will not be able to be opened in older versions. This version will open older projects just fine though.

Please have a play with the new version and report any problems in our bugzilla, or by email to me or directly to this thread.
simonandrews is offline   Reply With Quote
Old 09-09-2013, 07:05 AM   #212
simonandrews
Simon Andrews
 
Location: Babraham Inst, Cambridge, UK

Join Date: May 2009
Posts: 871
Default

Quote:
Originally Posted by Aspadia View Post
Seqmonk sounds really awesome but I do not manage to run it
It is downloaded and when I try to run it it says it cannot find java. When I type java -version in cmd it says 'java' is not recognized as an internal or external command, operable program or batch file.
Since this ended up being a fairly common problem I've written up a blog post which describes why this happens and how to fix it. I'm going to look at other ways we might be able to get SeqMonk to use the installed 32-bit version of java which is normally there - but I'm somewhat reluctant to do this since SeqMonk really benefits from using the correct 64-bit version.
simonandrews is offline   Reply With Quote
Old 09-09-2013, 09:49 AM   #213
crazyhottommy
Senior Member
 
Location: Gainesville

Join Date: Apr 2012
Posts: 140
Default

Hi Simon,

Sorry to bug you again. I am wondering what clustering algorithm is used for the aligned probe plot?
I wanted to reproduce the figure generated by Seqmonk by myself using homer + R. I got the count matrix for a ChIP-seq data by Homer, and then imported to R, log2 transformed and then plot by heatmap.2. I can use either hierarchical or K means clustering to cluster the data.

The thing is that I can observe a more obvious peak from figure generated by Seqmonk ( one can adjust the contrast by sliding the bar on the right) The one I generated by R is somewhat not that obvious. Or could you please give any tricks on plotting this kind of data?

Many Thanks!
crazyhottommy is offline   Reply With Quote
Old 09-09-2013, 11:16 AM   #214
crazyhottommy
Senior Member
 
Location: Gainesville

Join Date: Apr 2012
Posts: 140
Default

Hi Simon,

I just installed the newest version of seqmonk, it has many improved features! Thanks. I noticed that for many plots ( probe trend, box whisker etc ), it allows to specify multiple probe lists. I am wondering how you can keep several probe sets at the same time? each time, I create a new probe list, the old one would be wiped away.

Thank you again.

Tommy
crazyhottommy is offline   Reply With Quote
Old 09-09-2013, 12:13 PM   #215
simonandrews
Simon Andrews
 
Location: Babraham Inst, Cambridge, UK

Join Date: May 2009
Posts: 871
Default

Quote:
Originally Posted by crazyhottommy View Post
Hi Simon,

Sorry to bug you again. I am wondering what clustering algorithm is used for the aligned probe plot?
The aligned probes plot is simply ordered by the number of reads in the probe so the highest coverage goes at the top. In the new version you are now able to view multiple plots at the same time and you can choose to order them either independently, or to pick one and then order the rest by the coverage in that reference dataset.

In terms of the strength of the effects shown, there's nothing too clever about what SeqMonk is doing, it's default scaling is linear, and you'll see quite different effects on a log scale. From my own experience it's worth playing around with the amount of context you put around your regions of interest, since keeping the regions too tight may not give you enough context to be able to judge the strength of the enrichment. Also, being able to play with the contrast manually to get it set just right for what you want to show can be a big plus.
simonandrews is offline   Reply With Quote
Old 09-09-2013, 12:17 PM   #216
simonandrews
Simon Andrews
 
Location: Babraham Inst, Cambridge, UK

Join Date: May 2009
Posts: 871
Default

Quote:
Originally Posted by crazyhottommy View Post
Hi Simon,

I just installed the newest version of seqmonk, it has many improved features! Thanks. I noticed that for many plots ( probe trend, box whisker etc ), it allows to specify multiple probe lists. I am wondering how you can keep several probe sets at the same time? each time, I create a new probe list, the old one would be wiped away.
The plots can show many probe lists, not many probe sets. That is to say that if you've filtered your full probe set several different ways then you can plot these subsets together, but they're all part of the same original probe set.

There isn't a way to have more than one probe set active at once. Lots of things about the way SeqMonk expects to be able to work don't scale to having more than one probe set so this isn't something we're likely to add.

Although you can't keep a previous probe set around if you choose to create a new one, you do have the option of turning any probe list into an annotation track. This won't preserve the quantitated values, but it will preserve the positions which can often be useful. You can do this by selecting File > Import Annotation > Active Probe List.
simonandrews is offline   Reply With Quote
Old 09-09-2013, 07:19 PM   #217
crazyhottommy
Senior Member
 
Location: Gainesville

Join Date: Apr 2012
Posts: 140
Default

Quote:
Originally Posted by simonandrews View Post
The aligned probes plot is simply ordered by the number of reads in the probe so the highest coverage goes at the top. In the new version you are now able to view multiple plots at the same time and you can choose to order them either independently, or to pick one and then order the rest by the coverage in that reference dataset.

In terms of the strength of the effects shown, there's nothing too clever about what SeqMonk is doing, it's default scaling is linear, and you'll see quite different effects on a log scale. From my own experience it's worth playing around with the amount of context you put around your regions of interest, since keeping the regions too tight may not give you enough context to be able to judge the strength of the enrichment. Also, being able to play with the contrast manually to get it set just right for what you want to show can be a big plus.

So if I want to compare ChIP-seq enrichment between two sets of probes, when I adjust the contrast, I need to apply the adjustment at the same time for both heatmaps. It is something like Western blot ( a wet lab technique), you should expose for the same time for your treatment and control. For the context, I've seen people using -3kb to 3kb, I also saw people using -8kb to 8kb, not sure what is the consensus though...
crazyhottommy is offline   Reply With Quote
Old 09-09-2013, 07:20 PM   #218
crazyhottommy
Senior Member
 
Location: Gainesville

Join Date: Apr 2012
Posts: 140
Default

Quote:
Originally Posted by simonandrews View Post
The plots can show many probe lists, not many probe sets. That is to say that if you've filtered your full probe set several different ways then you can plot these subsets together, but they're all part of the same original probe set.

There isn't a way to have more than one probe set active at once. Lots of things about the way SeqMonk expects to be able to work don't scale to having more than one probe set so this isn't something we're likely to add.

Although you can't keep a previous probe set around if you choose to create a new one, you do have the option of turning any probe list into an annotation track. This won't preserve the quantitated values, but it will preserve the positions which can often be useful. You can do this by selecting File > Import Annotation > Active Probe List.
Thanks for your clarification!
crazyhottommy is offline   Reply With Quote
Old 09-10-2013, 12:24 AM   #219
simonandrews
Simon Andrews
 
Location: Babraham Inst, Cambridge, UK

Join Date: May 2009
Posts: 871
Default

Quote:
Originally Posted by crazyhottommy View Post
So if I want to compare ChIP-seq enrichment between two sets of probes, when I adjust the contrast, I need to apply the adjustment at the same time for both heatmaps. It is something like Western blot ( a wet lab technique), you should expose for the same time for your treatment and control. For the context, I've seen people using -3kb to 3kb, I also saw people using -8kb to 8kb, not sure what is the consensus though...
There shouldn't really be a consensus as the size you use will depend on the nature of the enrichment you're looking at and the insert size of your library among other factors.

When using multiple probe lists (not sets) in SeqMonk you now draw all of the plots in a single window and the slider adjusts all of them simultaneously so they're directly comparable. I'm never really sure how valuable it is to compare the strength of enrichment in these plots since this can be affected by technical artefacts, but it's a really good way to show differences in the patterning or extent (proportion of probes) of the enrichment.
simonandrews is offline   Reply With Quote
Old 09-10-2013, 03:22 AM   #220
VincentC
Junior Member
 
Location: France

Join Date: Sep 2013
Posts: 2
Default

Hi everyone,

We performed bisulfite treatment on 2 conditions x 3 genomes followed by deep sequencing (paired-ends, 2x100bp, Illumina HiSeq 2000). We used Bismark for read alignment and methylation calling.

I am now struggling to visualize my data with seqmonk and make it fit to Methylkit data that has been generated by a collaborator. We pooled the 3 genomes for each condition, comparing therefore two data sets namely A and B.

Here is the procedure I follow, according to the seqmonk guide, videos and other resources:
- I generate probes using contig probe generator: I select both datasets A and B, min contig size = 1 and by default for the remaining options.
- After that I quantify using the bisulfite pipeline over features: I select existing probes as features, and leave all other options as default.
- I then filter my data on values (individual probes), must be between 0 and rest by default.

First, is this procedure correct, or should I proceed differently given my data sets? Also, what is the best way to statistically filter my data? Thanks a lot for the advices, Im learning the hard way!!
VincentC is offline   Reply With Quote
Reply

Tags
analysis, desktop, seqmonk, visualization

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 08:06 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO