SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
ChIP-Seq: Enabling Data Analysis on High-Throughput Data in Large Data Depository Usi Newsbot! Literature Watch 1 04-18-2018 09:50 PM
Cufflinks - Nature Biotech data sets adrian Bioinformatics 1 04-16-2011 04:40 PM
public data sets muchomaas Bioinformatics 2 06-08-2010 01:48 AM
sff_extract: combining data from 454 Flx and Titanium data sets agroster Bioinformatics 7 01-14-2010 10:19 AM
SeqMonk - Flexible analysis of mapped reads simonandrews Bioinformatics 7 07-24-2009 04:12 AM

Reply
 
Thread Tools
Old 05-12-2016, 02:05 AM   #301
rmf
Junior Member
 
Location: Sweden

Join Date: May 2016
Posts: 8
Default Custom Pseudo-chromosomes

I am creating a new custom genome. I have 25 chr, 1 mt and a whole lot of scaffolds. I can only see automatic pseudo-chromosome creation and it doesn't do exactly what I want. I would like to group the scaffolds into pseudo-chromosomes in a custom manner. Also I would like to keep mt as a separate chromosome.
Is it possible to select some regions and convert them to a pseudo-chromosome?

Last edited by rmf; 05-16-2016 at 08:59 AM. Reason: added title
rmf is offline   Reply With Quote
Old 05-12-2016, 03:26 AM   #302
simonandrews
Simon Andrews
 
Location: Babraham Inst, Cambridge, UK

Join Date: May 2009
Posts: 869
Default

There's no built in support for this kind of customisation, but you could build this yourself if you like.

If you have a look in the automated genome you will quickly see how to play around with the way the pseudo chromosomes are made. There are two files which matter here:

chr_list is a text file giving the names and total lengths of the genomes. In a normal build only the pseudo chromosomes would appear in here, but you could add in some individual scaffolds on their own if you like.

aliases.txt is the file which says how the individual sequence files you have map into the chromosomes (or pseudo chromosomes in this case). For each sequence it says which chromsome it maps to and where in that chromosome it starts. If the number is negative then the sequence is assumed to be reverse complemented and inserted at that position.

By editing these two files manually you should be able to group your sequences however you like in the newly built genome.

Let me know how you get on.
simonandrews is offline   Reply With Quote
Old 05-16-2016, 09:00 AM   #303
rmf
Junior Member
 
Location: Sweden

Join Date: May 2016
Posts: 8
Default Custom Pseudo-chromosomes

I have tried to modify the aliase.txt and chr_list as shown below. I renamed the names in aliases.txt and moved the chr lengths around in chr_list. But when I reopen and create a new project and load the custom genome, it still looks like the original build.

old aliases.txt (automatically created)
1 pseudo1 0
10 pseudo2 0
11 pseudo3 0
12 pseudo4 0
13 pseudo5 0
14 pseudo6 0
15 pseudo7 0
16 pseudo8 0
17 pseudo9 0
18 pseudo10 0
19 pseudo11 0
2 pseudo12 0
20 pseudo13 0
21 pseudo14 0
22 pseudo15 0
23 pseudo16 0
24 pseudo17 0
25 pseudo18 0
3 pseudo19 0
4 pseudo20 0
5 pseudo21 0
6 pseudo22 0
7 pseudo23 0
8 pseudo24 0
9 pseudo25 0
MT pseudo26 0
KN149696.1 pseudo26 16696
KN149690.1 pseudo26 385433
...<lot more scaffolds>

new aliases.txt (manually corrected)
1 pseudo1 0
10 pseudo10 0
11 pseudo11 0
12 pseudo12 0
13 pseudo13 0
14 pseudo14 0
15 pseudo15 0
16 pseudo16 0
17 pseudo17 0
18 pseudo18 0
19 pseudo19 0
2 pseudo2 0
20 pseudo20 0
21 pseudo21 0
22 pseudo22 0
23 pseudo23 0
24 pseudo24 0
25 pseudo25 0
3 pseudo3 0
4 pseudo4 0
5 pseudo5 0
6 pseudo6 0
7 pseudo7 0
8 pseudo8 0
9 pseudo9 0
MT pseudo26 0
KN149696.1 pseudo26 16696
KN149690.1 pseudo26 385433
...<lot more scaffolds>

old chr_list (automatically created)
pseudo1 58871917
pseudo2 45574255
pseudo3 45107271
pseudo4 49229541
pseudo5 51780250
pseudo6 51944548
pseudo7 47771147
pseudo8 55381981
pseudo9 53345113
pseudo10 51008593
pseudo11 48790377
pseudo12 59543403
pseudo13 55370968
pseudo14 45895719
pseudo15 39226288
pseudo16 46272358
pseudo17 42251103
pseudo18 36898761
pseudo19 62385949
pseudo20 76625712
pseudo21 71715914
pseudo22 60272633
pseudo23 74082188
pseudo24 54191831
pseudo25 56892771
pseudo26 31392292

new chr_list (manually corrected)
pseudo1 58871917
pseudo2 59543403
pseudo3 62385949
pseudo4 76625712
pseudo5 71715914
pseudo6 60272633
pseudo7 74082188
pseudo8 54191831
pseudo9 56892771
pseudo10 45574255
pseudo11 45107271
pseudo12 49229541
pseudo13 51780250
pseudo14 51944548
pseudo15 47771147
pseudo16 55381981
pseudo17 53345113
pseudo18 51008593
pseudo19 48790377
pseudo20 55370968
pseudo21 45895719
pseudo22 39226288
pseudo23 46272358
pseudo24 42251103
pseudo25 36898761
pseudo26 31392292

Last edited by rmf; 05-16-2016 at 09:02 AM. Reason: added text
rmf is offline   Reply With Quote
Old 05-17-2016, 06:50 AM   #304
simonandrews
Simon Andrews
 
Location: Babraham Inst, Cambridge, UK

Join Date: May 2009
Posts: 869
Default

Can you try deleting the cache folder in your assembly folder. SeqMonk might not have recognised that those files have changed and be using an older version.

If it's still not working for you then drop me an email and I'll set up an FTP site where you can push the files to me and I can take a look.
simonandrews is offline   Reply With Quote
Old 05-17-2016, 12:59 PM   #305
rmf
Junior Member
 
Location: Sweden

Join Date: May 2016
Posts: 8
Default

Yes! It works now. Thanks a lot.
rmf is offline   Reply With Quote
Old 06-08-2016, 09:27 AM   #306
rmf
Junior Member
 
Location: Sweden

Join Date: May 2016
Posts: 8
Default Expand annotations

Hi Simon,
The annotations overlap a lot and it's hard to read.



Is there an option to expand annotations like that in IGV?



Thanks,
Roy

Last edited by rmf; 06-08-2016 at 09:28 AM. Reason: Typo
rmf is offline   Reply With Quote
Old 06-09-2016, 12:52 AM   #307
simonandrews
Simon Andrews
 
Location: Babraham Inst, Cambridge, UK

Join Date: May 2009
Posts: 869
Default

Quote:
Originally Posted by rmf View Post
Hi Simon,
The annotations overlap a lot and it's hard to read.
There's always a trade-off to make in these kinds of display and internally we've tried a few different ways to adjust the layout to try to show more stuff on screen clearly but have kept with the current layout. In the next release we're actuallly down-weighting the amount of space given to the annotation tracks so we can give more to the data, since the trend seems to be for more data, and data tracks get unusuble fairly quickly as they compress too much.

Whilst the view is somewhat minimal, our aim is to make it more usable through the interactive features (putting your mouse over a feature highlights it and tells you what it is), as this scales much better. Obviously for publications this doesn't help though - but what we do (and would recommend others do) is to use the option to add all labels (Control+L) then export out the SVG. You can then re-organise the layout of the features to make better use of the space you have and to highlight the information which is important for that figure. I should probably make up a video showing this process...
simonandrews is offline   Reply With Quote
Old 06-09-2016, 11:39 PM   #308
rmf
Junior Member
 
Location: Sweden

Join Date: May 2016
Posts: 8
Default Expand annotations

Its's strange that such a feature is down-weighted. I would think that it is important to see exactly what features your reads are overlapping with. Perhaps not that important when doing a whole-transcriptome dge analysis, but when a user is interested in certain genes or certain regions. As in my example figure, 4-5 layers of overlapping transcripts in one location is extremely hard to access properly even with hover effect.
rmf is offline   Reply With Quote
Old 06-10-2016, 02:36 AM   #309
simonandrews
Simon Andrews
 
Location: Babraham Inst, Cambridge, UK

Join Date: May 2009
Posts: 869
Default

Quote:
Originally Posted by rmf View Post
Its's strange that such a feature is down-weighted. I would think that it is important to see exactly what features your reads are overlapping with. Perhaps not that important when doing a whole-transcriptome dge analysis, but when a user is interested in certain genes or certain regions. As in my example figure, 4-5 layers of overlapping transcripts in one location is extremely hard to access properly even with hover effect.
The down-weighting here is simply the proportional space offered to each of the tracks. The point being that for an annotation track with the same layout then once you have around 50px of vertical space then giving it more space than that doesn't make it any clearer so just providing more vertical space without a better layout fix is just wasting more space.

I absolutely agree that there is an issue seeing exactly what's going on in regions where lots of features overlap, and that hovering - although it helps, isn't perfect. The problem is that to make a completely non-overlapping feature set takes a huge amount of vertical space in the general case since there are places in the genome where many tens of features overlap and these would take lots of space to show clearly.

I'm very happy to hear suggestions for ways in which we could improve the layout we have whilst still keeping the overall vertical space in check.

One other little tip which can be useful - where I've wanted to look at lots of annotation for a region I'm looking at in seqmonk it's possible to link up seqmonk with a browser view of the same genome in either UCSC or Ensembl. If you are looking at a chromsome view in seqmonk then selecting Edit > Copy current position (control/command + c) will copy the genomic location into your clipboard. You can then paste this into the UCSC or Ensembl search box to be taken directly to the equivalent region. This is especially useful for tracks which seqmonk doesn't have or can't calculate.
simonandrews is offline   Reply With Quote
Old 06-10-2016, 05:29 AM   #310
rmf
Junior Member
 
Location: Sweden

Join Date: May 2016
Posts: 8
Default

That's a neat little trick. Thanks for that.
rmf is offline   Reply With Quote
Old 09-26-2016, 01:02 AM   #311
xhuister
Member
 
Location: China

Join Date: Apr 2010
Posts: 41
Default

How can I save a dataset to a file (.txt or .bed..)?
For example, I grouped two datasets to one group, and import it as a _import dataset through File>Impot data>visible data souce. Then I'd like to save this dataset to a file. But I couldn't find any menu to do this.

I can export probe data through Reports>Annotated probe report, but I cann't export the dataset. A trick way is to make each position in the dataset as a probe and then export the probes. But this is slow for large dataset. Is there an easier way to do this?

Seems a really simple question, I apologize if this has been mentioned in the tutorial or in this thread. I couldn't find the solution.
xhuister is offline   Reply With Quote
Old 10-06-2016, 02:05 AM   #312
Niranjanks
Member
 
Location: India

Join Date: Aug 2015
Posts: 11
Default

Is there any way to plot this in Seqmonk ??

On the X-axis, the TSS in the centre at 0 flanked by a fixed number of bp decided by the user for example -2000 to +2000 bp

while the y axis contains the average binding signal
Niranjanks is offline   Reply With Quote
Old 08-24-2017, 04:32 AM   #313
pander
Junior Member
 
Location: Czech Republic

Join Date: Aug 2017
Posts: 1
Default

Hello,

I have a problem, that after importing mouse GTF annotation downloaded form Ensembl, Seqmonk does not recognize gene/transcript names. I would like to filter on probe names and see the names of the genes in my plots. I also do not want to use the default annotation as I noticed itīs probably an older version and names of some genes and transcript annotations have changed. Could you please help? Thank you.

pander is offline   Reply With Quote
Old 08-02-2018, 01:01 PM   #314
rmf
Junior Member
 
Location: Sweden

Join Date: May 2016
Posts: 8
Default Import bedGraph

Is it possible to import bedGraph files? Or more generally speaking can I plot any quantitative data as a track with the following info:

chr start end value

And value being some continuous variable.
Thanks,
Roy
rmf is offline   Reply With Quote
Reply

Tags
analysis, desktop, seqmonk, visualization

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 02:54 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2018, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO