SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
SeqMonk v0.10.0 released simonandrews Bioinformatics 15 03-02-2015 12:27 PM
NGS whole genome sequence versus sequence capture for quality control houkto Bioinformatics 0 02-02-2012 05:16 AM
New release of SeqMonk (v0.8) simonandrews Bioinformatics 0 01-22-2010 06:53 AM
SeqMonk hon Bioinformatics 2 11-02-2009 01:48 AM
Samtools import problem karl.d Illumina/Solexa 0 06-29-2009 12:42 PM

Reply
 
Thread Tools
Old 03-16-2011, 08:41 AM   #1
slny
Member
 
Location: FL

Join Date: Mar 2011
Posts: 54
Default how to import my own genome sequence into SeqMonk?

I'm trying start a new project and import my own genome sequence database into SeqMonk. However, I can browse my computer for selection. In SeqMonk, I can only download and import from its own genome list.

Is there any way I can import my own genome sequence database to Seq Monk?
slny is offline   Reply With Quote
Old 03-16-2011, 10:28 AM   #2
fkrueger
Senior Member
 
Location: Cambridge, UK

Join Date: Sep 2009
Posts: 625
Default

Yes, when you download SeqMonk you will find a text file called CREATING_CUSTOM_GENOMES.txt explaining the procedure in great detail.

www.bioinformatics.bbsrc.ac.uk/projects/seqmonk/

Hope this helps.
fkrueger is offline   Reply With Quote
Old 03-16-2011, 10:57 AM   #3
slny
Member
 
Location: FL

Join Date: Mar 2011
Posts: 54
Default

I didn't find such a file. I got only one file from the download. This file is basically the software, when I click it, it starts to work.

Could you please post the procedure to create custom genomes if possible?

Thanks.
slny is offline   Reply With Quote
Old 03-16-2011, 11:09 AM   #4
fkrueger
Senior Member
 
Location: Cambridge, UK

Join Date: Sep 2009
Posts: 625
Default

You can find the custom genome creation file if you download the Windows/Linux zip file. For your convenience I'll attach it here as well.
Attached Files
File Type: txt CREATING_CUSTOM_GENOMES.txt (8.4 KB, 133 views)
fkrueger is offline   Reply With Quote
Old 03-16-2011, 11:39 AM   #5
slny
Member
 
Location: FL

Join Date: Mar 2011
Posts: 54
Default

Got the file. Thanks a lot.
slny is offline   Reply With Quote
Old 12-07-2012, 05:25 AM   #6
mathew
Member
 
Location: australia

Join Date: Jan 2011
Posts: 81
Default custom genome

Slny,

I am also trying to use Seqmonk for my custom geneome but keeps on getting error. Were you able to use Seqmonk for custom genome.
Thanks
mathew is offline   Reply With Quote
Old 02-22-2013, 06:30 AM   #7
giampe
Member
 
Location: Bari, Italy

Join Date: Aug 2009
Posts: 22
Default

Hi,
Thanks for developing SeqMok software it is a great tool to manage my sequencing data.
I'm trying to create a folder with my custom genome, but I'm not able yet to do it.
Could you help me?
My steps were:
1)I downloaded citrus genome from http://citrus.hzau.edu.cn/cgi-bin/gb2/gbrowse/orange/ in genbank format
2) I converted genbank into embl format with this script

#!/usr/local/bin/perl -w
use strict;
use Bio::SeqIO;

if (@ARGV != 2) { die "USAGE: gb2embl.pl \n"; }

my $seqio = Bio::SeqIO->new('-format' => 'genbank', '-file' => "$ARGV[0]");
my $seqout = new Bio::SeqIO('-format' => 'embl', '-file' => ">$ARGV[1]");
while( my $seq = $seqio->next_seq) {
$seqout->write_seq($seq)
}
3) I changed the AC line

At this point the program returns a message:
"no data was present in the imported genome"
I didn't understand which lines I should modify.

Thak you all for your help!
giampe is offline   Reply With Quote
Old 02-22-2013, 06:38 AM   #8
simonandrews
Simon Andrews
 
Location: Babraham Inst, Cambridge, UK

Join Date: May 2009
Posts: 871
Default

Quote:
Originally Posted by giampe View Post
Hi,
Thanks for developing SeqMok software it is a great tool to manage my sequencing data.
I'm trying to create a folder with my custom genome, but I'm not able yet to do it.
Could you help me?
My steps were:
3) I changed the AC line
Without being able to see one of the files you've created it's difficult to know what's gone wrong.

Can you put your genome files somewhere I can see them? If I can have a look at the files I can figure out why seqmonk isn't recognising them.
simonandrews is offline   Reply With Quote
Old 02-22-2013, 06:47 AM   #9
giampe
Member
 
Location: Bari, Italy

Join Date: Aug 2009
Posts: 22
Default

Hi Simon,
thanks for your quicly reply
I can show you the head of .embl file relative to chr1:


ID unknown; SV 1; linear; unassigned DNA; STD; UNC; 28800734 BP.
XX
AC unknown;
XX
DT 22-Feb-2013
XX
XX
XX
FH Key Location/Qualifiers
FH
FT scaffold 1..196955
FT /name="scaffold_0255"
FT scaffold 196978..818715
FT /name="scaffold_0155"
FT scaffold 818738..1870313
FT /name="scaffold_0091"
FT scaffold complement(1870336..8756891)
FT /name="scaffold_0002"
FT scaffold 8756914..10191576
FT /name="scaffold_0067"
FT scaffold 10191599..12196287
FT /name="scaffold_0044"
FT scaffold 12196310..12455131
FT /name="scaffold_0224"
FT scaffold 12455154..12524877
FT /name="scaffold_0342"
FT scaffold 12524900..13254358
FT /name="scaffold_0131"
FT scaffold complement(13254381..13838699)
FT /name="scaffold_0162"
FT scaffold 13838722..14955534
FT /name="scaffold_0083"
FT scaffold complement(14955557..17624236)
FT /name="scaffold_0029"
FT scaffold 17624259..18164428
FT /name="scaffold_0166"
FT scaffold 18164451..19274573
FT /name="scaffold_0085"
FT scaffold complement(19274596..22480739)
FT /name="scaffold_0019"
FT scaffold 22480762..25121265
FT /name="scaffold_0030"
FT scaffold complement(25121288..26274302)
FT /name="scaffold_0081"
FT scaffold complement(26274325..28800734)
FT /name="scaffold_0033"
XX
SQ Sequence 28800734 BP; 8998530 A; 4599939 C; 4612033 G; 8991187 T; 1599045 other;
ctaaacccta aaccctaaac cctaaaccct aaaaacccta taccctaaat accctatacc 60
ctatacccta taccctatac cctaaaccct ataccctata aaccctatac cctaaaccct 120
ataccctata aaccctatac cccataccct ataccccata ccctataccc tataccccat 180
accctatacc ccatacccta aaccctataa accctaaacc ctataaaccc taaaccctat 240
aaaccccaaa ccataaaccc taaaacccaa aaccctaaaa ccctaaaccc ctaaacccta 300
aaccctaaac cctaaaaccc taaaccccta aaaccctaaa acgcaaaaac actaaaccct 360
aaaaccggaa aaccctaaac cctaaaccct aaaaccctaa accctaaacc ctaaacccta 420


this is the file before my attemps.
giampe is offline   Reply With Quote
Old 02-22-2013, 08:20 AM   #10
simonandrews
Simon Andrews
 
Location: Babraham Inst, Cambridge, UK

Join Date: May 2009
Posts: 871
Default

OK. The only problem is that you need to adjust the AC line to the format described in the CREATING_CUSTOM_GENOMES.txt file. The one I used to test with was:

AC chromosome:Test:1:1:28800734:1

..but change it to whatever assembly and genome name you actually want to use.
simonandrews is offline   Reply With Quote
Old 02-25-2013, 03:16 AM   #11
giampe
Member
 
Location: Bari, Italy

Join Date: Aug 2009
Posts: 22
Default

Yeah, I get it!
I have my custom genome! Thanks too much again!
giampe is offline   Reply With Quote
Old 11-14-2014, 07:33 AM   #12
chris202
Junior Member
 
Location: belgium

Join Date: Nov 2014
Posts: 7
Default

Hello everyone,

I have the same problem. Here is the beginning of my ".dat " file (for the first chromosome of the bug i'm interested in)

ID AM040264; SV 1; circular; genomic DNA; STD; PRO; 2121359 BP.
XX
AC chromosome:2308:genome:1:2121359:1
XX
PR Project:PRJNA16203;
XX
DT 22-NOV-2005 (Rel. 85, Created)
DT 15-JUN-2010 (Rel. 105, Last updated, Version 3)
XX
DE Brucella melitensis biovar Abortus 2308 chromosome I, complete sequence,
DE strain 2308
XX
KW complete genome.
XX
OS Brucella abortus 2308
OC Bacteria; Proteobacteria; Alphaproteobacteria; Rhizobiales; Brucellaceae;
OC Brucella.
XX
RN [1]
RP 1-2121359
RG Microbial Genomics Group, Lawrence Livermore National Laboratory, and the
RG Genome Analysis Group, Oak Ridge National Laboratory
RA Larimer F.;
RT ;
RL Submitted (21-JUN-2006) to the INSDC.
RL Larimer F., Oak Ridge National Laboratory, 1 Bethel Valley Road, Bldg 5700
RL A201 Oak Ridge, TN 37831, USA;
XX
RN [2]
RP 1-2121359
RX DOI; 10.1128/IAI.73.12.8353-8361.2005.
RX PUBMED; 16299333.
RG Microbial Genomics Group, Lawrence Livermore National Laboratory, and the
RG Genome Analysis Group, Oak Ridge National Laboratory
RA Chain P., Comerci D.J., Tolmasky M.E., Larimer F.W., Malfatti S.,
RA Vergez L.M., Aguero F., Land M.L., Ugalde R.A., Garcia E.;
RT "Whole-genome analyses of speciation events in pathogenic Brucellae";
RL Infect Immun 73(12):8353-8361(2005).
XX
DR MD5; a898c1e51a44dc700fa4f7a9333c982c.
DR EnsemblGenomes-Gn; BAB1_0014.
DR EnsemblGenomes-Gn; BAB1_0020.
DR EnsemblGenomes-Gn; BAB1_0021.
DR EnsemblGenomes-Gn; BAB1_0039.
...

any idea why it keep on telling "no data present in the imported genome" ?
Thanks a lot!
chris202 is offline   Reply With Quote
Old 11-14-2014, 08:00 AM   #13
simonandrews
Simon Andrews
 
Location: Babraham Inst, Cambridge, UK

Join Date: May 2009
Posts: 871
Default

Quote:
Originally Posted by chris202 View Post
Hello everyone,

I have the same problem. Here is the beginning of my ".dat " file (for the first chromosome of the bug i'm interested in)

ID AM040264; SV 1; circular; genomic DNA; STD; PRO; 2121359 BP.
XX
AC chromosome:2308:genome:1:2121359:1
Hi Chris,

The information in this post is now out of date. You no longer need to manually make custom genomes, there's a nice graphical way to do it as long as you have fasta files, GTF/GFF files, or preferably both.

Simply go to File > New Project and then select "Build custom genome". You can then load in your fasta and annotation files and it will create all of the genome files you need for you. It also has the option to create pseudochromosomes if you have an assembly which is scaffold or contig based and you don't want to end up with tons of chromosomes listed.

Let me know if you have any problems with this, but hopefully it will prove to be a much simpler solution.

Cheers

Simon.
simonandrews is offline   Reply With Quote
Old 11-17-2014, 01:44 AM   #14
chris202
Junior Member
 
Location: belgium

Join Date: Nov 2014
Posts: 7
Default

Ok it worked ! Thank you very much !
Actually I have another downstream question (not sure this is the best place to ask...). So I've uploaded my custom genome and I trying to import a small test dataset which looks exactly like the one shown in this video (at 2:05)
http://www.youtube.com/watch?v=ul4IZ...ABFFC4&index=2

After assigning the different columns as needed, I try to import but for each read it says:
"Location XXX-YYY was not an integer" no matter the size of the interval.
I'm a bit lost. Do you have any advice ?

Thansk again
chris202 is offline   Reply With Quote
Old 11-17-2014, 01:47 AM   #15
simonandrews
Simon Andrews
 
Location: Babraham Inst, Cambridge, UK

Join Date: May 2009
Posts: 871
Default

Quote:
Originally Posted by chris202 View Post
Ok it worked ! Thank you very much !
Actually I have another downstream question (not sure this is the best place to ask...). So I've uploaded my custom genome and I trying to import a small test dataset which looks exactly like the one shown in this video (at 2:05)
http://www.youtube.com/watch?v=ul4IZ...ABFFC4&index=2

After assigning the different columns as needed, I try to import but for each read it says:
"Location XXX-YYY was not an integer" no matter the size of the interval.
I'm a bit lost. Do you have any advice ?

Thansk again
Are you setting the count column when you import the file? This is a new option which won't be shown in the video and is only for datasets where there is an extra column to say how many times a particular position was seen. There was a bug in the last release which gave the wrong error message if the count value was incorrect so it made it hard to track down the problem. If you are setting the count column could you try setting it to nothing (leave that selector blank) and see if that fixes it.
simonandrews is offline   Reply With Quote
Old 11-17-2014, 02:11 AM   #16
chris202
Junior Member
 
Location: belgium

Join Date: Nov 2014
Posts: 7
Default

No, the error is still the same.
Actually I did use that option before (I always left it blank)...
chris202 is offline   Reply With Quote
Old 11-17-2014, 02:40 AM   #17
simonandrews
Simon Andrews
 
Location: Babraham Inst, Cambridge, UK

Join Date: May 2009
Posts: 871
Default

Quote:
Originally Posted by chris202 View Post
No, the error is still the same.
Actually I did use that option before (I always left it blank)...
OK, maybe there's something else broken in there. I'll have a go myself and get back to you.
simonandrews is offline   Reply With Quote
Old 11-17-2014, 08:48 AM   #18
simonandrews
Simon Andrews
 
Location: Babraham Inst, Cambridge, UK

Join Date: May 2009
Posts: 871
Default

Quote:
Originally Posted by chris202 View Post
No, the error is still the same.
Actually I did use that option before (I always left it blank)...
I've had a play with the options here to see if I can reproduce this and I've not managed to break it. I've added some better error reporting to the code so could I ask if you could try this development snapshot to see if the problem still occurs, and if so then what the error you get is.

If it still fails with the same nonsensiscal errors then could you send me a bit of the file which is failing to import and let me know which assembly you're using so I can reproduce it here and get to the bottom of what's failing.
simonandrews is offline   Reply With Quote
Old 11-19-2014, 12:27 AM   #19
chris202
Junior Member
 
Location: belgium

Join Date: Nov 2014
Posts: 7
Default

Quote:
Originally Posted by simonandrews View Post
I've had a play with the options here to see if I can reproduce this and I've not managed to break it. I've added some better error reporting to the code so could I ask if you could try this development snapshot to see if the problem still occurs, and if so then what the error you get is.

If it still fails with the same nonsensiscal errors then could you send me a bit of the file which is failing to import and let me know which assembly you're using so I can reproduce it here and get to the bottom of what's failing.
I've used the new version but the problem is still the same.
I can send you a piece of my dataset, however I use a custom built genome, i can send you the gff file also. Do you have a more private address than here ?
chris202 is offline   Reply With Quote
Old 11-19-2014, 12:43 AM   #20
simonandrews
Simon Andrews
 
Location: Babraham Inst, Cambridge, UK

Join Date: May 2009
Posts: 871
Default

Quote:
Originally Posted by chris202 View Post
I've used the new version but the problem is still the same.
I can send you a piece of my dataset, however I use a custom built genome, i can send you the gff file also. Do you have a more private address than here ?
Thanks Chris. Drop me an email to simon.andrews@babraham.ac.uk and we can work out the details.
simonandrews is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 09:32 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO