SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
PubMed: Parallelized short read assembly of large genomes using de Bruijn graphs. Newsbot! Literature Watch 0 12-30-2011 02:00 AM
Assembly of Large Genomes using Cloud Computing by Contrail Gangcai De novo discovery 9 11-23-2011 07:42 AM
Scaffolding tool glacerda Bioinformatics 0 08-04-2010 03:54 PM
PubMed: BFAST: An Alignment Tool for Large Scale Genome Resequencing. Newsbot! Literature Watch 0 11-13-2009 02:10 AM
BFAST: Blat-like Fast Accurate Search Tool for Large-Scale Genome Resequencing nilshomer Bioinformatics 1 11-06-2008 09:36 PM

Reply
 
Thread Tools
Old 01-28-2011, 12:45 AM   #21
boetsie
Senior Member
 
Location: NL, Leiden

Join Date: Feb 2010
Posts: 245
Default

Hi Corthay,

no problem, good that it is clear now

1)
Hmmm, that should never be the case. Are you looking at the summary file to conclude that the total bases of scaffolds is increased? Because this value (sum (bp)) is the total number of bases WITH N's. The number of bases without N's should either be the same or less than the original total number of bases, since it tries to merge the contigs if they share -n overlap.

If you want, i can send you a script which calculates the number of N's in the scaffolds.

2)
For estimating the gap, i use the size of gap using reads.

Kind regards and no problem for the questions ,
Boetsie

Quote:
Originally Posted by corthay View Post
Hi boetsie,

Thanks for your quick reply. I understood how uniqueness is guaranteed.
Then, I have two more questions please.

Firstly, I am wondering why the total bases of scaffolds without N is increased even though I set 0 for "-x" option.

Secondly, how do you calculate the distance of reads within a given contig pair.
Do you estimate the size of gap using reads, or gap size is just ignored ?

Sorry for asking so many questions.

Thanks
Corthay.
boetsie is offline   Reply With Quote
Old 01-28-2011, 05:04 AM   #22
gstitan
Junior Member
 
Location: Essonne

Join Date: Oct 2009
Posts: 7
Thumbs up congratulation

I am using SSPACE and I find this tool very useful and user friendly (not as Bambus!).

Thanks!
gstitan is offline   Reply With Quote
Old 02-01-2011, 01:43 AM   #23
boetsie
Senior Member
 
Location: NL, Leiden

Join Date: Feb 2010
Posts: 245
Default

Quote:
Originally Posted by gstitan View Post
I am using SSPACE and I find this tool very useful and user friendly (not as Bambus!).

Thanks!
Thank you for this compliment
boetsie is offline   Reply With Quote
Old 02-07-2011, 05:08 PM   #24
e-summer-3
Junior Member
 
Location: japan

Join Date: Oct 2010
Posts: 3
Default

Hi, boetsie.
SSPACE is very good tool for scaffolding. I thanks you for your good job.

By the way, How does SSAPCE pronounce? "espeis"?
e-summer-3 is offline   Reply With Quote
Old 02-10-2011, 12:39 PM   #25
themwg
Junior Member
 
Location: Madison, WI

Join Date: Jan 2011
Posts: 6
Default

Hi, I'm excited to get SSPACE up and running. Unfortunately I'm getting a permission denial when making the directories (line 141). SSPACE is installed on a server in a directory where I don't have write permissions, which I suspect is the problem. Is there a way to direct where the results folders end up? or is my issue much simpler (and dumber).
themwg is offline   Reply With Quote
Old 02-10-2011, 01:32 PM   #26
boetsie
Senior Member
 
Location: NL, Leiden

Join Date: Feb 2010
Posts: 245
Default

Hi themwg,

good that it is working! Unfortunately, you can't specify where the folders end up. The folder structure is generated in your current working directory. Maybe you can turn the problem around; go to the directory where you would like the files/folders will end up and run the program from there. Then specify the full path to the contigs and also the full paths in the library file for your paired sequences.

If this won't work, i'm able to make a customised script for you You can mail me any time.

Boetsie

Quote:
Originally Posted by themwg View Post
Hi, I'm excited to get SSPACE up and running. Unfortunately I'm getting a permission denial when making the directories (line 141). SSPACE is installed on a server in a directory where I don't have write permissions, which I suspect is the problem. Is there a way to direct where the results folders end up? or is my issue much simpler (and dumber).
boetsie is offline   Reply With Quote
Old 02-11-2011, 09:50 AM   #27
themwg
Junior Member
 
Location: Madison, WI

Join Date: Jan 2011
Posts: 6
Default the next problem

Thanks Boetsie for the quick reply.
Sure enough I get further along if I just direct to SSPACE.pl from my directory. However I hit a second problem during the Reading, filtering and converting input seqs it Can't write to single file. here it is below

=>Fri Feb 11 11:55:38 2011: Reading, filtering and converting input sequences of library '/home/carroll/Desktop/data_carroll/SSPACEtests/leo95130_I' initiated
Can't write to single file -- fatal

=>Fri Feb 11 11:55:38 2011: Storing contigs to format for scaffolding

LIBRARY /home/carroll/Desktop/data_carroll/SSPACEtests/leo95130_I
------------------------------------------------------------

=>Fri Feb 11 11:55:44 2011: Building Bowtie index for contigs (tmp.standard_output/subset_contigs.fasta)

Bowtie-build error; -1 at /opt/SSPACE-1.1_linux-x86_64/bin/mapWithBowtie.pl line 37.
WARNING: No scaffolding, because no reads found on contigs

I imagine the bowtie build error is related to the first. Any thoughts on why it can't write to the single file (merging the two seq files?). Those files are in fastq format from illumina. They are also both quite large >10GB. My machine has a meager 44GB Ram. IF any of that is at all relevant here.

Thanks!
themwg is offline   Reply With Quote
Old 02-11-2011, 01:53 PM   #28
boetsie
Senior Member
 
Location: NL, Leiden

Join Date: Feb 2010
Posts: 245
Default

Hi again,

I think i know what the problem is. You have a library called "/home/carroll/Desktop/data_carroll/SSPACEtests/leo95130_I". This is a very strange name for a library. Name it something like "leo95130_I" or "lib1" (without the quotes though). Now, with your current library name, the script will try to create a file containing this library name in folder 'reads'. It will now be something like;

reads/home/carroll/Desktop/data_carroll/SSPACEtests/leo95130_I.filtered.reads

This will surely cause problems (as you noticed). The other error you get is probably caused by the same problem, namely your library name.

Your library should be something like;

library1 /path-to-file/filename_1.fastq /path-to-file/filename_2.fastq 500 0.25 0

If you are unable to generate the library, you can mail me your current library file and i can help you.

Kind regards,
Boetsie
boetsie is offline   Reply With Quote
Old 02-13-2011, 06:20 AM   #29
goldenflaw
Junior Member
 
Location: UK

Join Date: Apr 2010
Posts: 3
Default

Hello, I am running into some problems while using SSPACE. I believe it has to do with tmp.alboxf_scaffolds_no_extension/subset_contigs.fasta not being built properly, so my question is how is subset_contigs.fasta built?

Thanks!
goldenflaw is offline   Reply With Quote
Old 02-13-2011, 01:24 PM   #30
boetsie
Senior Member
 
Location: NL, Leiden

Join Date: Feb 2010
Posts: 245
Default

Hi goldenflaw,

what kind of problems are your running into?

The file you mention is generated by taking a short subset of the contigs. How this is done, is explained below (and in the README of SSPACE).

Quote:
Before mapping, contigs are shortened, reducing the search space for Bowtie. Only edges of the contigs are considered for mapping. Cutting of edges is determined by taking the maximal allowed distance inserted by the user in the library file (insert size and insert standard deviation). The maximal distance is insert_size + (insert_size * insert_stdev). For example, with a insert size of 500 and a deviation of 0.5, the maximal distance is 750. First 750 bases and last 750 bases are subtracted from the contig sequence, in this case;

------------------------------------------

------------|-----------------|

-------------------------------------------
750bp------------------------750bp
Please do not look at the white stripes in the example. I couldn't get the spacings between the two dashed lines right

Kind regards,
Boetsie

Quote:
Originally Posted by goldenflaw View Post
Hello, I am running into some problems while using SSPACE. I believe it has to do with tmp.alboxf_scaffolds_no_extension/subset_contigs.fasta not being built properly, so my question is how is subset_contigs.fasta built?

Thanks!
boetsie is offline   Reply With Quote
Old 02-13-2011, 01:39 PM   #31
goldenflaw
Junior Member
 
Location: UK

Join Date: Apr 2010
Posts: 3
Default

Thanks for your prompt reply.

I get the following error:

=>Sun Feb 13 22:37:39 2011: Building Bowtie index for contigs (tmp.alboxf_scaffolds_no_extension/subset_contigs.fasta)

Bowtie-build error; -1 at /scratch/yang/tools/SSPACE-1.1_linux-x86_64/bin/mapWithBowtie.pl line 38.
WARNING: No scaffolding, because no reads found on contigs

I believe it might have something to do with bowtie, but I am unsure.

Thanks again!
goldenflaw is offline   Reply With Quote
Old 02-13-2011, 07:37 PM   #32
e-summer-3
Junior Member
 
Location: japan

Join Date: Oct 2010
Posts: 3
Default What should I call?

SSPACE is very nice tool for us. Thank you for your good job.

By the way, what should I call SSPACE?

es es pace?
es pace?
es space?

Regards.
e-summer-3 is offline   Reply With Quote
Old 02-13-2011, 11:07 PM   #33
boetsie
Senior Member
 
Location: NL, Leiden

Join Date: Feb 2010
Posts: 245
Default

Yes that's a common problem. What version do you have from SSPACE?

The problem was mainly solved by going through the directory were the main SSPACE script (SSPACE_v1-x.pl) and folders are stored using the command line. Then, do one of the following;

chmod a+x bowtie/*

or

chmod 777 *

in your command line.

If this won't work, then you may try to download the newest Bowtie version at http://sourceforge.net/projects/bowt...bowtie/0.12.7/

Replace the files in the bowtie folder with the ones you've downloaded.

Kind regards,
Boetsie
boetsie is offline   Reply With Quote
Old 02-14-2011, 09:47 AM   #34
goldenflaw
Junior Member
 
Location: UK

Join Date: Apr 2010
Posts: 3
Default

Downloading the newest version of bowtie worked (I am using SSPACE-1.1_linux-x86_64). Also, I had extra annotation in my reference (assembled file) and that screwed up bowtie as well (if anyone else runs into the same problem).

Thanks again!
goldenflaw is offline   Reply With Quote
Old 02-15-2011, 01:19 PM   #35
rsw3284
Junior Member
 
Location: Texas

Join Date: Feb 2011
Posts: 2
Default Error with '-a' and insert stdev values

I'm getting the following error when running the SSPACE perl script using: -a = 0.70 (default) and insert stdev of 0.50:

Code:
ERROR: -a must be a number between 0.00 and 1.00. Your inserted -a is .70 ...Exiting.
ERROR: Insert stdev must be a number between 0.00 and 1.00. Your library lib1 has insert size of 0.50. Exiting.
Here are the contents of library.txt:
Code:
lib1 s_6_1_sequence.txt s_6_2_sequence.txt 250 0.50 0
and the command that was run:
Code:
perl SSPACE_v1-1.pl -l libraries.txt -s sk2_originalreads_contigs.fa -x 0 -m 32 -o 20 -t 0 -k 5 -n 15 -p 1 -v 0 -b sk2_origreads_no_extension

This was run on a 64-bit OSX server w/ 32gb RAM.


Edit: I believe this issue was corrected by correcting the permissions on the files involved. However, I'm having the same issue as the user above: WARNING: No scaffolding, because no reads found on contigs
Edit #2: Nevermind - changed permissions to 777 in the directories took care of this issue.

Thanks,


Rsw3284

Last edited by rsw3284; 02-16-2011 at 08:51 AM.
rsw3284 is offline   Reply With Quote
Old 02-16-2011, 09:26 AM   #36
boetsie
Senior Member
 
Location: NL, Leiden

Join Date: Feb 2010
Posts: 245
Default

Hi rsw3284,

is it fixed now? To be honest, we did not test SSPACE on a MacOSX 64 bit server, only on a 32-bit server. However, the above problems are looking more like a perl problem rather than a SSPACE problem.

Boetsie
boetsie is offline   Reply With Quote
Old 02-16-2011, 02:12 PM   #37
rsw3284
Junior Member
 
Location: Texas

Join Date: Feb 2011
Posts: 2
Default

Yes, it's working just fine now. Thanks!



- Rsw3284
rsw3284 is offline   Reply With Quote
Old 02-17-2011, 03:29 PM   #38
hliang
Junior Member
 
Location: US

Join Date: Oct 2010
Posts: 3
Default

Hi boetsie,
thank you for the SSPACE. I have a question while reading the MANUAL file coming with SSPACE:

The libraries.txt file contains information about each library. For each library, column 2 and 3 are Fasta or fastq files for both ends. Should these fasta/fastq files be different files? But I found, in MANUAL file, this example:

Lib1 file1.fasta file2.fasta 400 0.5 1
Lib1 file2.fasta file2.fasta 400 0.5 1
Lib2 file3.fastq file3.fastq 4000 0.75 0

I'm a bit confused. In what kind of cases, file2.fasta/ file3.fastq can be placed in both column 2 and 3?
hliang is offline   Reply With Quote
Old 02-17-2011, 11:37 PM   #39
boetsie
Senior Member
 
Location: NL, Leiden

Join Date: Feb 2010
Posts: 245
Default

Hi Hliang,

Thank you for your question, i see some mistakes there in the MANUAL.

About your question;

Column 2 and 3 should always be the same in one line. For example, if the file with the first reads are fastA, then the file with the second reads should also be fastA

However, if you have multiple library files, you might also have paired reads in fastQ format, which could also be used;

so, these libraries are ok:

lib1 file1.1.fastA file1.2.fastA 400 0.5 0
lib1 file2.1.fastQ file2.2.fastQ 400 0.5 0

While these are not correct;
lib1 file1.1.fastA file1.2.fastQ 400 0.5 0
lib1 file2.1.fastQ file2.2.fastA 400 0.5 0

Is this what you mean?

Kind regards,
Boetsie


Quote:
Originally Posted by hliang View Post
Hi boetsie,
thank you for the SSPACE. I have a question while reading the MANUAL file coming with SSPACE:

The libraries.txt file contains information about each library. For each library, column 2 and 3 are Fasta or fastq files for both ends. Should these fasta/fastq files be different files? But I found, in MANUAL file, this example:

Lib1 file1.fasta file2.fasta 400 0.5 1
Lib1 file2.fasta file2.fasta 400 0.5 1
Lib2 file3.fastq file3.fastq 4000 0.75 0

I'm a bit confused. In what kind of cases, file2.fasta/ file3.fastq can be placed in both column 2 and 3?
boetsie is offline   Reply With Quote
Old 02-18-2011, 07:31 AM   #40
hliang
Junior Member
 
Location: US

Join Date: Oct 2010
Posts: 3
Default

Thanks for the info.

So column 2 and column 3 should be PAIRED and have the same file format ?

can I concatenate (separate the paired-end sequences by ":" ) file1.1.fastA and file1.2.fastA into one single file file_combo.fastA, and use the following line?
lib1 file_combo.fastA file_combo.fastA 400 0.5 0

One more question: is SSPACE suitable for scaffolding using 454 paired-end data? 454 paired-end reads are longer than illumina/solexa reads and have a mix of different lengths (200-500 bp).


Quote:
Originally Posted by boetsie View Post
Hi Hliang,

Thank you for your question, i see some mistakes there in the MANUAL.

About your question;

Column 2 and 3 should always be the same in one line. For example, if the file with the first reads are fastA, then the file with the second reads should also be fastA

However, if you have multiple library files, you might also have paired reads in fastQ format, which could also be used;

so, these libraries are ok:

lib1 file1.1.fastA file1.2.fastA 400 0.5 0
lib1 file2.1.fastQ file2.2.fastQ 400 0.5 0

While these are not correct;
lib1 file1.1.fastA file1.2.fastQ 400 0.5 0
lib1 file2.1.fastQ file2.2.fastA 400 0.5 0

Is this what you mean?

Kind regards,
Boetsie
hliang is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 07:35 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO