SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Reply
 
Thread Tools
Old 11-03-2016, 03:53 AM   #21
Heena_2002
Junior Member
 
Location: India

Join Date: Nov 2016
Posts: 4
Default

Hello Jiri,

Thanks for your prompt reply. I have tried to run it on two sets of sampled data; one with 1 Million and another with 1.2 Million sequences. For sampling of large dataset, i have used the tool Sequence sampling available on galaxy. I have ran clustering programme of Repeatexplorer. I checked the results once the files turned green in the history panel. However, the HTML and cluster files generated were empty. According to the log information there seems to be some error during rps blast of the pipeline (as far as i can understand). For your further reference i'm providing a snapshot of the log file at the portion i noticed the error. Hope this will enable you to figure out the problem and guide me through this. If you require any more information regarding this i will update you.

Thank you!

******************************************************
10
Clustering step 1 - renaming read names using integers
2016-11-02 20:32
Converting hitsort to binary format
2016-11-02 20:35
Clustering
2016-11-02 20:36
creating membership and cls files
2016-11-02 20:37
Wed Nov 2 20:37:52 CET 2016
Evaluating connection between clusters
2016-11-02 20:37
Running rpsblast
2016-11-02 20:40
USAGE
rpsblast [-h] [-help] [-import_search_strategy filename]
[-export_search_strategy filename] [-db database_name]
[-dbsize num_letters] [-gilist filename] [-seqidlist filename]
[-negative_gilist filename] [-entrez_query entrez_query]
[-query input_file] [-out output_file] [-evalue evalue]
[-word_size int_value] [-xdrop_ungap float_value] [-xdrop_gap float_value]
[-xdrop_gap_final float_value] [-searchsp int_value]
[-max_hsps_per_subject int_value] [-seg SEG_options]
[-soft_masking soft_masking] [-culling_limit int_value]
[-best_hit_overhang float_value] [-best_hit_score_edge float_value]
[-window_size int_value] [-lcase_masking] [-query_loc range]
[-parse_deflines] [-outfmt format] [-show_gis]
[-num_descriptions int_value] [-num_alignments int_value] [-html]
[-max_target_seqs num_sequences] [-num_threads int_value] [-remote]
[-comp_based_stats compo] [-use_sw_tback] [-version]

DESCRIPTION
Reverse Position Specific BLAST 2.2.28+

Use '-help' to print detailed descriptions of command line arguments
USAGE
rpsblast [-h] [-help] [-import_search_strategy filename]
[-e

========================================================================

USAGEUSUnkno: Unknown argument: "d"
================================================================



nknown argument: "d"
Error: : Unknown argument: "d"
nknown argument: "d"
Error: Unknown argument: "d"
Process 3:
Process 1:
Process 6:
Traceback (most recent call last):
Traceback (most recent call last):
Traceback (most recent call last):
Traceback (most recent call last):
File "/usr/lib/python2.6/multiprocessing/process.py", line 232, in _bootstrap
File "/usr/lib/python2.6/multiprocessing/process.py", line 232, in _bootstrap
File "/usr/lib/python2.6/multiprocessing/process.py", line 232, in _bootstrap
File "/usr/lib/python2.6/multiprocessing/process.py", line 232, in _bootstrap
self.run()
File "/usr/lib/python2.6/multiprocessing/process.py", line 88, in run
self.run()
self._target(*self._args, **self._kwargs)
File "/usr/lib/python2.6/multiprocessing/process.py", line 88, in run
File "/mnt/raid_galaxy/home/galaxy/galaxy-dist/tools/umbr_programs/seqclust/programs/parallel.py", line 11, in fun
self._target(*self._args, **self._kwargs)
self.run()
File "/usr/lib/python2.6/multiprocessing/process.py", line 88, in run
self._target(*self._args, **self._kwargs)
File "/mnt/raid_galaxy/home/galaxy/galaxy-dist/tools/umbr_programs/seqclust/programs/parallel.py", line 11, in fun
self.run()
File "/usr/lib/python2.6/multiprocessing/process.py", line 88, in run
File "/usr/lib/python2.6/multiprocessing/process.py", line 88, in run
File "/usr/lib/python2.6/multiprocessing/process.py", line 88, in run
self._target(*self._args, **self._kwargs)
self._target(*self._args, **self._kwargs)
self._target(*self._args, **self._kwargs)
File "/mnt/raid_galaxy/home/galaxy/galaxy-dist/tools/umbr_programs/seqclust/programs/parallel.py", line 11, in fun
File "/mnt/raid_galaxy/home/galaxy/galaxy-dist/tools/umbr_programs/seqclust/programs/parallel.py", line 11, in fun
File "/mnt/raid_galaxy/home/galaxy/galaxy-dist/tools/umbr_programs/seqclust/programs/parallel.py", line 11, in fun
File "/mnt/raid_galaxy/home/galaxy/galaxy-dist/tools/umbr_programs/seqclust/programs/parallel.py", line 11, in fun
pipe.send(f(x))
File "/mnt/raid_galaxy/home/galaxy/galaxy-dist/tools/umbr_programs/seqclust/programs/parallel.py", line 97, in command_star
return(command(*args))
File "/mnt/raid_galaxy/home/galaxy/galaxy-dist/tools/umbr_programs/seqclust/programs/parallel.py", line 110, in shell_command
raise Error
NameError: global name 'Error' is not defined
pipe.send(f(x))
File "/mnt/raid_galaxy/home/galaxy/galaxy-dist/tools/umbr_programs/seqclust/programs/parallel.py", line 97, in command_star
return(command(*args))
File "/mnt/raid_galaxy/home/galaxy/galaxy-dist/tools/umbr_programs/seqclust/programs/parallel.py", line 110, in shell_command
raise Error
NameError: global name 'Error' is not defined
pipe.send(f(x))
File "/mnt/raid_galaxy/home/galaxy/galaxy-dist/tools/umbr_programs/seqclust/programs/parallel.py", line 97, in command_star
File "/mnt/raid_galaxy/home/galaxy/galaxy-dist/tools/umbr_programs/seqclust/programs/parallel.py", line 97, in command_star
return(command(*args))
File "/mnt/raid_galaxy/home/galaxy/galaxy-dist/tools/umbr_programs/seqclust/programs/parallel.py", line 110, in shell_command
return(command(*args))
File "/mnt/raid_galaxy/home/galaxy/galaxy-dist/tools/umbr_programs/seqclust/programs/parallel.py", line 110, in shell_command
File "/mnt/raid_galaxy/home/galaxy/galaxy-dist/tools/umbr_programs/seqclust/programs/parallel.py", line 110, in shell_command
raise Error
raise Error
raise Error
raise Error
NameError: global name 'Error' is not defined
NameError: global name 'Error' is not defined
NameError: global name 'Error' is not defined
NameError: global name 'Error' is not defined
NameError: global name 'Error' is not defined
USAGE
rpsblast [-h] [-help] [-import_search_strategy filename]
[-export_search_strategy filename] [-db database_name]
[-dbsize num_letters] [-gilist filename] [-seqidlist filename]
[-negative_gilist filename] [-entrez_query entrez_query]
[-query input_file] [-out output_file] [-evalue evalue]
[-word_size int_value] [-xdrop_ungap float_value] [-xdrop_gap float_value]
[-xdrop_gap_final float_value] [-searchsp int_value]
[-max_hsps_per_subject int_value] [-seg SEG_options]
[-soft_masking soft_masking] [-culling_limit int_value]
[-best_hit_overhang float_value] [-best_hit_score_edge float_value]
[-window_size int_value] [-lcase_masking] [-query_loc range]
[-parse_deflines] [-outfmt format] [-show_gis]
[-num_descriptions int_value] [-num_alignments int_value] [-html]
[-max_target_seqs num_sequences] [-num_threads int_value] [-remote]
[-comp_based_stats compo] [-use_sw_tback] [-version]

DESCRIPTION
Reverse Position Specific BLAST 2.2.28+

Use '-help' to print detailed descriptions of command line arguments
========================================================================

Error: Unknown argument: "d"
USAGE
rpsblast [-h] [-help] [-import_search_strategy filename]
[-export_search_strategy filename] [-db database_name]
[-dbsize num_letters] [-gilist filename] [-seqidlist filename]
[-negative_gilist filename] [-entrez_query entrez_query]
[-query input_file] [-out output_file] [-evalue evalue]
[-word_size int_value] [-xdrop_ungap float_value] [-xdrop_gap float_value]
[-xdrop_gap_final float_value] [-searchsp int_value]
[-max_hsps_per_subject int_value] [-seg SEG_options]
[-soft_masking soft_masking] [-culling_limit int_value]
[-best_hit_overhang float_value] [-best_hit_score_edge float_value]
[-window_size int_value] [-lcase_masking] [-query_loc range]
[-parse_deflines] [-outfmt format] [-show_gis]
[-num_descriptions int_value] [-num_alignments int_value] [-html]
[-max_target_seqs num_sequences] [-num_threads int_value] [-remote]
[-comp_based_stats compo] [-use_sw_tback] [-version]

DESCRIPTION
Reverse Position Specific BLAST 2.2.28+

Use '-help' to print detailed descriptions of command line arguments
========================================================================

Error: Unknown argument: "d"
USAGE
rpsblast [-h] [-help] [-import_search_strategy filename]
[-export_search_strategy filename] [-db database_name]
[-dbsize num_letters] [-gilist filename] [-seqidlist filename]
[-negative_gilist filename] [-entrez_query entrez_query]
[-query input_file] [-out output_file] [-evalue evalue]
[-word_size int_value] [-xdrop_ungap float_value] [-xdrop_gap float_value]
[-xdrop_gap_final float_value] [-searchsp int_value]
[-max_hsps_per_subject int_value] [-seg SEG_options]
[-soft_masking soft_masking] [-culling_limit int_value]
[-best_hit_overhang float_value] [-best_hit_score_edge float_value]
[-window_size int_value] [-lcase_masking] [-query_loc range]
[-parse_deflines] [-outfmt format] [-show_gis]
[-num_descriptions int_value] [-num_alignments int_value] [-html]
[-max_target_seqs num_sequences] [-num_threads int_value] [-remote]
[-comp_based_stats compo] [-use_sw_tback] [-version]

DESCRIPTION
Reverse Position Specific BLAST 2.2.28+

Use '-help' to print detailed descriptions of command line arguments
========================================================================

Process 8:
Error: Unknown argument: "d"
Traceback (most recent call last):
File "/usr/lib/python2.6/multiprocessing/process.py", line 232, in _bootstrap
self.run()
File "/usr/lib/python2.6/multiprocessing/process.py", line 88, in run
self._target(*self._args, **self._kwargs)
File "/mnt/raid_galaxy/home/galaxy/galaxy-dist/tools/umbr_programs/seqclust/programs/parallel.py", line 11, in fun
USAGE
rpsblast [-h] [-help] [-import_search_strategy filename]
[-export_search_strategy filename] [-db database_name]
[-dbsize num_letters] [-gilist filename] [-seqidlist filename]
[-negative_gilist filename] [-entrez_query entrez_query]
[-query input_file] [-out output_file] [-evalue evalue]
[-word_size int_value] [-xdrop_ungap float_value] [-xdrop_gap float_value]
[-xdrop_gap_final float_value] [-searchsp int_value]
[-max_hsps_per_subject int_value] [-seg SEG_options]
[-soft_masking soft_masking] [-culling_limit int_value]
[-best_hit_overhang float_value] [-best_hit_score_edge float_value]
[-window_size int_value] [-lcase_masking] [-query_loc range]
[-parse_deflines] [-outfmt format] [-show_gis]
[-num_descriptions int_value] [-num_alignments int_value] [-html]
[-max_target_seqs num_sequences] [-num_threads int_value] [-remote]
[-comp_based_stats compo] [-use_sw_tback] [-version]

DESCRIPTION
Reverse Position Specific BLAST 2.2.28+

Use '-help' to print detailed descriptions of command line arguments
========================================================================
Heena_2002 is offline   Reply With Quote
Old 11-03-2016, 04:09 AM   #22
jimacas
Junior Member
 
Location: Czech Republic

Join Date: Mar 2014
Posts: 9
Default

Hi, this problem is related to running RPS-BLAST; I suppose you checked the option "Search conserved domain database" in the input form. Please try to run your analysis without selecting this option.
(since RPS-BLAST analysis takes very long time and does not provide much additional information we are in the process of removing this step from the pipeline. The option remained in this old version of the input form but will be removed during the next upgrade).
Please let me know if there is any other problem.
Jiri
jimacas is offline   Reply With Quote
Old 11-03-2016, 04:35 AM   #23
jimacas
Junior Member
 
Location: Czech Republic

Join Date: Mar 2014
Posts: 9
Default

You can also try to run your analysis on our new server at https://galaxy-elixir.cerit-sc.cz/ There is a new version of Galaxy as well as our scripts available there.

Jiri
jimacas is offline   Reply With Quote
Old 12-23-2016, 08:23 AM   #24
zhou_hye
Junior Member
 
Location: United States

Join Date: Oct 2014
Posts: 1
Default Calls: %dopar% -> <Anonymous>

Hi everyone,
when I run repeatExplorer with my short reads(100bp), I have come across the error message as below.

graph of cluster CL209 with 107 nodes, 246edges created
layout for cluster CL209 - calculation start at - 2016-12-20 16:01:53
ncol file for cluster CL213 loaded - 2016-12-20 16:01:53
ncol file converted to graph for cluster CL213 2016-12-20 16:01:53
graph of cluster CL213 with 105 nodes, 1419edges created
layout for cluster CL213 - calculation start at - 2016-12-20 16:01:53
Error in { : task 1 failed - "unused argument (verbose = FALSE)"
Calls: %dopar% -> <Anonymous>
Execution halted
exit status:1

Does anyone know what is going on there?
my file format is good for input fasta, there is no character like '#' in my fasta file. thank you
zhou_hye is offline   Reply With Quote
Old 12-23-2016, 08:31 AM   #25
zhou77
Junior Member
 
Location: US

Join Date: Dec 2016
Posts: 2
Default repeat explorer bug

Hello,
when I run repeat explorer on my 100Mb short reads data(100bp), I have the error message as below:
graph of cluster CL209 with 107 nodes, 246edges created
layout for cluster CL209 - calculation start at - 2016-12-20 16:01:53
ncol file for cluster CL213 loaded - 2016-12-20 16:01:53
ncol file converted to graph for cluster CL213 2016-12-20 16:01:53
graph of cluster CL213 with 105 nodes, 1419edges created
layout for cluster CL213 - calculation start at - 2016-12-20 16:01:53
Error in { : task 1 failed - "unused argument (verbose = FALSE)"
Calls: %dopar% -> <Anonymous>
Execution halted
exit status:1

does anyone know what is going on there. my fasta input file is totally good. it does not contain any character such as '#' in query names.

thank you all.
zhou77 is offline   Reply With Quote
Old 12-23-2016, 08:44 AM   #26
jimacas
Junior Member
 
Location: Czech Republic

Join Date: Mar 2014
Posts: 9
Default

Hi, could you please specify if you run your analysis from a command line scripts or on a public Galaxy/RepeatExplorer web server ? In the latter case, please submit a bug report by clicking on a bug symbol in the history item which finished with error. This way we will be able to inspect parameters of your runs and find the problem.

Thanks, Jiri Macas

Quote:
Originally Posted by zhou77 View Post
Hello,
when I run repeat explorer on my 100Mb short reads data(100bp), I have the error message as below:
graph of cluster CL209 with 107 nodes, 246edges created
layout for cluster CL209 - calculation start at - 2016-12-20 16:01:53
ncol file for cluster CL213 loaded - 2016-12-20 16:01:53
ncol file converted to graph for cluster CL213 2016-12-20 16:01:53
graph of cluster CL213 with 105 nodes, 1419edges created
layout for cluster CL213 - calculation start at - 2016-12-20 16:01:53
Error in { : task 1 failed - "unused argument (verbose = FALSE)"
Calls: %dopar% -> <Anonymous>
Execution halted
exit status:1

does anyone know what is going on there. my fasta input file is totally good. it does not contain any character such as '#' in query names.

thank you all.
jimacas is offline   Reply With Quote
Old 12-23-2016, 08:54 AM   #27
zhou77
Junior Member
 
Location: US

Join Date: Dec 2016
Posts: 2
Default

Quote:
Originally Posted by jimacas View Post
Hi, could you please specify if you run your analysis from a command line scripts or on a public Galaxy/RepeatExplorer web server ? In the latter case, please submit a bug report by clicking on a bug symbol in the history item which finished with error. This way we will be able to inspect parameters of your runs and find the problem.

Thanks, Jiri Macas
Hi Jiri,
thank you for your reply so quick. I used the command line and my script is

python /usr/local/apps/repeatexplore/052015/seqclust_cmd.py -s A_S_clean.fa -f 4 -v ./TEST_1_1 -c 10

I just used the default parameter. the input reads is all 100 bp

below is the seqclust.log
input parameters:
WD=/usr/local/apps/repeatexplore/052015/umbr_programs/seqclust/programs
DATA=/lustre1/hz09961/angiosperms/TEST_REPEATEXPLORER/A_S_clean.fa
STARTDIR=/lustre1/hz09961/angiosperms/TEST_REPEATEXPLORER/TEST_1
PROC=48
OVL=40
CAPARGS='-o 40 -p 80'
MINCL=105
BLASTGR=1000
MINRD=5
CODE=seqClust
CONFIGFILE=/usr/local/apps/repeatexplore/052015/config.sh
BASEDIR=/lustre1/hz09961/angiosperms/TEST_REPEATEXPLORER/TEST_1/seqClust
PAIREDREADS=false
MGBLAST_OVERLAP=55

CONFIGFILE content:
GALAXY_DIR=/mnt/raid/users/petr/galaxy-dist # this variable is not neccessary for command line version
# USE ABSOLUTE PATHS

# directory with RepeatMAsker installation
export REPEAT_MASKER=/usr/local/apps/repeatmasker/4.0.5_perl_threaded/RepeatMasker # set according your local installation, if RepeatMasker is in path you comment the line out
# Conserved domain database files :location:
export RPSBLAST_DATABASE=/db/ncbiblast/cdd/latest/cdd # set according your local installation
export RPSBLAST_DATABASE_ANNOTATION=/db/ncbiblast/cdd/latest/cddid_all.tbl # set according your local installation

export PATH=/usr/local/apps/parallel/20150622/src:$PATH
#PATH=${ROOT}/parallel/src:$PATH
#export PATH
export TGICL=/usr/local/apps/repeatexplore/052015/tgicl_linux
# directory with louvain clustering exacutables:
export PROG_COMMUNITY=${ROOT}/louvain # make sure that you have compiled source using make!
export OGDF=${ROOT}/OGDF/runOGDFlayout
export JSLIB=$ROOT/umbr_programs/interactive_graph/js # DO NOT MODIFY
export DOMAINDATABASE=$ROOT/tool-data/domains/TE_domains_newest.fasta # DO NOT MODIFY!
export DOMAIN_TYPES=$ROOT/tool-data/domains/classification_newest.csv # DO NOT MODIFY!
export DATABASE_PBS=${ROOT}/tool-data/tRNA/tRNAs_arabidopsis_unique # DO NOT MODIFY!
export MIN_MINCL=20 #limit for minimal size of the MINCL, DO NOT MODIFY
export MAXEDGES=350000000 # which can be processed with 16G RAM, depends on computer memory and is used to determine maximal amount of data which could be process DO NOT MODIFY
export MAXEDGES_FOR_LAYOUT=25000000 # adjusted down for fmmm layout
export MAXNODES_FOR_LAYOUT=50000
export PROC_LAYOUT=4
# recommendations 16G RAM then MAXEDGES~341576829

TGICL=/usr/local/apps/repeatexplore/052015/tgicl_linux
PROG_COMMUNITY=/usr/local/apps/repeatexplore/052015/louvain
DOMAINDATABASE=/usr/local/apps/repeatexplore/052015/tool-data/domains/TE_domains_newest.fasta
DOMAIN_TYPES=/usr/local/apps/repeatexplore/052015/tool-data/domains/classification_newest.csv
REPEAT_MASKER=/usr/local/apps/repeatmasker/4.0.5_perl_threaded/RepeatMasker
**************************************************
zhou77 is offline   Reply With Quote
Old 01-08-2017, 04:34 AM   #28
Heena_2002
Junior Member
 
Location: India

Join Date: Nov 2016
Posts: 4
Default

Hi Jiri,
I have a query regarding Repeatexplorer pipeline at Galaxy. I'm working on data analysis of my run and finding it very difficult to summarize repeat composition owing to massive data. Can you please guide me where can i find a compiled information regarding repeats. I have read somewhere and also seen that automatic classification system in not available in the public RE.

Thanks in advance!
Heena_2002 is offline   Reply With Quote
Old 01-09-2017, 06:42 AM   #29
jimacas
Junior Member
 
Location: Czech Republic

Join Date: Mar 2014
Posts: 9
Default

Hi, we have a testing version of the automated repeat classification running on our new RepeatExplorer server at https://repeatexplorer-elixir.cerit-sc.cz/ , so you can give it a try. We suggest users to move to this server anyway as it offers more computational and storage resources (and the old one will be discontinued in the future).
In the case of repeat quantification using older version of the pipeline, you should try to annotate repeats in the top clusters (the largest clusters listed in the summary HTML file) and then calculate proportions of different repeat types by summarizing read counts over the clusters annotated as the same type of repeat (these read counds can be found in the file seqClust/clustering/ncolInfo.txt in the analysis output archive - "Archive with clustering results.." in the Galaxy history).
Please note that we organize a practical workshop on repeat identification and annotation using RE (usually in May), the link is here: http://w3lamc.umbr.cas.cz/repeatexplorer/?page_id=14 . There are some talks from the course available here: http://w3lamc.umbr.cas.cz/repeatexplorer/?page_id=125

Best, Jiri


Quote:
Originally Posted by Heena_2002 View Post
Hi Jiri,
I have a query regarding Repeatexplorer pipeline at Galaxy. I'm working on data analysis of my run and finding it very difficult to summarize repeat composition owing to massive data. Can you please guide me where can i find a compiled information regarding repeats. I have read somewhere and also seen that automatic classification system in not available in the public RE.

Thanks in advance!
jimacas is offline   Reply With Quote
Old 01-22-2017, 07:33 AM   #30
Heena_2002
Junior Member
 
Location: India

Join Date: Nov 2016
Posts: 4
Default

Hi Jiri,
As suggested by you i was trying to run my analysis at the new RepeatExplorer server at https://repeatexplorer-elixir.cerit-sc.cz/. I have uploaded a 9Gb fastq file at the site through ftp server. However, the file is taking too long to move from the ftp directory to my history pane (it already ran for 2 days but the job is still running!). I have tried to contact the server administrator but it is also not working. Is it because the server is down or some other technical problem. As i remember the last time i ran RepeatExplorer at the galaxy it worked fine.

Please help me regarding this.

Thanks!

Best,
Heena
Heena_2002 is offline   Reply With Quote
Old 01-23-2017, 04:15 AM   #31
jimacas
Junior Member
 
Location: Czech Republic

Join Date: Mar 2014
Posts: 9
Default

That was a temporary problem with file upload. Next time, please use this email to send a problem report directly to the server administrator: [email protected]
Please also note that the upload may be blocked in case you exceed your storage quote (50 Gb by default).

Jiri

Quote:
Originally Posted by Heena_2002 View Post
Hi Jiri,
As suggested by you i was trying to run my analysis at the new RepeatExplorer server at https://repeatexplorer-elixir.cerit-sc.cz/. I have uploaded a 9Gb fastq file at the site through ftp server. However, the file is taking too long to move from the ftp directory to my history pane (it already ran for 2 days but the job is still running!). I have tried to contact the server administrator but it is also not working. Is it because the server is down or some other technical problem. As i remember the last time i ran RepeatExplorer at the galaxy it worked fine.

Please help me regarding this.

Thanks!

Best,
Heena

Last edited by jimacas; 01-23-2017 at 04:21 AM.
jimacas is offline   Reply With Quote
Reply

Tags
clusters, diagnostic applications, miseq output, repeatexplorer

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 12:55 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2017, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO