I am running this on a CDH3 cluster currently.
Crossbow prerequisites mention the following among others:
1. A bowtie v0.12.8 executable must exist at the same path on all cluster nodes (including the master).
2. A Crossbow-customized version of soapsnp v1.02 must be installed
I cannot seem to find these specific functions. Would newer versions of these components work - Bowtie-1.0.0 and SOAPsnp-v1.03?
Also, running this command:
./cb_hadoop --preprocess --input=hdfs:///crossbow/example/e_coli/small.manifest --output=hdfs:///crossbow/example/e_coli/output_small --reference=hdfs:///crossbow-refs/e_coli.jar --all-haploids --streaming-jar=/usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.jar
And consistently getting this error:
at SRR014475.fastq_1.out.gz | /usr/bin/md5sum | cut -d' ' -f 1
Error: No command named `-version' was found. Perhaps you meant `hadoop version'
hadoop could not be found in HADOOP_HOME or PATH; please specify --hadoop
java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1
at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362)
at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:576)
at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:136)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:390)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:324)
at org.apache.hadoop.mapred.Child$4.run(Child.java:266)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1278)
As per Michael S, I have created a symlink for hadoop streaming jar to hadoop-streaming-0.20.jar and HADOOP_HOME is set to /usr/lib/hadoop-0.20
Any help is appreciated! I have unsuccessfully tried to run this on CDH4 and IDH clusters before and attempting to get to a running baseline with CDH3 as it was only tested on this version.
Thanks.
Crossbow prerequisites mention the following among others:
1. A bowtie v0.12.8 executable must exist at the same path on all cluster nodes (including the master).
2. A Crossbow-customized version of soapsnp v1.02 must be installed
I cannot seem to find these specific functions. Would newer versions of these components work - Bowtie-1.0.0 and SOAPsnp-v1.03?
Also, running this command:
./cb_hadoop --preprocess --input=hdfs:///crossbow/example/e_coli/small.manifest --output=hdfs:///crossbow/example/e_coli/output_small --reference=hdfs:///crossbow-refs/e_coli.jar --all-haploids --streaming-jar=/usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.jar
And consistently getting this error:
at SRR014475.fastq_1.out.gz | /usr/bin/md5sum | cut -d' ' -f 1
Error: No command named `-version' was found. Perhaps you meant `hadoop version'
hadoop could not be found in HADOOP_HOME or PATH; please specify --hadoop
java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1
at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362)
at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:576)
at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:136)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:390)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:324)
at org.apache.hadoop.mapred.Child$4.run(Child.java:266)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1278)
As per Michael S, I have created a symlink for hadoop streaming jar to hadoop-streaming-0.20.jar and HADOOP_HOME is set to /usr/lib/hadoop-0.20
Any help is appreciated! I have unsuccessfully tried to run this on CDH4 and IDH clusters before and attempting to get to a running baseline with CDH3 as it was only tested on this version.
Thanks.
Comment