Often Google acts like magic for me: type in my error, and out pops the solution. Not so for a Giraph error I recently hit. Hopefully this post lets Google work like magic for someone else :)
After installing Giraph on a BigTop 0.7 VM, I was able to run the benchmark that takes no input or output but nothing more complicated.
This works:
hadoop jar /usr/share/doc/giraph-1.0.0.5/giraph-examples-1.0.0-for-hadoop-2.0.6-alpha-jar-with-dependencies.jar org.apache.giraph.benchmark.PageRankBenchmark -Dgiraph.zkList=127.0.0.1:2181 -libjars /usr/lib/giraph/giraph-1.0.0-for-hadoop-2.0.6-alpha-jar-with-dependencies.jar -e 1 -s 3 -v -V 50 -w 1
But this:
hadoop jar /usr/share/doc/giraph-1.0.0.5/giraph-examples-1.0.0-for-hadoop-2.0.6-alpha-jar-with-dependencies.jar org.apache.giraph.GiraphRunner -Dgiraph.zkList=127.0.0.1:2181 -libjars /usr/lib/giraph/giraph.jar org.apache.giraph.examples.SimpleShortestPathsVertex -vif org.apache.giraph.io.formats.JsonLongDoubleFloatDoubleVertexInputFormat -vip /user/acoleman/giraphtest/tiny_graph.txt -of org.apache.giraph.io.formats.IdWithValueTextOutputFormat -op /user/acoleman/giraphtest/shortestpathsC2 -ca SimpleShortestPathsVertex.source=2 -w 1
does not.
Looking at the latest container logs with:
cat $(ls -1rtd $(ls -1rtd /var/log/hadoop-yarn/containers/application_* | tail -1)/container_* | tail -1)/*
I find:
Error: Could not find or load main class org.apache.giraph.yarn.GiraphApplicationMaster
I beat my head against the wall trying to add to libjars, to -yj, copying jars into every directory i could find.
I stumbled across
http://mail-archives.apache.org/mod_mbox/giraph-user/201312.mbox/%3C198091226.KO6f1kuK42@chronos7%3E
which gives the answer. If https://issues.apache.org/jira/browse/GIRAPH-814 hasn't been applied, then mapreduce.application.classpath has to be hard set or Giraph simply won't work.
vi /etc/hadoop/conf.pseudo/mapred-site.xml
<property>
<name>mapreduce.application.classpath</name>
<value>/usr/lib/hadoop-mapreduce/*,/usr/lib/hadoop-mapreduce/lib/*,/usr/lib/giraph/giraph-1.0.0-for-hadoop-2.0.6-alpha-jar-with-dependencies.jar
</value>
</property>
I did not need to restart yarn-resourcemanager or yarn-nodemanager for this to get picked up.