Monday, November 14, 2011

Howto debug a Hadoop Program?

Hadoop could run on three modes:
  • Single Node Alone
  • Pseudo Distributed
  • Fully Distributed
When developing hadoop program, using Single Node Alone Mode could give you quick and easy way to debug programs. How to use it? Config Hadoop to run in a non-distributed mode, as a single Java process.

$ mkdir input
$ cp conf/*.xml input
$ bin/hadoop jar hadoop-examples-*.jar grep input output 'dfs[a-z.]+'
$ cat output/*

No comments:

Post a Comment