How to run HADOOP Map Reduce Program in UBUNTU ?

Many Beginners asked me this basic question – “How to run HADOOP Map Reduce Program in UBUNTU ?”

I can understand the difficulty beginners face using HADOOP in UBUNTU
For them im going to answer it with a brief explanation,
Generally after installation,we would like to test whether we have successfully installed hadoop or not.
First thing we need to do is,check whether the expected Hadoop Processes are running or not using a tool “jps” present in

hadoop directory.

hduser@ubuntu:/hadoop$ jps
2287 TaskTracker
2149 JobTracker
1938 DataNode
2085 SecondaryNameNode
2349 Jps
1788 NameNode

Some times while running this tool ,it may raise an error that tool jps is not present and other jdk version need to be

installed. In that case use the netstat command to check if Hadoop is listening on the configured ports.

hduser@ubuntu:~$ sudo netstat -plten | grep java
tcp   0  0 0.0.0.0:50070   0.0.0.0:*  LISTEN  1001  9236  2471/java
tcp   0  0 0.0.0.0:50010   0.0.0.0:*  LISTEN  1001  9998  2628/java
tcp   0  0 0.0.0.0:48159   0.0.0.0:*  LISTEN  1001  8496  2628/java
tcp   0  0 0.0.0.0:53121   0.0.0.0:*  LISTEN  1001  9228  2857/java
tcp   0  0 127.0.0.1:54310 0.0.0.0:*  LISTEN  1001  8143  2471/java
tcp   0  0 127.0.0.1:54311 0.0.0.0:*  LISTEN  1001  9230  2857/java
tcp   0  0 0.0.0.0:59305   0.0.0.0:*  LISTEN  1001  8141  2471/java
tcp   0  0 0.0.0.0:50060   0.0.0.0:*  LISTEN  1001  9857  3005/java
tcp   0  0 0.0.0.0:49900   0.0.0.0:*  LISTEN  1001  9037  2785/java
tcp   0  0 0.0.0.0:50030   0.0.0.0:*  LISTEN  1001  9773  2857/java

Yes they are listening ,next step is to run the sample program :

hduser@ubuntu:/home/hemanth/hadoop$ bin/hadoop jar hadoop-examples-1.0.4.jar pi 4 100

you will get the output.

Now another scenario is to run the Map Reduce Program ,which we have written .

1. Save the MapReduce Program with same like WordCount.java and save it in folder named wordcount_classes in
hadoop folder
2. Next execute the following commands in order(dont forget to create the input and ouput directories)

$ javac -classpath /home/hemanth/hadoop/hadoop-core-1.0.4.jar -d /home/hemanth/hadoop/wordcount_classes/WordCount.java

root@ubuntu:/home/hemanth/hadoop# jar -cvf wordcount.jar -C wordcount_classes/ .

hduser@ubuntu:/home/hemanth/hadoop$ bin/hadoop jar/home/hemanth/hadoop/wordcount.jar WordCount -r 2

/user/hduser/gutenberg/user/hduser/gutenberg-output2

Now check the output.thats all.
Fell free to ask me any queries regarding this . Tq 🙂

BIG DATA + ORACLE + HADOOP

This is my Idea and explanation to one of my colleague who asked me HOW TO USE ORACLE TECHNOLOGY WITH BIG DATA.

As you can see below, we can combine the open source technology HADOOP with Oracle Technologies.  I hope you can easily understand it 🙂

bigdata

Map Reduce Program Error in Ubuntu

Some times while Running a Map Reduce Program may result in this kind of Error ,the reason is output directory exits already ,which means you have already run it some time before .

hduser@ubuntu:/home/hemanth/hadoop$ bin/hadoop jar hadoop-examples-1.0.4.jar pi 10 100
Warning: $HADOOP_HOME is deprecated.
Number of Maps = 10
Samples per Map = 100
java.io.IOException: Tmp directory
hdfs://localhost:54310/user/hduser/PiEstimator_TMP_3_141592654 already exists.
Please remove it first.
at org.apache.hadoop.examples.PiEstimator.estimate(PiEstimator.java:270)
at org.apache.hadoop.examples.PiEstimator.run(PiEstimator.java:342)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.examples.PiEstimator.main(PiEstimator.java:351)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:2
5)
at java.lang.reflect.Method.invoke(Method.java:597)
at
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:6
8)
at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:2
5)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:156)

I resolved this issue like this :

hduser@ubuntu:/home/hemanth/hadoop$ cd bin
hduser@ubuntu:/home/hemanth/hadoop/bin$ fs -rmr
hdfs://localhost:54310/user/hduser/PiEstimator_TMP_3_141592654
Warning: $HADOOP_HOME is deprecated.
Deleted hdfs://localhost:54310/user/hduser/PiEstimator_TMP_3_141592654

Now you run this Program ,you will get the out put