How to run HADOOP Map Reduce Program in UBUNTU ?

Many Beginners asked me this basic question – “How to run HADOOP Map Reduce Program in UBUNTU ?”

I can understand the difficulty beginners face using HADOOP in UBUNTU
For them im going to answer it with a brief explanation,
Generally after installation,we would like to test whether we have successfully installed hadoop or not.
First thing we need to do is,check whether the expected Hadoop Processes are running or not using a tool “jps” present in

hadoop directory.

hduser@ubuntu:/hadoop$ jps
2287 TaskTracker
2149 JobTracker
1938 DataNode
2085 SecondaryNameNode
2349 Jps
1788 NameNode

Some times while running this tool ,it may raise an error that tool jps is not present and other jdk version need to be

installed. In that case use the netstat command to check if Hadoop is listening on the configured ports.

hduser@ubuntu:~$ sudo netstat -plten | grep java
tcp   0  0 0.0.0.0:50070   0.0.0.0:*  LISTEN  1001  9236  2471/java
tcp   0  0 0.0.0.0:50010   0.0.0.0:*  LISTEN  1001  9998  2628/java
tcp   0  0 0.0.0.0:48159   0.0.0.0:*  LISTEN  1001  8496  2628/java
tcp   0  0 0.0.0.0:53121   0.0.0.0:*  LISTEN  1001  9228  2857/java
tcp   0  0 127.0.0.1:54310 0.0.0.0:*  LISTEN  1001  8143  2471/java
tcp   0  0 127.0.0.1:54311 0.0.0.0:*  LISTEN  1001  9230  2857/java
tcp   0  0 0.0.0.0:59305   0.0.0.0:*  LISTEN  1001  8141  2471/java
tcp   0  0 0.0.0.0:50060   0.0.0.0:*  LISTEN  1001  9857  3005/java
tcp   0  0 0.0.0.0:49900   0.0.0.0:*  LISTEN  1001  9037  2785/java
tcp   0  0 0.0.0.0:50030   0.0.0.0:*  LISTEN  1001  9773  2857/java

Yes they are listening ,next step is to run the sample program :

hduser@ubuntu:/home/hemanth/hadoop$ bin/hadoop jar hadoop-examples-1.0.4.jar pi 4 100

you will get the output.

Now another scenario is to run the Map Reduce Program ,which we have written .

1. Save the MapReduce Program with same like WordCount.java and save it in folder named wordcount_classes in
hadoop folder
2. Next execute the following commands in order(dont forget to create the input and ouput directories)

$ javac -classpath /home/hemanth/hadoop/hadoop-core-1.0.4.jar -d /home/hemanth/hadoop/wordcount_classes/WordCount.java

root@ubuntu:/home/hemanth/hadoop# jar -cvf wordcount.jar -C wordcount_classes/ .

hduser@ubuntu:/home/hemanth/hadoop$ bin/hadoop jar/home/hemanth/hadoop/wordcount.jar WordCount -r 2

/user/hduser/gutenberg/user/hduser/gutenberg-output2

Now check the output.thats all.
Fell free to ask me any queries regarding this . Tq 🙂

BIG QUERY: Analytics goooooooooogles way

I’ve been wondering how i forgot to write an article on Big Query. An year back when i heard the word “Big Query” from Google i felt these guys are planning to conquer the BIG DATA world as well. Its obvious because google showed the world GFS(google file system) and Mapreduce concepts,which gave birth to hadoop. HAIL GOOGLE.!!! for your innovation.
Coming back to GOOGLE BIG QUERY,it is a full fledge big data tool stored on the cloud.Google created this tool online where you can analyze your bigdata for a per use fee, similar to other cloud offerings.

Wanna practially see the Advantage of BIG Query ???
Yes ,you can see the demo based on 2 contexts WIKIPEDIA & Data from WEATHER STATIONS.
Try the Demo here

https://demobigquery.appspot.com

bigquerydemo

Continue reading