SapHanaTutorial.Com HOME     Learning-Materials Interview-Q&A Certifications Quiz Online-Courses Forum Jobs Trendz FAQs  
     Explore The World of Hana With Us     
About Us
Contact Us
 Apps
X
HANA App
>>>
Hadoop App
>>>
Tutorial App on SAP HANA
This app is an All-In-One package to provide everything to HANA Lovers.

It contains
1. Courses on SAP HANA - Basics, Modeling and Administration
2. Multiple Quizzes on Overview, Modelling, Architeture, and Administration
3. Most popular articles on SAP HANA
4. Series of Interview questions to brushup your HANA skills
Tutorial App on Hadoop
This app is an All-In-One package to provide everything to Hadoop Lovers.

It contains
1. Courses on Hadoop - Basics and Advanced
2. Multiple Quizzes on Basics, MapReduce and HDFS
3. Most popular articles on Hadoop
4. Series of Interview questions to brushup your skills
Apps
HANA App
Hadoop App
';
Search
Stay Connected
Search Topics
Topic Index
+
-
Hadoop Overview
+
-
Hadoop Examples
+
-
MapReduce
+
-
YARN
+
-
Miscellaneous

Hadoop Hello World - Using HDFS, HCatalog and Pig

In previous article How to setup Hortonworks Sandbox we explained how to setup sandbox.

Now let's write and run our first Hello World program. In this example we will use 3 components of Hadoop - HDFS, HCatalog and Pig.

Prerequisite

You need to setup Hortonworks Sandbox. This will not take more than 5 minutes.

What are we going to do?

We have a CSV file with two columns "cities", "temperature". We will use this data to find out the cities with maximum and minimum temperature.

Hadoop Hello World - Using HDFS, HCatalog and Pig

We are going to do it in 3 simple steps


Hadoop Hello World - Using HDFS, HCatalog and Pig

Step 1: Load Data into HDFS

As we know that HDFS is the file system where we store data in Hadoop, let us load the file in HDFS.

Download the example data from here.

Open Hortonworks Sandbox and click on the File Browser.

Hadoop Hello World - Using HDFS, HCatalog and Pig

Click on upload Files and select the CSV file you downloaded in previous step.

Hadoop Hello World - Using HDFS, HCatalog and Pig

You will notice that file is uploaded into HDFS.

Hadoop Hello World - Using HDFS, HCatalog and Pig

Step 2: Creating tables for the Data stored in HDFS using HCatalog

After we are done with loading the data into HDFS, we need to create tables with HCatalog so as to make data available to all processing languages like Pig, Hive etc.

Click on HCatalog icon

Hadoop Hello World - Using HDFS, HCatalog and Pig

Create a New Table from a file. Give it some meaningful name (for example "City_Temperature_List").

Hadoop Hello World - Using HDFS, HCatalog and Pig

Click on "Choose a File" and select the City_With_Temperature.csv which you just uploaded in HDFS.

Hadoop Hello World - Using HDFS, HCatalog and Pig

You would notice that the CSV file content is now coming in tabular form. Here we can change the metadata for example, change column name, column types etc. Let us skip this part. Click on "Create Table" button.

Hadoop Hello World - Using HDFS, HCatalog and Pig

You would notice that table has been successfully created and appears on HCatalog

Hadoop Hello World - Using HDFS, HCatalog and Pig

Step 3: Process the Data using Pig

Pig is a tool used to analyze large amounts of data. Using the PigLatin scripting language data analysis and processing can be easily done.

Click on Pig icon.

Hadoop Hello World - Using HDFS, HCatalog and Pig

We will use HCatLoader() function from Pig Helper. HCatLoader() function is used read the data from HCatalog table. Choose PIG helper -> HCatalog -> LOAD City_Temperature_List.

Hadoop Hello World - Using HDFS, HCatalog and Pig

You will see the load statement in Pig script.

Hadoop Hello World - Using HDFS, HCatalog and Pig

In the Query Editor, write below query and execute.

  A = LOAD 'default.city_temparature_list'
   USING org.apache.hcatalog.pig.HCatLoader();
  B = ORDER A BY temperature DESC;
  C = LIMIT B 1;
  D = FOREACH C GENERATE city, temperature;
  DUMP D;

You need to give some name to PIG script "PigScript01" before executing it.

Hadoop Hello World - Using HDFS, HCatalog and Pig

You will see the output as below.

Hadoop Hello World - Using HDFS, HCatalog and Pig

Similarly, you can also check the minimum temperature. To know more about Pig language, check Pig Function Library

Hadoop Hello World - Using HDFS, HCatalog and Pig

What's Next?

Hadoop Hello World - Using HDFS, HCatalog and Pig


Have a question or doubt? Please post that in comment.>/b>



Support us by sharing this article.

Explore More
Close X
Close X

Leave a Reply

Your email address will not be published. Required fields are marked *

Current day month ye@r *

 © 2017 : saphanatutorial.com, All rights reserved.  Privacy Policy