Apache Hive is a data warehouse system built on top of Hadoop for providing data summary, query, and analysis.
Hive provides a mechanism to query the data using a SQL-like language called HiveQL.
Now let's play with this data file using Hive.
- Click on tool Beeswax which gives you an interactive interface to Hive.
- Since we have already registered our tables in Hcatalog, Hive would have access to it. Just check this by clicking on Tables tab. Here we see our table.
- Hive inherit the schema and location information from HCatalog.
- 5. Now execute a query to find the city with maximum temperature.
Congratulations!!! You just completed your very first Hadoop example by loading the data into Hadoop HDFS, then registering it with HCatalog and finally executing Hive scripts to get result from the data.