Categories
Configuration Hadoop Hive Scalability

Give Your MySQL Account Access to Hive

One area of the Apache Hive documentation that’s not entirely explicit is in regard to the database privileges needed for its metastore[1]. Developers often become accustomed to creating a database account that has all privileges granted. But in the Real World, end users of Hive must configure it to point to a metastore RDBMS account […]

Categories
Hadoop Hive

Using Hive with Existing Files on S3

One feature that Hive gets for free by virtue of being layered atop Hadoop is the S3 file system implementation. The upshot being that all the raw, textual data you have stored in S3 is just a few hoops away from being queried using Hive’s SQL-esque language. Imagine you have an S3 bucket un-originally named […]

Categories
Configuration Hive Java Scalability

How To Try Out Hive on Your Local Machine — And Not Upset Your Ops Team

According to the Hive web site: Hive is a data warehouse infrastructure built on top of Hadoop that provides tools to enable easy data summarization, adhoc querying and analysis of large datasets data stored in Hadoop files. Hive is built on top of various technologies, the most notable being Hadoop and HDFS. As a result, […]