Query HIVE on Hadoop on Azure (HOA) using C#

Hive is an open-source data warehousing solution built on top of Hadoop. It supports queries expressed in SQL like declarative language – HiveQL, which are compiled into map-reduce jobs that are executed using Hadoop. In addition, HiveQL enables users to plug in custom map-reduce scripts into queries.

Similar to traditional databases, Hive stores data in tables, where each table consists of a number of rows, and each row consists of a specified number of columns. The type is either a primitive type or a complex type.

Now, Let us see how we can call Hive using C#.

Prerequisites

  1. Make sure that you have installed the correct version of HiveODBC  connector from HadoopOnAzure.
  2. Make sure that you have opened  ODBC server port on HOA.

Setup DSN with HIVE:

  1. First step is to create new data source for Hive.
  2. Open ODBC Data Source Administrator window and click on Add… button to create new Data Source

  1. Select HIVE from the list and then click on Finish button.

  1. Enter appropriate Data Source Name, enter descriptive Description , enter Host URL in Host and keep the port number as default.
  2. Select Authentication option as Username/Password.
  3. In Username , enter username that you have kept while creating cluster on HoA and then click on OK button.

  1. After successful entries , there will be a new entry in ODBC Data Source Administrator window for Hive.

Call Hive Query from C#

Following are the four simple steps to call Hive query using C#:

  1. Setup connection string by setting user name, password and DSN.
  1. Create new ODBCConnection by passing connection string and then open the connection.
  1. Next step is to create a hive query and execute it.
  1. Last step is to traverse the resultset. Thats it!
Code Snippet:
using System.Data.Odbc;
Class Program
{
private void QueryHive()
 {
 string connectionString = "Uid=USER_NAME;pwd=PASSWORD;Dsn=DSN_NAME";
 OdbcConnection DbConnection = new OdbcConnection(connectionString);
 DbConnection.Open();

OdbcCommand com = new OdbcCommand("select Col1,Col2 from TABLE_NAME limit 10", DbConnection);
 OdbcDataReader reader = com.ExecuteReader();
 while (reader.Read())
 {
 Console.WriteLine("Col1 : {0} , Col2 : {1}", reader.GetString(0), reader.GetString(1));
 }

}
}

References :
1. http://hive.apache.org/

Leave a Reply

Your email address will not be published. Required fields are marked *


three × 8 =

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>