├── .gitignore ├── README.md ├── images ├── nexus2.png ├── nexus3.png ├── nexus4.png └── nexus5.png ├── pom.xml ├── run-with-maven.sh ├── run.sh └── src └── main ├── java └── com │ └── cloudera │ └── example │ └── ClouderaImpalaJdbcExample.java └── resources ├── ClouderaImpalaJdbcExample.conf └── log4j.properties /.gitignore: -------------------------------------------------------------------------------- 1 | target/ 2 | .DS_Store 3 | ._.DS_Store 4 | .AppleDouble 5 | .LSOverride 6 | Icon 7 | *.jar 8 | 9 | #Thumbnails 10 | ._* 11 | 12 | .Trashes 13 | 14 | # Eclipse stuff 15 | .classpath 16 | .project 17 | .settings 18 | 19 | # IntelliJ 20 | .idea 21 | 22 | # Vim 23 | *.swp 24 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | ###Cloudera Impala JDBC Example 2 | 3 | [Apache Impala (Incubating)](http://www.cloudera.com/products/apache-hadoop/impala.html) is an open source, analytic MPP database for Apache Hadoop. 4 | 5 | This example shows how to build and run a Maven-based project to execute SQL queries on Impala using JDBC 6 | 7 | This example was tested using Impala 2.3 included with [CDH 5.5.2](http://www.cloudera.com/downloads/cdh/5-5-2.html) and the [Impala JDBC Driver v2.5.30](http://www.cloudera.com/downloads/connectors/impala/jdbc/2-5-30.html) 8 | 9 | When you download the Impala JDBC Driver from the link above, it is packaged as a zip file with separate distributions for JDBC3, JDBC4 and JDBC4.1. This example uses the distribution for JDBC4.1 on RHEL6 x86_64. The downloaded zip file contains the following eleven jar files: 10 | 11 | (1) ImpalaJDBC41.jar 12 | (2) TCLIServiceClient.jar 13 | (3) hive_metastore.jar 14 | (4) hive_service.jar 15 | (5) ql.jar 16 | (6) libfb303-0.9.0.jar 17 | (7) libthrift-0.9.0.jar 18 | (8) log4j-1.2.14.jar 19 | (9) slf4j-api-1.5.11.jar 20 | (10) slf4j-log4j12-1.5.11.jar 21 | (11) zookeeper-3.4.6.jar 22 | 23 | The JDBC driver's installation instructions say only that "...you must set the class path to include all the JAR files from the ZIP archive containing the driver that you are using..." 24 | 25 | While this works fine for one-off projects, it's a little loose for shops that would rather manage their dependencies using Maven or other build systems. 26 | 27 | Part of the challenge in building a project using those jars with Maven is that some of the jars are not available in public repos and some of them do not have obvious version numbers. My approach in this example will be to use a local Maven repo to manage the first five jars in the list above and to rely on publicly available Maven repos for jars 6 - 11 (as they have version numbers in their name). 28 | I will use the community version of the [Nexus Repository Manager OSS](http://www.sonatype.org/nexus/go/) as a local Maven repo 29 | 30 | I downloaded Nexus Repository Manager OSS v2.12 from the link [here](http://www.sonatype.org/nexus/go/) and followed the installation instructions [here](http://books.sonatype.com/nexus-book/reference/installing.html) 31 | 32 | Here is the view of my local Nexus repo available after launching it for the first time. Note there is already a repo named "3rd party" which I will use to manage the first five JDBC driver jars: 33 | 34 | ![nexus2](images/nexus2.png) 35 | 36 | To add jars to the repo, login to the local Nexus repo, go to the 3rd party repo's "upload artifacts" tab and select the desired jar to upload. I specified a group of "com.cloudera.impala.jdbc" and a version number of "2.5.30" for each of the five jars I uploaded, like this: 37 | 38 | ![nexus3](images/nexus3.png) 39 | 40 | Click on the 3rd party repo's URL link and you can browse the uploaded artifacts: 41 | 42 | ![nexus4](images/nexus4.png) 43 | 44 | Drill into any of the links and you can see the version number has been appended to each jar: 45 | 46 | ![nexus5](images/nexus5.png) 47 | 48 | Now that we have a local repo available hosting the JDBC jars, all we need to do is add that repo to our pom with an entry like this: 49 | 50 | 51 | YOUR.LOCAL.REPO.ID 52 | 53 | YOUR.LOCAL.REPO.NAME 54 | 55 | false 56 | 57 | 58 | 59 | For example, in my case my local repo entry looks like this: 60 | 61 | 62 | nexus.local 63 | http://10.10.10.7:8081/nexus/content/repositories/thirdparty 64 | Nexus Local 65 | 66 | false 67 | 68 | 69 | 70 | And you can refer to the JDBC artifacts with entries like this: 71 | 72 | 73 | com.cloudera.impala.jdbc 74 | ImpalaJDBC41 75 | 2.5.30 76 | 77 | 78 | Jars 6 - 11 will be retrieved from the Cloudera and Maven Central repos and will have traditional dependency elements like this: 79 | 80 | 81 | org.apache.thrift 82 | libfb303 83 | 0.9.0 84 | 85 | 86 | See the pom.xml for details 87 | 88 | 89 | 90 | ####Dependencies 91 | To build the project you must have Maven 2.x or higher installed. Maven info is [here](http://maven.apache.org). 92 | 93 | To run the project you must have access to a Hadoop cluster running Impala with at least one populated table defined in the Hive Metastore. 94 | 95 | 96 | #### Configuring the example 97 | 98 | Make sure to set your local repo in pom.xml as described above 99 | 100 | Edit the file src/main/resources/ClouderaImpalaJdbcExample.conf and set an Impala daemon's host and port in the connection.url (Impala's default JDBC port is 21050) and set the appropriate JDBC driver class. I am using JDBC4.1 so my conf file looks like this: 101 | 102 | # ClouderaImpalaJdbcExample.conf 103 | connection.url = jdbc:impala://chicago.onefoursix.com:21050 104 | jdbc.driver.class.name = com.cloudera.impala.jdbc41.Driver 105 | 106 | See the JDBC driver's docs for more details. 107 | 108 | 109 | #### Building the example 110 | 111 | Build the project like this: 112 | 113 | $ mvn clean package 114 | 115 | If this is the first time you are building the project you should see messages like this showing that Maven is retrieving the JDBC jars from your local repo: 116 | 117 | Downloading: http://10.10.10.7:8081/nexus/content/repositories/thirdparty/com/cloudera/impala/jdbc/hive_metastore/2.5.30/hive_metastore-2.5.30.jar 118 | Downloading: http://10.10.10.7:8081/nexus/content/repositories/thirdparty/com/cloudera/impala/jdbc/hive_service/2.5.30/hive_service-2.5.30.jar 119 | Downloading: http://10.10.10.7:8081/nexus/content/repositories/thirdparty/com/cloudera/impala/jdbc/ImpalaJDBC41/2.5.30/ImpalaJDBC41-2.5.30.jar 120 | Downloading: http://10.10.10.7:8081/nexus/content/repositories/thirdparty/com/cloudera/impala/jdbc/ql/2.5.30/ql-2.5.30.jar 121 | Downloading: http://10.10.10.7:8081/nexus/content/repositories/thirdparty/com/cloudera/impala/jdbc/TCLIServiceClient/2.5.30/TCLIServiceClient-2.5.30.jar 122 | 123 | Whereas the other jars (and their dependencies) are downloaded from the public repos: 124 | 125 | Downloading: https://repository.cloudera.com/artifactory/cloudera-repos/org/apache/thrift/libfb303/0.9.0/libfb303-0.9.0.jar 126 | Downloading: https://repository.cloudera.com/artifactory/cloudera-repos/org/apache/thrift/libthrift/0.9.0/libthrift-0.9.0.jar 127 | ... 128 | 129 | If your build is successful you should see messages like this: 130 | 131 | [INFO] Building jar: /home/mark/a/Cloudera-Impala-JDBC-Example-impala-cdh-5.5.2/cloudera-impala-jdbc-example-1.0.jar 132 | [INFO] 133 | [INFO] --- maven-shade-plugin:2.2:shade (default) @ cloudera-impala-jdbc-example --- 134 | [INFO] Including com.cloudera.impala.jdbc:hive_metastore:jar:2.5.30 in the shaded jar. 135 | [INFO] Including com.cloudera.impala.jdbc:hive_service:jar:2.5.30 in the shaded jar. 136 | ... 137 | [INFO] ------------------------------------------------------------------------ 138 | [INFO] BUILD SUCCESS 139 | [INFO] ------------------------------------------------------------------------ 140 | [INFO] Total time: 3.108 s 141 | [INFO] Finished at: 2016-02-21T11:24:56-08:00 142 | [INFO] Final Memory: 32M/476M 143 | [INFO] ------------------------------------------------------------------------ 144 | 145 | Note that pom.xml is configured to have Maven build an "uber jar" will all dependencies packaged in a single jar and with the main class set 146 | 147 | The uber jar will be located at target/cloudera-impala-jdbc-example-uber.jar 148 | 149 | 150 | #### Running the example using the uber jar 151 | 152 | One can run the example using the uber jar with a "java -jar" command with a SQL statement as an argument like this: 153 | 154 | $ java -jar target/cloudera-impala-jdbc-example-uber.jar "SELECT description FROM sample_07 limit 10" 155 | 156 | ============================================= 157 | Cloudera Impala JDBC Example 158 | Using Connection URL: jdbc:impala://chicago.onefoursix.com:21050 159 | Running Query: SELECT description FROM sample_07 limit 10 160 | 161 | == Begin Query Results ====================== 162 | All Occupations 163 | Management occupations 164 | Chief executives 165 | General and operations managers 166 | Legislators 167 | Advertising and promotions managers 168 | Marketing managers 169 | Sales managers 170 | Public relations managers 171 | Administrative services managers 172 | == End Query Results ======================= 173 | 174 | There is a "run.sh" script provided with that command 175 | 176 | #### Running the example using Maven 177 | 178 | One can also run the example using Maven using the run-with-maven.sh script which by default passes a SQL statement as an argument: 179 | 180 | mvn exec:java -Dexec.mainClass=com.cloudera.example.ClouderaImpalaJdbcExample -Dexec.arguments="SELECT description FROM sample_07 limit 10" 181 | 182 | Your output should look like this: 183 | 184 | $ ./run-with-maven.sh 185 | [INFO] Scanning for projects... 186 | ... 187 | [INFO] ------------------------------------------------------------------------ 188 | [INFO] Building cloudera-impala-jdbc-example 1.0 189 | [INFO] ------------------------------------------------------------------------ 190 | [INFO] 191 | [INFO] >>> exec-maven-plugin:1.2.1:java (default-cli) > validate @ cloudera-impala-jdbc-example >>> 192 | [INFO] 193 | [INFO] <<< exec-maven-plugin:1.2.1:java (default-cli) < validate @ cloudera-impala-jdbc-example <<< 194 | [INFO] 195 | [INFO] --- exec-maven-plugin:1.2.1:java (default-cli) @ cloudera-impala-jdbc-example --- 196 | 197 | Cloudera Impala JDBC Example 198 | Using Connection URL: jdbc:impala://chicago.onefoursix.com:21050 199 | Running Query: SELECT description FROM sample_07 limit 10 200 | 201 | == Begin Query Results ====================== 202 | All Occupations 203 | Management occupations 204 | Chief executives 205 | General and operations managers 206 | Legislators 207 | Advertising and promotions managers 208 | Marketing managers 209 | Sales managers 210 | Public relations managers 211 | Administrative services managers 212 | == End Query Results ======================= 213 | 214 | [INFO] ------------------------------------------------------------------------ 215 | [INFO] BUILD SUCCESS 216 | [INFO] ------------------------------------------------------------------------ 217 | -------------------------------------------------------------------------------- /images/nexus2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/onefoursix/Cloudera-Impala-JDBC-Example/6399d6de25b75ff732f655158b8b2f4b375a5b3c/images/nexus2.png -------------------------------------------------------------------------------- /images/nexus3.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/onefoursix/Cloudera-Impala-JDBC-Example/6399d6de25b75ff732f655158b8b2f4b375a5b3c/images/nexus3.png -------------------------------------------------------------------------------- /images/nexus4.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/onefoursix/Cloudera-Impala-JDBC-Example/6399d6de25b75ff732f655158b8b2f4b375a5b3c/images/nexus4.png -------------------------------------------------------------------------------- /images/nexus5.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/onefoursix/Cloudera-Impala-JDBC-Example/6399d6de25b75ff732f655158b8b2f4b375a5b3c/images/nexus5.png -------------------------------------------------------------------------------- /pom.xml: -------------------------------------------------------------------------------- 1 | 4 | 5 | 4.0.0 6 | com.cloudera.example 7 | cloudera-impala-jdbc-example 8 | 1.0 9 | jar 10 | Cloudera Impala JDBC Example for CDH 5.5.2 11 | 12 | 13 | UTF-8 14 | 2.5.30 15 | cloudera-impala-jdbc-example-uber.jar 16 | com.cloudera.example.ClouderaImpalaJdbcExample 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | com.cloudera.impala.jdbc 25 | hive_metastore 26 | ${impala.jdbc.version} 27 | 28 | 29 | com.cloudera.impala.jdbc 30 | hive_service 31 | ${impala.jdbc.version} 32 | 33 | 34 | com.cloudera.impala.jdbc 35 | ImpalaJDBC41 36 | ${impala.jdbc.version} 37 | 38 | 39 | com.cloudera.impala.jdbc 40 | ql 41 | ${impala.jdbc.version} 42 | 43 | 44 | com.cloudera.impala.jdbc 45 | TCLIServiceClient 46 | ${impala.jdbc.version} 47 | 48 | 49 | 50 | 51 | 52 | org.apache.thrift 53 | libfb303 54 | 0.9.0 55 | 56 | 57 | org.apache.thrift 58 | libthrift 59 | 0.9.0 60 | 61 | 62 | log4j 63 | log4j 64 | 1.2.14 65 | 66 | 67 | org.slf4j 68 | slf4j-api 69 | 1.5.11 70 | 71 | 72 | org.slf4j 73 | slf4j-log4j12 74 | 1.5.11 75 | 76 | 77 | org.apache.zookeeper 78 | zookeeper 79 | 3.4.6 80 | 81 | 82 | 83 | 84 | 85 | 86 | 87 | 88 | 89 | org.codehaus.mojo 90 | exec-maven-plugin 91 | 1.2.1 92 | 93 | 94 | 95 | 96 | 97 | 98 | org.apache.maven.plugins 99 | maven-compiler-plugin 100 | 2.3.2 101 | 102 | 1.6 103 | 1.6 104 | 105 | 106 | 107 | org.apache.maven.plugins 108 | maven-jar-plugin 109 | 2.4 110 | 111 | ${basedir} 112 | 113 | 114 | 115 | maven-clean-plugin 116 | 2.6.1 117 | 118 | 119 | 120 | . 121 | 122 | *.jar 123 | 124 | false 125 | 126 | 127 | 128 | 129 | 130 | 131 | org.apache.maven.plugins 132 | maven-shade-plugin 133 | 2.2 134 | 135 | false 136 | target/${uber.jar.name} 137 | 138 | 139 | *:* 140 | 141 | 142 | 143 | 144 | *:* 145 | 146 | META-INF/*.SF 147 | META-INF/*.DSA 148 | META-INF/*.RSA 149 | 150 | 151 | 152 | 153 | 154 | 155 | package 156 | 157 | shade 158 | 159 | 160 | 161 | 162 | 163 | reference.conf 164 | 165 | 166 | ${uber.jar.main.class} 167 | 168 | 169 | 170 | 171 | 172 | 173 | 174 | 175 | 176 | 177 | 178 | 179 | 180 | 181 | YOUR.LOCAL.REPO.ID 182 | 183 | YOUR.LOCAL.REPO.NAME 184 | 185 | false 186 | 187 | 188 | 189 | 190 | cdh.repo 191 | https://repository.cloudera.com/artifactory/cloudera-repos 192 | Cloudera Repositories 193 | 194 | false 195 | 196 | 197 | 198 | 199 | central 200 | http://repo1.maven.org/maven2/ 201 | 202 | true 203 | 204 | 205 | false 206 | 207 | 208 | 209 | 210 | 211 | 212 | -------------------------------------------------------------------------------- /run-with-maven.sh: -------------------------------------------------------------------------------- 1 | mvn exec:java -Dexec.mainClass=com.cloudera.example.ClouderaImpalaJdbcExample -Dexec.arguments="SELECT description FROM sample_07 limit 10" 2 | -------------------------------------------------------------------------------- /run.sh: -------------------------------------------------------------------------------- 1 | java -jar target/cloudera-impala-jdbc-example-uber.jar "SELECT description FROM sample_07 limit 10" -------------------------------------------------------------------------------- /src/main/java/com/cloudera/example/ClouderaImpalaJdbcExample.java: -------------------------------------------------------------------------------- 1 | package com.cloudera.example; 2 | 3 | import java.io.IOException; 4 | import java.io.InputStream; 5 | import java.sql.Connection; 6 | import java.sql.DriverManager; 7 | import java.sql.ResultSet; 8 | import java.sql.SQLException; 9 | import java.sql.Statement; 10 | import java.util.Properties; 11 | 12 | public class ClouderaImpalaJdbcExample { 13 | 14 | private static final String CONNECTION_URL_PROPERTY = "connection.url"; 15 | private static final String JDBC_DRIVER_NAME_PROPERTY = "jdbc.driver.class.name"; 16 | 17 | private static String connectionUrl; 18 | private static String jdbcDriverName; 19 | 20 | private static void loadConfiguration() throws IOException { 21 | InputStream input = null; 22 | try { 23 | String filename = ClouderaImpalaJdbcExample.class.getSimpleName() + ".conf"; 24 | input = ClouderaImpalaJdbcExample.class.getClassLoader().getResourceAsStream(filename); 25 | Properties prop = new Properties(); 26 | prop.load(input); 27 | 28 | connectionUrl = prop.getProperty(CONNECTION_URL_PROPERTY); 29 | jdbcDriverName = prop.getProperty(JDBC_DRIVER_NAME_PROPERTY); 30 | } finally { 31 | try { 32 | if (input != null) 33 | input.close(); 34 | } catch (IOException e) { 35 | // nothing to do 36 | } 37 | } 38 | } 39 | 40 | public static void main(String[] args) throws IOException { 41 | 42 | if (args.length != 1) { 43 | System.out.println("Syntax: ClouderaImpalaJdbcExample \"\""); 44 | System.exit(1); 45 | } 46 | String sqlStatement = args[0]; 47 | 48 | loadConfiguration(); 49 | 50 | System.out.println("\n============================================="); 51 | System.out.println("Cloudera Impala JDBC Example"); 52 | System.out.println("Using Connection URL: " + connectionUrl); 53 | System.out.println("Running Query: " + sqlStatement); 54 | 55 | Connection con = null; 56 | 57 | try { 58 | 59 | Class.forName(jdbcDriverName); 60 | 61 | con = DriverManager.getConnection(connectionUrl); 62 | 63 | Statement stmt = con.createStatement(); 64 | 65 | ResultSet rs = stmt.executeQuery(sqlStatement); 66 | 67 | System.out.println("\n== Begin Query Results ======================"); 68 | 69 | // print the results to the console 70 | while (rs.next()) { 71 | // the example query returns one String column 72 | System.out.println(rs.getString(1)); 73 | } 74 | 75 | System.out.println("== End Query Results =======================\n\n"); 76 | 77 | } catch (SQLException e) { 78 | e.printStackTrace(); 79 | } catch (Exception e) { 80 | e.printStackTrace(); 81 | } finally { 82 | try { 83 | con.close(); 84 | } catch (Exception e) { 85 | // swallow 86 | } 87 | } 88 | } 89 | } 90 | -------------------------------------------------------------------------------- /src/main/resources/ClouderaImpalaJdbcExample.conf: -------------------------------------------------------------------------------- 1 | connection.url = jdbc:impala://IMPALAD_HOST:21050 2 | jdbc.driver.class.name = com.cloudera.impala.jdbc41.Driver 3 | -------------------------------------------------------------------------------- /src/main/resources/log4j.properties: -------------------------------------------------------------------------------- 1 | # Root logger option 2 | log4j.rootLogger=INFO, stdout 3 | 4 | # Direct log messages to stdout 5 | log4j.appender.stdout=org.apache.log4j.ConsoleAppender 6 | log4j.appender.stdout.Target=System.out 7 | log4j.appender.stdout.layout=org.apache.log4j.PatternLayout 8 | log4j.appender.stdout.layout.ConversionPattern=%d{yyyy-MM-dd HH:mm:ss} %-5p %c{1}:%L - %m%n --------------------------------------------------------------------------------