├── Oracle_to_Cassandra_OSchema.png ├── README.md └── Red_Hat_Oracle_VM_Notes.md /Oracle_to_Cassandra_OSchema.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/simonambridge/Oracle_to_Cassandra/bfc8d7ad9d87d2272f4c84485d9dd504988d3e9e/Oracle_to_Cassandra_OSchema.png -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Migrating Relational Data From Oracle To DSE/Cassandra - using Spark DataFrames, SparkSQL and the spark-cassandra connector 2 | The objective of this exercise is to demonstrate how to migrate data from Oracle to DataStax Cassandra. I'll be using the DataFrame capability introduced in Apache Spark 1.3 to load data from tables in an Oracle database (12c) via Oracle's JDBC thin driver, to generate a result set, joining tables where necessary. 3 | The data will then be saved to DSE/Cassandra. 4 | 5 | This demo is not intended as an exhaustive or comprehensive tutorial but it should provide enough material to gain an understanding of the processes involved in migrating data from Oracle to Cassandra 6 | 7 |
34 |
# vi /etc/yum.repos.d/datastax.repo 40 |41 | 42 |
45 | [datastax] 46 | name = DataStax Repo for DataStax Enterprise 47 | baseurl=https://datastaxrepo_gmail.com:utJVKEg4lKeaWTX@rpm.datastax.com/enterprise 48 | enabled=1 49 | gpgcheck=0 50 |51 | 52 |
54 | rpm --import http://rpm.datastax.com/rpm/repo_key 55 |56 | 57 | 58 |
61 | # yum install dse-full-5.0.1-1 62 |63 | 64 |
66 | # yum install opscenter 67 |68 | 69 |
71 | # yum install datastax-agent 72 |73 |
81 | # service dse stop 82 | Stopping DSE daemon : dse [ OK ] 83 |84 |
87 | # vi /etc/default/dse 88 |89 | e.g.: 90 | 91 |
92 | # Start the node in DSE Search mode 93 | SOLR_ENABLED=1 94 | # Start the node in Spark mode 95 | SPARK_ENABLED=1 96 |97 | 98 |
101 | # rm -rf /var/lib/cassandra/data/* 102 | # rm -rf /var/lib/cassandra/saved_caches/* 103 | # rm -rf /var/lib/cassandra/commitlog/* 104 | # rm -rf /var/lib/cassandra/hints/* 105 |106 | 107 | Remove the old system.log if it exists: 108 |
109 | # rm -rf /var/log/cassandra/system.log 110 |111 | 112 | Now restart DSE: 113 |
114 | $ sudo service DSE restart 115 |116 |
120 | $ nodetool status 121 | Datacenter: SearchAnalytics 122 | =========================== 123 | Status=Up/Down 124 | |/ State=Normal/Leaving/Joining/Moving 125 | -- Address Load Owns Host ID Token Rack 126 | UN 127.0.0.1 346.89 KB ? 8e6fa3db-9018-47f0-96df-8c78067fddaa 6840808785095143619 rack1 127 | 128 | Note: Non-system keyspaces don't have the same replication settings, effective ownership information is meaningless 129 |130 | You should also check that you can log into cqlsh: 131 |
132 | $ cqlsh 133 | Connected to Test Cluster at 127.0.0.1:9042. 134 | [cqlsh 5.0.1 | Cassandra 3.0.7.1159 | DSE 5.0.1 | CQL spec 3.4.0 | Native protocol v4] 135 | Use HELP for help. 136 | cqlsh> 137 |138 | Type exit in cqlsh to return to the shell prompt. 139 |
145 | $ dse client-tool spark master-address 146 | spark://127.0.0.1:7077 147 |148 | OK, let's go look at our source data in the Oracle database. 149 | 150 | > If you have your own data you can substitute your schema for the HR schema shown) 151 | 152 |
160 | $ sqlplus / as sysdba 161 | 162 | SQL*Plus: Release 12.1.0.2.0 Production on Wed Sep 14 02:57:42 2016 163 | 164 | Copyright (c) 1982, 2014, Oracle. All rights reserved. 165 | ... 166 | SQL> 167 |168 |
171 | SQL> alter session set "_ORACLE_SCRIPT"=true; 172 |173 | 174 |
177 | SQL> @?/demo/schema/human_resources/hr_main.sql 178 |179 | 180 | Respond with these parameters 181 | - hr 182 | - users 183 | - temp 184 | - [your sys password] 185 | - $ORACLE_HOME/demo/schema/log/ 186 | 187 | You should see 188 |
189 | PL/SQL procedure successfully completed. 190 |191 |
203 |
204 |
207 | $ sqlplus hr/hr 208 | SQL> set lines 180 209 | 210 | SQL> select table_name from user_tables; 211 | 212 | TABLE_NAME 213 | ------------------------ 214 | REGIONS 215 | COUNTRIES 216 | LOCATIONS 217 | DEPARTMENTS 218 | JOBS 219 | EMPLOYEES 220 | JOB_HISTORY 221 | 222 | 7 rows selected. 223 |224 | 225 | There are seven tables in the HR schema. I'm going to look at five of them. 226 | 227 |
231 | SQL> desc employees 232 | 233 | Name Null? Type 234 | ----------------------- -------- ---------------- 235 | EMPLOYEE_ID NOT NULL NUMBER(6) 236 | FIRST_NAME VARCHAR2(20) 237 | LAST_NAME NOT NULL VARCHAR2(25) 238 | EMAIL NOT NULL VARCHAR2(25) 239 | PHONE_NUMBER VARCHAR2(20) 240 | HIRE_DATE NOT NULL DATE 241 | JOB_ID NOT NULL VARCHAR2(10) 242 | SALARY NUMBER(8,2) 243 | COMMISSION_PCT NUMBER(2,2) 244 | MANAGER_ID NUMBER(6) 245 | DEPARTMENT_ID NUMBER(4) 246 |247 | 248 |
250 | SQL> desc jobs 251 | 252 | Name Null? Type 253 | ----------------------- -------- ---------------- 254 | JOB_ID NOT NULL VARCHAR2(10) 255 | JOB_TITLE NOT NULL VARCHAR2(35) 256 | MIN_SALARY NUMBER(6) 257 | MAX_SALARY NUMBER(6) 258 |259 | 260 |
263 | SQL> desc departments 264 | 265 | Name Null? Type 266 | ----------------------- -------- ---------------- 267 | DEPARTMENT_ID NOT NULL NUMBER(4) 268 | DEPARTMENT_NAME NOT NULL VARCHAR2(30) 269 | MANAGER_ID NUMBER(6) 270 | LOCATION_ID NUMBER(4) 271 |272 | 273 |
276 | SQL> desc locations 277 | 278 | Name Null? Type 279 | ----------------------- -------- ---------------- 280 | LOCATION_ID NOT NULL NUMBER(4) 281 | STREET_ADDRESS VARCHAR2(40) 282 | POSTAL_CODE VARCHAR2(12) 283 | CITY NOT NULL VARCHAR2(30) 284 | STATE_PROVINCE VARCHAR2(25) 285 | COUNTRY_ID CHAR(2) 286 |287 | 288 |
291 | SQL> desc countries 292 | 293 | Name Null? Type 294 | ----------------------- -------- ---------------- 295 | COUNTRY_ID NOT NULL CHAR(2) 296 | COUNTRY_NAME VARCHAR2(40) 297 | REGION_ID NUMBER 298 |299 |
304 | SQL> select * from employees; 305 | 306 | EMPLOYEE_ID FIRST_NAME LAST_NAME EMAIL PHONE_NUMBER HIRE_DATE JOB_ID SALARY COMMISSION_PCT MANAGER_ID DEPARTMENT_ID 307 | ----------- -------------------- -------------------- -------------------- -------------------- --------- ---------- ---------- -------------- ---------- ------------- 308 | 120 Matthew Weiss MWEISS 650.123.1234 18-JUL-04 ST_MAN 8000 100 50 309 | 121 Adam Fripp AFRIPP 650.123.2234 10-APR-05 ST_MAN 8200 100 50 310 | 122 Payam Kaufling PKAUFLIN 650.123.3234 01-MAY-03 ST_MAN 7900 100 50 311 | 123 Shanta Vollman SVOLLMAN 650.123.4234 10-OCT-05 ST_MAN 6500 100 50 312 | 124 Kevin Mourgos KMOURGOS 650.123.5234 16-NOV-07 ST_MAN 5800 100 50 313 | 125 Julia Nayer JNAYER 650.124.1214 16-JUL-05 ST_CLERK 3200 120 50 314 | 126 Irene Mikkilineni IMIKKILI 650.124.1224 28-SEP-06 ST_CLERK 2700 120 50 315 |316 | 317 | Let's walk through the schema to get familiar with the data that we're going to migrate. 318 | For a moment let's just focus on employees reporting to a manager with ID=121: 319 | 320 |
321 | SQL> select * from employees where manager_id=121; 322 | 323 | EMPLOYEE_ID FIRST_NAME LAST_NAME EMAIL PHONE_NUMBER HIRE_DATE JOB_ID SALARY COMMISSION_PCT MANAGER_ID DEPARTMENT_ID 324 | ----------- -------------------- -------------------- -------------------- -------------------- --------- ---------- ---------- -------------- ---------- ------------- 325 | 129 Laura Bissot LBISSOT 650.124.5234 20-AUG-05 ST_CLERK 3300 121 50 326 | 130 Mozhe Atkinson MATKINSO 650.124.6234 30-OCT-05 ST_CLERK 2800 121 50 327 | 131 James Marlow JAMRLOW 650.124.7234 16-FEB-05 ST_CLERK 2500 121 50 328 | 132 TJ Olson TJOLSON 650.124.8234 10-APR-07 ST_CLERK 2100 121 50 329 | 184 Nandita Sarchand NSARCHAN 650.509.1876 27-JAN-04 SH_CLERK 4200 121 50 330 | 185 Alexis Bull ABULL 650.509.2876 20-FEB-05 SH_CLERK 4100 121 50 331 | 186 Julia Dellinger JDELLING 650.509.3876 24-JUN-06 SH_CLERK 3400 121 50 332 | 187 Anthony Cabrio ACABRIO 650.509.4876 07-FEB-07 SH_CLERK 3000 121 50 333 |334 | 335 | A little relational reminder. Who is that manager with ID=121? 336 | 337 |
338 | SQL> select * from employees where employee_id=121; 339 | 340 | EMPLOYEE_ID FIRST_NAME LAST_NAME EMAIL PHONE_NUMBER HIRE_DATE JOB_ID SALARY COMMISSION_PCT MANAGER_ID DEPARTMENT_ID 341 | ----------- -------------------- -------------------- -------------------- -------------------- --------- ---------- ---------- -------------- ---------- ------------- 342 | 121 Adam Fripp AFRIPP 650.123.2234 10-APR-05 ST_MAN 8200 100 50 343 |344 | 345 | ...and what is HIS job? 346 |
347 | SQL> select * from jobs where job_id='ST_MAN'; 348 | 349 | JOB_ID JOB_TITLE MIN_SALARY MAX_SALARY 350 | ---------- ----------------------------------- ---------- ---------- 351 | ST_MAN Stock Manager 5500 8500 352 |353 | 354 | ...and who is HIS boss? 355 | 356 |
357 | EMPLOYEE_ID FIRST_NAME LAST_NAME EMAIL PHONE_NUMBER HIRE_DATE JOB_ID SALARY COMMISSION_PCT MANAGER_ID DEPARTMENT_ID 358 | ----------- -------------------- -------------------- -------------------- -------------------- --------- ---------- ---------- -------------- ---------- ------------- 359 | 100 Steven King SKING 515.123.4567 17-JUN-03 AD_PRES 24000 90 360 |361 | They work in Department=50 - what is that? 362 |
363 | SQL> select * from departments where department_id=50; 364 | 365 | DEPARTMENT_ID DEPARTMENT_NAME MANAGER_ID LOCATION_ID 366 | ------------- ------------------------------ ---------- ----------- 367 | 50 Shipping 121 1500 368 |369 | It's in Location 1500. Where is location=1500? 370 |
371 | SQL> select * from locations where location_id=1500; 372 | 373 | LOCATION_ID STREET_ADDRESS POSTAL_CODE CITY STATE_PROVINCE CO 374 | ----------- ---------------------------------------- ------------ ------------------------------ ------------------------- -- 375 | 1500 2011 Interiors Blvd 99236 South San Francisco California US 376 |377 | And column CO value is "US" - I wonder what country that is in? 378 |
379 | SQL> select * from countries where country_id='US'; 380 | 381 | CO COUNTRY_NAME REGION_ID 382 | -- ---------------------------------------- ---------- 383 | US United States of America 2 384 | 385 | 1 row selected. 386 |387 | And the US is in the Americas: 388 |
389 | SQL> select * from regions where region_id=2; 390 | 391 | REGION_ID REGION_NAME 392 | ---------- ------------------------- 393 | 2 Americas 394 |395 | 396 | 397 |
412 | # vi /etc/dse/spark/spark-defaults.conf 413 |414 | Add the following lines pointing to the location of your ojdbc7.jar file (/app/oracle/downloads is my location, yours may be different): 415 |
416 | spark.driver.extraClassPath = /app/oracle/downloads/ojdbc7.jar 417 | spark.executor.extraClassPath = /app/oracle/downloads/ojdbc7.jar 418 |419 | 420 |
422 | $ sudo service dse stop 423 | $ sudo service dse start 424 |425 | 426 | 427 |
430 | $ dse spark --driver-class-path /app/oracle/downloads/ojdbc7.jar -deprecation 431 | Welcome to 432 | ____ __ 433 | / __/__ ___ _____/ /__ 434 | _\ \/ _ \/ _ `/ __/ '_/ 435 | /___/ .__/\_,_/_/ /_/\_\ version 1.6.2 436 | /_/ 437 | Using Scala version 2.10.5 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_77) 438 | Type in expressions to have them evaluated. 439 | Type :help for more information. 440 | Initializing SparkContext with MASTER: spark://127.0.0.1:7077 441 | Created spark context.. 442 | Spark context available as sc. 443 | Hive context available as sqlContext. Will be initialized on first use. 444 |445 |
450 | import com.datastax.spark.connector._
451 | import com.datastax.spark.connector.cql.CassandraConnector
452 | import org.apache.log4j.{Level, Logger}
453 | import org.apache.spark.{SparkConf, SparkContext}
454 | import java.io._
455 |
456 | 460 | For Spark versions below 1.4: 461 |
462 | scala> val employees = sqlContext.load("jdbc", Map("url" -> "jdbc:oracle:thin:hr/hr@localhost:1521/orcl", "dbtable" -> "employees"))
463 |
464 | For Spark 1.4 onwards:
465 |
466 | scala> val employees = sqlContext.read.format("jdbc").option("url", "jdbc:oracle:thin:hr/hr@localhost:1521/orcl").option("driver", "oracle.jdbc.OracleDriver").option("dbtable", "employees").load()
467 |
468 |
469 | The Spark REPL responds successfully with:
470 | 471 | employees: org.apache.spark.sql.DataFrame = [EMPLOYEE_ID: decimal(6,0), FIRST_NAME: string, LAST_NAME: string, EMAIL: string, PHONE_NUMBER: string, HIRE_DATE: timestamp, JOB_ID: string, SALARY: decimal(8,2), COMMISSION_PCT: decimal(2,2), MANAGER_ID: decimal(6,0), DEPARTMENT_ID: decimal(4,0)] 472 |473 | All good so far. 474 | 475 |
480 | scala> employees.printSchema()
481 | root
482 | |-- EMPLOYEE_ID: decimal(6,0) (nullable = false)
483 | |-- FIRST_NAME: string (nullable = true)
484 | |-- LAST_NAME: string (nullable = false)
485 | |-- EMAIL: string (nullable = false)
486 | |-- PHONE_NUMBER: string (nullable = true)
487 | |-- HIRE_DATE: timestamp (nullable = false)
488 | |-- JOB_ID: string (nullable = false)
489 | |-- SALARY: decimal(8,2) (nullable = true)
490 | |-- COMMISSION_PCT: decimal(2,2) (nullable = true)
491 | |-- MANAGER_ID: decimal(6,0) (nullable = true)
492 | |-- DEPARTMENT_ID: decimal(4,0) (nullable = true)
493 |
494 |
495 | We can use the dataframe .show() method to display the first x:int rows in the table:
496 |
497 |
498 | scala> employees.show(5)
499 | +-----------+-----------+----------+--------+------------+--------------------+----------+--------+--------------+----------+-------------+
500 | |EMPLOYEE_ID| FIRST_NAME| LAST_NAME| EMAIL|PHONE_NUMBER| HIRE_DATE| JOB_ID| SALARY|COMMISSION_PCT|MANAGER_ID|DEPARTMENT_ID|
501 | +-----------+-----------+----------+--------+------------+--------------------+----------+--------+--------------+----------+-------------+
502 | | 100| Steven| King| SKING|515.123.4567|2003-06-16 23:00:...| AD_PRES|24000.00| null| null| 90|
503 | | 101| Neena| Kochhar|NKOCHHAR|515.123.4568|2005-09-20 23:00:...| AD_VP|17000.00| null| 100| 90|
504 | | 102| Lex| De Haan| LDEHAAN|515.123.4569|2001-01-13 00:00:...| AD_VP|17000.00| null| 100| 90|
505 | | 103| Alexander| Hunold| AHUNOLD|590.423.4567|2006-01-03 00:00:...| IT_PROG| 9000.00| null| 102| 60|
506 | | 104| Bruce| Ernst| BERNST|590.423.4568|2007-05-20 23:00:...| IT_PROG| 6000.00| null| 103| 60|
507 | +-----------+-----------+----------+--------+------------+--------------------+----------+--------+--------------+----------+-------------+
508 | only showing top 5 rows
509 |
510 |
511 | We have Employees. Now let's load some data from the Oracle table Departments:
512 | For Spark versions below 1.4:
513 |
514 | scala> val departments = sqlContext.load("jdbc", Map("url" -> "jdbc:oracle:thin:hr/hr@localhost:1521/orcl", "dbtable" -> "departments"))
515 |
516 |
517 | For Spark 1.4 onwards:
518 |
519 | scala> val departments = sqlContext.read.format("jdbc").option("url", "jdbc:oracle:thin:hr/hr@localhost:1521/orcl").option("driver", "oracle.jdbc.OracleDriver").option("dbtable", "departments").load()
520 |
521 |
522 | REPL responds:
523 |
524 | departments: org.apache.spark.sql.DataFrame = [DEPARTMENT_ID: decimal(4,0), DEPARTMENT_NAME: string, MANAGER_ID: decimal(6,0), LOCATION_ID: decimal(4,0)]
525 |
526 |
527 |
549 | scala> val departments = sqlContext.read.format("jdbc")
550 | .option("url", "jdbc:oracle:thin:hr/hr@localhost:1521/orcl")
551 | .option("driver", "oracle.jdbc.OracleDriver")
552 | .option("dbtable", "departments")
553 | .option("partitionColumn", "DEPARTMENT_ID")
554 | .option("lowerBound", "1")
555 | .option("upperBound", "100000000")
556 | .option("numPartitions", "4")
557 | .option("fetchsize","1000")
558 | .load()
559 |
560 |
561 | > Don’t create too many partitions in parallel on a large cluster, otherwise Spark might crash the external database.
562 |
563 | You can experiment with these options when you load your data on your infrastructure.
564 | Read more about this here: http://spark.apache.org/docs/latest/sql-programming-guide.html#jdbc-to-other-databases
565 |
566 |
569 | scala> departments.printSchema()
570 | root
571 | |-- DEPARTMENT_ID: decimal(4,0) (nullable = false)
572 | |-- DEPARTMENT_NAME: string (nullable = false)
573 | |-- MANAGER_ID: decimal(6,0) (nullable = true)
574 | |-- LOCATION_ID: decimal(4,0) (nullable = true)
575 |
576 |
577 | >At this point the JDBC statement has been validated but Spark hasn't yet checked the physical data (this is called lazy evaluation - for example if you provide an invalid partitioning column you won't get an error message until you actually try to read the data, that is, perform an action).
578 |
579 | Now read records from the Departments DataFrame:
580 |
581 |
582 | scala> departments.show(5)
583 | +-------------+--------------------+----------+-----------+
584 | |DEPARTMENT_ID| DEPARTMENT_NAME|MANAGER_ID|LOCATION_ID|
585 | +-------------+--------------------+----------+-----------+
586 | | 10| Administration| 200| 1700|
587 | | 20| Marketing| 201| 1800|
588 | | 30| Purchasing| 114| 1700|
589 | | 40| Human Resources| 203| 2400|
590 | | 50| Shipping| 121| 1500|
591 | +-------------+--------------------+----------+-----------+
592 | only showing top 5 rows
593 |
594 | What's the total number of records in the departments dataframe:
595 |
596 | scala> departments.count()
597 | res1: Long = 27
598 |
599 | 668 | CREATE KEYSPACE IF NOT EXISTS HR WITH REPLICATION = {'class': 'SimpleStrategy', 'replication_factor': 1 }; 669 | USE HR; 670 |671 | 672 |
683 | DROP TABLE IF EXISTS employees;
684 |
685 | CREATE TABLE employees (
686 | employee_id bigint,
687 | first_name text ,
688 | last_name text,
689 | email text,
690 | phone_number text,
691 | hire_date text,
692 | salary decimal,
693 | commission_pct decimal,
694 | PRIMARY KEY (employee_id));
695 |
696 | This table satisfies query 1 we can now do an extremely fast retrieval using a partition key to access our data:
697 |
698 | cqlsh:hr> select * from employees where employee_id=188;
699 |
700 | employee_id | commission_pct | email | first_name | hire_date | last_name | phone_number | salary
701 | -------------+----------------+--------+------------+--------------------------+-----------+--------------+---------
702 | 188 | null | KCHUNG | Kelly | 2005-06-14 00:00:00-0400 | Chung | 650.505.1876 | 3800.00
703 |
704 |
705 | 709 | So we need to modify our dataframe schema.... 710 | Here's a reminder of our raw employees dataframe schema again: 711 | 712 |
713 | scala> employees.printSchema()
714 | root
715 | |-- EMPLOYEE_ID: decimal(6,0) (nullable = false)
716 | |-- FIRST_NAME: string (nullable = true)
717 | |-- LAST_NAME: string (nullable = false)
718 | |-- EMAIL: string (nullable = false)
719 | |-- PHONE_NUMBER: string (nullable = true)
720 | |-- HIRE_DATE: timestamp (nullable = false)
721 | |-- JOB_ID: string (nullable = false)
722 | |-- SALARY: decimal(8,2) (nullable = true)
723 | |-- COMMISSION_PCT: decimal(2,2) (nullable = true)
724 | |-- MANAGER_ID: decimal(6,0) (nullable = true)
725 | |-- DEPARTMENT_ID: decimal(4,0) (nullable = true)
726 |
727 |
728 | We want to change those column names to lower case - so we create a list of column names in lower case matching the dataframe order, creating a new dataframe "emps_lc" in the process.
729 |
730 | We rename the columns in the dataframe like this:
731 | 1. Create a new collection of names:
732 |
733 | scala> val newNames = Seq("employee_id", "first_name", "last_name", "email","phone_number","hire_date","job_Id","salary","commission_pct","manager_id","department_id")
734 |
735 | newNames: Seq[String] = List(employee_id, first_name, last_name, email, phone_number, hire_date, job_Id, salary, commission_pct, manager_id, department_id)
736 |
737 | 2. Create a new dataframe from employees using the new column names:
738 | 739 | scala> val emps_lc = employees.toDF(newNames: _*) 740 | 741 | emps_lc: org.apache.spark.sql.DataFrame = [employee_id: decimal(6,0), first_name: string, last_name: string, email: string, phone_number: string, hire_date: timestamp, job_Id: string, salary: decimal(8,2), commission_pct: decimal(2,2), manager_id: decimal(6,0), department_id: decimal(4,0)] 742 |743 | 744 | The schema in our new dataframe is in lower case - Yay! 745 |
746 | scala> emps_lc.printSchema()
747 | root
748 | |-- employee_id: decimal(6,0) (nullable = false)
749 | |-- first_name: string (nullable = true)
750 | |-- last_name: string (nullable = false)
751 | |-- email: string (nullable = false)
752 | |-- phone_number: string (nullable = true)
753 | |-- hire_date: timestamp (nullable = false)
754 | |-- job_Id: string (nullable = false)
755 | |-- salary: decimal(8,2) (nullable = true)
756 | |-- commission_pct: decimal(2,2) (nullable = true)
757 | |-- manager_id: decimal(6,0) (nullable = true)
758 | |-- department_id: decimal(4,0) (nullable = true)
759 |
760 |
761 |
762 | There are some columns in the dataframe above that we don't need for this exercise. For simplicity we will simply create a new dataframe from this one, containing just the columns that we do want to use.
763 |
764 |
774 | scala> employees.registerTempTable("empTable")
775 |
776 | scala> departments.registerTempTable("deptTable")
777 |
778 |
779 | Now we can run a query using those 'tables':
780 |
781 | scala> val emps_lc_subset = sqlContext.sql("SELECT employee_id, first_name, last_name, email, phone_number, hire_date, salary, commission_pct FROM empTable")
782 |
783 | emps_lc_subset: org.apache.spark.sql.DataFrame = [employee_id: decimal(6,0), first_name: string, last_name: string, email: string, phone_number: string, hire_date: timestamp, salary: decimal(8,2), commission_pct: decimal(2,2)]
784 |
785 |
786 | And we now have the schema we're looking for to match the Cassandra target table for Query 1.
787 |
788 | scala> emps_lc_subset.printSchema()
789 | root
790 | |-- employee_id: decimal(6,0) (nullable = false)
791 | |-- first_name: string (nullable = true)
792 | |-- last_name: string (nullable = false)
793 | |-- email: string (nullable = false)
794 | |-- phone_number: string (nullable = true)
795 | |-- hire_date: timestamp (nullable = false)
796 | |-- job_id: string (nullable = false)
797 | |-- salary: decimal(8,2) (nullable = true)
798 | |-- commission_pct: decimal(2,2) (nullable = true)
799 |
800 |
801 |
804 | scala> val emps_lc_subset = employees.select("employee_id","first_name","last_name","email","phone_number","hire_date","salary","commission_pct")
805 |
806 | emps_lc_subset: org.apache.spark.sql.DataFrame = [employee_id: decimal(6,0), first_name: string, last_name: string, email: string, phone_number: string, hire_date: timestamp, salary: decimal(8,2), commission_pct: decimal(2,2)]
807 |
808 |
809 | And to demonstrate that the results are the same:
810 |
811 | scala> emps_lc_subset.printSchema()
812 | root
813 | |-- employee_id: decimal(6,0) (nullable = false)
814 | |-- first_name: string (nullable = true)
815 | |-- last_name: string (nullable = false)
816 | |-- email: string (nullable = false)
817 | |-- phone_number: string (nullable = true)
818 | |-- hire_date: timestamp (nullable = false)
819 | |-- salary: decimal(8,2) (nullable = true)
820 | |-- commission_pct: decimal(2,2) (nullable = true)
821 |
822 | scala> emps_lc_subset.show(5)
823 | +-----------+----------+---------+--------+------------+--------------------+--------+--------------+
824 | |employee_id|first_name|last_name| email|phone_number| hire_date| salary|commission_pct|
825 | +-----------+----------+---------+--------+------------+--------------------+--------+--------------+
826 | | 100| Steven| King| SKING|515.123.4567|2003-06-16 23:00:...|24000.00| null|
827 | | 101| Neena| Kochhar|NKOCHHAR|515.123.4568|2005-09-20 23:00:...|17000.00| null|
828 | | 102| Lex| De Haan| LDEHAAN|515.123.4569|2001-01-13 00:00:...|17000.00| null|
829 | | 103| Alexander| Hunold| AHUNOLD|590.423.4567|2006-01-03 00:00:...| 9000.00| null|
830 | | 104| Bruce| Ernst| BERNST|590.423.4568|2007-05-20 23:00:...| 6000.00| null|
831 | +-----------+----------+---------+--------+------------+--------------------+--------+--------------+
832 | only showing top 5 rows
833 |
834 |
835 |
839 | scala> emps_lc_subset.write.format("org.apache.spark.sql.cassandra").options(Map( "table" -> "employees", "keyspace" -> "hr")).save()
840 |
841 |
842 | And if we hop over to cqlsh we can see the records are there:
843 |
844 | cqlsh:hr> select * from employees;
845 |
846 | employee_id | commission_pct | email | first_name | hire_date | last_name | phone_number | salary
847 | -------------+----------------+----------+-------------+--------------------------+-------------+--------------------+----------
848 | 111 | null | ISCIARRA | Ismael | 2005-09-30 00:00:00-0400 | Sciarra | 515.124.4369 | 7700.00
849 | 163 | 0.15 | DGREENE | Danielle | 2007-03-19 00:00:00-0400 | Greene | 011.44.1346.229268 | 9500.00
850 | 148 | 0.30 | GCAMBRAU | Gerald | 2007-10-15 00:00:00-0400 | Cambrault | 011.44.1344.619268 | 11000.00
851 | 197 | null | KFEENEY | Kevin | 2006-05-23 00:00:00-0400 | Feeney | 650.507.9822 | 3000.00
852 | 102 | null | LDEHAAN | Lex | 2001-01-13 00:00:00-0500 | De Haan | 515.123.4569 | 17000.00
853 | 153 | 0.20 | COLSEN | Christopher | 2006-03-30 00:00:00-0500 | Olsen | 011.44.1344.498718 | 8000.00
854 | 129 | null | LBISSOT | Laura | 2005-08-20 00:00:00-0400 | Bissot | 650.124.5234 | 3300.00
855 | 160 | 0.30 | LDORAN | Louise | 2005-12-15 00:00:00-0500 | Doran | 011.44.1345.629268 | 7500.00
856 | 107 | null | DLORENTZ | Diana | 2007-02-07 00:00:00-0500 | Lorentz | 590.423.5567 | 4200.00
857 | 136 | null | HPHILTAN | Hazel | 2008-02-06 00:00:00-0500 | Philtanker | 650.127.1634 | 2200.00
858 | 188 | null | KCHUNG | Kelly | 2005-06-14 00:00:00-0400 | Chung | 650.505.1876 | 3800.00
859 | 134 | null | MROGERS | Michael | 2006-08-26 00:00:00-0400 | Rogers | 650.127.1834 | 2900.00
860 | 181 | null | JFLEAUR | Jean | 2006-02-23 00:00:00-0500 | Fleaur | 650.507.9877 | 3100.00
861 |
862 |
863 | Note:The integrated DSE release of Cassandra and Spark also gives us the opportunity to directly read tables in Cassandra.
864 |
865 | For example, read some columns from the employees table:
866 |
867 | scala> val emp_data = sqlContext.sql ("select employee_id, first_name, last_name from hr.employees")
868 | emp_data: org.apache.spark.sql.DataFrame = [employee_id: bigint]
869 |
870 | scala> emp_data.printSchema()
871 | root
872 | |-- employee_id: long (nullable = true)
873 | |-- first_name: string (nullable = true)
874 | |-- last_name: string (nullable = true)
875 |
876 | scala> emp_data.count()
877 | res20: Long = 107
878 |
879 | scala> emp_data.show(5)
880 | +-----------+----------+---------+
881 | |employee_id|first_name|last_name|
882 | +-----------+----------+---------+
883 | | 111| Ismael| Sciarra|
884 | | 163| Danielle| Greene|
885 | | 148| Gerald|Cambrault|
886 | | 197| Kevin| Feeney|
887 | | 102| Lex| De Haan|
888 | +-----------+----------+---------+
889 | only showing top 5 rows
890 |
891 |
892 |
902 | DROP TABLE IF EXISTS EMPLOYEES_BY_DEPT;
903 | CREATE TABLE employees_by_dept (
904 | DEPARTMENT_ID bigint,
905 | DEPARTMENT_NAME text static,
906 | EMPLOYEE_ID bigint,
907 | FIRST_NAME text ,
908 | LAST_NAME text,
909 | PRIMARY KEY (DEPARTMENT_ID, EMPLOYEE_ID));
910 |
911 |
912 |
913 | > For each department we'll store the department id and the name of the department (as a partioning key and a static column for that partitioning key), and then for each department partition key there will be successive clustered columns of employees in that department.
914 |
915 | We create a dataframe containing employees-by-department by joining our SparkSQL tables employees and departments on DEPARTMENT_ID.
916 |
917 | In the Spark REPL:
918 | As an example, let's join the employees and departments tables.
919 |
920 | Register the dataframes as temporary tables if not done already (this allows us to use Spark SQL on them):
921 |
922 |
923 | scala> employees.registerTempTable("empTable")
924 |
925 | scala> departments.registerTempTable("deptTable")
926 |
927 |
928 | Right, let's do the query joining the tables (this is so easy using SparkSQL :) ):
929 |
930 |
931 | scala> val emp_by_dept = sqlContext.sql("SELECT d.department_id, d.department_name, e.employee_id, e.first_name, e.last_name FROM empTable e, deptTable d where e.department_id=d.department_id")
932 |
933 | Response:
934 | 935 | emp_by_dept: org.apache.spark.sql.DataFrame = [department_id: decimal(4,0), department_name: string, employee_id: decimal(6,0), first_name: string, last_name: string] 936 |937 | 938 | Here's the schema for the resulting dataframe: 939 |
940 | scala> emp_by_dept.printSchema()
941 | root
942 | |-- department_id: decimal(4,0) (nullable = false)
943 | |-- department_name: string (nullable = false)
944 | |-- employee_id: decimal(6,0) (nullable = false)
945 | |-- first_name: string (nullable = true)
946 | |-- last_name: string (nullable = false)
947 |
948 |
949 | The data is in the format that we want it to match the Cassandra target table:
950 |
951 | scala> emp_by_dept.show(5)
952 | +-------------+---------------+-----------+----------+-----------+
953 | |department_id|department_name|employee_id|first_name| last_name|
954 | +-------------+---------------+-----------+----------+-----------+
955 | | 40|Human Resources| 203| Susan| Mavris|
956 | | 50| Shipping| 120| Matthew| Weiss|
957 | | 50| Shipping| 121| Adam| Fripp|
958 | | 50| Shipping| 122| Payam| Kaufling|
959 | | 50| Shipping| 123| Shanta| Vollman|
960 | +-------------+---------------+-----------+----------+-----------+
961 | only showing top 5 rows
962 |
963 |
964 |
967 | scala> emp_by_dept.write.format("org.apache.spark.sql.cassandra").options(Map( "table" -> "employees_by_dept", "keyspace" -> "hr")).save()
968 |
969 | And the records are there in Cassandra:
970 | 971 | cqlsh:hr> select * from employees_by_dept; 972 | 973 | department_id | employee_id | department_name | first_name | last_name 974 | ---------------+-------------+-----------------+-------------+------------- 975 | 30 | 114 | Purchasing | Den | Raphaely 976 | 30 | 115 | Purchasing | Alexander | Khoo 977 | 30 | 116 | Purchasing | Shelli | Baida 978 | 30 | 117 | Purchasing | Sigal | Tobias 979 | 30 | 118 | Purchasing | Guy | Himuro 980 | 30 | 119 | Purchasing | Karen | Colmenares 981 | 20 | 201 | Marketing | Michael | Hartstein 982 | 20 | 202 | Marketing | Pat | Fay 983 | 80 | 145 | Sales | John | Russell 984 | 80 | 146 | Sales | Karen | Partners 985 |986 | Now we're able to retrieve data about departments: 987 |
988 | cqlsh:hr> select distinct department_id, department_name from employees_by_dept; 989 | department_id | department_name 990 | ---------------+------------------ 991 | 30 | Purchasing 992 | 20 | Marketing 993 | 80 | Sales 994 | 60 | IT 995 | 110 | Accounting 996 | 50 | Shipping 997 | 10 | Administration 998 | 100 | Finance 999 | 40 | Human Resources 1000 | 70 | Public Relations 1001 | 90 | Executive 1002 |1003 | And the second part of the query requirement was to be able to return employees by department: 1004 |
1005 | cqlsh:hr> select department_name, first_name, last_name from employees_by_dept where department_id=50; 1006 | 1007 | department_name | first_name | last_name 1008 | -----------------+------------+------------- 1009 | Shipping | Matthew | Weiss 1010 | Shipping | Adam | Fripp 1011 | Shipping | Payam | Kaufling 1012 | Shipping | Shanta | Vollman 1013 | Shipping | Kevin | Mourgos 1014 | Shipping | Julia | Nayer 1015 | Shipping | Irene | Mikkilineni 1016 | Shipping | James | Landry 1017 | Shipping | Steven | Markle 1018 | Shipping | Laura | Bissot 1019 | Shipping | Mozhe | Atkinson 1020 | Shipping | James | Marlow 1021 | 1022 |1023 | 1024 |
9 | eth8 Link encap:Ethernet HWaddr 08:00:27:B7:44:21 10 | inet addr:10.0.2.15 Bcast:10.0.2.255 Mask:255.255.255.0 11 |12 | 13 | The default NIC is on eth8 - we need to tweak the config in ```/etc/sysconfig/network-scripts``` by creating a file for eth0. 14 | 15 | Start off by getting rid of eth0..eth8 defined in /etc/udev/rules.d/70-persistent-net.rules 16 | 17 | Delete this file then reboot. 18 |
19 | # rm /etc/udev/rules.d/70-persistent-net.rules 20 |21 |
24 | # cd /etc/sysconfig/network-scripts/ 25 |26 |
29 | # ls ifcfg* 30 | ifcfg-Auto_eth8 ifcfg-lo 31 |32 |
36 | # mv ifcfg-eth8 ifcfg-eth0 37 |38 |
42 | # vi ifcfg-eth0 43 | TYPE=Ethernet 44 | BOOTPROTO=dhcp 45 | DEFROUTE=no 46 | IPV4_FAILURE_FATAL=yes 47 | IPV6INIT=no 48 | NAME="Auto eth8" 49 | UUID=260d6a27-75c8-4773-b12a-b9c8f18505d7 50 | ONBOOT=yes 51 | HWADDR=08:00:27:B7:44:21 52 | PEERDNS=yes 53 | PEERROUTES=yes 54 | LAST_CONNECT=1470844728 55 |56 |
61 | DEVICE=eth0 62 | NM_CONTROLLED=no 63 | BOOTPROTO=dhcp 64 | ONBOOT=yes 65 | TYPE=Ethernet 66 | IPV4_FAILURE_FATAL=yes 67 | IPV6INIT=no 68 | HWADDR=08:00:27:B7:44:21 69 | NETMASK=255.255.255.0 70 | GATEWAY=10.0.2.2 71 | PEERDNS=yes 72 | PEERROUTES=yes 73 | 74 |75 |
79 | # service iptables status 80 | iptables: Firewall is not running. 81 |82 |
88 | # vi /etc/sysconfig/network 89 | NETWORKING=yes 90 | HOSTNAME=demo.us.oracle.com 91 | GATEWAY=10.128.0.1 92 | # oracle-rdbms-server-11gR2-preinstall : Add NOZEROCONF=yes 93 | NOZEROCONF=yes 94 |95 | 96 | To This: 97 |
98 | NETWORKING=yes 99 | HOSTNAME=sampleapp 100 | GATEWAY=104 |101 | # oracle-rdbms-server-11gR2-preinstall : Add NOZEROCONF=yes 102 | NOZEROCONF=yes 103 |
111 | # hostname 112 | demo.us.oracle.com 113 |114 |
118 | # hostname sampleapp 119 |120 |
124 | # hostname 125 | sampleapp 126 |127 |
131 | # echo $HOSTNAME 132 | demo.us.oracle.com 133 | # HOSTNAME=sampleapp export HOSTNAME 134 |135 |
140 | # vi /etc/hosts 141 | 142 | 127.0.0.1 localhost.localdomain localhost 143 | ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 144 | #10.128.8.33 demo.us.oracle.com demo 145 | 10.0.2.15 sampleapp 146 |147 |
151 | # ping `echo $HOSTNAME` 152 | PING sampleapp (10.0.2.15) 56(84) bytes of data. 153 | 64 bytes from sampleapp (10.0.2.15): icmp_seq=1 ttl=64 time=0.020 ms 154 | 64 bytes from sampleapp (10.0.2.15): icmp_seq=2 ttl=64 time=0.037 ms 155 |156 |
161 | # cat /etc/resolv.conf 162 | ; generated by /sbin/dhclient-script 163 | search localdomain 164 | nameserver 8.8.8.8 165 | nameserver 8.8.4.4 166 |167 |
172 | # service network restart 173 |174 | 175 |
176 | eth0 Link encap:Ethernet HWaddr 08:00:27:B7:44:21 177 | inet addr:171.28.136.150 Bcast:171.28.136.255 Mask:255.255.255.0 178 | 179 | # route -n 180 | Kernel IP routing table 181 | Destination Gateway Genmask Flags Metric Ref Use Iface 182 | 0.0.0.0 171.28.136.254 0.0.0.0 UG 0 0 0 eth0 183 | 171.28.136.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0 184 |185 | 186 |
190 | # cat /etc/redhat-release 191 | Red Hat Enterprise Linux Server release 6.8 (Santiago) 192 |193 |
198 | # vi /etc/yum.conf 199 |200 |
209 | # yum install gcc 210 | # cd /usr/src 211 | # wget https://www.python.org/ftp/python/2.7.12/Python-2.7.12.tgz 212 | # tar xzf Python-2.7.12.tgz 213 | # cd Python-2.7.12 214 | # ./configure 215 | # make altinstall 216 |217 |
221 | # python2.7 --version 222 | Python 2.7.12 223 |224 | ("python --version" still brings up 2.6.6) 225 | 226 |
229 | # cqlsh 230 | Warning: Timezone defined and 'pytz' module for timezone conversion not installed. Timestamps will be displayed in UTC timezone. 231 | 232 | Connected to Test Cluster at 127.0.0.1:9042. 233 | [cqlsh 5.0.1 | Cassandra 3.0.7.1159 | DSE 5.0.1 | CQL spec 3.4.0 | Native protocol v4] 234 | Use HELP for help. 235 | cqlsh> 236 |237 | We will fix the pytz error next. 238 |
252 | # yum install wget 253 |254 |
258 | # rpm -ivh http://dl.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm 259 | 260 | Retrieving http://dl.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm 261 | warning: /var/tmp/rpm-tmp.X7MF42: Header V3 RSA/SHA256 Signature, key ID 0608b895: NOKEY 262 | Preparing... ########################################### [100%] 263 | 1:epel-release ########################################### [100%] 264 |265 |
269 | # rpm -qa | grep epel 270 | epel-release-6-8.noarch 271 |272 | 273 | 274 | The install creates two files: 275 | - /etc/yum.repos.d/epel.repo 276 | - /etc/yum.repos.d/epel-testing.repo 277 |
282 | # yum install -y python-pip 283 |284 | 285 | Install easy_install for Python 2.7 286 |
287 | # wget --no-check-certificate https://bootstrap.pypa.io/ez_setup.py 288 | # /usr/local/bin/python2.7 ez_setup.py 289 | # /usr/local/bin/easy_install-2.7 pip 290 |291 |
295 | # which pip 296 | /usr/local/bin/pip 297 | # pip --version 298 | pip 8.1.2 from /usr/local/lib/python2.7/site-packages/pip-8.1.2-py2.7.egg (python 2.7) 299 |300 |
304 | # which easy_install 305 | /usr/local/bin/easy_install 306 | # easy_install --version 307 | setuptools 27.1.2 from /usr/local/lib/python2.7/site-packages/setuptools-27.1.2-py2.7.egg (Python 2.7) 308 |309 | 310 | Use easy_install to install the pytz package (the error in cqlsh): 311 |
312 | # /usr/local/bin/easy_install-2.7 pytz 313 |314 | 315 |
321 | # cqlsh 322 | Connected to Test Cluster at 127.0.0.1:9042. 323 | [cqlsh 5.0.1 | Cassandra 3.0.7.1159 | DSE 5.0.1 | CQL spec 3.4.0 | Native protocol v4] 324 | Use HELP for help. 325 | cqlsh> 326 |327 | 328 |
333 | # cat /etc/passwd | grep cassandra 334 | cassandra:x:476:472::/var/lib/cassandra:/bin/bash 335 |336 |
341 | # cat /etc/security/limits.d/cassandra.conf 342 | 343 | cassandra - memlock unlimited 344 | cassandra - nofile 100000 345 | cassandra - nproc 32768 346 | cassandra - as unlimited 347 |348 | If you want to log in as the Cassandra user you will either need to su - from the root account, or use the root account to set a password for the casssandra account to allow interactive logins. 349 | 350 | ## 12.Check Oracle Database Name 351 |
352 | $ sqlplus / as sysdba 353 | 354 | SQL*Plus: Release 12.1.0.2.0 Production on Mon Sep 5 05:25:42 2016 355 | 356 | Copyright (c) 1982, 2014, Oracle. All rights reserved. 357 | 358 | Connected to: 359 | Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production 360 | With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options 361 | 362 | SQL> select database_name from v$database; 363 | 364 | DATABASE_NAME 365 | ---------------------- 366 | ORCL.US.ORACLE.COM 367 | 368 | 1 row selected. 369 |370 | 371 | ## 13.Set Up Oracle TNS 372 | The environment variable TNS_ADMIN should be pointing to the location of your tns configuration files. 373 |
374 | $ echo $TNS_ADMIN 375 | /app/oracle/db/product/12c/dbhome_1/network/admin 376 | $ ls $TNS_ADMIN 377 | listener.ora listener.ora~ samples shrept.lst sqlnet.ora tnsnames.ora tnsnames.ora~ 378 |379 | 380 | 381 | 382 | ### Configure listener.ora 383 | #### Set Hostname 384 | Change hostname from demo.us.oracle.com to localhost. 385 | The config file is: 386 |
387 | /app/oracle/db/product/12c/dbhome_1/network/admin/listener.ora 388 |389 | 390 |
391 | # listener.ora Network Configuration File: /app/oracle/db/product/12c/dbhome_1/network/admin/listener.ora 392 | # Generated by Oracle configuration tools. 393 | 394 | LISTENER = 395 | (DESCRIPTION_LIST = 396 | (DESCRIPTION = 397 | (ADDRESS = (PROTOCOL = TCP)(HOST = localhost)(PORT = 1521)) 398 | (ADDRESS = (PROTOCOL = IPC)(KEY = EXTPROC1521)) 399 | ) 400 | ) 401 |402 | 403 | ### Add A Service To listener.ora 404 | We need to add a service listening for connections to database orcl. 405 | Add this section to the end of listener.ora: 406 |
407 | SID_LIST_LISTENER= 408 | (SID_LIST= 409 | (SID_DESC= 410 | (GLOBAL_DBNAME=orcl) 411 | (ORACLE_HOME=/app/oracle/db/product/12c/dbhome_1) 412 | (SID_NAME=orcl))) 413 |414 | 415 | ### Configure Host Names In tnsnames.ora 416 | Change all occurrences of hostname from demo.us.oracle.com to localhost. 417 | 418 |
419 | /app/oracle/db/product/12c/dbhome_1/network/admin/tnsnames.ora 420 |421 |
422 | # tnsnames.ora Network Configuration File: /app/oracle/db/product/12c/dbhome_1/network/admin/tnsnames.ora 423 | # Generated by Oracle configuration tools. 424 | 425 | LISTENER_ORCL = 426 | (ADDRESS = (PROTOCOL = TCP)(HOST = localhost)(PORT = 1521)) 427 | 428 | ORCL = 429 | (DESCRIPTION = 430 | (ADDRESS = (PROTOCOL = TCP)(HOST = localhost)(PORT = 1521)) 431 | (CONNECT_DATA = 432 | (SERVER = DEDICATED) 433 | (SERVICE_NAME = orcl) 434 | ) 435 | ) 436 | pdborcl = 437 | (DESCRIPTION = 438 | (ADDRESS = (PROTOCOL = TCP)(HOST = localhost)(PORT = 1521)) 439 | (CONNECT_DATA = 440 | (SERVER = DEDICATED) 441 | (SERVICE_NAME = pdborcl) 442 | ) 443 | ) 444 |445 | 446 | ### Start the listener 447 |
448 | $ lsnrctl start 449 | 450 | LSNRCTL for Linux: Version 12.1.0.2.0 - Production on 13-SEP-2016 17:18:26 451 | 452 | Copyright (c) 1991, 2014, Oracle. All rights reserved. 453 | 454 | Starting /app/oracle/db/product/12c/dbhome_1/bin/tnslsnr: please wait... 455 | 456 | TNSLSNR for Linux: Version 12.1.0.2.0 - Production 457 | System parameter file is /app/oracle/db/product/12c/dbhome_1/network/admin/listener.ora 458 | Log messages written to /app/oracle/db/diag/tnslsnr/sampleapp/listener/alert/log.xml 459 | Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=127.0.0.1)(PORT=1521))) 460 | Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=EXTPROC1521))) 461 | 462 | Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=127.0.0.1)(PORT=1521))) 463 | STATUS of the LISTENER 464 | ------------------------ 465 | Alias LISTENER 466 | Version TNSLSNR for Linux: Version 12.1.0.2.0 - Production 467 | Start Date 13-SEP-2016 17:18:26 468 | Uptime 0 days 0 hr. 0 min. 0 sec 469 | Trace Level off 470 | Security ON: Local OS Authentication 471 | SNMP OFF 472 | Listener Parameter File /app/oracle/db/product/12c/dbhome_1/network/admin/listener.ora 473 | Listener Log File /app/oracle/db/diag/tnslsnr/sampleapp/listener/alert/log.xml 474 | Listening Endpoints Summary... 475 | (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=127.0.0.1)(PORT=1521))) 476 | (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=EXTPROC1521))) 477 | Services Summary... 478 | Service "orcl" has 1 instance(s). 479 | Instance "orcl", status UNKNOWN, has 1 handler(s) for this service... 480 | The command completed successfully 481 |482 | 483 | ### Check tnsping 484 | 485 |
486 | $ tnsping orcl 487 | 488 | TNS Ping Utility for Linux: Version 12.1.0.2.0 - Production on 13-SEP-2016 17:25:55 489 | 490 | Copyright (c) 1997, 2014, Oracle. All rights reserved. 491 | 492 | Used parameter files: 493 | /app/oracle/db/product/12c/dbhome_1/network/admin/sqlnet.ora 494 | 495 | 496 | Used TNSNAMES adapter to resolve the alias 497 | Attempting to contact (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = localhost)(PORT = 1521)) (CONNECT_DATA = (SERVER = DEDICATED) (SERVICE_NAME = orcl))) 498 | OK (0 msec) 499 |500 | 501 | ### Check connection with sqlplus 502 |
503 | $ sqlplus hr/hr@orcl 504 | 505 | SQL*Plus: Release 12.1.0.2.0 Production on Tue Sep 13 17:32:06 2016 506 | 507 | Copyright (c) 1982, 2014, Oracle. All rights reserved. 508 | 509 | Last Successful login time: Tue Sep 13 2016 17:18:54 -05:00 510 | 511 | Connected to: 512 | Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production 513 | With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options 514 | 515 | SQL> 516 |517 | 518 | ## 14.Fix VBox Shared Clipboard 519 | My clipboard didn't seem to work but this fixes it. 520 | 521 | In Virtual Box mount the Guest Additions CD and run it when prompted for the root password. 522 | 523 | Restart the machine. 524 | 525 | Then: 526 |
527 | # killall VBoxClient 528 | # VBoxClient-all 529 |530 | I find that I have to enable the bidirectional clipboard I have to run the last two commands every time I start the machine. 531 | 532 |