├── LICENSE.txt ├── README.md ├── bootstrap-cm.sh └── cmxDeploy.py /LICENSE.txt: -------------------------------------------------------------------------------- 1 | 2 | Apache License 3 | Version 2.0, January 2004 4 | http://www.apache.org/licenses/ 5 | 6 | TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 7 | 8 | 1. Definitions. 9 | 10 | "License" shall mean the terms and conditions for use, reproduction, 11 | and distribution as defined by Sections 1 through 9 of this document. 12 | 13 | "Licensor" shall mean the copyright owner or entity authorized by 14 | the copyright owner that is granting the License. 15 | 16 | "Legal Entity" shall mean the union of the acting entity and all 17 | other entities that control, are controlled by, or are under common 18 | control with that entity. For the purposes of this definition, 19 | "control" means (i) the power, direct or indirect, to cause the 20 | direction or management of such entity, whether by contract or 21 | otherwise, or (ii) ownership of fifty percent (50%) or more of the 22 | outstanding shares, or (iii) beneficial ownership of such entity. 23 | 24 | "You" (or "Your") shall mean an individual or Legal Entity 25 | exercising permissions granted by this License. 26 | 27 | "Source" form shall mean the preferred form for making modifications, 28 | including but not limited to software source code, documentation 29 | source, and configuration files. 30 | 31 | "Object" form shall mean any form resulting from mechanical 32 | transformation or translation of a Source form, including but 33 | not limited to compiled object code, generated documentation, 34 | and conversions to other media types. 35 | 36 | "Work" shall mean the work of authorship, whether in Source or 37 | Object form, made available under the License, as indicated by a 38 | copyright notice that is included in or attached to the work 39 | (an example is provided in the Appendix below). 40 | 41 | "Derivative Works" shall mean any work, whether in Source or Object 42 | form, that is based on (or derived from) the Work and for which the 43 | editorial revisions, annotations, elaborations, or other modifications 44 | represent, as a whole, an original work of authorship. For the purposes 45 | of this License, Derivative Works shall not include works that remain 46 | separable from, or merely link (or bind by name) to the interfaces of, 47 | the Work and Derivative Works thereof. 48 | 49 | "Contribution" shall mean any work of authorship, including 50 | the original version of the Work and any modifications or additions 51 | to that Work or Derivative Works thereof, that is intentionally 52 | submitted to Licensor for inclusion in the Work by the copyright owner 53 | or by an individual or Legal Entity authorized to submit on behalf of 54 | the copyright owner. For the purposes of this definition, "submitted" 55 | means any form of electronic, verbal, or written communication sent 56 | to the Licensor or its representatives, including but not limited to 57 | communication on electronic mailing lists, source code control systems, 58 | and issue tracking systems that are managed by, or on behalf of, the 59 | Licensor for the purpose of discussing and improving the Work, but 60 | excluding communication that is conspicuously marked or otherwise 61 | designated in writing by the copyright owner as "Not a Contribution." 62 | 63 | "Contributor" shall mean Licensor and any individual or Legal Entity 64 | on behalf of whom a Contribution has been received by Licensor and 65 | subsequently incorporated within the Work. 66 | 67 | 2. Grant of Copyright License. Subject to the terms and conditions of 68 | this License, each Contributor hereby grants to You a perpetual, 69 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 70 | copyright license to reproduce, prepare Derivative Works of, 71 | publicly display, publicly perform, sublicense, and distribute the 72 | Work and such Derivative Works in Source or Object form. 73 | 74 | 3. Grant of Patent License. Subject to the terms and conditions of 75 | this License, each Contributor hereby grants to You a perpetual, 76 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 77 | (except as stated in this section) patent license to make, have made, 78 | use, offer to sell, sell, import, and otherwise transfer the Work, 79 | where such license applies only to those patent claims licensable 80 | by such Contributor that are necessarily infringed by their 81 | Contribution(s) alone or by combination of their Contribution(s) 82 | with the Work to which such Contribution(s) was submitted. If You 83 | institute patent litigation against any entity (including a 84 | cross-claim or counterclaim in a lawsuit) alleging that the Work 85 | or a Contribution incorporated within the Work constitutes direct 86 | or contributory patent infringement, then any patent licenses 87 | granted to You under this License for that Work shall terminate 88 | as of the date such litigation is filed. 89 | 90 | 4. Redistribution. You may reproduce and distribute copies of the 91 | Work or Derivative Works thereof in any medium, with or without 92 | modifications, and in Source or Object form, provided that You 93 | meet the following conditions: 94 | 95 | (a) You must give any other recipients of the Work or 96 | Derivative Works a copy of this License; and 97 | 98 | (b) You must cause any modified files to carry prominent notices 99 | stating that You changed the files; and 100 | 101 | (c) You must retain, in the Source form of any Derivative Works 102 | that You distribute, all copyright, patent, trademark, and 103 | attribution notices from the Source form of the Work, 104 | excluding those notices that do not pertain to any part of 105 | the Derivative Works; and 106 | 107 | (d) If the Work includes a "NOTICE" text file as part of its 108 | distribution, then any Derivative Works that You distribute must 109 | include a readable copy of the attribution notices contained 110 | within such NOTICE file, excluding those notices that do not 111 | pertain to any part of the Derivative Works, in at least one 112 | of the following places: within a NOTICE text file distributed 113 | as part of the Derivative Works; within the Source form or 114 | documentation, if provided along with the Derivative Works; or, 115 | within a display generated by the Derivative Works, if and 116 | wherever such third-party notices normally appear. The contents 117 | of the NOTICE file are for informational purposes only and 118 | do not modify the License. You may add Your own attribution 119 | notices within Derivative Works that You distribute, alongside 120 | or as an addendum to the NOTICE text from the Work, provided 121 | that such additional attribution notices cannot be construed 122 | as modifying the License. 123 | 124 | You may add Your own copyright statement to Your modifications and 125 | may provide additional or different license terms and conditions 126 | for use, reproduction, or distribution of Your modifications, or 127 | for any such Derivative Works as a whole, provided Your use, 128 | reproduction, and distribution of the Work otherwise complies with 129 | the conditions stated in this License. 130 | 131 | 5. Submission of Contributions. Unless You explicitly state otherwise, 132 | any Contribution intentionally submitted for inclusion in the Work 133 | by You to the Licensor shall be under the terms and conditions of 134 | this License, without any additional terms or conditions. 135 | Notwithstanding the above, nothing herein shall supersede or modify 136 | the terms of any separate license agreement you may have executed 137 | with Licensor regarding such Contributions. 138 | 139 | 6. Trademarks. This License does not grant permission to use the trade 140 | names, trademarks, service marks, or product names of the Licensor, 141 | except as required for reasonable and customary use in describing the 142 | origin of the Work and reproducing the content of the NOTICE file. 143 | 144 | 7. Disclaimer of Warranty. Unless required by applicable law or 145 | agreed to in writing, Licensor provides the Work (and each 146 | Contributor provides its Contributions) on an "AS IS" BASIS, 147 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 148 | implied, including, without limitation, any warranties or conditions 149 | of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A 150 | PARTICULAR PURPOSE. You are solely responsible for determining the 151 | appropriateness of using or redistributing the Work and assume any 152 | risks associated with Your exercise of permissions under this License. 153 | 154 | 8. Limitation of Liability. In no event and under no legal theory, 155 | whether in tort (including negligence), contract, or otherwise, 156 | unless required by applicable law (such as deliberate and grossly 157 | negligent acts) or agreed to in writing, shall any Contributor be 158 | liable to You for damages, including any direct, indirect, special, 159 | incidental, or consequential damages of any character arising as a 160 | result of this License or out of the use or inability to use the 161 | Work (including but not limited to damages for loss of goodwill, 162 | work stoppage, computer failure or malfunction, or any and all 163 | other commercial damages or losses), even if such Contributor 164 | has been advised of the possibility of such damages. 165 | 166 | 9. Accepting Warranty or Additional Liability. While redistributing 167 | the Work or Derivative Works thereof, You may choose to offer, 168 | and charge a fee for, acceptance of support, warranty, indemnity, 169 | or other liability obligations and/or rights consistent with this 170 | License. However, in accepting such obligations, You may act only 171 | on Your own behalf and on Your sole responsibility, not on behalf 172 | of any other Contributor, and only if You agree to indemnify, 173 | defend, and hold each Contributor harmless for any liability 174 | incurred by, or claims asserted against, such Contributor by reason 175 | of your accepting any such warranty or additional liability. 176 | 177 | END OF TERMS AND CONDITIONS 178 | 179 | APPENDIX: How to apply the Apache License to your work. 180 | 181 | To apply the Apache License to your work, attach the following 182 | boilerplate notice, with the fields enclosed by brackets "[]" 183 | replaced with your own identifying information. (Don't include 184 | the brackets!) The text should be enclosed in the appropriate 185 | comment syntax for the file format. We also recommend that a 186 | file or class name and description of purpose be included on the 187 | same "printed page" as the copyright notice for easier 188 | identification within third-party archives. 189 | 190 | Copyright [yyyy] [name of copyright owner] 191 | 192 | Licensed under the Apache License, Version 2.0 (the "License"); 193 | you may not use this file except in compliance with the License. 194 | You may obtain a copy of the License at 195 | 196 | http://www.apache.org/licenses/LICENSE-2.0 197 | 198 | Unless required by applicable law or agreed to in writing, software 199 | distributed under the License is distributed on an "AS IS" BASIS, 200 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 201 | See the License for the specific language governing permissions and 202 | limitations under the License. 203 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | ---- 2 | ``` bash 3 | usage: cmxDeploy.py -u "root" -p "cloudera" -m "CM-SERVER-IP" -w "ip1,ip2,ip3,..." 4 | ``` 5 | ---- 6 | 7 | #### pre-requisite 8 | 9 | You have a running Cloudera Manager Server. You can use the preferred option as documented in ["Installation Path B - Manual Installation Using Cloudera Manager Packages"] [1](http://www.cloudera.com/content/cloudera-content/cloudera-docs/CM5/latest/Cloudera-Manager-Installation-Guide/cm5ig_install_path_B.html?scroll=cmig_topic_6_6). And also Cloudera Manager API python (cm-api) client installed [2](http://cloudera.github.io/cm_api/docs/python-client/) 10 | 11 | #### TL;DR: copy-and-paste below example in RHEL/CentOS 12 | ``` bash 13 | curl -L https://raw.githubusercontent.com/gdgt/cmapi/master/bootstrap-cm.sh | bash 14 | yum install -y git python-pip python-setuptools 15 | git clone https://github.com/cloudera/cm_api.git -b cm5-5.16.2 $HOME/cm_api && pip install $HOME/cm_api/python 16 | curl -L https://raw.githubusercontent.com/gdgt/cmapi/master/cmxDeploy.py -o $HOME/cmxDeploy.py && chmod +x $HOME/cmxDeploy.py 17 | python $HOME/cmxDeploy.py -u "root" -p "cloudera" -m "$(hostname -f)" -w "$(hostname -f)" 18 | # OR python $HOME/cmxDeploy.py -u "root" -p "cloudera" -m "cm-ip" -w "ip1,ip2,ip3,..." 19 | ``` 20 | - SSH credentials - this allows CM to configure the rest of the hosts -u/--ssh-root-user, -p/--ssh-root-password **OR** -k/--ssh-private-key 21 | - create Hive in embedded PosgreSQL database, bash script provided below (also included in [bootstrap-cm.sh](https://github.com/gdgt/cmapi/blob/master/bootstrap-cm.sh#L13-L21)). 22 | 23 | #### Usage 24 | ``` bash 25 | Usage: cmxDeploy.py [options] 26 | 27 | Options: 28 | -h, --help show this help message and exit 29 | -d TEARDOWN, --teardown=TEARDOWN 30 | Teardown Cloudera Manager Cluster. Required arguments 31 | "keep_cluster" or "remove_cluster". 32 | -i CDH_VERSION, --cdh-version=CDH_VERSION 33 | Install CDH version. Default "latest" 34 | -k SSH_PRIVATE_KEY, --ssh-private-key=SSH_PRIVATE_KEY 35 | The private key to authenticate with the hosts. 36 | Specify either this or a password. 37 | -l LICENSE_FILE, --license-file=LICENSE_FILE 38 | Cloudera Manager License file name 39 | -m CM_SERVER, --cm-server=CM_SERVER 40 | *Set Cloudera Manager Server Host. Note: This is the 41 | host where the Cloudera Management Services get 42 | installed. 43 | -n CLUSTER_NAME, --cluster-name=CLUSTER_NAME 44 | Set Cloudera Manager Cluster name enclosed in double 45 | quotes. Default "Cluster 1" 46 | -p SSH_ROOT_PASSWORD, --ssh-root-password=SSH_ROOT_PASSWORD 47 | *Set target node(s) ssh password.. 48 | -u SSH_ROOT_USER, --ssh-root-user=SSH_ROOT_USER 49 | Set target node(s) ssh username. Default root 50 | -w HOST_NAMES, --host-names=HOST_NAMES 51 | *Set target node(s) list, separate with comma eg: -w 52 | host1,host2,...,host(n). Note: - enclose in double 53 | quote. - 54 | CM_SERVER excluded in this list, if you want install 55 | CDH Services in CM_SERVER add the host to this list. 56 | ``` 57 | ## References 58 | - https://github.com/cloudera/cm_api/tree/master/python/examples 59 | - https://raw.githubusercontent.com/justinhayes/cm_api/master/python/examples/auto-deploy/deploycloudera.py 60 | - https://github.com/eBay/hadrian 61 | 62 | ---- 63 | -------------------------------------------------------------------------------- /bootstrap-cm.sh: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bash 2 | set -e 3 | echo "Set cloudera-manager.repo to CM v5" 4 | yum clean all 5 | RELEASEVER=$(rpm -q --qf "%{VERSION}" $(rpm -q --whatprovides redhat-release)) 6 | rpm --import http://archive.cloudera.com/cdh5/redhat/$RELEASEVER/x86_64/cdh/RPM-GPG-KEY-cloudera 7 | wget http://archive.cloudera.com/cm5/redhat/$RELEASEVER/x86_64/cm/cloudera-manager.repo -O /etc/yum.repos.d/cloudera-manager.repo 8 | yum install -y oracle-j2sdk* cloudera-manager-{daemons,server,server-db*} 9 | 10 | echo "start cloudera-scm-server-db and cloudera-scm-server services" 11 | service cloudera-scm-server-db start 12 | service cloudera-scm-server start 13 | 14 | export PGPASSWORD=$(head -1 /var/lib/cloudera-scm-server-db/data/generated_password.txt) 15 | SCHEMA=("hive" "sentry") 16 | for DB in "${SCHEMA[@]}"; do 17 | echo "Create $DB Database in Cloudera embedded PostgreSQL" 18 | SQLCMD=("CREATE ROLE $DB LOGIN PASSWORD 'cloudera';" "CREATE DATABASE $DB OWNER $DB ENCODING 'UTF8';" "ALTER DATABASE $DB SET standard_conforming_strings = off;") 19 | for SQL in "${SQLCMD[@]}"; do 20 | psql -A -t -d scm -U cloudera-scm -h localhost -p 7432 -c """${SQL}""" 21 | done 22 | done 23 | while ! (exec 6<>/dev/tcp/$(hostname)/7180) 2> /dev/null ; do echo 'Waiting for Cloudera Manager to start accepting connections...'; sleep 10; done 24 | -------------------------------------------------------------------------------- /cmxDeploy.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # 3 | __author__ = 'Michalis' 4 | __version__ = '0.13.2702' 5 | 6 | import socket 7 | import re 8 | import urllib2 9 | from optparse import OptionParser 10 | import hashlib 11 | import os 12 | import sys 13 | import random 14 | 15 | from cm_api.api_client import ApiResource, ApiException, API_CURRENT_VERSION 16 | from cm_api.http_client import HttpClient, RestException 17 | from cm_api.endpoints.hosts import * 18 | from cm_api.endpoints.services import ApiServiceSetupInfo, ApiService 19 | 20 | 21 | def init_cluster(): 22 | """ 23 | Initialise Cluster 24 | :return: 25 | """ 26 | print "> Initialise Cluster" 27 | api = ApiResource(server_host=cmx.cm_server, username=cmx.username, password=cmx.password, version=cmx.api_version) 28 | # Update Cloudera Manager configuration 29 | cm = api.get_cloudera_manager() 30 | # Enable cgroup 31 | cm.update_all_hosts_config({"rm_enabled": "true"}) 32 | 33 | def manifest_to_dict(manifest_json): 34 | if manifest_json: 35 | dir_list = json.load( 36 | urllib2.urlopen(manifest_json))['parcels'][0]['parcelName'] 37 | parcel_part = re.match(r"^(.*?)-(.*)-(.*?)$", dir_list).groups() 38 | print "{'product': %s, 'version': %s}" % (str(parcel_part[0]).upper(), str(parcel_part[1]).lower()) 39 | return {'product': str(parcel_part[0]).upper(), 'version': str(parcel_part[1]).lower()} 40 | else: 41 | raise Exception("Invalid manifest.json") 42 | 43 | # Install CDH5 latest version 44 | repo_url = ["%s/cdh5/parcels/%s" % (cmx.archive_url, cmx.cdh_version)] 45 | print "CDH5 Parcel URL: %s" % repo_url[0] 46 | cmx.parcel.append(manifest_to_dict(repo_url[0] + "/manifest.json")) 47 | 48 | # Install GPLEXTRAS5 to match CDH5 version 49 | repo_url.append('%s/gplextras5/parcels/%s' % 50 | (cmx.archive_url, cmx.parcel[0]['version'].split('-')[0])) 51 | print "GPL Extras parcel URL: %s" % repo_url[1] 52 | cmx.parcel.append(manifest_to_dict(repo_url[1] + "/manifest.json")) 53 | 54 | cm.update_config({"REMOTE_PARCEL_REPO_URLS": "http://archive.cloudera.com/impala/parcels/latest/," 55 | "http://archive.cloudera.com/search/parcels/latest/," 56 | "http://archive.cloudera.com/spark/parcels/latest/," 57 | "http://archive.cloudera.com/sqoop-connectors/parcels/latest/," 58 | "http://archive.cloudera.com/accumulo-c5/parcels/latest," 59 | "%s" % ",".join([url for url in repo_url if url]), 60 | "PHONE_HOME": False, "PARCEL_DISTRIBUTE_RATE_LIMIT_KBS_PER_SECOND": "102400", 61 | "SESSION_TIMEOUT": "99999999999999"}) 62 | 63 | if cmx.cluster_name in [x.name for x in api.get_all_clusters()]: 64 | print "Cluster name: '%s' already exists" % cmx.cluster_name 65 | else: 66 | print "Creating cluster name '%s'" % cmx.cluster_name 67 | api.create_cluster(name=cmx.cluster_name, version=cmx.cluster_version) 68 | 69 | 70 | def add_hosts_to_cluster(): 71 | """ 72 | Add hosts to cluster 73 | :return: 74 | """ 75 | print "> Add hosts to Cluster: %s" % cmx.cluster_name 76 | api = ApiResource(server_host=cmx.cm_server, username=cmx.username, password=cmx.password, version=cmx.api_version) 77 | cluster = api.get_cluster(cmx.cluster_name) 78 | cm = api.get_cloudera_manager() 79 | 80 | # deploy agents into host_list 81 | host_list = list(set([socket.getfqdn(x) for x in cmx.host_names] + [socket.getfqdn(cmx.cm_server)]) - 82 | set([x.hostname for x in api.get_all_hosts()])) 83 | if host_list: 84 | cmd = cm.host_install(user_name=cmx.ssh_root_user, host_names=host_list, 85 | password=cmx.ssh_root_password, private_key=cmx.ssh_private_key, unlimited_jce=True) 86 | 87 | # TODO: Temporary fix to flag for unlimited strength JCE policy files installation (If unset, defaults to false) 88 | # host_install_args = {"userName": cmx.ssh_root_user, "hostNames": host_list, "password": cmx.ssh_root_password, 89 | # "privateKey": cmx.ssh_private_key, "unlimitedJCE": True} 90 | # cmd = cm._cmd('hostInstall', data=host_install_args) 91 | print "Installing host(s) to cluster '%s' - [ http://%s:7180/cmf/command/%s/details ]" % \ 92 | (socket.getfqdn(cmx.cm_server), cmx.cm_server, cmd.id) 93 | check.status_for_command("Hosts: %s " % host_list, cmd) 94 | 95 | hosts = [] 96 | for host in api.get_all_hosts(): 97 | if host.hostId not in [x.hostId for x in cluster.list_hosts()]: 98 | print "Adding {'ip': '%s', 'hostname': '%s', 'hostId': '%s'}" % (host.ipAddress, host.hostname, host.hostId) 99 | hosts.append(host.hostId) 100 | 101 | if hosts: 102 | print "Adding hostId(s) to '%s'" % cmx.cluster_name 103 | print "%s" % hosts 104 | cluster.add_hosts(hosts) 105 | 106 | 107 | def host_rack(): 108 | """ 109 | Add host to rack 110 | :return: 111 | """ 112 | # TODO: Add host to rack 113 | print "> Add host to rack" 114 | api = ApiResource(server_host=cmx.cm_server, username=cmx.username, password=cmx.password, version=cmx.api_version) 115 | cluster = api.get_cluster(cmx.cluster_name) 116 | hosts = [] 117 | for h in api.get_all_hosts(): 118 | # host = api.create_host(h.hostId, h.hostname, 119 | # socket.gethostbyname(h.hostname), 120 | # "/default_rack") 121 | h.set_rack_id("/default_rack") 122 | hosts.append(h) 123 | 124 | cluster.add_hosts(hosts) 125 | 126 | 127 | def _check_parcel_stage(parcel_item, expected_stage, action_description): 128 | api = ApiResource(server_host=cmx.cm_server, username=cmx.username, password=cmx.password, version=cmx.api_version) 129 | cluster = api.get_cluster(cmx.cluster_name) 130 | 131 | while True: 132 | cdh_parcel = cluster.get_parcel(product=parcel_item['product'], version=parcel_item['version']) 133 | if cdh_parcel.stage in expected_stage: 134 | break 135 | if cdh_parcel.state.errors: 136 | raise Exception(str(cdh_parcel.state.errors)) 137 | 138 | msg = " [%s: %s / %s]" % (cdh_parcel.stage, cdh_parcel.state.progress, cdh_parcel.state.totalProgress) 139 | sys.stdout.write(msg + " " * (78 - len(msg)) + "\r") 140 | sys.stdout.flush() 141 | time.sleep(1) 142 | 143 | 144 | def parcel_action(parcel_item, function, expected_stage, action_description): 145 | api = ApiResource(server_host=cmx.cm_server, username=cmx.username, password=cmx.password, version=cmx.api_version) 146 | cluster = api.get_cluster(cmx.cluster_name) 147 | print "%s [%s-%s]" % (action_description, parcel_item['product'], parcel_item['version']) 148 | cdh_parcel = cluster.get_parcel(product=parcel_item['product'], version=parcel_item['version']) 149 | 150 | cmd = getattr(cdh_parcel, function)() 151 | if not cmd.success: 152 | print "ERROR: %s failed!" % action_description 153 | exit(0) 154 | return _check_parcel_stage(parcel_item, expected_stage, action_description) 155 | 156 | 157 | def setup_zookeeper(): 158 | """ 159 | Zookeeper 160 | > Waiting for ZooKeeper Service to initialize 161 | Starting ZooKeeper Service 162 | :return: 163 | """ 164 | api = ApiResource(server_host=cmx.cm_server, username=cmx.username, password=cmx.password, version=cmx.api_version) 165 | cluster = api.get_cluster(cmx.cluster_name) 166 | service_type = "ZOOKEEPER" 167 | if cdh.get_service_type(service_type) is None: 168 | print "> %s" % service_type 169 | service_name = "zookeeper" 170 | print "Create %s service" % service_name 171 | cluster.create_service(service_name, service_type) 172 | service = cluster.get_service(service_name) 173 | hosts = manager.get_hosts() 174 | service.update_config({"zookeeper_datadir_autocreate": False}) 175 | 176 | # Role Config Group equivalent to Service Default Group 177 | for rcg in service.get_all_role_config_groups(): 178 | if rcg.roleType == "SERVER": 179 | rcg.update_config({"maxClientCnxns": "1024", "zookeeper_server_java_heapsize": "492830720"}) 180 | # Pick 3 hosts and deploy Zookeeper Server role 181 | for host in random.sample(hosts, 3 if len(hosts) >= 3 else 1): 182 | cdh.create_service_role(service, rcg.roleType, host) 183 | 184 | # init_zookeeper not required as the API performs this when adding Zookeeper 185 | # check.status_for_command("Waiting for ZooKeeper Service to initialize", service.init_zookeeper()) 186 | check.status_for_command("Starting ZooKeeper Service", service.start()) 187 | 188 | 189 | def setup_hdfs(): 190 | """ 191 | HDFS 192 | > Checking if the name directories of the NameNode are empty. Formatting HDFS only if empty. 193 | Starting HDFS Service 194 | > Creating HDFS /tmp directory 195 | :return: 196 | """ 197 | api = ApiResource(server_host=cmx.cm_server, username=cmx.username, password=cmx.password, version=cmx.api_version) 198 | cluster = api.get_cluster(cmx.cluster_name) 199 | service_type = "HDFS" 200 | if cdh.get_service_type(service_type) is None: 201 | print "> %s" % service_type 202 | service_name = "hdfs" 203 | print "Create %s service" % service_name 204 | cluster.create_service(service_name, service_type) 205 | service = cluster.get_service(service_name) 206 | hosts = manager.get_hosts() 207 | 208 | # Service-Wide 209 | service_config = cdh.dependencies_for(service) 210 | service_config.update({"dfs_replication": "3", 211 | "dfs_block_local_path_access_user": "impala,hbase,mapred,spark"}) 212 | service.update_config(service_config) 213 | 214 | # Role Config Group equivalent to Service Default Group 215 | for rcg in service.get_all_role_config_groups(): 216 | if rcg.roleType == "NAMENODE": 217 | # hdfs-NAMENODE - Default Group 218 | rcg.update_config({"dfs_name_dir_list": "/data/dfs/nn", 219 | "namenode_java_heapsize": "1073741824", 220 | "dfs_namenode_handler_count": "30", 221 | "dfs_namenode_service_handler_count": "30", 222 | "dfs_namenode_servicerpc_address": "8022"}) 223 | cdh.create_service_role(service, rcg.roleType, [x for x in hosts if x.id == 0][0]) 224 | if rcg.roleType == "SECONDARYNAMENODE": 225 | # hdfs-SECONDARYNAMENODE - Default Group 226 | rcg.update_config({"fs_checkpoint_dir_list": "/data/dfs/snn", 227 | "secondary_namenode_java_heapsize": "1073741824"}) 228 | # chose a server that it's not NN, easier to enable HDFS-HA later 229 | secondary_nn = random.choice([host for host in hosts if host.hostId not in 230 | [x.hostRef.hostId for x in service.get_roles_by_type("NAMENODE")]]) \ 231 | if len(hosts) > 1 else random.choice(hosts) 232 | 233 | cdh.create_service_role(service, rcg.roleType, secondary_nn) 234 | 235 | if rcg.roleType == "DATANODE": 236 | # hdfs-DATANODE - Default Group 237 | rcg.update_config({"datanode_java_heapsize": "127926272", 238 | "dfs_data_dir_list": "/data/dfs/dn", 239 | "dfs_datanode_data_dir_perm": "755", 240 | "dfs_datanode_du_reserved": "3218866585", 241 | "dfs_datanode_failed_volumes_tolerated": "0", 242 | "dfs_datanode_max_locked_memory": "316669952", }) 243 | if rcg.roleType == "BALANCER": 244 | # hdfs-BALANCER - Default Group 245 | rcg.update_config({"balancer_java_heapsize": "492830720"}) 246 | if rcg.roleType == "GATEWAY": 247 | # hdfs-GATEWAY - Default Group 248 | rcg.update_config({"dfs_client_use_trash": True}) 249 | 250 | for role_type in ['DATANODE', 'GATEWAY']: 251 | for host in manager.get_hosts(include_cm_host=(role_type == 'GATEWAY')): 252 | cdh.create_service_role(service, role_type, host) 253 | 254 | # Example of deploy_client_config. Recommended to Deploy Cluster wide client config. 255 | # cdh.deploy_client_config_for(service) 256 | 257 | nn_role_type = service.get_roles_by_type("NAMENODE")[0] 258 | commands = service.format_hdfs(nn_role_type.name) 259 | for cmd in commands: 260 | check.status_for_command("Format NameNode", cmd) 261 | 262 | check.status_for_command("Starting HDFS.", service.start()) 263 | check.status_for_command("Creating HDFS /tmp directory", service.create_hdfs_tmp()) 264 | 265 | 266 | def setup_hbase(): 267 | """ 268 | HBase 269 | > Creating HBase root directory 270 | Starting HBase Service 271 | :return: 272 | """ 273 | api = ApiResource(server_host=cmx.cm_server, username=cmx.username, password=cmx.password, version=cmx.api_version) 274 | cluster = api.get_cluster(cmx.cluster_name) 275 | service_type = "HBASE" 276 | if cdh.get_service_type(service_type) is None: 277 | print "> %s" % service_type 278 | service_name = "hbase" 279 | print "Create %s service" % service_name 280 | cluster.create_service(service_name, service_type) 281 | service = cluster.get_service(service_name) 282 | hosts = manager.get_hosts() 283 | 284 | # Service-Wide 285 | service_config = {"hbase_enable_indexing": True, "hbase_enable_replication": True, 286 | "zookeeper_session_timeout": "30000"} 287 | service_config.update(cdh.dependencies_for(service)) 288 | service.update_config(service_config) 289 | 290 | # Role Config Group equivalent to Service Default Group 291 | for rcg in service.get_all_role_config_groups(): 292 | if rcg.roleType == "MASTER": 293 | rcg.update_config({"hbase_master_java_heapsize": "492830720"}) 294 | if rcg.roleType == "REGIONSERVER": 295 | rcg.update_config({"hbase_regionserver_java_heapsize": "365953024", 296 | "hbase_regionserver_java_opts": "-XX:+UseParNewGC -XX:+UseConcMarkSweepGC " 297 | "-XX:-CMSConcurrentMTEnabled " 298 | "-XX:CMSInitiatingOccupancyFraction=70 " 299 | "-XX:+CMSParallelRemarkEnabled -verbose:gc " 300 | "-XX:+PrintGCDetails -XX:+PrintGCDateStamps"}) 301 | 302 | for role_type in ['MASTER', 'HBASETHRIFTSERVER', 'HBASERESTSERVER']: 303 | cdh.create_service_role(service, role_type, random.choice(hosts)) 304 | 305 | for role_type in ['GATEWAY', 'REGIONSERVER']: 306 | for host in manager.get_hosts(include_cm_host=(role_type == 'GATEWAY')): 307 | cdh.create_service_role(service, role_type, host) 308 | 309 | # Example of deploy_client_config. Recommended to Deploy Cluster wide client config. 310 | # cdh.deploy_client_config_for(service) 311 | 312 | check.status_for_command("Creating HBase root directory", service.create_hbase_root()) 313 | # This service is started later on 314 | # check.status_for_command("Starting HBase Service", service.start()) 315 | 316 | 317 | def setup_solr(): 318 | """ 319 | Solr 320 | > Initializing Solr in ZooKeeper 321 | > Creating HDFS home directory for Solr 322 | Starting Solr Service 323 | :return: 324 | """ 325 | api = ApiResource(server_host=cmx.cm_server, username=cmx.username, password=cmx.password, version=cmx.api_version) 326 | cluster = api.get_cluster(cmx.cluster_name) 327 | service_type = "SOLR" 328 | if cdh.get_service_type(service_type) is None: 329 | print "> %s" % service_type 330 | service_name = "solr" 331 | print "Create %s service" % service_name 332 | cluster.create_service(service_name, service_type) 333 | service = cluster.get_service(service_name) 334 | hosts = manager.get_hosts() 335 | 336 | # Service-Wide 337 | service.update_config(cdh.dependencies_for(service)) 338 | 339 | # Role Config Group equivalent to Service Default Group 340 | for rcg in service.get_all_role_config_groups(): 341 | if rcg.roleType == "SOLR_SERVER": 342 | cdh.create_service_role(service, rcg.roleType, [x for x in hosts if x.id == 0][0]) 343 | if rcg.roleType == "GATEWAY": 344 | for host in manager.get_hosts(include_cm_host=True): 345 | cdh.create_service_role(service, rcg.roleType, host) 346 | 347 | # Example of deploy_client_config. Recommended to Deploy Cluster wide client config. 348 | # cdh.deploy_client_config_for(service) 349 | 350 | # check.status_for_command("Initializing Solr in ZooKeeper", service._cmd('initSolr')) 351 | # check.status_for_command("Creating HDFS home directory for Solr", service._cmd('createSolrHdfsHomeDir')) 352 | check.status_for_command("Initializing Solr in ZooKeeper", service.init_solr()) 353 | check.status_for_command("Creating HDFS home directory for Solr", 354 | service.create_solr_hdfs_home_dir()) 355 | # This service is started later on 356 | # check.status_for_command("Starting Solr Service", service.start()) 357 | 358 | 359 | def setup_ks_indexer(): 360 | """ 361 | KS_INDEXER 362 | :return: 363 | """ 364 | api = ApiResource(server_host=cmx.cm_server, username=cmx.username, password=cmx.password, version=cmx.api_version) 365 | cluster = api.get_cluster(cmx.cluster_name) 366 | service_type = "KS_INDEXER" 367 | if cdh.get_service_type(service_type) is None: 368 | print "> %s" % service_type 369 | service_name = "ks_indexer" 370 | print "Create %s service" % service_name 371 | cluster.create_service(service_name, service_type) 372 | service = cluster.get_service(service_name) 373 | hosts = manager.get_hosts() 374 | 375 | # Service-Wide 376 | service.update_config(cdh.dependencies_for(service)) 377 | 378 | # Pick 1 host to deploy Lily HBase Indexer Default Group 379 | cdh.create_service_role(service, "HBASE_INDEXER", random.choice(hosts)) 380 | 381 | # HBase Service-Wide configuration 382 | hbase = cdh.get_service_type('HBASE') 383 | hbase.stop() 384 | hbase.update_config({"hbase_enable_indexing": True, "hbase_enable_replication": True}) 385 | hbase.start() 386 | 387 | # This service is started later on 388 | # check.status_for_command("Starting Lily HBase Indexer Service", service.start()) 389 | 390 | 391 | def setup_spark(): 392 | """ 393 | Spark 394 | > Execute command CreateSparkUserDirCommand on service Spark 395 | > Execute command CreateSparkHistoryDirCommand on service Spark 396 | > Execute command SparkUploadJarServiceCommand on service Spark 397 | Starting Spark Service 398 | :return: 399 | """ 400 | api = ApiResource(server_host=cmx.cm_server, username=cmx.username, password=cmx.password, version=cmx.api_version) 401 | cluster = api.get_cluster(cmx.cluster_name) 402 | service_type = "SPARK" 403 | if cdh.get_service_type(service_type) is None: 404 | print "> %s" % service_type 405 | service_name = "spark" 406 | print "Create %s service" % service_name 407 | cluster.create_service(service_name, service_type) 408 | service = cluster.get_service(service_name) 409 | hosts = manager.get_hosts() 410 | 411 | # Service-Wide 412 | service.update_config(cdh.dependencies_for(service)) 413 | 414 | cdh.create_service_role(service, "SPARK_MASTER", [x for x in hosts if x.id == 0][0]) 415 | cdh.create_service_role(service, "SPARK_HISTORY_SERVER", random.choice(hosts)) 416 | 417 | for role_type in ['GATEWAY', 'SPARK_WORKER']: 418 | for host in manager.get_hosts(include_cm_host=(role_type == 'GATEWAY')): 419 | cdh.create_service_role(service, role_type, host) 420 | 421 | # Example of deploy_client_config. Recommended to Deploy Cluster wide client config. 422 | # cdh.deploy_client_config_for(service) 423 | 424 | check.status_for_command("Execute command CreateSparkUserDirCommand on service Spark", 425 | service.service_command_by_name('CreateSparkUserDirCommand')) 426 | check.status_for_command("Execute command CreateSparkHistoryDirCommand on service Spark", 427 | service.service_command_by_name('CreateSparkHistoryDirCommand')) 428 | check.status_for_command("Execute command SparkUploadJarServiceCommand on service Spark", 429 | service.service_command_by_name('SparkUploadJarServiceCommand')) 430 | 431 | # This service is started later on 432 | # check.status_for_command("Starting Spark Service", service.start()) 433 | 434 | 435 | def setup_spark_on_yarn(): 436 | """ 437 | Sqoop Client 438 | :return: 439 | """ 440 | api = ApiResource(server_host=cmx.cm_server, username=cmx.username, password=cmx.password, version=cmx.api_version) 441 | cluster = api.get_cluster(cmx.cluster_name) 442 | service_type = "SPARK_ON_YARN" 443 | if cdh.get_service_type(service_type) is None: 444 | print "> %s" % service_type 445 | service_name = "spark_on_yarn" 446 | print "Create %s service" % service_name 447 | cluster.create_service(service_name, service_type) 448 | service = cluster.get_service(service_name) 449 | hosts = manager.get_hosts() 450 | 451 | # Service-Wide 452 | service.update_config(cdh.dependencies_for(service)) 453 | for rcg in service.get_all_role_config_groups(): 454 | if rcg.roleType == "SPARK_YARN_HISTORY_SERVER": 455 | rcg.update_config({"history_server_max_heapsize": "153092096"}) 456 | 457 | cdh.create_service_role(service, "SPARK_YARN_HISTORY_SERVER", random.choice(hosts)) 458 | 459 | for host in manager.get_hosts(include_cm_host=True): 460 | cdh.create_service_role(service, "GATEWAY", host) 461 | 462 | # Example of deploy_client_config. Recommended to Deploy Cluster wide client config. 463 | # cdh.deploy_client_config_for(service) 464 | 465 | check.status_for_command("Execute command CreateSparkUserDirCommand on service Spark", 466 | service.service_command_by_name('CreateSparkUserDirCommand')) 467 | check.status_for_command("Execute command CreateSparkHistoryDirCommand on service Spark", 468 | service.service_command_by_name('CreateSparkHistoryDirCommand')) 469 | check.status_for_command("Execute command SparkUploadJarServiceCommand on service Spark", 470 | service.service_command_by_name('SparkUploadJarServiceCommand')) 471 | 472 | # This service is started later on 473 | # check.status_for_command("Starting Spark Service", service.start()) 474 | 475 | 476 | def setup_yarn(): 477 | """ 478 | Yarn 479 | > Creating MR2 job history directory 480 | > Creating NodeManager remote application log directory 481 | Starting YARN (MR2 Included) Service 482 | :return: 483 | """ 484 | api = ApiResource(server_host=cmx.cm_server, username=cmx.username, password=cmx.password, version=cmx.api_version) 485 | cluster = api.get_cluster(cmx.cluster_name) 486 | service_type = "YARN" 487 | if cdh.get_service_type(service_type) is None: 488 | print "> %s" % service_type 489 | service_name = "yarn" 490 | print "Create %s service" % service_name 491 | cluster.create_service(service_name, service_type) 492 | service = cluster.get_service(service_name) 493 | hosts = manager.get_hosts() 494 | 495 | # Service-Wide 496 | service.update_config(cdh.dependencies_for(service)) 497 | 498 | for rcg in service.get_all_role_config_groups(): 499 | if rcg.roleType == "RESOURCEMANAGER": 500 | # yarn-RESOURCEMANAGER - Default Group 501 | rcg.update_config({"resource_manager_java_heapsize": "492830720", 502 | "yarn_scheduler_maximum_allocation_mb": "2568", 503 | "yarn_scheduler_maximum_allocation_vcores": "2"}) 504 | cdh.create_service_role(service, rcg.roleType, [x for x in hosts if x.id == 0][0]) 505 | if rcg.roleType == "JOBHISTORY": 506 | # yarn-JOBHISTORY - Default Group 507 | rcg.update_config({"mr2_jobhistory_java_heapsize": "492830720"}) 508 | cdh.create_service_role(service, rcg.roleType, random.choice(hosts)) 509 | if rcg.roleType == "NODEMANAGER": 510 | # yarn-NODEMANAGER - Default Group 511 | rcg.update_config({"yarn_nodemanager_heartbeat_interval_ms": "100", 512 | "yarn_nodemanager_local_dirs": "/data/yarn/nm", 513 | "yarn_nodemanager_resource_cpu_vcores": "2", 514 | "yarn_nodemanager_resource_memory_mb": "2568", 515 | "node_manager_java_heapsize": "127926272"}) 516 | for host in hosts: 517 | cdh.create_service_role(service, rcg.roleType, host) 518 | if rcg.roleType == "GATEWAY": 519 | # yarn-GATEWAY - Default Group 520 | rcg.update_config({"mapred_reduce_tasks": "505413632", "mapred_submit_replication": "1", 521 | "mapred_reduce_tasks": "3"}) 522 | for host in manager.get_hosts(include_cm_host=True): 523 | cdh.create_service_role(service, rcg.roleType, host) 524 | 525 | # Example of deploy_client_config. Recommended to Deploy Cluster wide client config. 526 | # cdh.deploy_client_config_for(service) 527 | 528 | check.status_for_command("Creating MR2 job history directory", service.create_yarn_job_history_dir()) 529 | check.status_for_command("Creating NodeManager remote application log directory", 530 | service.create_yarn_node_manager_remote_app_log_dir()) 531 | # This service is started later on 532 | # check.status_for_command("Starting YARN (MR2 Included) Service", service.start()) 533 | 534 | 535 | def setup_mapreduce(): 536 | """ 537 | MapReduce 538 | :return: 539 | """ 540 | api = ApiResource(server_host=cmx.cm_server, username=cmx.username, password=cmx.password, version=cmx.api_version) 541 | cluster = api.get_cluster(cmx.cluster_name) 542 | service_type = "MAPREDUCE" 543 | if cdh.get_service_type(service_type) is None: 544 | print "> %s" % service_type 545 | service_name = "mapreduce" 546 | print "Create %s service" % service_name 547 | cluster.create_service(service_name, service_type) 548 | service = cluster.get_service(service_name) 549 | hosts = manager.get_hosts() 550 | 551 | # Service-Wide 552 | service.update_config(cdh.dependencies_for(service)) 553 | 554 | for rcg in service.get_all_role_config_groups(): 555 | if rcg.roleType == "JOBTRACKER": 556 | # mapreduce-JOBTRACKER - Default Group 557 | rcg.update_config({"jobtracker_mapred_local_dir_list": "/data/mapred/jt", 558 | "jobtracker_java_heapsize": "492830720", 559 | "mapred_job_tracker_handler_count": "22"}) 560 | cdh.create_service_role(service, rcg.roleType, [x for x in hosts if x.id == 0][0]) 561 | if rcg.roleType == "TASKTRACKER": 562 | # mapreduce-TASKTRACKER - Default Group 563 | rcg.update_config({"tasktracker_mapred_local_dir_list": "/data/mapred/local", 564 | "mapred_tasktracker_map_tasks_maximum": "1", 565 | "mapred_tasktracker_reduce_tasks_maximum": "1", 566 | "task_tracker_java_heapsize": "127926272"}) 567 | if rcg.roleType == "GATEWAY": 568 | # mapreduce-GATEWAY - Default Group 569 | rcg.update_config({"mapred_reduce_tasks": "1", "mapred_submit_replication": "1"}) 570 | 571 | for role_type in ['GATEWAY', 'TASKTRACKER']: 572 | for host in manager.get_hosts(include_cm_host=(role_type == 'GATEWAY')): 573 | cdh.create_service_role(service, role_type, host) 574 | 575 | # Example of deploy_client_config. Recommended to Deploy Cluster wide client config. 576 | # cdh.deploy_client_config_for(service) 577 | 578 | # This service is started later on 579 | # check.status_for_command("Starting MapReduce Service", service.start()) 580 | 581 | 582 | def setup_hive(): 583 | """ 584 | Hive 585 | > Creating Hive Metastore Database 586 | > Creating Hive Metastore Database Tables 587 | > Creating Hive user directory 588 | > Creating Hive warehouse directory 589 | Starting Hive Service 590 | :return: 591 | """ 592 | api = ApiResource(server_host=cmx.cm_server, username=cmx.username, password=cmx.password, version=cmx.api_version) 593 | cluster = api.get_cluster(cmx.cluster_name) 594 | service_type = "HIVE" 595 | if cdh.get_service_type(service_type) is None: 596 | print "> %s" % service_type 597 | service_name = "hive" 598 | print "Create %s service" % service_name 599 | cluster.create_service(service_name, service_type) 600 | service = cluster.get_service(service_name) 601 | hosts = manager.get_hosts() 602 | 603 | # Service-Wide 604 | # hive_metastore_database_host: Assuming embedded DB is running from where embedded-db is located. 605 | service_config = {"hive_metastore_database_host": socket.getfqdn(cmx.cm_server), 606 | "hive_metastore_database_user": "hive", 607 | "hive_metastore_database_name": "hive", 608 | "hive_metastore_database_password": "cloudera", 609 | "hive_metastore_database_port": "7432", 610 | "hive_metastore_database_type": "postgresql"} 611 | service_config.update(cdh.dependencies_for(service)) 612 | service.update_config(service_config) 613 | 614 | # Role Config Group equivalent to Service Default Group 615 | for rcg in service.get_all_role_config_groups(): 616 | if rcg.roleType == "HIVEMETASTORE": 617 | rcg.update_config({"hive_metastore_java_heapsize": "492830720"}) 618 | if rcg.roleType == "HIVESERVER2": 619 | rcg.update_config({"hiveserver2_java_heapsize": "144703488"}) 620 | 621 | for role_type in ['HIVEMETASTORE', 'HIVESERVER2']: 622 | cdh.create_service_role(service, role_type, random.choice(hosts)) 623 | 624 | for host in manager.get_hosts(include_cm_host=True): 625 | cdh.create_service_role(service, "GATEWAY", host) 626 | 627 | # Example of deploy_client_config. Recommended to Deploy Cluster wide client config. 628 | # cdh.deploy_client_config_for(service) 629 | 630 | check.status_for_command("Creating Hive Metastore Database Tables", service.create_hive_metastore_tables()) 631 | check.status_for_command("Creating Hive user directory", service.create_hive_userdir()) 632 | check.status_for_command("Creating Hive warehouse directory", service.create_hive_warehouse()) 633 | # This service is started later on 634 | # check.status_for_command("Starting Hive Service", service.start()) 635 | 636 | 637 | def setup_sqoop(): 638 | """ 639 | Sqoop 2 640 | > Creating Sqoop 2 user directory 641 | > Creating Sqoop 2 Database 642 | Starting Sqoop 2 Service 643 | :return: 644 | """ 645 | api = ApiResource(server_host=cmx.cm_server, username=cmx.username, password=cmx.password, version=cmx.api_version) 646 | cluster = api.get_cluster(cmx.cluster_name) 647 | service_type = "SQOOP" 648 | if cdh.get_service_type(service_type) is None: 649 | print "> %s" % service_type 650 | service_name = "sqoop" 651 | print "Create %s service" % service_name 652 | cluster.create_service(service_name, service_type) 653 | service = cluster.get_service(service_name) 654 | hosts = manager.get_hosts() 655 | 656 | # Service-Wide 657 | service.update_config(cdh.dependencies_for(service)) 658 | 659 | # Role Config Group equivalent to Service Default Group 660 | for rcg in service.get_all_role_config_groups(): 661 | if rcg.roleType == "SQOOP_SERVER": 662 | rcg.update_config({"sqoop_java_heapsize": "492830720"}) 663 | 664 | cdh.create_service_role(service, "SQOOP_SERVER", [x for x in hosts if x.id == 0][0]) 665 | 666 | check.status_for_command("Creating Sqoop 2 user directory", service.create_sqoop_user_dir()) 667 | # CDH Version check if greater than 5.3.0 668 | vc = lambda v: tuple(map(int, (v.split(".")))) 669 | if vc(cmx.parcel[0]['version'].split('-')[0]) >= vc("5.3.0"): 670 | check.status_for_command("Creating Sqoop 2 Database", service._cmd('SqoopCreateDatabase')) 671 | # This service is started later on 672 | # check.status_for_command("Starting Sqoop 2 Service", service.start()) 673 | 674 | 675 | def setup_sqoop_client(): 676 | """ 677 | Sqoop Client 678 | :return: 679 | """ 680 | api = ApiResource(server_host=cmx.cm_server, username=cmx.username, password=cmx.password, version=cmx.api_version) 681 | cluster = api.get_cluster(cmx.cluster_name) 682 | service_type = "SQOOP_CLIENT" 683 | if cdh.get_service_type(service_type) is None: 684 | print "> %s" % service_type 685 | service_name = "sqoop_client" 686 | print "Create %s service" % service_name 687 | cluster.create_service(service_name, service_type) 688 | service = cluster.get_service(service_name) 689 | # hosts = get_cluster_hosts() 690 | 691 | # Service-Wide 692 | service.update_config({}) 693 | 694 | for host in manager.get_hosts(include_cm_host=True): 695 | cdh.create_service_role(service, "GATEWAY", host) 696 | 697 | # Example of deploy_client_config. Recommended to Deploy Cluster wide client config. 698 | # cdh.deploy_client_config_for(service) 699 | 700 | 701 | def setup_impala(enable_llama=False): 702 | """ 703 | Impala 704 | > Creating Impala user directory 705 | Starting Impala Service 706 | :param enable_llama: 707 | :return: 708 | """ 709 | api = ApiResource(server_host=cmx.cm_server, username=cmx.username, password=cmx.password, version=cmx.api_version) 710 | cluster = api.get_cluster(cmx.cluster_name) 711 | service_type = "IMPALA" 712 | if cdh.get_service_type(service_type) is None: 713 | print "> %s" % service_type 714 | service_name = "impala" 715 | print "Create %s service" % service_name 716 | cluster.create_service(service_name, service_type) 717 | service = cluster.get_service(service_name) 718 | hosts = manager.get_hosts() 719 | 720 | # Service-Wide 721 | service.update_config(cdh.dependencies_for(service)) 722 | 723 | # Role Config Group equivalent to Service Default Group 724 | for rcg in service.get_all_role_config_groups(): 725 | if rcg.roleType == "IMPALAD": 726 | rcg.update_config({"impalad_memory_limit": "618659840", 727 | "enable_audit_event_log": True, 728 | "scratch_dirs": "/data/impala/impalad"}) 729 | 730 | for role_type in ['CATALOGSERVER', 'STATESTORE']: 731 | cdh.create_service_role(service, role_type, random.choice(hosts)) 732 | 733 | # Install ImpalaD 734 | for host in hosts: 735 | cdh.create_service_role(service, "IMPALAD", host) 736 | 737 | check.status_for_command("Creating Impala user directory", service.create_impala_user_dir()) 738 | # Impala will be started/stopped when we enable_llama_rm 739 | # This service is started later on 740 | # check.status_for_command("Starting Impala Service", service.start()) 741 | 742 | # Enable YARN and Impala Integrated Resource Management 743 | # http://www.cloudera.com/content/www/en-us/documentation/enterprise/latest/topics/admin_llama.html 744 | yarn = cdh.get_service_type('YARN') 745 | if yarn is not None and enable_llama is True: 746 | # enable cgroup-based resource management for all hosts with NodeManager roles. 747 | cm = api.get_cloudera_manager() 748 | cm.update_all_hosts_config({"rm_enabled": True}) 749 | yarn.update_config({"yarn_service_cgroups": True, "yarn_service_lce_always": True}) 750 | role_group = yarn.get_role_config_group("%s-RESOURCEMANAGER-BASE" % yarn.name) 751 | role_group.update_config({"yarn_scheduler_minimum_allocation_mb": 0, 752 | "yarn_scheduler_minimum_allocation_vcores": 0}) 753 | check.status_for_command("Enable YARN and Impala Integrated Resource Management", 754 | service.enable_llama_rm(random.choice(hosts).hostId)) 755 | 756 | 757 | def setup_oozie(): 758 | """ 759 | Oozie 760 | > Creating Oozie database 761 | > Installing Oozie ShareLib in HDFS 762 | Starting Oozie Service 763 | :return: 764 | """ 765 | api = ApiResource(server_host=cmx.cm_server, username=cmx.username, password=cmx.password, version=cmx.api_version) 766 | cluster = api.get_cluster(cmx.cluster_name) 767 | service_type = "OOZIE" 768 | if cdh.get_service_type(service_type) is None: 769 | print "> %s" % service_type 770 | service_name = "oozie" 771 | print "Create %s service" % service_name 772 | cluster.create_service(service_name, service_type) 773 | service = cluster.get_service(service_name) 774 | hosts = manager.get_hosts() 775 | 776 | # Service-Wide 777 | service.update_config(cdh.dependencies_for(service)) 778 | 779 | # Role Config Group equivalent to Service Default Group 780 | for rcg in service.get_all_role_config_groups(): 781 | if rcg.roleType == "OOZIE_SERVER": 782 | rcg.update_config({"oozie_java_heapsize": "492830720"}) 783 | cdh.create_service_role(service, rcg.roleType, [x for x in hosts if x.id == 0][0]) 784 | 785 | check.status_for_command("Creating Oozie database", service.create_oozie_db()) 786 | check.status_for_command("Installing Oozie ShareLib in HDFS", service.install_oozie_sharelib()) 787 | # This service is started later on 788 | # check.status_for_command("Starting Oozie Service", service.start()) 789 | 790 | 791 | def setup_oozie_ha(load_balancer_host_port): 792 | """ 793 | Setup oozie-ha 794 | :return: 795 | """ 796 | # TODO: test setup_oozie_ha 797 | print "> Setup OOZIE-HA" 798 | service = cdh.get_service_type('OOZIE') 799 | # pre-requisites 800 | service.update_config({"oozie_load_balancer": load_balancer_host_port}) 801 | rcg = service.get_role_config_group("{0}-OOZIE_SERVER-BASE".format(service.name)) 802 | # CM5.4/OPSAPS-25778 803 | rcg.update_config({"oozie_plugins_list": "org.apache.oozie.service.ZKLocksService," 804 | "org.apache.oozie.service.ZKXLogStreamingService," 805 | "org.apache.oozie.service.ZKJobsConcurrencyService," 806 | "org.apache.oozie.service.ZKUUIDService"}) 807 | 808 | if len(service.get_roles_by_type("OOZIE_SERVER")) != 2: 809 | # Choose random node for the second Oozie Server 810 | hosts = manager.get_hosts() 811 | rnd_host = random.choice([x.hostId for x in hosts if x.hostId 812 | is not service.get_roles_by_type("OOZIE_SERVER")[0].hostRef.hostId]) 813 | 814 | cmd = service.enable_oozie_ha(rnd_host.hostRef.hostId) 815 | check.status_for_command("Enable YARN-HA - [ http://%s:7180/cmf/command/%s/details ]" % 816 | (socket.getfqdn(cmx.cm_server), cmd.id), cmd) 817 | 818 | 819 | def setup_hue(): 820 | """ 821 | Hue 822 | Starting Hue Service 823 | :return: 824 | """ 825 | api = ApiResource(server_host=cmx.cm_server, username=cmx.username, password=cmx.password, version=cmx.api_version) 826 | cluster = api.get_cluster(cmx.cluster_name) 827 | service_type = "HUE" 828 | if cdh.get_service_type(service_type) is None: 829 | print "> %s" % service_type 830 | service_name = "hue" 831 | print "Create %s service" % service_name 832 | cluster.create_service(service_name, service_type) 833 | service = cluster.get_service(service_name) 834 | hosts = manager.get_hosts() 835 | 836 | # Service-Wide 837 | service.update_config(cdh.dependencies_for(service)) 838 | 839 | # Role Config Group equivalent to Service Default Group 840 | for rcg in service.get_all_role_config_groups(): 841 | if rcg.roleType == "HUE_SERVER": 842 | rcg.update_config({}) 843 | cdh.create_service_role(service, "HUE_SERVER", [x for x in hosts if x.id == 0][0]) 844 | # This service is started later on 845 | # check.status_for_command("Starting Hue Service", service.start()) 846 | 847 | 848 | def setup_flume(): 849 | api = ApiResource(server_host=cmx.cm_server, username=cmx.username, password=cmx.password, version=cmx.api_version) 850 | cluster = api.get_cluster(cmx.cluster_name) 851 | service_type = "FLUME" 852 | if cdh.get_service_type(service_type) is None: 853 | service_name = "flume" 854 | cluster.create_service(service_name.lower(), service_type) 855 | service = cluster.get_service(service_name) 856 | 857 | # Service-Wide 858 | service.update_config(cdh.dependencies_for(service)) 859 | hosts = manager.get_hosts() 860 | cdh.create_service_role(service, "AGENT", [x for x in hosts if x.id == 0][0]) 861 | # This service is started later on 862 | # check.status_for_command("Starting Flume Agent", service.start()) 863 | 864 | 865 | def setup_accumulo(): 866 | """ 867 | Accumulo 1.6 868 | > Deploy Client Configuration 869 | > Create Accumulo Home Dir on service Accumulo 1.6 870 | > Create Accumulo User Dir on service Accumulo 1.6 871 | > Initialize Accumulo on service Accumulo 1.6 872 | Start Accumulo 1.6 873 | :return: 874 | """ 875 | api = ApiResource(server_host=cmx.cm_server, username=cmx.username, password=cmx.password, version=cmx.api_version) 876 | cluster = api.get_cluster(cmx.cluster_name) 877 | service_type = "ACCUMULO16" 878 | if cdh.get_service_type(service_type) is None: 879 | print "> %s" % service_type 880 | service_name = "accumulo16" 881 | print "Create %s service" % service_name 882 | cluster.create_service(service_name, service_type) 883 | service = cluster.get_service(service_name) 884 | hosts = manager.get_hosts() 885 | 886 | # Deploy ACCUMULO16 Parcel 887 | parcel = [x for x in cluster.get_all_parcels() if x.product == 'ACCUMULO' and 888 | 'cdh5' in x.version][0] 889 | 890 | accumulo_parcel = {'product': str(parcel.product.upper()), 'version': str(parcel.version).lower()} 891 | print "> Parcel action for parcel: [ %s-%s ]" % (parcel.product, parcel.version) 892 | cluster_parcel = cluster.get_parcel(product=parcel.product, version=parcel.version) 893 | if "ACTIVATED" not in cluster_parcel.stage: 894 | parcel_action(parcel_item=accumulo_parcel, function="start_removal_of_distribution", 895 | expected_stage=['DOWNLOADED', 'AVAILABLE_REMOTELY', 'ACTIVATING'], 896 | action_description="Un-Distribute Parcel") 897 | parcel_action(parcel_item=accumulo_parcel, function="start_download", 898 | expected_stage=['DOWNLOADED'], action_description="Download Parcel") 899 | parcel_action(parcel_item=accumulo_parcel, function="start_distribution", expected_stage=['DISTRIBUTED'], 900 | action_description="Distribute Parcel") 901 | parcel_action(parcel_item=accumulo_parcel, function="activate", expected_stage=['ACTIVATED'], 902 | action_description="Activate Parcel") 903 | 904 | # Service-Wide 905 | service.update_config(cdh.dependencies_for(service)) 906 | 907 | # Create Accumulo roles 908 | for role_type in ['ACCUMULO16_MASTER', 'ACCUMULO16_TRACER', 'ACCUMULO16_GC', 909 | 'ACCUMULO16_TSERVER', 'ACCUMULO16_MONITOR']: 910 | cdh.create_service_role(service, role_type, random.choice(hosts)) 911 | 912 | # Create Accumulo gateway roles 913 | for host in manager.get_hosts(include_cm_host=True): 914 | cdh.create_service_role(service, 'GATEWAY', host) 915 | 916 | print "Deploy Client Configuration" 917 | cluster.deploy_client_config() 918 | check.status_for_command("Execute command Create Accumulo Home Dir on service Accumulo 1.6", 919 | service.service_command_by_name('CreateHdfsDirCommand')) 920 | check.status_for_command("Execute command Create Accumulo User Dir on service Accumulo 1.6", 921 | service.service_command_by_name('CreateAccumuloUserDirCommand')) 922 | check.status_for_command("Execute command Initialize Accumulo on service Accumulo 1.6", 923 | service.service_command_by_name('AccumuloInitServiceCommand')) 924 | # check.status_for_command("Starting Accumulo Service", service.start()) 925 | 926 | 927 | def setup_hdfs_ha(): 928 | """ 929 | Setup hdfs-ha 930 | :return: 931 | """ 932 | try: 933 | print "> Setup HDFS-HA" 934 | hdfs = cdh.get_service_type('HDFS') 935 | zookeeper = cdh.get_service_type('ZOOKEEPER') 936 | # Requirement Hive/Hue 937 | hive = cdh.get_service_type('HIVE') 938 | hue = cdh.get_service_type('HUE') 939 | hosts = manager.get_hosts() 940 | 941 | if len(hdfs.get_roles_by_type("NAMENODE")) != 2: 942 | # QJM require 3 nodes 943 | jn = random.sample([x.hostRef.hostId for x in hdfs.get_roles_by_type("DATANODE")], 3) 944 | # get NAMENODE and SECONDARYNAMENODE hostId 945 | nn_host_id = hdfs.get_roles_by_type("NAMENODE")[0].hostRef.hostId 946 | sndnn_host_id = hdfs.get_roles_by_type("SECONDARYNAMENODE")[0].hostRef.hostId 947 | 948 | # Occasionally SECONDARYNAMENODE is also installed on the NAMENODE 949 | if nn_host_id == sndnn_host_id: 950 | standby_host_id = random.choice([x.hostId for x in jn if x.hostId not in [nn_host_id, sndnn_host_id]]) 951 | elif nn_host_id is not sndnn_host_id: 952 | standby_host_id = sndnn_host_id 953 | else: 954 | standby_host_id = random.choice([x.hostId for x in hosts if x.hostId is not nn_host_id]) 955 | 956 | # hdfs-JOURNALNODE - Default Group 957 | role_group = hdfs.get_role_config_group("%s-JOURNALNODE-BASE" % hdfs.name) 958 | role_group.update_config({"dfs_journalnode_edits_dir": "/data/dfs/jn"}) 959 | 960 | cmd = hdfs.enable_nn_ha(hdfs.get_roles_by_type("NAMENODE")[0].name, standby_host_id, 961 | "nameservice1", [dict(jnHostId=jn[0]), dict(jnHostId=jn[1]), dict(jnHostId=jn[2])], 962 | zk_service_name=zookeeper.name) 963 | check.status_for_command("Enable HDFS-HA - [ http://%s:7180/cmf/command/%s/details ]" % 964 | (socket.getfqdn(cmx.cm_server), cmd.id), cmd) 965 | 966 | # hdfs-HTTPFS 967 | cdh.create_service_role(hdfs, "HTTPFS", [x for x in hosts if x.id == 0][0]) 968 | # Configure HUE service dependencies 969 | cdh(*['HDFS', 'HIVE', 'HUE', 'ZOOKEEPER']).stop() 970 | if hue is not None: 971 | hue.update_config(cdh.dependencies_for(hue)) 972 | if hive is not None: 973 | check.status_for_command("Update Hive Metastore NameNodes", hive.update_metastore_namenodes()) 974 | cdh(*['ZOOKEEPER', 'HDFS', 'HIVE', 'HUE']).start() 975 | 976 | except ApiException as err: 977 | print " ERROR: %s" % err.message 978 | 979 | 980 | def setup_yarn_ha(): 981 | """ 982 | Setup yarn-ha 983 | :return: 984 | """ 985 | print "> Setup YARN-HA" 986 | yarn = cdh.get_service_type('YARN') 987 | zookeeper = cdh.get_service_type('ZOOKEEPER') 988 | # hosts = api.get_all_hosts() 989 | if len(yarn.get_roles_by_type("RESOURCEMANAGER")) != 2: 990 | # Choose random node for standby RM 991 | rm = random.choice([nm for nm in yarn.get_roles_by_type("NODEMANAGER") 992 | if nm.hostRef.hostId != yarn.get_roles_by_type("RESOURCEMANAGER")[0].hostRef.hostId]) 993 | cmd = yarn.enable_rm_ha(rm.hostRef.hostId, zookeeper.name) 994 | check.status_for_command("Enable YARN-HA - [ http://%s:7180/cmf/command/%s/details ]" % 995 | (socket.getfqdn(cmx.cm_server), cmd.id), cmd) 996 | 997 | 998 | def enable_kerberos(): 999 | """ 1000 | Enable Kerberos 1001 | > Import KDC Account Manager Credentials 1002 | > Generate Credentials 1003 | > Stop cluster 1004 | > Stop Cloudera Management Services 1005 | > Configure all services to use Kerberos 1006 | > Wait for credentials to be generated 1007 | > Deploy client configuration 1008 | > Start Cloudera Management Services 1009 | > Start cluster 1010 | :return: 1011 | """ 1012 | api = ApiResource(server_host=cmx.cm_server, username=cmx.username, password=cmx.password, version=cmx.api_version) 1013 | cm = api.get_cloudera_manager() 1014 | cluster = api.get_cluster(cmx.cluster_name) 1015 | print "> Setup Kerberos" 1016 | cm.update_config({"KDC_HOST": cmx.kerberos['kdc_host'], 1017 | "SECURITY_REALM": cmx.kerberos['security_realm']}) 1018 | 1019 | if cmx.api_version >= 11: 1020 | check.status_for_command("Configure Kerberos for Cluster", 1021 | cluster.configure_for_kerberos(datanode_transceiver_port=1004, 1022 | datanode_web_port=1006)) 1023 | check.status_for_command("Stop Cloudera Management Services", cm.get_service().stop()) 1024 | # check.status_for_command("Wait for credentials to be generated", cm.generate_credentials()) 1025 | check.status_for_command("Start Cloudera Management Services", cm.get_service().start()) 1026 | else: 1027 | hdfs = cdh.get_service_type('HDFS') 1028 | zookeeper = cdh.get_service_type('ZOOKEEPER') 1029 | hue = cdh.get_service_type('HUE') 1030 | hosts = manager.get_hosts() 1031 | 1032 | check.status_for_command("Import Admin Credentials", 1033 | cm.import_admin_credentials(username=str(cmx.kerberos['kdc_user']), 1034 | password=str(cmx.kerberos['kdc_password']))) 1035 | check.status_for_command("Wait for credentials to be generated", cm.generate_credentials()) 1036 | time.sleep(10) 1037 | check.status_for_command("Stop cluster: %s" % cmx.cluster_name, cluster.stop()) 1038 | check.status_for_command("Stop Cloudera Management Services", cm.get_service().stop()) 1039 | 1040 | # Configure all services to use MIT Kerberos 1041 | # HDFS Service-Wide 1042 | hdfs.update_config({"hadoop_security_authentication": "kerberos", "hadoop_security_authorization": True}) 1043 | 1044 | # hdfs-DATANODE-BASE - Default Group 1045 | role_group = hdfs.get_role_config_group("%s-DATANODE-BASE" % hdfs.name) 1046 | role_group.update_config({"dfs_datanode_http_port": "1006", "dfs_datanode_port": "1004", 1047 | "dfs_datanode_data_dir_perm": "700"}) 1048 | 1049 | # Zookeeper Service-Wide 1050 | zookeeper.update_config({"enableSecurity": True}) 1051 | cdh.create_service_role(hue, "KT_RENEWER", [x for x in hosts if x.id == 0][0]) 1052 | 1053 | # Example deploying cluster wide Client Config 1054 | check.status_for_command("Deploy client config for %s" % cmx.cluster_name, cluster.deploy_client_config()) 1055 | check.status_for_command("Start Cloudera Management Services", cm.get_service().start()) 1056 | # check.status_for_command("Start cluster: %s" % cmx.cluster_name, cluster.start()) 1057 | 1058 | 1059 | def disable_kerberos(): 1060 | """ 1061 | Disable Kerberos 1062 | > Stop cluster 1063 | > Stop Cloudera Management Services 1064 | > Configure all services to not use Kerberos 1065 | > Deploy client configuration 1066 | > Start Cloudera Management Services 1067 | > Start cluster 1068 | :return: 1069 | """ 1070 | api = ApiResource(server_host=cmx.cm_server, username=cmx.username, password=cmx.password, version=cmx.api_version) 1071 | cm = api.get_cloudera_manager() 1072 | cluster = api.get_cluster(cmx.cluster_name) 1073 | print "> Setup Kerberos" 1074 | cm.update_config({"KDC_HOST": None, "SECURITY_REALM": None}) 1075 | hdfs = cdh.get_service_type('HDFS') 1076 | zookeeper = cdh.get_service_type('ZOOKEEPER') 1077 | hue = cdh.get_service_type('HUE') 1078 | 1079 | check.status_for_command("Stop cluster: %s" % cmx.cluster_name, cluster.stop()) 1080 | check.status_for_command("Stop Cloudera Management Services", cm.get_service().stop()) 1081 | 1082 | # Configure all services to use simple authentication 1083 | # HDFS Service-Wide 1084 | hdfs.update_config({"hadoop_security_authentication": "simple", "hadoop_security_authorization": False}) 1085 | 1086 | # hdfs-DATANODE-BASE - Default Group 1087 | role_group = hdfs.get_role_config_group("%s-DATANODE-BASE" % hdfs.name) 1088 | role_group.update_config({"dfs_datanode_http_port": "50075", "dfs_datanode_port": "50010", 1089 | "dfs_datanode_data_dir_perm": "700"}) 1090 | 1091 | # Zookeeper Service-Wide 1092 | zookeeper.update_config({"enableSecurity": False}) 1093 | kt_renewer_role = hue.get_roles_by_type("HUE_SERVER")[0].name 1094 | check.status_for_command("Delete KT_RENEWER role: %s" % kt_renewer_role, hue.delete_role(kt_renewer_role)) 1095 | 1096 | # Example deploying cluster wide Client Config 1097 | check.status_for_command("Deploy client config for %s" % cmx.cluster_name, cluster.deploy_client_config()) 1098 | check.status_for_command("Start Cloudera Management Services", cm.get_service().start()) 1099 | check.status_for_command("Start cluster: %s" % cmx.cluster_name, cluster.start()) 1100 | 1101 | 1102 | def setup_sentry(): 1103 | api = ApiResource(server_host=cmx.cm_server, username=cmx.username, password=cmx.password, version=cmx.api_version) 1104 | cluster = api.get_cluster(cmx.cluster_name) 1105 | service_type = "SENTRY" 1106 | if cdh.get_service_type(service_type) is None: 1107 | service_name = "sentry" 1108 | cluster.create_service(service_name.lower(), service_type) 1109 | service = cluster.get_service(service_name) 1110 | 1111 | # Service-Wide 1112 | # sentry_server_database_host: Assuming embedded DB is running from where embedded-db is located. 1113 | service_config = {"sentry_server_database_host": socket.getfqdn(cmx.cm_server), 1114 | "sentry_server_database_user": "sentry", 1115 | "sentry_server_database_name": "sentry", 1116 | "sentry_server_database_password": "cloudera", 1117 | "sentry_server_database_port": "7432", 1118 | "sentry_server_database_type": "postgresql"} 1119 | 1120 | service_config.update(cdh.dependencies_for(service)) 1121 | service.update_config(service_config) 1122 | hosts = manager.get_hosts() 1123 | 1124 | cdh.create_service_role(service, "SENTRY_SERVER", random.choice(hosts)) 1125 | check.status_for_command("Creating Sentry Database Tables", service.create_sentry_database_tables()) 1126 | 1127 | # Update configuration for Hive service 1128 | hive = cdh.get_service_type('HIVE') 1129 | hive.update_config(cdh.dependencies_for(hive)) 1130 | 1131 | # Disable HiveServer2 Impersonation - hive-HIVESERVER2-BASE - Default Group 1132 | role_group = hive.get_role_config_group("%s-HIVESERVER2-BASE" % hive.name) 1133 | role_group.update_config({"hiveserver2_enable_impersonation": False}) 1134 | 1135 | # This service is started later on 1136 | # check.status_for_command("Starting Sentry Server", service.start()) 1137 | 1138 | 1139 | def setup_easy(): 1140 | """ 1141 | An example using auto_assign_roles() and auto_configure() 1142 | """ 1143 | api = ApiResource(server_host=cmx.cm_server, username=cmx.username, password=cmx.password, version=cmx.api_version) 1144 | cluster = api.get_cluster(cmx.cluster_name) 1145 | print "> Easy setup for cluster: %s" % cmx.cluster_name 1146 | # Do not install these services 1147 | do_not_install = ['KEYTRUSTEE', 'KMS', 'KS_INDEXER', 'ISILON', 'FLUME', 'MAPREDUCE', 'ACCUMULO', 1148 | 'ACCUMULO16', 'SPARK_ON_YARN', 'SPARK', 'SOLR', 'SENTRY'] 1149 | service_types = list(set(cluster.get_service_types()) - set(do_not_install)) 1150 | [cluster.create_service(name=service.lower(), service_type=service.upper()) for service in service_types] 1151 | 1152 | cluster.auto_assign_roles() 1153 | cluster.auto_configure() 1154 | 1155 | # Hive Metastore DB and dependencies ['YARN', 'ZOOKEEPER'] 1156 | service = cdh.get_service_type('HIVE') 1157 | service_config = {"hive_metastore_database_host": socket.getfqdn(cmx.cm_server), 1158 | "hive_metastore_database_user": "hive", 1159 | "hive_metastore_database_name": "hive", 1160 | "hive_metastore_database_password": "hive", 1161 | "hive_metastore_database_port": "7432", 1162 | "hive_metastore_database_type": "postgresql"} 1163 | service_config.update(cdh.dependencies_for(service)) 1164 | service.update_config(service_config) 1165 | check.status_for_command("Executing first run command. This might take a while.", cluster.first_run()) 1166 | 1167 | 1168 | def teardown(keep_cluster=True): 1169 | """ 1170 | Teardown the Cluster 1171 | :return: 1172 | """ 1173 | api = ApiResource(server_host=cmx.cm_server, username=cmx.username, password=cmx.password, version=cmx.api_version) 1174 | try: 1175 | cluster = api.get_cluster(cmx.cluster_name) 1176 | service_list = cluster.get_all_services() 1177 | print "> Teardown Cluster: %s Services and keep_cluster: %s" % (cmx.cluster_name, keep_cluster) 1178 | check.status_for_command("Stop %s" % cmx.cluster_name, cluster.stop()) 1179 | 1180 | for service in service_list[:None:-1]: 1181 | try: 1182 | check.status_for_command("Stop Service %s" % service.name, service.stop()) 1183 | except ApiException as err: 1184 | print " ERROR: %s" % err.message 1185 | 1186 | # Unset service dependencies and configuration settings 1187 | service_config = {} 1188 | for k, v in service.get_config()[0].items(): 1189 | service_config[k] = None 1190 | 1191 | for service in service_list[:None:-1]: 1192 | # Remove service roles 1193 | print "Processing service %s" % service.name 1194 | for role in service.get_all_roles(): 1195 | print " Delete role %s" % role.name 1196 | service.delete_role(role.name) 1197 | 1198 | cluster.delete_service(service.name) 1199 | 1200 | print "Deactivate and Un-Distribute CDH Parcel and GPL Extras Parcel" 1201 | for cdh_parcel in cluster.get_all_parcels(): 1202 | if cdh_parcel.stage == 'ACTIVATED': 1203 | print "> Parcel action for parcel: [ %s-%s ]" % (cdh_parcel.product, cdh_parcel.version) 1204 | parcel_action(parcel_item={"product": cdh_parcel.product, "version": cdh_parcel.version}, 1205 | function="deactivate", expected_stage=['DISTRIBUTED'], 1206 | action_description="Deactivate Parcel") 1207 | parcel_action(parcel_item={"product": cdh_parcel.product, "version": cdh_parcel.version}, 1208 | function="start_removal_of_distribution", expected_stage=['DOWNLOADED'], 1209 | action_description="Un-Distribute Parcel") 1210 | 1211 | except ApiException as err: 1212 | print err.message 1213 | exit(1) 1214 | 1215 | # Delete Management Services 1216 | try: 1217 | mgmt = api.get_cloudera_manager() 1218 | check.status_for_command("Stop Management services", mgmt.get_service().stop().wait()) 1219 | mgmt.delete_mgmt_service() 1220 | except ApiException as err: 1221 | print " ERROR: %s" % err.message 1222 | 1223 | # cluster.remove_all_hosts() 1224 | if not keep_cluster: 1225 | print "Deleting cluster: %s" % cmx.cluster_name 1226 | api.delete_cluster(cmx.cluster_name) 1227 | 1228 | 1229 | class ManagementActions: 1230 | """ 1231 | Example stopping 'ACTIVITYMONITOR', 'REPORTSMANAGER' Management Role 1232 | :param role_list: 1233 | :param action: 1234 | :return: 1235 | """ 1236 | 1237 | def __init__(self, *role_list): 1238 | self._role_list = role_list 1239 | self._api = ApiResource(server_host=cmx.cm_server, username=cmx.username, password=cmx.password, 1240 | version=cmx.api_version) 1241 | self._cm = self._api.get_cloudera_manager() 1242 | try: 1243 | self._service = self._cm.get_service() 1244 | except ApiException: 1245 | self._service = self._cm.create_mgmt_service(ApiServiceSetupInfo()) 1246 | self._role_types = [x.type for x in self._service.get_all_roles()] 1247 | 1248 | def stop(self): 1249 | self._role_action('stop_roles') 1250 | 1251 | def start(self): 1252 | self._role_action('start_roles') 1253 | 1254 | def restart(self): 1255 | self._role_action('restart_roles') 1256 | 1257 | def _role_action(self, action): 1258 | state = {'start_roles': ['STOPPED'], 'stop_roles': ['STARTED'], 'restart_roles': ['STARTED', 'STOPPED']} 1259 | for mgmt_role in [x for x in self._role_list if x in self._role_types]: 1260 | for role in [x for x in self._service.get_roles_by_type(mgmt_role) if x.roleState in state[action]]: 1261 | [check.status_for_command("%s role %s" % (action.split("_")[0].upper(), mgmt_role), cmd) 1262 | for cmd in getattr(self._service, action)(role.name)] 1263 | 1264 | def setup(self): 1265 | """ 1266 | Setup Management Roles 1267 | 'ACTIVITYMONITOR', 'ALERTPUBLISHER', 'EVENTSERVER', 'HOSTMONITOR', 'SERVICEMONITOR' 1268 | Requires License: 'NAVIGATOR', 'NAVIGATORMETASERVER', 'REPORTSMANAGER" 1269 | :return: 1270 | """ 1271 | print "> Setup Management Services" 1272 | self._cm.update_config({"TSQUERY_STREAMS_LIMIT": 1000}) 1273 | hosts = manager.get_hosts(include_cm_host=True) 1274 | # pick hostId that match the ipAddress of cm_server 1275 | # mgmt_host may be empty then use the 1st host from the -w 1276 | try: 1277 | mgmt_host = [x for x in hosts if x.ipAddress == socket.gethostbyname(cmx.cm_server)][0] 1278 | except IndexError: 1279 | mgmt_host = [x for x in hosts if x.id == 0][0] 1280 | 1281 | for role_type in [x for x in self._service.get_role_types() if x in self._role_list]: 1282 | try: 1283 | if not [x for x in self._service.get_all_roles() if x.type == role_type]: 1284 | print "Creating Management Role %s " % role_type 1285 | role_name = "mgmt-%s-%s" % (role_type, mgmt_host.md5host) 1286 | for cmd in self._service.create_role(role_name, role_type, mgmt_host.hostId).get_commands(): 1287 | check.status_for_command("Creating %s" % role_name, cmd) 1288 | except ApiException as err: 1289 | print "ERROR: %s " % err.message 1290 | 1291 | # now configure each role 1292 | for group in [x for x in self._service.get_all_role_config_groups() if x.roleType in self._role_list]: 1293 | if group.roleType == "ACTIVITYMONITOR": 1294 | group.update_config({"firehose_database_host": "%s:7432" % socket.getfqdn(cmx.cm_server), 1295 | "firehose_database_user": "amon", 1296 | "firehose_database_password": cmx.amon_password, 1297 | "firehose_database_type": "postgresql", 1298 | "firehose_database_name": "amon", 1299 | "firehose_heapsize": "615514112"}) 1300 | elif group.roleType == "ALERTPUBLISHER": 1301 | group.update_config({}) 1302 | elif group.roleType == "EVENTSERVER": 1303 | group.update_config({"event_server_heapsize": "492830720"}) 1304 | elif group.roleType == "HOSTMONITOR": 1305 | group.update_config({"firehose_non_java_memory_bytes": "1610612736", 1306 | "firehose_heapsize": "268435456"}) 1307 | elif group.roleType == "SERVICEMONITOR": 1308 | group.update_config({"firehose_non_java_memory_bytes": "1610612736", 1309 | "firehose_heapsize": "268435456"}) 1310 | elif group.roleType == "NAVIGATOR" and manager.licensed(): 1311 | group.update_config({"navigator_heapsize": "492830720"}) 1312 | elif group.roleType == "NAVIGATORMETASERVER" and manager.licensed(): 1313 | group.update_config({"navigator_heapsize": "1232076800"}) 1314 | elif group.roleType == "NAVIGATORMETADATASERVER" and manager.licensed(): 1315 | group.update_config({}) 1316 | elif group.roleType == "REPORTSMANAGER" and manager.licensed(): 1317 | group.update_config({"headlamp_database_host": "%s:7432" % socket.getfqdn(cmx.cm_server), 1318 | "headlamp_database_name": "rman", 1319 | "headlamp_database_password": cmx.rman_password, 1320 | "headlamp_database_type": "postgresql", 1321 | "headlamp_database_user": "rman", 1322 | "headlamp_heapsize": "492830720"}) 1323 | 1324 | @classmethod 1325 | def licensed(cls): 1326 | """ 1327 | Check if Cluster is licensed 1328 | :return: 1329 | """ 1330 | api = ApiResource(server_host=cmx.cm_server, username=cmx.username, password=cmx.password, 1331 | version=cmx.api_version) 1332 | cm = api.get_cloudera_manager() 1333 | try: 1334 | return bool(cm.get_license().uuid) 1335 | except ApiException as err: 1336 | return "Express" not in err.message 1337 | 1338 | @classmethod 1339 | def upload_license(cls): 1340 | """ 1341 | Upload License file 1342 | :return: 1343 | """ 1344 | api = ApiResource(server_host=cmx.cm_server, username=cmx.username, password=cmx.password, 1345 | version=cmx.api_version) 1346 | cm = api.get_cloudera_manager() 1347 | if cmx.license_file and not manager.licensed(): 1348 | print "Upload license" 1349 | with open(cmx.license_file, 'r') as f: 1350 | license_contents = f.read() 1351 | print "Upload CM License: \n %s " % license_contents 1352 | cm.update_license(license_contents) 1353 | # REPORTSMANAGER required after applying license 1354 | manager("REPORTSMANAGER").setup() 1355 | manager("REPORTSMANAGER").start() 1356 | 1357 | @classmethod 1358 | def begin_trial(cls): 1359 | """ 1360 | Begin Trial 1361 | :return: 1362 | """ 1363 | api = ApiResource(server_host=cmx.cm_server, username=cmx.username, password=cmx.password, 1364 | version=cmx.api_version) 1365 | print "def begin_trial" 1366 | if not manager.licensed(): 1367 | try: 1368 | api.post("/cm/trial/begin") 1369 | # REPORTSMANAGER required after applying license 1370 | manager("REPORTSMANAGER").setup() 1371 | # manager("REPORTSMANAGER").start() 1372 | except ApiException as err: 1373 | print err.message 1374 | 1375 | @classmethod 1376 | def get_mgmt_password(cls, role_type): 1377 | """ 1378 | Get password for "ACTIVITYMONITOR', 'REPORTSMANAGER', 'NAVIGATOR" 1379 | :param role_type: 1380 | :return False if db.mgmt.properties is missing 1381 | """ 1382 | contents = [] 1383 | mgmt_password = False 1384 | 1385 | if os.path.isfile('/etc/cloudera-scm-server/db.mgmt.properties'): 1386 | try: 1387 | print "> Reading %s password from /etc/cloudera-scm-server/db.mgmt.properties" % role_type 1388 | with open(os.path.join('/etc/cloudera-scm-server', 'db.mgmt.properties')) as f: 1389 | contents = f.readlines() 1390 | 1391 | # role_type expected to be in 1392 | # "ACTIVITYMONITOR', 'REPORTSMANAGER', 'NAVIGATOR" 1393 | if role_type in ['ACTIVITYMONITOR', 'REPORTSMANAGER', 'NAVIGATOR']: 1394 | idx = "com.cloudera.cmf.%s.db.password=" % role_type 1395 | match = [s.rstrip('\n') for s in contents if idx in s][0] 1396 | mgmt_password = match[match.index(idx) + len(idx):] 1397 | 1398 | except IOError: 1399 | print "Unable to open file: /etc/cloudera-scm-server/db.mgmt.properties" 1400 | 1401 | return mgmt_password 1402 | 1403 | @classmethod 1404 | def get_hosts(cls, include_cm_host=False): 1405 | """ 1406 | because api.get_all_hosts() returns all the hosts as instanceof ApiHost: hostId hostname ipAddress 1407 | and cluster.list_hosts() returns all the cluster hosts as instanceof ApiHostRef: hostId 1408 | we only need Cluster hosts with instanceof ApiHost: hostId hostname ipAddress + md5host 1409 | preserve host order in -w 1410 | hashlib.md5(host.hostname).hexdigest() 1411 | attributes = {'id': None, 'hostId': None, 'hostname': None, 'md5host': None, 'ipAddress': None, } 1412 | return a list of hosts 1413 | """ 1414 | api = ApiResource(server_host=cmx.cm_server, username=cmx.username, password=cmx.password, 1415 | version=cmx.api_version) 1416 | 1417 | w_hosts = set(enumerate(cmx.host_names)) 1418 | if include_cm_host and socket.gethostbyname(cmx.cm_server) \ 1419 | not in [socket.gethostbyname(x) for x in cmx.host_names]: 1420 | w_hosts.add((len(w_hosts), cmx.cm_server)) 1421 | 1422 | hosts = [] 1423 | for idx, host in w_hosts: 1424 | _host = [x for x in api.get_all_hosts() if x.ipAddress == socket.gethostbyname(host)][0] 1425 | hosts.append({ 1426 | 'id': idx, 1427 | 'hostId': _host.hostId, 1428 | 'hostname': _host.hostname, 1429 | 'md5host': hashlib.md5(_host.hostname).hexdigest(), 1430 | 'ipAddress': _host.ipAddress, 1431 | }) 1432 | 1433 | return [type('', (), x) for x in hosts] 1434 | 1435 | @classmethod 1436 | def restart_management(cls): 1437 | """ 1438 | Restart Management Services 1439 | :return: 1440 | """ 1441 | api = ApiResource(server_host=cmx.cm_server, username=cmx.username, password=cmx.password, 1442 | version=cmx.api_version) 1443 | mgmt = api.get_cloudera_manager().get_service() 1444 | 1445 | check.status_for_command("Stop Management services", mgmt.stop()) 1446 | check.status_for_command("Start Management services", mgmt.start()) 1447 | 1448 | 1449 | class ServiceActions: 1450 | """ 1451 | Example stopping/starting services ['HBASE', 'IMPALA', 'SPARK', 'SOLR'] 1452 | :param service_list: 1453 | :param action: 1454 | :return: 1455 | """ 1456 | 1457 | def __init__(self, *service_list): 1458 | self._service_list = service_list 1459 | self._api = ApiResource(server_host=cmx.cm_server, username=cmx.username, password=cmx.password, 1460 | version=cmx.api_version) 1461 | self._cluster = self._api.get_cluster(cmx.cluster_name) 1462 | 1463 | def stop(self): 1464 | self._action('stop') 1465 | 1466 | def start(self): 1467 | self._action('start') 1468 | 1469 | def restart(self): 1470 | self._action('restart') 1471 | 1472 | def _action(self, action): 1473 | state = {'start': ['STOPPED'], 'stop': ['STARTED'], 'restart': ['STARTED', 'STOPPED']} 1474 | for services in [x for x in self._cluster.get_all_services() 1475 | if x.type in self._service_list and x.serviceState in state[action]]: 1476 | check.status_for_command("%s service %s" % (action.upper(), services.type), 1477 | getattr(self._cluster.get_service(services.name), action)()) 1478 | 1479 | @classmethod 1480 | def get_service_type(cls, name): 1481 | """ 1482 | Returns service based on service type name 1483 | :param name: 1484 | :return: 1485 | """ 1486 | api = ApiResource(server_host=cmx.cm_server, username=cmx.username, password=cmx.password, 1487 | version=cmx.api_version) 1488 | cluster = api.get_cluster(cmx.cluster_name) 1489 | try: 1490 | service = [x for x in cluster.get_all_services() if x.type == name][0] 1491 | except IndexError: 1492 | service = None 1493 | 1494 | return service 1495 | 1496 | @classmethod 1497 | def deploy_client_config_for(cls, obj): 1498 | """ 1499 | Example deploying GATEWAY Client Config on each host 1500 | Note: only recommended if you need to deploy on a specific hostId. 1501 | Use the cluster.deploy_client_config() for normal use. 1502 | example usage: 1503 | # hostId 1504 | for host in get_cluster_hosts(include_cm_host=True): 1505 | deploy_client_config_for(host.hostId) 1506 | 1507 | # cdh service 1508 | for service in cluster.get_all_services(): 1509 | deploy_client_config_for(service) 1510 | 1511 | :param host.hostId, or ApiService: 1512 | :return: 1513 | """ 1514 | api = ApiResource(server_host=cmx.cm_server, username=cmx.username, password=cmx.password, 1515 | version=cmx.api_version) 1516 | # cluster = api.get_cluster(cmx.cluster_name) 1517 | if isinstance(obj, str) or isinstance(obj, unicode): 1518 | for role_name in [x.roleName for x in api.get_host(obj).roleRefs if 'GATEWAY' in x.roleName]: 1519 | service = cdh.get_service_type('GATEWAY') 1520 | print "Deploying client config for service: %s - host: [%s]" % \ 1521 | (service.type, api.get_host(obj).hostname) 1522 | check.status_for_command("Deploy client config for role %s" % 1523 | role_name, service.deploy_client_config(role_name)) 1524 | elif isinstance(obj, ApiService): 1525 | for role in obj.get_roles_by_type("GATEWAY"): 1526 | check.status_for_command("Deploy client config for role %s" % 1527 | role.name, obj.deploy_client_config(role.name)) 1528 | 1529 | @classmethod 1530 | def create_service_role(cls, service, role_type, host): 1531 | """ 1532 | Helper function to create a role 1533 | :return: 1534 | """ 1535 | service_name = service.name[:4] + hashlib.md5(service.name).hexdigest()[:8] \ 1536 | if len(role_type) > 24 else service.name 1537 | 1538 | role_name = "-".join([service_name, role_type, host.md5host])[:64] 1539 | print "Creating role: %s on host: [%s]" % (role_name, host.hostname) 1540 | if not [role for role in service.get_all_roles() if role_name in role.name]: 1541 | [check.status_for_command("Creating role: %s on host: [%s]" % (role_name, host.hostname), cmd) 1542 | for cmd in service.create_role(role_name, role_type, host.hostId).get_commands()] 1543 | 1544 | @classmethod 1545 | def restart_cluster(cls): 1546 | """ 1547 | Restart Cluster and Cluster wide deploy client config 1548 | :return: 1549 | """ 1550 | api = ApiResource(server_host=cmx.cm_server, username=cmx.username, password=cmx.password, 1551 | version=cmx.api_version) 1552 | cluster = api.get_cluster(cmx.cluster_name) 1553 | print "Restart cluster: %s" % cmx.cluster_name 1554 | check.status_for_command("Stop %s" % cmx.cluster_name, cluster.stop()) 1555 | check.status_for_command("Start %s" % cmx.cluster_name, cluster.start()) 1556 | # Example deploying cluster wide Client Config 1557 | check.status_for_command("Deploy client config for %s" % cmx.cluster_name, cluster.deploy_client_config()) 1558 | 1559 | @classmethod 1560 | def dependencies_for(cls, service): 1561 | """ 1562 | Utility function returns dict of service dependencies 1563 | :return: 1564 | """ 1565 | service_config = {} 1566 | config_types = {"hue_webhdfs": ['NAMENODE', 'HTTPFS'], "hdfs_service": "HDFS", "sentry_service": "SENTRY", 1567 | "zookeeper_service": "ZOOKEEPER", "hbase_service": "HBASE", 1568 | "hue_hbase_thrift": "HBASETHRIFTSERVER", "solr_service": "SOLR", 1569 | "hive_service": "HIVE", "sqoop_service": "SQOOP", 1570 | "impala_service": "IMPALA", "oozie_service": "OOZIE", 1571 | "mapreduce_yarn_service": ['MAPREDUCE', 'YARN'], "yarn_service": "YARN"} 1572 | 1573 | dependency_list = [] 1574 | # get required service config 1575 | for k, v in service.get_config(view="full")[0].items(): 1576 | if v.required: 1577 | dependency_list.append(k) 1578 | 1579 | # Extended dependence list, adding the optional ones as well 1580 | if service.type == 'HUE': 1581 | dependency_list.extend(['hbase_service', 'solr_service', 'sqoop_service', 1582 | 'impala_service', 'hue_hbase_thrift']) 1583 | if service.type in ['HIVE', 'HDFS', 'HUE', 'OOZIE', 'MAPREDUCE', 'YARN', 'ACCUMULO16']: 1584 | dependency_list.append('zookeeper_service') 1585 | if service.type in ['HIVE']: 1586 | dependency_list.append('sentry_service') 1587 | if service.type == 'OOZIE': 1588 | dependency_list.append('hive_service') 1589 | if service.type in ['FLUME', 'IMPALA']: 1590 | dependency_list.append('hbase_service') 1591 | if service.type in ['FLUME', 'SPARK', 'SENTRY', 'ACCUMULO16']: 1592 | dependency_list.append('hdfs_service') 1593 | if service.type == 'FLUME': 1594 | dependency_list.append('solr_service') 1595 | 1596 | for key in dependency_list: 1597 | if key == "hue_webhdfs": 1598 | hdfs = cdh.get_service_type('HDFS') 1599 | if hdfs is not None: 1600 | service_config[key] = [x.name for x in hdfs.get_roles_by_type('NAMENODE')][0] 1601 | # prefer HTTPS over NAMENODE 1602 | if [x.name for x in hdfs.get_roles_by_type('HTTPFS')]: 1603 | service_config[key] = [x.name for x in hdfs.get_roles_by_type('HTTPFS')][0] 1604 | elif key == "mapreduce_yarn_service": 1605 | for _type in config_types[key]: 1606 | if cdh.get_service_type(_type) is not None: 1607 | service_config[key] = cdh.get_service_type(_type).name 1608 | # prefer YARN over MAPREDUCE 1609 | if cdh.get_service_type(_type) is not None and _type == 'YARN': 1610 | service_config[key] = cdh.get_service_type(_type).name 1611 | elif key == "hue_hbase_thrift": 1612 | hbase = cdh.get_service_type('HBASE') 1613 | if hbase is not None: 1614 | service_config[key] = [x.name for x in hbase.get_roles_by_type(config_types[key])][0] 1615 | else: 1616 | if cdh.get_service_type(config_types[key]) is not None: 1617 | service_config[key] = cdh.get_service_type(config_types[key]).name 1618 | 1619 | return service_config 1620 | 1621 | 1622 | class ActiveCommands: 1623 | def __init__(self): 1624 | self._api = ApiResource(server_host=cmx.cm_server, username=cmx.username, password=cmx.password, 1625 | version=cmx.api_version) 1626 | 1627 | def status_for_command(self, message, command): 1628 | """ 1629 | Helper to check active command status 1630 | :param message: 1631 | :param command: 1632 | :return: 1633 | """ 1634 | _state = 0 1635 | _bar = ['[|]', '[/]', '[-]', '[\\]'] 1636 | while True: 1637 | if self._api.get("/commands/%s" % command.id)['active']: 1638 | sys.stdout.write(_bar[_state % 4] + ' ' + message + ' ' + ('\b' * (len(message) + 5))) 1639 | sys.stdout.flush() 1640 | _state += 1 1641 | time.sleep(0.5) 1642 | else: 1643 | print "\n [%s] %s" % (command.id, self._api.get("/commands/%s" % command.id)['resultMessage']) 1644 | self._child_cmd(self._api.get("/commands/%s" % command.id)['children']['items']) 1645 | break 1646 | 1647 | def _child_cmd(self, cmd): 1648 | """ 1649 | Helper cmd has child objects 1650 | :param cmd: 1651 | :return: 1652 | """ 1653 | if len(cmd) != 0: 1654 | print " Sub tasks result(s):" 1655 | for resMsg in cmd: 1656 | if resMsg.get('resultMessage'): 1657 | print " [%s] %s" % (resMsg['id'], resMsg['resultMessage']) if not resMsg.get('roleRef') \ 1658 | else " [%s] %s - %s" % (resMsg['id'], resMsg['resultMessage'], resMsg['roleRef']['roleName']) 1659 | self._child_cmd(self._api.get("/commands/%s" % resMsg['id'])['children']['items']) 1660 | 1661 | 1662 | def parse_options(): 1663 | global cmx 1664 | global check, cdh, manager 1665 | 1666 | cmx_config_options = {'ssh_root_password': None, 'ssh_root_user': 'root', 'ssh_private_key': None, 1667 | 'cluster_name': 'Cluster 1', 'cluster_version': 'CDH5', 1668 | 'username': 'admin', 'password': 'admin', 'cm_server': None, 1669 | 'host_names': None, 'license_file': None, 1670 | 'parcel': [], 'archive_url': 'http://archive.cloudera.com'} 1671 | 1672 | cmx_config_options.update({'kerberos': {'kdc_host': None, 'security_realm': None, 1673 | 'kdc_user': None, 'kdc_password': None}}) 1674 | 1675 | def cmx_args(option, opt_str, value, *args, **kwargs): 1676 | if option.dest == 'host_names': 1677 | print "switch %s value check: %s" % (opt_str, value) 1678 | for host in value.split(','): 1679 | if not hostname_resolves(host.strip()): 1680 | exit(1) 1681 | else: 1682 | cmx_config_options[option.dest] = [socket.gethostbyname(x.strip()) for x in value.split(',')] 1683 | elif option.dest == 'cm_server': 1684 | print "switch %s value check: %s" % (opt_str, value.strip()) 1685 | s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) 1686 | cmx_config_options[option.dest] = socket.gethostbyname(value.strip()) if \ 1687 | hostname_resolves(value.strip()) else exit(1) 1688 | 1689 | if not s.connect_ex((socket.gethostbyname(value), 7180)) == 0: 1690 | print "Cloudera Manager Server is not started on %s " % value 1691 | s.close() 1692 | exit(1) 1693 | 1694 | # Determine the CM API version 1695 | api_version = get_cm_api_version(cmx_config_options[option.dest], 1696 | cmx_config_options['username'], 1697 | cmx_config_options['password']) 1698 | print "CM API version: %s" % api_version 1699 | cmx_config_options.update({'api_version': api_version}) 1700 | 1701 | # from CM 5.4+ API v10 we specify 'latest' CDH version with {latest_supported} 1702 | if int(cmx_config_options['api_version'].strip("v")) >= 10: 1703 | cmx_config_options.update({'cdh_version': '5'}) 1704 | else: 1705 | cmx_config_options.update({'cdh_version': 'latest'}) 1706 | 1707 | elif option.dest == 'ssh_private_key': 1708 | with open(value, 'r') as f: 1709 | cmx_config_options[option.dest] = f.read() 1710 | elif option.dest == 'cdh_version': 1711 | print "switch %s value check: %s" % (opt_str, value) 1712 | _cdh_repo = urllib2.urlopen("%s/cdh5/parcels/" % cmx_config_options["archive_url"]).read() 1713 | _cdh_ver = [link.replace('/', '') for link in re.findall(r"(.*?)", _cdh_repo) 1714 | if link not in ['Name', 'Last modified', 'Size', 'Description', 'Parent Directory']] 1715 | cmx_config_options[option.dest] = value 1716 | if value not in _cdh_ver: 1717 | print "Invalid CDH version: %s" % value 1718 | exit(1) 1719 | else: 1720 | cmx_config_options[option.dest] = value 1721 | 1722 | def hostname_resolves(hostname): 1723 | """ 1724 | Check if hostname resolves 1725 | :param hostname: 1726 | :return: 1727 | """ 1728 | try: 1729 | if socket.gethostbyname(hostname) == '0.0.0.0': 1730 | print "Error [{'host': '%s', 'fqdn': '%s'}]" % \ 1731 | (socket.gethostbyname(hostname), socket.getfqdn(hostname)) 1732 | return False 1733 | else: 1734 | print "Success [{'host': '%s', 'fqdn': '%s'}]" % \ 1735 | (socket.gethostbyname(hostname), socket.getfqdn(hostname)) 1736 | return True 1737 | except socket.error: 1738 | print "Error 'host': '%s'" % hostname 1739 | return False 1740 | 1741 | def get_cm_api_version(cm_server, username, password): 1742 | """ 1743 | Get supported API version from CM 1744 | :param cm_server: 1745 | :param username: 1746 | :param password: 1747 | :return version: 1748 | """ 1749 | base_url = "%s://%s:%s/api" % ("http", cm_server, 7180) 1750 | client = HttpClient(base_url, exc_class=ApiException) 1751 | client.set_basic_auth(username, password, "Cloudera Manager") 1752 | client.set_headers({"Content-Type": "application/json"}) 1753 | return client.execute("GET", "/version").read().strip('v') 1754 | 1755 | parser = OptionParser() 1756 | parser.add_option('-d', '--teardown', dest='teardown', action="store", type="string", 1757 | help='Teardown Cloudera Manager Cluster. Required arguments "keep_cluster" or "remove_cluster".') 1758 | parser.add_option('-i', '--cdh-version', dest='cdh_version', type="string", action='callback', 1759 | callback=cmx_args, default='latest', help='Install CDH version. Default "latest"') 1760 | parser.add_option('-k', '--ssh-private-key', dest='ssh_private_key', type="string", action='callback', 1761 | callback=cmx_args, help='The private key to authenticate with the hosts. ' 1762 | 'Specify either this or a password.') 1763 | parser.add_option('-l', '--license-file', dest='license_file', type="string", action='callback', 1764 | callback=cmx_args, help='Cloudera Manager License file name') 1765 | parser.add_option('-m', '--cm-server', dest='cm_server', type="string", action='callback', callback=cmx_args, 1766 | help='*Set Cloudera Manager Server Host. ' 1767 | 'Note: This is the host where the Cloudera Management Services get installed.') 1768 | parser.add_option('-n', '--cluster-name', dest='cluster_name', type="string", action='callback', 1769 | callback=cmx_args, default='Cluster 1', 1770 | help='Set Cloudera Manager Cluster name enclosed in double quotes. Default "Cluster 1"') 1771 | parser.add_option('-p', '--ssh-root-password', dest='ssh_root_password', type="string", action='callback', 1772 | callback=cmx_args, help='*Set target node(s) ssh password..') 1773 | parser.add_option('-u', '--ssh-root-user', dest='ssh_root_user', type="string", action='callback', 1774 | callback=cmx_args, default='root', help='Set target node(s) ssh username. Default root') 1775 | parser.add_option('-w', '--host-names', dest='host_names', type="string", action='callback', 1776 | callback=cmx_args, 1777 | help='*Set target node(s) list, separate with comma eg: -w host1,host2,...,host(n). ' 1778 | 'Note:' 1779 | ' - enclose in double quote.' 1780 | ' - CM_SERVER excluded in this list, if you want install CDH Services in CM_SERVER' 1781 | ' add the host to this list.') 1782 | 1783 | (options, args) = parser.parse_args() 1784 | 1785 | msg_req_args = "Please specify the required arguments: " 1786 | if cmx_config_options['cm_server'] is None: 1787 | parser.error(msg_req_args + "-m/--cm-server") 1788 | else: 1789 | if not (cmx_config_options['ssh_private_key'] or cmx_config_options['ssh_root_password']): 1790 | parser.error(msg_req_args + "-p/--ssh-root-password or -k/--ssh-private-key") 1791 | elif cmx_config_options['host_names'] is None: 1792 | parser.error(msg_req_args + "-w/--host-names") 1793 | elif cmx_config_options['ssh_private_key'] and cmx_config_options['ssh_root_password']: 1794 | parser.error(msg_req_args + "-p/--ssh-root-password _OR_ -k/--ssh-private-key") 1795 | 1796 | # Management services password. They are required when adding Management services 1797 | manager = ManagementActions 1798 | if not (bool(manager.get_mgmt_password("ACTIVITYMONITOR")) 1799 | and bool(manager.get_mgmt_password("REPORTSMANAGER"))): 1800 | cmx_config_options['amon_password'] = bool(manager.get_mgmt_password("ACTIVITYMONITOR")) 1801 | cmx_config_options['rman_password'] = bool(manager.get_mgmt_password("REPORTSMANAGER")) 1802 | else: 1803 | cmx_config_options['amon_password'] = manager.get_mgmt_password("ACTIVITYMONITOR") 1804 | cmx_config_options['rman_password'] = manager.get_mgmt_password("REPORTSMANAGER") 1805 | 1806 | cmx = type('', (), cmx_config_options) 1807 | check = ActiveCommands() 1808 | cdh = ServiceActions 1809 | if cmx_config_options['cm_server'] and options.teardown: 1810 | if options.teardown.lower() in ['remove_cluster', 'keep_cluster']: 1811 | teardown(keep_cluster=(options.teardown.lower() == 'keep_cluster')) 1812 | print "Bye!" 1813 | exit(0) 1814 | else: 1815 | print 'Teardown Cloudera Manager Cluster. Required arguments "keep_cluster" or "remove_cluster".' 1816 | exit(1) 1817 | 1818 | # Uncomment here to see cmx configuration options 1819 | # print cmx_config_options 1820 | return options 1821 | 1822 | 1823 | def main(): 1824 | # Parse user options 1825 | parse_options() 1826 | 1827 | # Prepare Cloudera Manager Server: 1828 | # 1. Initialise Cluster and set Cluster name: 'Cluster 1' 1829 | # 3. Add hosts into: 'Cluster 1' 1830 | # 4. Deploy latest parcels into : 'Cluster 1' 1831 | init_cluster() 1832 | add_hosts_to_cluster() 1833 | 1834 | # Deploy CDH Parcel and GPL Extra Parcel skip if they are ACTIVATED 1835 | api = ApiResource(server_host=cmx.cm_server, username=cmx.username, password=cmx.password, version=cmx.api_version) 1836 | cluster = api.get_cluster(cmx.cluster_name) 1837 | for cdh_parcel in cmx.parcel: 1838 | print "> Parcel action for parcel: [ %s-%s ]" % (cdh_parcel['product'], cdh_parcel['version']) 1839 | parcel = cluster.get_parcel(product=cdh_parcel['product'], version=cdh_parcel['version']) 1840 | if "ACTIVATED" not in parcel.stage: 1841 | parcel_action(parcel_item=cdh_parcel, function="start_removal_of_distribution", 1842 | expected_stage=['DOWNLOADED', 'AVAILABLE_REMOTELY', 'ACTIVATING'], 1843 | action_description="Un-Distribute Parcel") 1844 | parcel_action(parcel_item=cdh_parcel, function="start_download", 1845 | expected_stage=['DOWNLOADED'], action_description="Download Parcel") 1846 | parcel_action(parcel_item=cdh_parcel, function="start_distribution", expected_stage=['DISTRIBUTED'], 1847 | action_description="Distribute Parcel") 1848 | parcel_action(parcel_item=cdh_parcel, function="activate", expected_stage=['ACTIVATED'], 1849 | action_description="Activate Parcel") 1850 | 1851 | # Skip MGMT role installation if amon_password and rman_password password are False 1852 | mgmt_roles = ['SERVICEMONITOR', 'ALERTPUBLISHER', 'EVENTSERVER', 'HOSTMONITOR'] 1853 | if cmx.rman_password: 1854 | if manager.licensed(): 1855 | mgmt_roles.append('REPORTSMANAGER') 1856 | manager(*mgmt_roles).setup() 1857 | # "STOP" Management roles 1858 | # manager(*mgmt_roles).stop() 1859 | # "START" Management roles 1860 | manager(*mgmt_roles).start() 1861 | 1862 | # Upload license 1863 | if cmx.license_file: 1864 | manager.upload_license() 1865 | 1866 | # Begin Trial 1867 | # manager.begin_trial() 1868 | 1869 | # Step-Through - Setup services in order of service dependencies 1870 | # Zookeeper, hdfs, HBase, Solr, Spark, Yarn, 1871 | # Hive, Sqoop, Sqoop Client, Impala, Oozie, Hue 1872 | setup_zookeeper() 1873 | setup_hdfs() 1874 | setup_hbase() 1875 | # setup_accumulo() 1876 | # setup_solr() 1877 | # setup_ks_indexer() 1878 | setup_yarn() 1879 | # setup_mapreduce() 1880 | # setup_spark() 1881 | setup_flume() 1882 | setup_spark_on_yarn() 1883 | setup_hive() 1884 | # setup_sentry() 1885 | setup_sqoop() 1886 | setup_sqoop_client() 1887 | setup_impala() 1888 | setup_oozie() 1889 | setup_hue() 1890 | 1891 | # Note: setup_easy() is alternative to Step-Through above 1892 | # This this provides an example of alternative method of 1893 | # using CM API to setup CDH services. 1894 | # setup_easy() 1895 | 1896 | # Example setting hdfs-HA and yarn-HA 1897 | # You can uncomment below after you've setup the CDH services. 1898 | # setup_hdfs_ha() 1899 | # setup_yarn_ha() 1900 | 1901 | # Example enable Kerberos 1902 | # cmx.kerberos = {'kdc_host': 'mko.vpc.cloudera.com', 1903 | # 'security_realm': 'HADOOP.EXAMPLE.COM', 1904 | # 'kdc_user': 'mko/admin@HADOOP.EXAMPLE.COM', 1905 | # 'kdc_password': 'Had00p'} 1906 | # enable_kerberos() 1907 | # OR 1908 | # disable_kerberos() 1909 | 1910 | # Restart Cluster and Deploy Cluster wide client config 1911 | cdh.restart_cluster() 1912 | 1913 | # Other examples of CM API 1914 | # eg: "STOP" Services or "START" 1915 | cdh('HBASE', 'IMPALA', 'SPARK', 'SOLR', 'FLUME').stop() 1916 | 1917 | print "Enjoy!" 1918 | 1919 | 1920 | if __name__ == "__main__": 1921 | print "%s" % '- ' * 20 1922 | print "Version: %s" % __version__ 1923 | print "%s" % '- ' * 20 1924 | main() 1925 | 1926 | # def setup_template(): 1927 | # api = ApiResource(server_host=cmx.cm_server, username=cmx.username, password=cmx.password, 1928 | # version=cmx.api_version) 1929 | # cluster = api.get_cluster(cmx.cluster_name) 1930 | # service_type = "" 1931 | # if cdh.get_service_type(service_type) is None: 1932 | # service_name = "" 1933 | # cluster.create_service(service_name.lower(), service_type) 1934 | # service = cluster.get_service(service_name) 1935 | # 1936 | # # Service-Wide 1937 | # service.update_config(cdh.dependencies_for(service)) 1938 | # 1939 | # hosts = sorted([x for x in api.get_all_hosts()], key=lambda x: x.ipAddress, reverse=False) 1940 | # 1941 | # # - Default Group 1942 | # role_group = service.get_role_config_group("%s-x-BASE" % service.name) 1943 | # role_group.update_config({}) 1944 | # cdh.create_service_role(service, "X", [x for x in hosts if x.id == 0][0]) 1945 | # 1946 | # check.status_for_command("Starting x Service", service.start()) 1947 | --------------------------------------------------------------------------------