├── .gitignore ├── README.md ├── __init__.py ├── eval_tools ├── README_ambari_bp_tool.md ├── README_ambari_cfg_diff.md ├── README_hdp_eval.md ├── __init__.py ├── ambari.py ├── ambari_bp_tool.py ├── ambari_cfg_diff.py ├── common.py ├── dict_diff.py ├── hdp_eval.py ├── hdp_support │ ├── bp_cfg.json │ ├── control.json │ ├── host_group_control.json │ ├── ref_3_1_cluster.json │ └── sub_hosts_default.json └── suite.sh ├── hive-sre └── README.md └── migration ├── HiveOnTEZ.puml ├── LLAP_Migration.puml └── LLAP_Migration_dot.puml /.gitignore: -------------------------------------------------------------------------------- 1 | # IntelliJ 2 | /.idea/ 3 | *.iml 4 | 5 | # Java 6 | *.class 7 | target 8 | 9 | # Python 10 | *.pyc 11 | 12 | # Special Files 13 | test.yaml 14 | 15 | cdp_automation/clusters -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Cloudera Upgrade Utils 2 | 3 | Various tools to help plan HDP and CDH upgrades to CDP 4 | 5 | ## Python Eval Tools 6 | 7 | ### HDP / Ambari 8 | #### [Ambari Blueprint Diff Report](./eval_tools/README_ambari_cfg_diff.md) 9 | 10 | #### [HDP Evaluation Tool](./eval_tools/README_hdp_eval.md) 11 | 12 | ## SRE Tooling 13 | 14 | [SRE and Upgrade Tooling](hive-sre/README.md) -------------------------------------------------------------------------------- /__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dstreev/cloudera_upgrade_utils/c3fab30c3964a068e9327d11e2e82dd0a22459d5/__init__.py -------------------------------------------------------------------------------- /eval_tools/README_ambari_bp_tool.md: -------------------------------------------------------------------------------- 1 | # Ambari Blueprint Tool 2 | 3 | This tool assist with building Blueprints that support the Ambari to Cloudera Manager conversion process available from Cloudera. This is an important step in the process of converting Ambari (HDP) managed clusters to CDP-DC. 4 | 5 | Another function of this tool is to build a standard 'Cluster Creation Template' by combining a Blueprint and Layout from an existing Ambari Cluster. 6 | 7 | ## Artifact Naming Conventions 8 | 9 | The tools in this suite have made special considerations for file naming conventions for the 'blueprints', 'layouts', and 'Cluster Creation Templates'. In cases where the 'blueprint' and ('layout' or 'Cluster Creation Template') are required, you can omit specifying the 'layout' or 'cct' if the blueprint file ends with '-blueprint.json' and the layout ends with '-layout.json' or the cct ends with '-cct.json'. The prefix for each must match. For example: Blueprint filename: mytest-cluster-prod-blueprint.json would automatically look for a layout file named mytest-cluster-prod-layout.json or mytest-cluster-prod-cct.json. If these don't match, you will need to specify both the blueprint and layout or cct options. 10 | 11 | ## Getting Artifacts from Ambari 12 | 13 | Run these in a browser that has been logged into Ambari. The results are JSON files. Save and process. 14 | 15 | ### Get a Blueprint 16 | http://${AMBARI_HOST_PORT}/api/v1/clusters/${CLUSTER_NAME}?format=blueprint 17 | 18 | ### Get a Layout: 19 | http://${AMBARI_HOST_PORT}/api/v1/clusters/${CLUSTER_NAME}/hosts?fields=Hosts/host_name,host_components,Hosts/ip,Hosts/total_mem,Hosts/os_arch,Hosts/os_type,Hosts/rack_info,Hosts/cpu_count,Hosts/disk_info,metrics/disk,Hosts/ph_cpu_count 20 | 21 | ## Usage 22 | 23 | ``` 24 | Usage: ambari_bp_tool.py [options] 25 | 26 | Options: 27 | -h, --help show this help message and exit 28 | -l AMBARI_LAYOUT, --ambari-layout=AMBARI_LAYOUT 29 | Ambari Layout File 30 | -c AMBARI_CREATION_TEMPLATE, --ambari-creation-template=AMBARI_CREATION_TEMPLATE 31 | Ambari Cluster Creation Template 32 | -b AMBARI_BLUEPRINT, --ambari-blueprint=AMBARI_BLUEPRINT 33 | Ambari Blueprint File 34 | -2 AMBARI_BLUEPRINT_V2, --ambari-blueprint-v2=AMBARI_BLUEPRINT_V2 35 | Ambari Blueprint V2 File 36 | -r, --v2-reduction WIP: Remove and consolidate HostGroups for CM 37 | Conversion 38 | -w WORKER_SCALE, --worker-scale=WORKER_SCALE 39 | Reduce Cardinality of Worker Host Groups to this 40 | Cardinality 41 | -s SUB_HOSTS, --sub-hosts-file=SUB_HOSTS 42 | Substitute Hosts in Blueprint with host in a file. 43 | -o OUTPUT_DIR, --output-dir=OUTPUT_DIR 44 | Output Directory 45 | ``` 46 | 47 | To create a 'Cluster Creation Template', use option `-l` and `-b`. This will create a `*-cct.json` file. If a layout is specified and a 'CCT' is not, we will create one. 48 | 49 | The tool will create an 'Ambari v2 Blueprint' regardless. Use the `-2` option to control the output. 50 | 51 | To build clusters with Blueprint V2 with a lower number of hosts, use the `-s ` to control how many hosts to reduce them down to. 52 | 53 | The `-s ` option allows you to replace the host in the output Blueprint V2 file with those in the host file. See [host file](./hdp_support/sub_hosts_default.json) for the format. 54 | 55 | ## Noteworthy Observations 56 | 57 | ### Building v2 output from v1 Blueprint and Layout 58 | 59 | When Ambari Managed groups are used (especially for worker nodes) an accurate mapping of hosts and host_groups can't be guaranteed. There isn't enough information to associate the host to the intended host_group. If a "Cluster Creation Template" is provided, this mapping is already implied. Regardless, when a 'layout' is used we will 'strip' the host_group configurations out, reduce the host_group supported services to those that will be supported in Cloudera Manager and then consolidate the remaining host_groups to removed duplicates created through this reduction process. In the end, a smaller number of 'unique' host_groups will exist and only the main configurations will be translated to the output Blueprint V2 file. 60 | 61 | ## Example Uses 62 | 63 | - Create a V2 Blueprint from a V1 Blueprint and Cluster Creation Template. 64 | 65 | `ambari_bp_tool.py -b my-test-blueprint.json -c my-test-cct.json` 66 | 67 | - Create a V2 Blueprint from a V1 Blueprint when a Layout OR Cluster Creation Template exist in the same directory. 68 | 69 | `ambari_bp_tool.py -b my-test-blueprint.json` 70 | 71 | > Expects to find `my-test-layout.json` in the same directory 72 | > If not found, will look for `my-test-cct.json` in the same directory. 73 | 74 | - Create a V2 Blueprint from a V1 Blueprint with a Large Cluster Layout. Goal is to produce a small cluster for testing, based on the larger clusters configuration. And replace the host names with new host fqdn that match your test environment. 75 | 76 | `ambari_bp_tool.py -b my-big-cluster-blueprint.json -l my-big-cluster-layout.json -2 my-small-test-cluster-blueprint_v2.json -w 3 -s replacement_hosts.json` 77 | 78 | - Reduce a v2 Blueprint file to convertible services, reduce the worker node count, and sub new host names. 79 | 80 | `ambari_bp_tool.py -2 my-blueprint-v2.json -r -w 3 -s my_new-host.json` -------------------------------------------------------------------------------- /eval_tools/README_ambari_cfg_diff.md: -------------------------------------------------------------------------------- 1 | # Ambari Diff Tool 2 | 3 | Used to compare Ambari Blueprints with the 'reference' cluster OR against another cluster blueprint. 4 | 5 | ## Usage 6 | 7 | ``` 8 | Usage: ambari_cfg_diff.py [options] 9 | 10 | Options: 11 | -h, --help show this help message and exit 12 | -r REFERENCE, --reference-file=REFERENCE 13 | The standard (reference-file) file to compare against. 14 | -c CHECK, --check-file=CHECK 15 | The file (check-file) that you want to compare. 16 | -o OUTPUT, --output=OUTPUT 17 | The output report file will be in 'markdown'. 18 | ``` 19 | 20 | ## Uses for this tool 21 | - Compare the configuration between two clusters and identify items that may not be configured the same in each. IE: DEV vs. PROD 22 | - Compare the cluster against a 'reference' configuration to identify anomolies. 23 | 24 | -------------------------------------------------------------------------------- /eval_tools/README_hdp_eval.md: -------------------------------------------------------------------------------- 1 | # HDP Evaluation Tool 2 | 3 | Use this tool to help increase the visibility of cluster configurations. 4 | 5 | It will provide a layout of the clusters main components, counts for each component type, information about drive layouts, counts of certain component group types, and the start of memory allocations for each host. 6 | 7 | The input to this is a 'layout' file and an 'Ambari blueprint' of the cluster. 8 | 9 | 10 | ## Artifact Naming Conventions 11 | 12 | The tools in this suite have made special considerations for file naming conventions for the 'blueprints' and 'layouts'. In cases where the 'blueprint' and 'layout' are required, you can omit specifying the 'layout' if the blueprint file ends with '-blueprint.json' and the layout ends with '-layout.json'. The prefix for each must match. For example: Blueprint filename: mytest-cluster-prod-blueprint.json would automatically look for a layout file named mytest-cluster-prod-layout.json. If these don't match, you will need to specify both the blueprint and layout options. 13 | 14 | ## Getting Artifacts from Ambari 15 | 16 | Run these in a browser that has been logged into Ambari. The results are JSON files. Save and use as input to this process. 17 | 18 | ### Get a Blueprint: 19 | ``` 20 | http://${AMBARI_HOST_PORT}/api/v1/clusters/${CLUSTER_NAME}?format=blueprint 21 | ``` 22 | 23 | ### Get a Layout: 24 | ``` 25 | http://${AMBARI_HOST_PORT}/api/v1/clusters/${CLUSTER_NAME}/hosts?fields=Hosts/host_name,host_components,Hosts/ip,Hosts/total_mem,Hosts/os_arch,Hosts/os_type,Hosts/rack_info,Hosts/cpu_count,Hosts/disk_info,metrics/disk,Hosts/ph_cpu_count 26 | ``` 27 | 28 | ## Usage: hdp_eval.py [options] 29 | 30 | ``` 31 | Options: 32 | -h, --help show this help message and exit 33 | -l AMBARI_LAYOUT, --ambari-layout=AMBARI_LAYOUT 34 | . 35 | -b AMBARI_BLUEPRINT, --ambari-blueprint=AMBARI_BLUEPRINT 36 | -o OUTPUT_DIR, --output_dir 37 | ``` 38 | -------------------------------------------------------------------------------- /eval_tools/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dstreev/cloudera_upgrade_utils/c3fab30c3964a068e9327d11e2e82dd0a22459d5/eval_tools/__init__.py -------------------------------------------------------------------------------- /eval_tools/ambari.py: -------------------------------------------------------------------------------- 1 | import copy 2 | import collections 3 | import re 4 | 5 | # List of services to translate 6 | supported_services = ['NAMENODE', 'HBASE_REGIONSERVER', 'KAFKA_BROKER', 'HISTORYSERVER', 'DATANODE', 7 | 'ZOOKEEPER_SERVER', 'HIVE_SERVER', 'RESOURCEMANAGER', 'HBASE_MASTER', 8 | 'HIVE_METASTORE', 'ZKFC', 'SPARK2_JOBHISTORYSERVER', 'JOURNALNODE', 9 | 'OOZIE_SERVER', 'NODEMANAGER', 'TEZ_CLIENT', 'SPARK2_CLIENT'] 10 | 11 | master_services = ['NAMENODE', 'HBASE_REGIONSERVER', 'HISTORYSERVER', 12 | 'ZOOKEEPER_SERVER', 'HIVE_SERVER', 'RESOURCEMANAGER', 'HBASE_MASTER', 13 | 'HIVE_METASTORE', 'ZKFC', 'SPARK2_JOBHISTORYSERVER', 'JOURNALNODE', 14 | 'OOZIE_SERVER'] 15 | 16 | def get_host_group_mask(item, componentDict): 17 | location = 0 18 | components = item["host_components"] 19 | for component in components: 20 | for ckey, cvalue in component.items(): 21 | if ckey == "HostRoles": 22 | for hkey, hvalue in cvalue.items(): 23 | if hkey == "component_name": 24 | location = location | componentDict[hvalue] 25 | return location 26 | 27 | 28 | # Mismatches happen here. Use the bp version 29 | # def get_component_dictionary(layout): 30 | # component_dict = {} 31 | # items = layout['items'] 32 | # for item in items: 33 | # components = item["host_components"] 34 | # for component in components: 35 | # for ckey, cvalue in component.items(): 36 | # if ckey == "HostRoles": 37 | # for hkey, hvalue in cvalue.items(): 38 | # if hkey == "component_name": 39 | # if hvalue not in component_dict.keys(): 40 | # dl = len(component_dict) 41 | # if dl == 0: 42 | # component_dict[hvalue] = 1 43 | # elif dl == 1: 44 | # component_dict[hvalue] = 2 45 | # else: 46 | # component_dict[hvalue] = 2 ** dl 47 | # return component_dict 48 | 49 | 50 | def get_component_dictionary_from_bp(blueprint): 51 | component_dict = {} 52 | host_groups = blueprint['host_groups'] 53 | for host_group in host_groups: 54 | components = host_group["components"] 55 | for component in components: 56 | for ckey, cvalue in component.items(): 57 | if ckey == "name": 58 | if cvalue not in component_dict.keys(): 59 | dl = len(component_dict) 60 | if dl == 0: 61 | component_dict[cvalue] = 1 62 | elif dl == 1: 63 | component_dict[cvalue] = 2 64 | else: 65 | component_dict[cvalue] = 2 ** dl 66 | return component_dict 67 | 68 | 69 | def calc_host_group_bit_masks(hostgroups, componentDict): 70 | host_groups_bitmask = {} 71 | # working_host_groups = copy.deepcopy(hostgroups) 72 | for hostgroup in hostgroups: 73 | hgbitmask = 0 74 | for component in hostgroup['components']: 75 | try: 76 | hgbitmask = hgbitmask | componentDict[component['name']] 77 | except: 78 | check = 'Component in Host that is not in the Layouts: ' + component['name'] 79 | host_groups_bitmask[hostgroup['name']] = hgbitmask 80 | return host_groups_bitmask 81 | 82 | 83 | def calc_host_bit_masks(layout_hosts, componentDict): 84 | host_bitmask = {} 85 | # working_host_groups = copy.deepcopy(hostgroups) 86 | for item in layout_hosts: 87 | hgbitmask = 0 88 | host_detail = {} 89 | for component in item['host_components']: 90 | try: 91 | hgbitmask = hgbitmask | componentDict[component['HostRoles']['component_name']] 92 | except: 93 | check = 'Component in Host that is not in the Layouts: ' + component['HostRoles']['component_name'] 94 | host_detail['host_name'] = item['Hosts']['host_name'] 95 | host_detail['bit_mask'] = hgbitmask 96 | if 'rack_info' in item['Hosts'].keys(): 97 | host_detail['rack_info'] = item['Hosts']['rack_info'] 98 | host_bitmask[item['Hosts']['host_name']] = host_detail 99 | return host_bitmask 100 | 101 | 102 | def build_creation_template_from_layout(blueprint, layout): 103 | print (" The Blueprint_v2 was built via a Layout, there isn't enough information\n" + 104 | " to associate a Host to a HostGroup when Host Groups contain the same\n" + 105 | " services, but have different configurations (managed groups in ambari).\n" + 106 | " So the reduction process will consolidate and strip all host group\n" + 107 | " configurations.") 108 | 109 | reduce_to_supported_services(blueprint) 110 | consolidate_blueprint_host_groups(blueprint, False) 111 | 112 | cluster_creation_template = {} 113 | # Generate Counts for Blueprint Host Groups. 114 | # Go through the Merged Blueprint and count the hosts in each host_group. 115 | host_groups = blueprint['host_groups'] 116 | component_dict = get_component_dictionary_from_bp(blueprint) 117 | # componentDict = get_component_dictionary(layout) 118 | 119 | hostgroupsbitmask = calc_host_group_bit_masks(host_groups, component_dict) 120 | 121 | hostbitmask = calc_host_bit_masks(layout['items'], component_dict) 122 | 123 | cluster_creation_template['blueprint'] = 'need-to-set-me' 124 | 125 | # Stubout Credentials 126 | credential = {'alias': 'kdc.admin.credentials', 'key': 'NEED_TO_SET', 'principal': 'NEED_TO_SET', 127 | 'type': 'TEMPORARY'} 128 | credentials = [credential] 129 | cluster_creation_template['credential'] = credentials 130 | 131 | cct_host_groups = [] 132 | for host_group in host_groups: 133 | cct_host_group = {} 134 | cct_host_group['name'] = host_group['name'] 135 | cct_hosts = [] 136 | for layout_host in hostbitmask.keys(): 137 | if hostbitmask[layout_host]['bit_mask'] == hostgroupsbitmask[host_group['name']]: 138 | cct_host = {} 139 | cct_host['fqdn'] = hostbitmask[layout_host]['host_name'] 140 | if 'rack_info' in hostbitmask[layout_host]: 141 | cct_host['rack_info'] = hostbitmask[layout_host]['rack_info'] 142 | cct_hosts.append(cct_host) 143 | cct_host_group['hosts'] = cct_hosts 144 | cct_host_groups.append(cct_host_group) 145 | 146 | cluster_creation_template['host_groups'] = cct_host_groups 147 | # Stub out more cct items. 148 | cluster_creation_template['provision_action'] = 'INSTALL_ONLY' 149 | cluster_creation_template['repository_version'] = 'NOT_TO_SET' 150 | kerb = {'type': 'KERBEROS'} 151 | cluster_creation_template['security'] = kerb 152 | 153 | return cluster_creation_template 154 | 155 | 156 | def build_ambari_blueprint_v2(blueprint, creationTemplate): 157 | # def mergeConfigsWithHostMatrix(blueprint, hostMatrix, control): 158 | blueprintV2 = copy.deepcopy(blueprint) 159 | # configurations = blueprintV2['configurations'] 160 | # stack = blueprint['Blueprints']['stack_name'] + ' ' + blueprint['Blueprints']['stack_version'] 161 | ct_hostgroups = creationTemplate['host_groups'] 162 | bp_hostgroups = blueprintV2['host_groups'] 163 | # calcHostGroupBitMasks(hostgroups) 164 | for bp_host_group in bp_hostgroups: 165 | # Loop thru ct host_groups and collect hosts for each group 166 | for ct_hostgroup in ct_hostgroups: 167 | if bp_host_group['name'] == ct_hostgroup['name']: 168 | hosts = [] 169 | for ct_host in ct_hostgroup['hosts']: 170 | lclHost = {} 171 | lclHost['hostname'] = ct_host['fqdn'] 172 | if 'rack_info' in ct_host.keys(): 173 | lclHost['rack_info'] = ct_host['rack_info'] 174 | hosts.append(lclHost) 175 | bp_host_group['hosts'] = hosts 176 | # bp_host_group['cardinality'] = str(len(hosts)) 177 | # if consolidate: 178 | # consolidate_blueprint_host_groups(blueprintV2) 179 | remove_empty_host_groups(blueprintV2) 180 | return blueprintV2 181 | 182 | 183 | def reduce_to_supported_services(blueprint): 184 | host_groups = blueprint['host_groups'] 185 | 186 | # Filter out unsupported components 187 | for host_group in host_groups: 188 | unsupported = [] 189 | for index, component in enumerate(host_group['components']): 190 | if component['name'] not in supported_services: 191 | unsupported.append(index) 192 | for index in reversed(unsupported): 193 | del host_group['components'][index] 194 | for index in range(len(host_group['configurations'])-1, -1, -1): 195 | del host_group['configurations'][index] 196 | # remove host_groups that have no components 197 | empty_host_groups = [] 198 | for index, host_group in enumerate(host_groups): 199 | if len(host_group['components']) == 0: 200 | empty_host_groups.append(index) 201 | print('Host group: ' + host_group['name'] + ' has no supported components left. Will remove it.') 202 | for index in reversed(empty_host_groups): 203 | del host_groups[index] 204 | 205 | 206 | def consolidate_blueprint_host_groups(blueprint, transfer_hosts): 207 | host_groups = blueprint['host_groups'] 208 | 209 | # Consolidate Host Groups that have the same components. 210 | component_dict = get_component_dictionary_from_bp(blueprint) 211 | hostgroupsbitmask = calc_host_group_bit_masks(host_groups, component_dict) 212 | bitmasks = [] 213 | # Let's swap the keys and values. In the process, we'll naturally condense to a set of 214 | # host groups that are unique by throwing out hostgroups with duplicate bitmasks. 215 | res = dict((v, k) for k, v in hostgroupsbitmask.iteritems()) 216 | final_host_groups = [] 217 | # Loop though and get final host groups. 218 | for key in res.keys(): 219 | final_host_groups.append(res[key]) 220 | 221 | del_hg_indexes = {} 222 | move_hg_hosts = {} 223 | for index, host_group in enumerate(host_groups): 224 | if host_group['name'] not in final_host_groups: 225 | del_hg_indexes[index] = hostgroupsbitmask[host_group['name']] 226 | move_hg_hosts[host_group['name']] = res[hostgroupsbitmask[host_group['name']]] 227 | # Migrate this host group's host to the remaining host group. 228 | 229 | # Transfer Hosts from the consolidated host groups. 230 | if transfer_hosts: 231 | for m_hg_name in move_hg_hosts: 232 | t_hg_name = move_hg_hosts[m_hg_name] 233 | target_host_group = None 234 | for host_group in host_groups: 235 | if host_group['name'] == t_hg_name: 236 | target_host_group = host_group 237 | for host_group in host_groups: 238 | if host_group['name'] == m_hg_name: 239 | if host_group['hosts'] is not None: 240 | for host in host_group['hosts']: 241 | target_host_group['hosts'].append(host) 242 | target_host_group['cardinality'] = len(target_host_group['hosts']) 243 | 244 | # Need to iterate over move_hg_groups and reset properties with a reference 245 | # to a host_group that will be removed. 246 | configs = blueprint['configurations'] 247 | for move_key in move_hg_hosts.keys(): 248 | re_move_key = move_key.replace('_','\_') 249 | # re.sub(re_move_key, '\\_', '\\_') 250 | print "Move host_group: " + move_key + " to " + move_hg_hosts[move_key] 251 | print " ---> Reconciling placeholders in configuration properties..." 252 | for config in configs: 253 | for config_key in config: 254 | # print 'Config Key: ' + config_key 255 | for property in config[config_key]['properties']: 256 | # print 'Config Property: ' + property 257 | check_value = config[config_key]['properties'][property] 258 | status, new_value = compare_exists_replace(check_value, '%HOSTGROUP::'+move_hg_hosts[move_key]+'%', '%HOSTGROUP::'+move_key+'%') 259 | if status: 260 | config[config_key]['properties'][property] = new_value 261 | 262 | # Remove duplicate Host Groups 263 | del_hg_indexes_sorted = collections.OrderedDict(sorted(del_hg_indexes.items())) 264 | for key in reversed(del_hg_indexes_sorted.keys()): 265 | del host_groups[key] 266 | 267 | 268 | def repair_host_references(blueprint_v2, replaced_hosts): 269 | configs = blueprint_v2['configurations'] 270 | for replaced_host in replaced_hosts.keys(): 271 | re_move_key = replaced_host.replace('_','\_') 272 | # re.sub(re_move_key, '\\_', '\\_') 273 | print "Replacing Host: " + replaced_host + " with " + replaced_hosts[replaced_host] 274 | print " ---> Reconciling placeholders in configuration properties..." 275 | for config in configs: 276 | for config_key in config: 277 | # print 'Config Key: ' + config_key 278 | for property in config[config_key]['properties']: 279 | # print 'Config Property: ' + property 280 | check_value = config[config_key]['properties'][property] 281 | status, new_value = compare_exists_replace(check_value, replaced_hosts[replaced_host], replaced_host) 282 | if status: 283 | config[config_key]['properties'][property] = new_value 284 | 285 | 286 | def compare_exists_replace(value, compare, replace): 287 | if compare in value: 288 | # Look for 'replace' in value and if it exists, remove it. 289 | if replace in value: 290 | value = value.replace(replace, '') 291 | # Remove empty item. 292 | value = value.replace(',,', ',') 293 | return True, value 294 | else: 295 | return False, None 296 | else: 297 | if replace in value: 298 | value = value.replace(replace, compare) 299 | return True, value 300 | else: 301 | return False, None 302 | 303 | 304 | def cct_from_blueprint_v2(blueprint_v2): 305 | cct = {} 306 | # Generate Counts for Blueprint Host Groups. 307 | # Go through the Merged Blueprint and count the hosts in each host_group. 308 | host_groups = blueprint_v2['host_groups'] 309 | 310 | cct['blueprint'] = 'need-to-set-me' 311 | 312 | # Stubout Credentials 313 | credential = {'alias': 'kdc.admin.credentials', 'key': 'NEED_TO_SET', 'principal': 'NEED_TO_SET', 314 | 'type': 'TEMPORARY'} 315 | credentials = [credential] 316 | cct['credential'] = credentials 317 | 318 | cct_host_groups = [] 319 | for host_group in host_groups: 320 | cct_host_group = {'name': host_group['name']} 321 | cct_hosts = [] 322 | for host in host_group['hosts']: 323 | cct_host = {} 324 | cct_host['fqdn'] = host['hostname'] 325 | if host['rack_info'] is not None: 326 | cct_host['rack_info'] = host['rack_info'] 327 | cct_hosts.append(cct_host) 328 | cct_host_group['hosts'] = cct_hosts 329 | cct_host_groups.append(cct_host_group) 330 | 331 | cct['host_groups'] = cct_host_groups 332 | 333 | # Stub out more cct items. 334 | cct['provision_action'] = 'INSTALL_ONLY' 335 | cct['repository_version'] = 'NOT_TO_SET' 336 | kerb = {'type': 'KERBEROS'} 337 | cct['security'] = kerb 338 | 339 | return cct 340 | 341 | 342 | def reduce_worker_scale(blueprint_v2, scale): 343 | print ("Applying worker Scale Reduction: " + str(scale)) 344 | host_groups = blueprint_v2['host_groups'] 345 | component_dict = get_component_dictionary_from_bp(blueprint_v2) 346 | hostgroupsbitmask = calc_host_group_bit_masks(host_groups, component_dict) 347 | 348 | # Generate Master Services BitMask 349 | masterbitmask = 0 350 | for service in master_services: 351 | try: 352 | masterbitmask = masterbitmask | component_dict[service] 353 | except KeyError: 354 | check = 'Not in lists for this blueprint. This is ok.' 355 | removed_hosts = [] 356 | for hostgroupname in hostgroupsbitmask.keys(): 357 | check = hostgroupsbitmask[hostgroupname] & masterbitmask == hostgroupsbitmask[hostgroupname] 358 | if not check: # Not a master 359 | # Check Cardinality 360 | for host_group in host_groups: 361 | if host_group['name'] == hostgroupname: 362 | if len(host_group['hosts']) > scale: 363 | # Need to scale back hosts in host group. 364 | for i in range(len(host_group['hosts']) - 1, scale-1, -1): 365 | # print("del: " + str(i)) 366 | removed_hosts.append(host_group['hosts'][i]['hostname']) 367 | del host_group['hosts'][i] 368 | host_group['cardinality'] = len(host_group['hosts']) 369 | return removed_hosts 370 | 371 | 372 | def substitute_hosts(blueprint_v2, hosts): 373 | index = 0 374 | replaced_hosts = {} 375 | host_groups = blueprint_v2['host_groups'] 376 | for host_group in host_groups: 377 | for bp_host in host_group['hosts']: 378 | sub_host = hosts[index] 379 | replaced_hosts[bp_host['hostname']] = sub_host['host'] 380 | bp_host['hostname'] = sub_host['host'] 381 | if sub_host['rack_info'] is not None: 382 | bp_host['rack_info'] = sub_host['rack_info'] 383 | index += 1 384 | if index >= len(hosts): 385 | print ('') 386 | print ('') 387 | print ('!!!! ******************************** ') 388 | print ('WARNING: NOT enough hosts in list to replace blueprint hosts.') 389 | print (' Replaced hosts until we exhausted list. Need to add more host to sub list and try again!!!') 390 | print ('!!!! ******************************** ') 391 | print ('') 392 | print ('') 393 | return replaced_hosts 394 | return replaced_hosts 395 | 396 | 397 | def remove_empty_host_groups(blueprintV2): 398 | print ("Removing Empty Host Groups") 399 | bp_hostgroups = blueprintV2['host_groups'] 400 | # Remove empty host-groups 401 | empty_idx = [] 402 | for index, bp_host_group in enumerate(bp_hostgroups): 403 | # print (index) 404 | try: 405 | if 'hosts' in bp_host_group.keys() and len(bp_host_group['hosts']) == 0: 406 | print("Empty Host group: " + bp_host_group['name'] + ". Will be removed.") 407 | empty_idx.append(index) 408 | elif 'hosts' not in bp_host_group.keys(): 409 | print("Empty Host group: " + bp_host_group['name'] + ". Will be removed.") 410 | empty_idx.append(index) 411 | 412 | if 'hosts' in bp_host_group.keys() and len(bp_host_group['hosts']) != int( 413 | bp_host_group['cardinality']): 414 | # print("Mismatch Cardinality for: " + bp_host_group['name'] + ". " + str( 415 | # len(bp_host_group['hosts'])) + ":" 416 | # + str(bp_host_group['cardinality'])) 417 | bp_host_group['cardinality'] = str(len(bp_host_group['hosts'])) 418 | except ValueError: 419 | # print("Mismatch Cardinality for: " + bp_host_group['name'] + ". " + str( 420 | # len(bp_host_group['hosts'])) + ":" 421 | # + str(bp_host_group['cardinality'])) 422 | bp_host_group['cardinality'] = str(len(bp_host_group['hosts'])) 423 | 424 | for index in empty_idx: 425 | del bp_hostgroups[index] 426 | return blueprintV2 427 | 428 | 429 | def merge_configs_with_host_matrix(blueprint, hostMatrix, componentDict, control): 430 | mergedBlueprint = copy.deepcopy(blueprint) 431 | configurations = mergedBlueprint['configurations'] 432 | # stack = blueprint['Blueprints']['stack_name'] + ' ' + blueprint['Blueprints']['stack_version'] 433 | hostgroups = mergedBlueprint['host_groups'] 434 | hostgroupbitmask = calc_host_group_bit_masks(hostgroups, componentDict) 435 | 436 | # Loop through Hosts 437 | for hostKey in hostMatrix: 438 | # Retrieve Host 439 | host = hostMatrix[hostKey] 440 | # print host 441 | for hostgroup in mergedBlueprint['host_groups']: 442 | if host['HostGroupMask'] == hostgroupbitmask[hostgroup['name']]: 443 | host['host_group'] = str(hostgroup['name']) 444 | hosts = [] 445 | if 'hosts' in hostgroup.keys(): 446 | hosts = hostgroup['hosts'] 447 | lclHost = {} 448 | lclHost['hostname'] = host['Hostname'] 449 | # lclHost['rackId'] = host['Rack'] 450 | # lclHost['ip'] = host['ip'] 451 | hosts.append(lclHost) 452 | else: 453 | lclHost = {} 454 | lclHost['hostname'] = host['Hostname'] 455 | # lclHost['rackId'] = host['Rack'] 456 | # lclHost['ip'] = host['ip'] 457 | hosts.append(lclHost) 458 | hostgroup['hosts'] = hosts 459 | # Loop Host Components 460 | for cGroup in host['components']: 461 | # Proceed if component has a setting. 462 | if len(host['components'][cGroup]) > 0: 463 | # print "Group: " + cGroup 464 | # Cycle through the Hosts Group Components 465 | for component in host['components'][cGroup]: 466 | # print "Component: " + component 467 | # Get the component config from the CONTROL File 468 | for componentSection in ['config', 'environment']: 469 | # print "Section: " + componentSection 470 | config = control[cGroup][component][componentSection] 471 | # Cycle through the configs in the Control File 472 | cfgSection = config['section'] 473 | # bpProperties = {} 474 | hostgroup = [] 475 | for shg in hostgroups: 476 | if 'host_group' in host.keys(): 477 | if shg['name'] == host['host_group']: 478 | hostgroup = shg 479 | else: 480 | print "no host_group identified" 481 | for bpSections in configurations: 482 | if bpSections.keys()[0] == cfgSection: 483 | # print "Config Section: " + cfgSection 484 | bpProperties = bpSections.get(cfgSection)['properties'] 485 | # Lookup Configuration in BP 486 | for localProperty in config['configs']: 487 | # Get the Target BP Property to Lookup 488 | targetProperty = config['configs'][localProperty] 489 | # Find property in BP 490 | # print 'Local Prop: ' + localProperty + '\tTarget Prop: ' + targetProperty 491 | try: 492 | pValue = bpProperties[targetProperty] 493 | # print 'BP Property Value: ' + pValue 494 | try: 495 | pValue = int(pValue) 496 | except: 497 | # It could have a trailing char for type 498 | if localProperty in ['heap', 'off.heap']: 499 | pValue = int(pValue[:-1]) 500 | if isinstance(pValue, int): 501 | # Account for some mem settings in Kb 502 | if localProperty in ['heap', 'off.heap'] and pValue > 1000000: 503 | host['components'][cGroup][component][localProperty] = pValue / 1024 504 | else: 505 | host['components'][cGroup][component][localProperty] = pValue 506 | else: 507 | host['components'][cGroup][component][localProperty] = pValue 508 | except: 509 | missing = "Missing from Blueprint: " + component + ":" + cfgSection + ":" + targetProperty 510 | # print pValue 511 | break 512 | # go through the overrides 513 | if len(hostgroup) > 0: 514 | hostgroupCfg = hostgroup['configurations'] 515 | for bpSections in hostgroupCfg: 516 | if bpSections.keys()[0] == cfgSection: 517 | # print "Config Section: " + cfgSection 518 | bpProperties = bpSections.get(cfgSection) 519 | # Lookup Configuration in BP 520 | for localProperty in config['configs']: 521 | # Get the Target BP Property to Lookup 522 | targetProperty = config['configs'][localProperty] 523 | # Find property in BP 524 | # print 'Local Prop: ' + localProperty + '\tTarget Prop: ' + targetProperty 525 | try: 526 | pValue = bpProperties[targetProperty] 527 | # print 'BP Property Value: ' + pValue 528 | try: 529 | pValue = int(pValue) 530 | except: 531 | # It could have a trailing char for type 532 | if localProperty in ['heap', 'off.heap']: 533 | pValue = int(pValue[:-1]) 534 | if isinstance(pValue, int): 535 | # Account for some mem settings in Kb 536 | if localProperty in ['heap', 'off.heap'] and pValue > 1000000: 537 | host['components'][cGroup][component][localProperty] = pValue / 1024 538 | else: 539 | host['components'][cGroup][component][localProperty] = pValue 540 | else: 541 | host['components'][cGroup][component][localProperty] = pValue 542 | except: 543 | override = "No override for: " + component + ":" + cfgSection + ":" + targetProperty 544 | # print pValue 545 | break 546 | return remove_empty_host_groups(mergedBlueprint) 547 | -------------------------------------------------------------------------------- /eval_tools/ambari_bp_tool.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | 3 | import copy 4 | import optparse 5 | import logging 6 | import sys 7 | import json 8 | import os 9 | from common import pprinttable, pprinttable2, pprinthtmltable, writehtmltable 10 | from datetime import date 11 | from ambari import * 12 | from os import path 13 | 14 | VERSION = "0.1.4" 15 | 16 | logger = logging.getLogger('ambari_bp_tool') 17 | 18 | def main(): 19 | 20 | parser = optparse.OptionParser(usage="usage: %prog [options]") 21 | 22 | parser.add_option("-l", "--ambari-layout", dest="ambari_layout", help="Ambari Layout File") 23 | parser.add_option("-c", "--ambari-creation-template", dest="ambari_creation_template", 24 | help="Ambari Cluster Creation Template") 25 | parser.add_option("-b", "--ambari-blueprint", dest="ambari_blueprint", help="Ambari Blueprint File") 26 | parser.add_option("-2", "--ambari-blueprint-v2", dest="ambari_blueprint_v2", help="Ambari Blueprint V2 File") 27 | parser.add_option("-r", "--v2-reduction", dest="v2_reduction", action="store_true", help="WIP: Remove and consolidate HostGroups for CM Conversion", ) 28 | parser.add_option("-w", "--worker-scale", dest="worker_scale", help="Reduce Cardinality of Worker Host Groups to this Cardinality") 29 | parser.add_option("-s", "--sub-hosts-file", dest="sub_hosts", help="Substitute Hosts in Blueprint with host in a file.") 30 | 31 | parser.add_option("-o", "--output-dir", dest="output_dir", help="Output Directory") 32 | 33 | (options, args) = parser.parse_args() 34 | 35 | logger.setLevel(logging.INFO) 36 | formatter = logging.Formatter('%(asctime)s %(levelname)s %(message)s') 37 | stdout_handler = logging.StreamHandler(sys.stdout) 38 | stdout_handler.setLevel(logging.INFO) 39 | stdout_handler.setFormatter(formatter) 40 | logger.addHandler(stdout_handler) 41 | 42 | if options.output_dir: 43 | output_dir = options.output_dir 44 | else: 45 | # run_date = str(date.today()) 46 | output_dir = '.' 47 | 48 | # Reduce V2 Blueprint 49 | if options.ambari_blueprint_v2 and not options.ambari_blueprint: 50 | bp_v2_file = options.ambari_blueprint_v2 51 | bp_v2 = json.loads(open(bp_v2_file).read()) 52 | if not options.v2_reduction and not options.worker_scale: 53 | print ("Need to specify -r (reduction) and/or -w and/or -s options") 54 | exit(-1) 55 | if options.v2_reduction: 56 | print("\n-->> Reducing to CM Supported convertible services.") 57 | reduce_to_supported_services(bp_v2) 58 | print("\n-->> Consolidate Host Groups") 59 | consolidate_blueprint_host_groups(bp_v2, True) 60 | if options.worker_scale: 61 | print("\n-->> Scaling down worker nodes: " + options.worker_scale) 62 | removed_hosts = reduce_worker_scale(bp_v2, int(options.worker_scale)) 63 | # TODO: Cleanup Host references in properties. 64 | 65 | if options.sub_hosts: 66 | sub_hosts_file = options.sub_hosts 67 | if not path.exists(sub_hosts_file): 68 | sub_hosts_file = os.path.dirname(os.path.realpath(__file__)) + "/hdp_support/sub_hosts_default.json" 69 | print ("WARNING: Input 'sub_hosts' file not found. Using default.") 70 | sub_hosts = json.loads(open(sub_hosts_file).read()) 71 | print("\n-->> Substituting Host fqdn's") 72 | replaced_hosts = substitute_hosts(bp_v2, sub_hosts['hosts']) 73 | # Replace host references in configuration properties. 74 | repair_host_references(bp_v2, replaced_hosts) 75 | 76 | reduced_bp_v2_file = output_dir + '/' + options.ambari_blueprint_v2[:-5] + '-reduced.json' 77 | 78 | print "\n--> Reduced Blueprint V2 output file: " + reduced_bp_v2_file 79 | 80 | bp_v2_output = open(reduced_bp_v2_file, 'w') 81 | bp_v2_output.write(json.dumps(bp_v2, indent=2, sort_keys=False)) 82 | bp_v2_output.close() 83 | 84 | exit(0) 85 | 86 | 87 | 88 | if options.ambari_blueprint: 89 | bp_file = options.ambari_blueprint 90 | blueprint = json.loads(open(bp_file).read()) 91 | else: 92 | print("Need to specify a Blueprint") 93 | exit(-1) 94 | 95 | # if options.host_group_reduction: 96 | # consolidate_blueprint_host_groups(blueprint) 97 | 98 | layout = None 99 | cct = None 100 | write_cct = False 101 | 102 | if options.ambari_layout: 103 | layout_file = options.ambari_layout 104 | layout = json.loads(open(layout_file).read()) 105 | elif not options.ambari_creation_template: 106 | # Making assumption on layout file based on BP filename 107 | layout_file = options.ambari_blueprint[:-14] + 'layout.json' 108 | if path.exists(layout_file): 109 | print ("+++ Using Ambari Layout File: " + layout_file) 110 | layout = json.loads(open(layout_file).read()) 111 | # cct = build_creation_template_from_layout(blueprint, layout) 112 | # else: 113 | # print("Can't locate layout file (based on blueprint filename: " + layout_file) 114 | # exit(-1) 115 | 116 | if options.ambari_creation_template: 117 | cct_file = options.ambari_creation_template 118 | cct = json.loads(open(cct_file).read()) 119 | elif layout is None: 120 | # Didn't load a layout and didn't specify 121 | cct_file = options.ambari_blueprint[:-14] + 'cct.json' 122 | if path.exists(cct_file): 123 | print ("+++ Using Cluster Creation Template file: " + cct_file) 124 | cct = json.loads(open(cct_file).read()) 125 | 126 | if cct is None and layout is not None: 127 | print("\n-->> Using Ambari Layout to build Cluster Creation Template.") 128 | cct = build_creation_template_from_layout(blueprint, layout) 129 | write_cct = True 130 | 131 | if cct is None and layout is None: 132 | print ("You must provide either a 'layout' (-l) or a 'CCT' (-c)") 133 | exit(-1) 134 | 135 | if options.ambari_blueprint_v2: 136 | bp_v2_file = options.ambari_blueprint_v2 137 | else: 138 | bp_v2_file = output_dir + '/' + options.ambari_blueprint[:-5] + '-v2-generated.json' 139 | 140 | print "\n--> Blueprint V2 output file: " + bp_v2_file 141 | 142 | bp_v2_output = open(bp_v2_file, 'w') 143 | 144 | bp_v2 = build_ambari_blueprint_v2(blueprint, cct) 145 | 146 | if options.worker_scale: 147 | print("\n-->> Scaling down worker nodes: " + options.worker_scale) 148 | removed_hosts = reduce_worker_scale(bp_v2, int(options.worker_scale)) 149 | # TODO: Cleanup Host references in properties. 150 | 151 | if options.sub_hosts: 152 | sub_hosts_file = options.sub_hosts 153 | if not path.exists(sub_hosts_file): 154 | sub_hosts_file = os.path.dirname(os.path.realpath(__file__)) + "/hdp_support/sub_hosts_default.json" 155 | print ("WARNING: Input 'sub_hosts' file not found. Using default.") 156 | sub_hosts = json.loads(open(sub_hosts_file).read()) 157 | print("\n-->> Substituting Host fqdn's") 158 | replaced_hosts = substitute_hosts(bp_v2, sub_hosts['hosts']) 159 | # Replace host references in configuration properties. 160 | repair_host_references(bp_v2, replaced_hosts) 161 | 162 | if write_cct: 163 | cct = cct_from_blueprint_v2(bp_v2) 164 | cct_file = options.ambari_blueprint[:-14] + 'cct-generated.json' 165 | cct_output = open(cct_file, 'w') 166 | cct_output.write(json.dumps(cct, indent=2, sort_keys=False)) 167 | cct_output.close() 168 | print "\n--> Generated CCT output file: " + cct_file 169 | 170 | bp_v2_output.write(json.dumps(bp_v2, indent=2, sort_keys=False)) 171 | bp_v2_output.close() 172 | 173 | main() -------------------------------------------------------------------------------- /eval_tools/ambari_cfg_diff.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | 3 | import optparse 4 | from optparse import OptionGroup 5 | import logging 6 | import sys 7 | import os 8 | import json 9 | from dict_diff import dict_compare 10 | from datetime import date 11 | 12 | VERSION = "0.1.5" 13 | 14 | logger = logging.getLogger('Ambari_cfg_diff') 15 | 16 | appdir = os.path.dirname(os.path.realpath(__file__)) 17 | ref_cluster_filename = appdir + '/hdp_support/ref_3_1_cluster.json' 18 | output_filename = 'default_compare.txt' 19 | 20 | eval_file = open(appdir + '/hdp_support/bp_cfg.json', 'r') 21 | eval_cfg = json.load(eval_file) 22 | eval_sections = sorted(eval_cfg['evaluate_cfgs']) 23 | 24 | section_width = 100 25 | 26 | part_sep = '***\n' 27 | 28 | 29 | # part_sep = '{message:{fill}{align}{width}}\n'.format( 30 | # message='', 31 | # fill='-', 32 | # align='^', 33 | # width=3) 34 | 35 | # part_title = ">>> %s <<<\n" 36 | 37 | 38 | def fix(text): 39 | return str(text).replace('|', '\|
').replace(',', ',
').replace('_', '\_').replace('\n', '
').replace('*', 40 | '\*').replace( 41 | ';', ';
') 42 | 43 | 44 | def write(key, added, removed, modified, env_dep, same, output): 45 | output.write("\n\n") 46 | # output.write('{message:{fill}{align}{width}}\n'.format( 47 | # message='', 48 | # fill='=', 49 | # align='^', 50 | # width=section_width, 51 | # ))## For Configuration Sections 52 | output.write('## [' + key + '](#for-configuration-sections)\n') 53 | # output.write('## {message:{fill}{align}{width}}\n'.format( 54 | # message=key, 55 | # fill=' ', 56 | # align='<', 57 | # width=section_width, 58 | # )) 59 | # output.write('{message:{fill}{align}{width}}\n'.format( 60 | # message='', 61 | # fill='=', 62 | # align='^', 63 | # width=section_width, 64 | # )) 65 | 66 | output.write(part_sep) 67 | if len(added) > 0: 68 | output.write('{message:{fill}{align}{width}}\n'.format( 69 | message='##### [ADDITIONAL ' + key + '](#for-configuration-sections)', 70 | fill=' ', 71 | align='<', 72 | width=section_width, 73 | )) 74 | # output.write(part_title % ("Extras",)) 75 | # output.write(part_sep) 76 | 77 | output.write("| Property | Current Value |\n|:---|:---|\n") 78 | for akey in sorted(added): 79 | if len(akey) <= (section_width / 10 * 6): 80 | output.write('| ' + akey + ' | ' + fix(added.get(akey)) + ' |\n') 81 | else: 82 | print('*** ' + akey + ' ***') 83 | output.write("| %s | %s |\n" % (fix(akey), fix(added.get(akey)))) 84 | 85 | # output.write(part_sep) 86 | if len(removed) > 0: 87 | output.write('{message:{fill}{align}{width}}\n'.format( 88 | message='##### [MISSING ' + key + '](#for-configuration-sections)', 89 | fill=' ', 90 | align='<', 91 | width=section_width, 92 | )) 93 | # output.write(part_sep) 94 | 95 | output.write("| Property | Missing Value |\n|:---|:---|\n") 96 | 97 | for rkey in sorted(removed): 98 | if len(rkey) <= (section_width / 10 * 6): 99 | output.write('| ' + rkey + ' | ' + fix(removed.get(rkey)) + ' |\n') 100 | else: 101 | output.write("| %s | %s |\n" % (rkey, fix(removed.get(rkey)))) 102 | 103 | # output.write(part_sep) 104 | if len(modified) > 0: 105 | output.write('{message:{fill}{align}{width}}\n'.format( 106 | message='##### [DIFF ' + key + '](#for-configuration-sections)', 107 | fill=' ', 108 | align='<', 109 | width=section_width, 110 | )) 111 | # output.write(part_sep) 112 | output.write("") 113 | # output.write("| Property | Reference Value | Check Value |\n|:---|:---|:---|\n") 114 | 115 | for mkey in sorted(modified): 116 | output.write('\n\n\n') 119 | # output.write('| ' + mkey + ' | ' + modified[mkey][0].replace('\n', '
').replace('|', '|') + ' | ' + 120 | # modified[mkey][1].replace('\n', '
').replace('|', '|') + ' |\n') 121 | output.write("
PropertyReference ValueCheck Value
' + mkey + '' + 117 | fix(modified[mkey][0]) + '' + 118 | fix(modified[mkey][1]) + '
\n") 122 | 123 | # output.write(part_sep) 124 | if len(env_dep) > 0: 125 | output.write('{message:{fill}{align}{width}}\n'.format( 126 | message='##### [ENV. DIFF ' + key + '](#for-configuration-sections)', 127 | fill=' ', 128 | align='<', 129 | width=section_width, 130 | )) 131 | # output.write(part_sep) 132 | output.write("| Property | Reference Value | Check Value |\n|:---|:---|:---|\n") 133 | 134 | for ekey in sorted(env_dep): 135 | output.write('| ' + ekey + ' | ' + fix(env_dep[ekey][0]) + ' | ' + 136 | fix(env_dep[ekey][1]) + ' |\n') 137 | # output.write("| %s | %s | %s |\n" % (ekey, env_dep[ekey][0], env_dep[ekey][1])) 138 | 139 | # output.write(part_sep) 140 | if len(same) > 0: 141 | output.write('{message:{fill}{align}{width}}\n'.format( 142 | message='##### [SAME ' + key + '](#for-configuration-sections)', 143 | fill=' ', 144 | align='<', 145 | width=section_width, 146 | )) 147 | # output.write(part_sep) 148 | output.write("| Property | Value |\n|:---|:---|\n") 149 | 150 | # Show Key and Value 151 | for ekey in sorted(same): 152 | if (len(same[ekey]) > 0): 153 | output.write('| ' + ekey + ' | ' + fix(same[ekey]) + ' |\n') 154 | else: 155 | output.write("| %s | |\n" % (ekey,)) 156 | 157 | # for item in sorted(same): 158 | # output.write("| %s | %s |\n" % (item, fix(env_dep[ekey][0]))) 159 | 160 | output.write("\n") 161 | 162 | 163 | def compare(referencebp, checkbp, output): 164 | output.write("# Ambari Configuration Diff Tool\n") 165 | output.write("## Configurations\n") 166 | output.write("\n") 167 | output.write("Reference Cluster (-r): " + referencebp + "\n") 168 | output.write("Check Cluster (-c) : " + checkbp + "\n") 169 | output.write("\n") 170 | output.write("### Summary\n") 171 | output.write("> This is a comparison of the 'reference' blueprint and the 'check' blueprint.\n") 172 | output.write(" The output will include 5 parts for each section examined:\n") 173 | output.write(" - ADDITIONAL : Keys present in the 'check' blueprint and not in the 'reference' blueprint.\n") 174 | output.write(" - MISSING : Keys missing from the 'check' blueprint, compared to the 'reference' blueprint.\n") 175 | output.write(" - DIFFER : Keys the differ between the 'check' and 'reference' blueprint.\n") 176 | output.write(" - ENV. DIFF : Keys that are in both blueprints, but differ mostly due to environment.\n") 177 | output.write(" - SAME : Keys that match between the two blueprints.\n") 178 | output.write("\n") 179 | 180 | referencebpfile = open(referencebp, 'r') 181 | checkbpfile = open(checkbp, 'r') 182 | 183 | referencedict = json.load(referencebpfile) 184 | checkdict = json.load(checkbpfile) 185 | 186 | referencedictCfg = sorted(referencedict['configurations']) 187 | checkdictCfg = checkdict['configurations'] 188 | 189 | eSections = [] 190 | 191 | for eSection in eval_sections: 192 | for eKey in eSection: 193 | eSections.append(eKey) 194 | output.write("## For Configuration Sections\n") 195 | output.write('| Section | Added | Missing | Diff | Env Diff | Same |\n') 196 | output.write('|:---|---|---|---|---|---|\n') 197 | cfg_list = map(lambda section: '| [' + section + '](#' + section.replace(' ', '-').lower() + ')' + 198 | ' | [link](#additional-' + section.replace(' ', 199 | '-').lower() + ')' + ' | [link](#missing-' + section.replace( 200 | ' ', '-').lower() + ')' + 201 | ' | [link](#diff-' + section.replace(' ', 202 | '-').lower() + ')' + ' | [link](#env-diff-' + section.replace( 203 | ' ', '-').lower() + ')' + 204 | ' | [link](#same-' + section.replace(' ', '-').lower() + ')' + ' |', eSections) 205 | # output.write("\n- ".join('[' + eSections + '](' + eSections.replace(' ', '-').lower() + ')')) 206 | output.writelines(["%s\n" % item for item in cfg_list]) 207 | # output.write(*cfg_list, sep='\n') 208 | output.write('\n| | |\n') 209 | output.write('|---:|:---|\n') 210 | output.write('| Date | ' + str(date.today()) + ' |\n') 211 | # today = date.today() 212 | # tdy = '{message:{fill}{align}{width}}'.format( 213 | # message='Date', 214 | # fill=' ', 215 | # align='>', 216 | # width=section_width/5, 217 | # ) + " : " + str(today) + "\n" 218 | # 219 | # output.write(tdy) 220 | 221 | # ref = '{message:{fill}{align}{width}}'.format( 222 | # message='Reference Blueprint', 223 | # fill=' ', 224 | # align='>', 225 | # width=section_width/5, 226 | # ) + " : " + referencebp 227 | output.write('| * | |\n') 228 | output.write('| Reference Blueprint | ' + referencebp + ' |\n') 229 | output.write('| Reference Blueprint Stack | ' + referencedict['Blueprints']['stack_name'] + " " + 230 | referencedict['Blueprints']['stack_version'] + ' |\n') 231 | print('\n\nReference Blueprint : ' + referencebp) 232 | # print (ref) 233 | 234 | # output.write("\n") 235 | # check = '{message:{fill}{align}{width}}'.format( 236 | # message='Check Blueprint', 237 | # fill=' ', 238 | # align='>', 239 | # width=section_width/5, 240 | # ) + " : " + checkbp 241 | output.write('| * | |\n') 242 | output.write('| Check Blueprint | ' + checkbp + ' |\n') 243 | output.write('| Check Blueprint Stack | ' + checkdict['Blueprints']['stack_name'] + " " + 244 | checkdict['Blueprints']['stack_version'] + ' |\n') 245 | print('Check Blueprint : ' + checkbp) 246 | # print (check) 247 | # output.write(check) 248 | 249 | # otpt = '{message:{fill}{align}{width}}'.format( 250 | # message='Output Filename', 251 | # fill=' ', 252 | # align='>', 253 | # width=section_width/5, 254 | # ) + " : " + output_filename 255 | output.write('| * | |\n') 256 | output.write('| Output Filename | ' + output_filename + ' |\n') 257 | print('Output Filename : ' + output_filename) 258 | 259 | # print(otpt) 260 | 261 | # output.write("\n") 262 | # 263 | # output.write('{message:{fill}{align}{width}}'.format( 264 | # message='Tool Version', 265 | # fill=' ', 266 | # align='>', 267 | # width=section_width/5, 268 | # ) + " : " + VERSION) 269 | output.write('| Tool Version | ' + VERSION + ' |\n') 270 | output.write("\n") 271 | 272 | # iterate over the reference cfgs: 273 | for referenceCfg in referencedictCfg: 274 | for key in referenceCfg: 275 | # Determine if the current key is one of the eval_sections 276 | if any(key in section for section in eval_sections): 277 | spec = {} 278 | # Locate the Eval Section for Special Processing Instructions 279 | for section in eval_sections: 280 | if key in section: 281 | spec = section[key] 282 | # print ("other") 283 | # Get the value of the reference cfg. 284 | value = referenceCfg[key] 285 | # >>> ["foo", "bar", "baz"].index("bar") 286 | # Iterate over the check cfgs 287 | for checkCfg in checkdictCfg: 288 | for ckey in checkCfg: 289 | # match to the current reference key 290 | if ckey == key: 291 | # get the check value 292 | cvalue = checkCfg[ckey] 293 | 294 | # get the 'properties' key from each of the values. 295 | ref = value.get('properties') 296 | check = cvalue.get('properties') 297 | added, removed, modified, env_dep, same = dict_compare(ref, check, spec) 298 | write(key, added, removed, modified, env_dep, same, output) 299 | 300 | 301 | def main(): 302 | parser = optparse.OptionParser(usage="usage: %prog [options]") 303 | 304 | parser.add_option("-r", "--reference-file", dest="reference", 305 | help="The standard (reference-file) file to compare against.") 306 | parser.add_option("-c", "--check-file", dest="check", help="The file (check-file) that you want to compare.") 307 | parser.add_option("-o", "--output", dest="output", help="The output report file will be in 'markdown'.") 308 | 309 | (options, args) = parser.parse_args() 310 | 311 | global ref_cluster_filename 312 | global output_filename 313 | 314 | if not options.check: 315 | print("Required: -c ") 316 | exit - 1 317 | else: 318 | check_filename = options.check 319 | output_filename = os.path.splitext(check_filename)[0] + "_diff.md" 320 | # output_filename = 321 | 322 | if options.reference: 323 | ref_cluster_filename = options.reference 324 | 325 | if options.output: 326 | output_filename = options.output 327 | 328 | output = open(output_filename, 'w') 329 | 330 | compare(ref_cluster_filename, check_filename, output) 331 | output.close() 332 | 333 | 334 | main() 335 | -------------------------------------------------------------------------------- /eval_tools/common.py: -------------------------------------------------------------------------------- 1 | def left(field, length): 2 | diff = length - len(str(field)) 3 | return str(field) + " " * diff 4 | 5 | 6 | def center(field, length): 7 | if isinstance(field, list): 8 | diff = length - len(str(field[0])) 9 | return " " * (diff / 2) + str(field[0]) + " " * (length - len(str(field[0])) - (diff / 2)) 10 | else: 11 | diff = length - len(str(field)) 12 | return " " * (diff / 2) + str(field) + " " * (length - len(str(field)) - (diff / 2)) 13 | 14 | 15 | def right(field, length): 16 | diff = length - len(str(field)) 17 | return " " * diff + str(field) 18 | 19 | 20 | def pprinttable(rows, fields): 21 | output = buildtable(rows, fields) 22 | for line in output: 23 | print line 24 | return output 25 | 26 | def buildtable(rows, fields): 27 | str_list = [] 28 | 29 | if len(rows) > 0: 30 | # headers = HEADER._fields 31 | # headers = HEADER 32 | lens = [] 33 | for field in fields: 34 | lens.append(len(field[1])) 35 | 36 | for row in rows: 37 | inc = 0 38 | for field in fields: 39 | if isinstance(row[field[0]], (int, float, long)): 40 | if lens[inc] < 4: 41 | lens[inc] = 4 42 | if lens[inc] < len(str(row[field[0]])): 43 | lens[inc] = len(str(row[field[0]])) 44 | # if lens[inc] < 16: 45 | # lens[inc] = 16 46 | elif isinstance(row[field[0]], (list, tuple)): 47 | size = 2 48 | for i in range(len(row[field[0]])): 49 | size += len(row[field[0]][i]) + 3 50 | if size > lens[inc]: 51 | lens[inc] = size 52 | elif isinstance(row[field[0]], (dict)): 53 | size = 2 54 | for i in range(len(row[field[0]])): 55 | size += len(row[field[0]]) + 3 56 | if size > lens[inc]: 57 | lens[inc] = size 58 | else: 59 | if row[field[0]] is not None and (len(row[field[0]]) > lens[inc]): 60 | lens[inc] = len(row[field[0]]) 61 | inc += 1 62 | 63 | headerRowSeparator = "" 64 | headerRow = "" 65 | for loc in range(len(fields)): 66 | headerRowSeparator = headerRowSeparator + "|" + "=" * (lens[loc]+1) 67 | headerRow = headerRow + "| " + center([fields[loc][1]], lens[loc]) 68 | 69 | headerRowSeparator = headerRowSeparator + "|" 70 | headerRow = headerRow + "|" 71 | 72 | str_list.append(headerRowSeparator) 73 | # print headerRowSeparator 74 | str_list.append(headerRow) 75 | # print headerRow 76 | str_list.append(headerRowSeparator) 77 | # print headerRowSeparator 78 | 79 | for row in rows: 80 | inc = 0 81 | recordRow = "" 82 | offset = 0 83 | for field in fields: 84 | if isinstance(row[field[0]], int) or isinstance(row[field[0]], float) or isinstance(row[field[0]], long): 85 | recordRow = recordRow + "| " + right(row[field[0]], lens[inc]) 86 | # elif isinstance(row[field[0]], bool): 87 | # if row[field[0]]: 88 | # recordRow = recordRow + "| " + right('X', lens[inc]) 89 | # else: 90 | # recordRow = recordRow + "| " + right('', lens[inc]) 91 | 92 | elif isinstance(row[field[0]], (dict)): 93 | # recordRow = recordRow + "| " 94 | offset = len(recordRow) 95 | it = 0 96 | for item in row[field[0]]: 97 | dictItem = str(row[field[0]][item]) 98 | if it == 0: 99 | recordRow = recordRow + '|' + left(dictItem, lens[inc] + 1) + '|\n|' 100 | elif it == len(row[field[0]]) - 1: 101 | recordRow = recordRow + ' '.rjust(offset-1) + '|' + left(dictItem, lens[inc] + 1) 102 | else: 103 | recordRow = recordRow + ' '.rjust(offset-1) + '|' + left(dictItem, lens[inc] + 1) + '|\n|' 104 | it += 1 105 | else: 106 | recordRow = recordRow + "| " + left(row[field[0]], lens[inc]) 107 | inc += 1 108 | recordRow = recordRow + "|" 109 | 110 | str_list.append(recordRow) 111 | # print recordRow 112 | 113 | str_list.append(headerRowSeparator) 114 | # print headerRowSeparator 115 | return str_list 116 | 117 | 118 | def pprinttable2(rows, fields): 119 | output = buildtable2(rows, fields) 120 | for line in output: 121 | print line 122 | 123 | 124 | def buildtable2(rows, fields): 125 | str_list = [] 126 | 127 | if len(rows) > 0: 128 | # headers = HEADER._fields 129 | # headers = HEADER 130 | lens = [] 131 | for field in fields: 132 | lens.append(len(field)) 133 | 134 | for row in rows: 135 | inc = 0 136 | for field in fields: 137 | try: 138 | value = row[field] 139 | if isinstance(row[field], (int, float, long)): 140 | if lens[inc] < 4: 141 | lens[inc] = 4 142 | if lens[inc] < len(str(row[field])): 143 | lens[inc] = len(str(row[field])) 144 | # if lens[inc] < 16: 145 | # lens[inc] = 16 146 | elif isinstance(row[field], (list, tuple)): 147 | size = 2 148 | for i in range(len(row[field])): 149 | size += len(row[field][i]) + 3 150 | if size > lens[inc]: 151 | lens[inc] = size 152 | elif isinstance(row[field], (dict)): 153 | size = 2 154 | for i in range(len(row[field])): 155 | size += len(row[field]) + 3 156 | if size > lens[inc]: 157 | lens[inc] = size 158 | else: 159 | if row[field] is not None and (len(row[field]) > lens[inc]): 160 | lens[inc] = len(row[field]) 161 | except: 162 | pass 163 | inc += 1 164 | 165 | headerRowSeparator = "" 166 | headerRow = "" 167 | loc = 0 168 | for field in fields: 169 | # for loc in range(len(fields)): 170 | headerRowSeparator = headerRowSeparator + "|" + "=" * (lens[loc]+1) 171 | headerRow = headerRow + "| " + center(field, lens[loc]) 172 | loc += 1 173 | 174 | headerRowSeparator = headerRowSeparator + "|" 175 | headerRow = headerRow + "|" 176 | 177 | str_list.append(headerRowSeparator) 178 | # print headerRowSeparator 179 | str_list.append(headerRow) 180 | # print headerRow 181 | str_list.append(headerRowSeparator) 182 | # print headerRowSeparator 183 | 184 | for row in rows: 185 | inc = 0 186 | recordRow = "" 187 | offset = 0 188 | for field in fields: 189 | try: 190 | value = row[field] 191 | if isinstance(row[field], int) or isinstance(row[field], float) or isinstance(row[field], long): 192 | recordRow = recordRow + "| " + right(row[field], lens[inc]) 193 | # elif isinstance(row[field[0]], bool): 194 | # if row[field[0]]: 195 | # recordRow = recordRow + "| " + right('X', lens[inc]) 196 | # else: 197 | # recordRow = recordRow + "| " + right('', lens[inc]) 198 | 199 | elif isinstance(row[field], (dict)): 200 | # recordRow = recordRow + "| " 201 | offset = len(recordRow) 202 | it = 0 203 | for item in row[field]: 204 | dictItem = str(item) + ':' + str(row[field][item]) 205 | if it == 0: 206 | recordRow = recordRow + '|' + left(dictItem, lens[inc] + 1) + '|\n|' 207 | elif it == len(row[field]) - 1: 208 | recordRow = recordRow + ' '.rjust(offset-1) + '|' + left(dictItem, lens[inc] + 1) 209 | else: 210 | recordRow = recordRow + ' '.rjust(offset-1) + '|' + left(dictItem, lens[inc] + 1) + '|\n|' 211 | it += 1 212 | else: 213 | recordRow = recordRow + "| " + left(row[field], lens[inc]) 214 | except: 215 | recordRow = recordRow + "| " + left(' ', lens[inc]) 216 | 217 | inc += 1 218 | 219 | recordRow = recordRow + "|" 220 | 221 | str_list.append(recordRow) 222 | # print recordRow 223 | 224 | str_list.append(headerRowSeparator) 225 | # print headerRowSeparator 226 | return str_list 227 | 228 | 229 | def writehtmltable(rows, fields, outputFile): 230 | output = buildhtmltable(rows, fields) 231 | for line in output: 232 | outputFile.write(line) 233 | 234 | 235 | def pprinthtmltable(rows, fields): 236 | output = buildhtmltable(rows, fields) 237 | for line in output: 238 | print(line) 239 | 240 | 241 | def buildhtmltable(rows, fields): 242 | str_list = [] 243 | 244 | if len(rows) > 0: 245 | # headers = HEADER._fields 246 | # headers = HEADER 247 | # lens = [] 248 | # for field in fields: 249 | # lens.append(len(field)) 250 | # 251 | # for row in rows: 252 | # inc = 0 253 | # for field in fields: 254 | # try: 255 | # value = row[field] 256 | # if isinstance(row[field], (int, float, long)): 257 | # if lens[inc] < 4: 258 | # lens[inc] = 4 259 | # if lens[inc] < len(str(row[field])): 260 | # lens[inc] = len(str(row[field])) 261 | # # if lens[inc] < 16: 262 | # # lens[inc] = 16 263 | # elif isinstance(row[field], (list, tuple)): 264 | # size = 2 265 | # for i in range(len(row[field])): 266 | # size += len(row[field][i]) + 3 267 | # if size > lens[inc]: 268 | # lens[inc] = size 269 | # elif isinstance(row[field], (dict)): 270 | # size = 2 271 | # for i in range(len(row[field])): 272 | # size += len(row[field]) + 3 273 | # if size > lens[inc]: 274 | # lens[inc] = size 275 | # else: 276 | # if row[field] is not None and (len(row[field]) > lens[inc]): 277 | # lens[inc] = len(row[field]) 278 | # except: 279 | # pass 280 | # inc += 1 281 | str_list.append('') 282 | headerRowSeparator = '|' 283 | headerRow = '' 284 | loc = 0 285 | for field in fields: 286 | headerRow = headerRow + '' 298 | 299 | headerRow = headerRow + '' 300 | 301 | str_list.append(headerRow) 302 | # str_list.append(headerRowSeparator) 303 | 304 | for row in rows: 305 | inc = 0 306 | recordRow = ' ' 307 | offset = 0 308 | for field in fields: 309 | recordRow = recordRow + '' 338 | 339 | inc += 1 340 | recordRow = recordRow + '' 341 | # recordRow = recordRow + ' | ' 342 | 343 | str_list.append(recordRow) 344 | # print recordRow 345 | str_list.append('
' 287 | # try: 288 | # # for loc in range(len(fields)): 289 | # if isinstance(rows[0][field], int) or isinstance(rows[0][field], float) or isinstance(rows[0][field], long): 290 | # headerRowSeparator = headerRowSeparator + '---:|' 291 | # else: 292 | # headerRowSeparator = headerRowSeparator + ':---|' 293 | # except: 294 | # headerRowSeparator = headerRowSeparator + ':---|' 295 | # loc += 1 296 | headerRow = headerRow + field 297 | headerRow = headerRow + '
' 310 | try: 311 | value = str(row[field]) 312 | # if isinstance(row[field], int) or isinstance(row[field], float) or isinstance(row[field], long): 313 | # recordRow = recordRow + "| " + right(row[field], lens[inc]) 314 | if isinstance(row[field], (dict)): 315 | # recordRow = recordRow + '| ' 316 | # offset = len(recordRow) 317 | it = 0 318 | if len(row[field]) > 0: 319 | recordRow = recordRow + '' 320 | for item in row[field]: 321 | recordRow = recordRow + '' 329 | recordRow = recordRow + '
' 322 | dictItem = str(item) + ':' + str(row[field][item]).replace(',', '
') 323 | if it == len(row[field]) - 1: 324 | recordRow = recordRow + dictItem + '
' 325 | else: 326 | recordRow = recordRow + dictItem + '
' 327 | it += 1 328 | recordRow = recordRow + '
' 330 | # else: 331 | # recordRow = recordRow + ' | ' 332 | else: 333 | recordRow = recordRow + value 334 | except: 335 | pass 336 | 337 | recordRow = recordRow + '
') 346 | # str_list.append(headerRowSeparator) 347 | # print headerRowSeparator 348 | return str_list 349 | -------------------------------------------------------------------------------- /eval_tools/dict_diff.py: -------------------------------------------------------------------------------- 1 | import sys 2 | import os 3 | import json 4 | import operator 5 | 6 | 7 | def dict_compare(base_dict, check_dict, cfg_spec): 8 | skip = [] 9 | if 'skip' in cfg_spec: 10 | skip = cfg_spec['skip'] 11 | skip_keys = set(skip) 12 | 13 | base_keys = set(base_dict.keys()) - skip_keys 14 | check_keys = set(check_dict.keys()) - skip_keys 15 | intersect_keys = base_keys.intersection(check_keys) 16 | added_keys = check_keys - base_keys 17 | added = {} 18 | for added_key in added_keys: 19 | added[added_key] = check_dict[added_key] 20 | # print "hello" 21 | 22 | removed_keys = base_keys - check_keys 23 | removed = {} 24 | for removed_key in removed_keys: 25 | removed[removed_key] = base_dict[removed_key] 26 | 27 | modified = {o : (base_dict[o], check_dict[o]) for o in intersect_keys if base_dict[o] != check_dict[o]} 28 | env_dep = {} 29 | env_dep_check = [] 30 | if 'environment_dependent' in cfg_spec: 31 | env_dep_check = cfg_spec['environment_dependent'] 32 | for ed in env_dep_check: 33 | if ed in modified.keys(): 34 | env_dep[ed] = modified[ed] 35 | del modified[ed] 36 | 37 | # skip_keys = set(skip) 38 | 39 | # Check if 40 | # del modified[key] 41 | 42 | # same = set(o for o in intersect_keys if base_dict[o] == check_dict[o]) 43 | same = {o: (base_dict[o]) for o in intersect_keys if base_dict[o] == check_dict[o]} 44 | return added, removed, modified, env_dep, same 45 | 46 | -------------------------------------------------------------------------------- /eval_tools/hdp_eval.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | 3 | # use this to parse the Ambari Layout Report that's generated with: 4 | # http://${AMBARI_HOST_PORT}/api/v1/clusters/${CLUSTER_NAME}/hosts?fields=Hosts/host_name,host_components,Hosts/ip,Hosts/total_mem,Hosts/os_arch,Hosts/os_type,Hosts/rack_info,Hosts/cpu_count,Hosts/disk_info,metrics/disk,Hosts/ph_cpu_count 5 | 6 | import os 7 | import optparse 8 | import logging 9 | import sys 10 | import json 11 | from common import pprinttable, pprinttable2, pprinthtmltable, writehtmltable 12 | from datetime import date 13 | from os import path 14 | from ambari import * 15 | 16 | VERSION = "0.1.5" 17 | 18 | logger = logging.getLogger('hdp_eval') 19 | 20 | # HOSTS = {} 21 | # SERVICES = {} 22 | # CONTROL = {} 23 | # glayout = {} 24 | # blueprint = {} 25 | cluster_creation_template = {} 26 | 27 | layout_file = '' 28 | bp_file = '' 29 | run_date = '' 30 | stack = '' 31 | 32 | 33 | # A bitmask to associate to a hostgroup 34 | # componentDict = {} 35 | 36 | 37 | def build_field_path_from_abbr(control, abbrs): 38 | fields = [] 39 | paths = {} 40 | for abbr in abbrs: 41 | for group in control: 42 | for component in control[group]: 43 | # print group + ":" + component 44 | if abbr == control[group][component]['abbr']: 45 | # print 'found' 46 | path = [group, component] 47 | paths[abbr] = path 48 | fields.append(abbr) 49 | 50 | return paths, fields 51 | 52 | 53 | def get_hostname(item): 54 | host_info = item["Hosts"] 55 | return host_info["host_name"] 56 | 57 | 58 | def aggregate_components(layout): 59 | services = {} 60 | 61 | items = layout['items'] 62 | 63 | for item in items: 64 | components = item['host_components'] 65 | for component in components: 66 | for key, value in component.items(): 67 | if key == 'HostRoles': 68 | for rkey, rvalue in value.items(): 69 | if rkey == 'component_name': 70 | if rvalue in services: 71 | services[rvalue] += 1 72 | else: 73 | services[rvalue] = 1 74 | return services 75 | 76 | 77 | def is_component(item, componentName, componentDict): 78 | # addToComponentDictionary(componentName) 79 | components = item["host_components"] 80 | for component in components: 81 | for ckey, cvalue in component.items(): 82 | if ckey == "HostRoles": 83 | for hkey, hvalue in cvalue.items(): 84 | if hkey == "component_name": 85 | if hvalue == componentName: 86 | return True, componentDict[hvalue] 87 | return False, 0 88 | 89 | 90 | def is_component_x(item, componentName): 91 | found, location = is_component(item, componentName) 92 | if found: 93 | return 'X' 94 | else: 95 | return '' 96 | 97 | 98 | def get_control(controlFile): 99 | control = {} 100 | control = json.loads(open(controlFile).read()) 101 | return control 102 | 103 | 104 | def get_info(layoutFile): 105 | layout = json.loads(open(layoutFile).read()) 106 | items = layout['items'] 107 | 108 | hosttable, compute_count, other_count = gen_hosttable(items) 109 | 110 | return hosttable, compute_count, other_count 111 | 112 | 113 | def append_css(output): 114 | output.write( 115 | '') 122 | 123 | 124 | def report(blueprint, hostMatrix, layout, control, componentDict, output_dir): 125 | index_filename = output_dir + '/index.html' 126 | index_output = open(index_filename, 'w') 127 | append_css(index_output) 128 | writeHeader(index_output) 129 | index_output.write('
') 130 | index_output.write('') 131 | index_output.write('') 132 | index_output.write('') 135 | index_output.write('') 136 | index_output.write('') 139 | index_output.write('') 140 | index_output.write('') 143 | index_output.write('') 144 | index_output.write('') 147 | index_output.write('') 148 | index_output.write('') 151 | index_output.write('') 152 | index_output.write('') 155 | index_output.write('
') 133 | index_output.write('Services') 134 | index_output.write('
') 137 | index_output.write('Count Types') 138 | index_output.write('
') 141 | index_output.write('Host Table') 142 | index_output.write('
') 145 | index_output.write('Host Storage') 146 | index_output.write('
') 149 | index_output.write('Host Memory Allocation') 150 | index_output.write('
') 153 | index_output.write('Hosts json') 154 | index_output.write('
') 156 | 157 | # index_output.write('
    ') 158 | # index_output.write('
  1. Services
  2. ') 159 | # index_output.write('
  3. Count Types
  4. ') 160 | # index_output.write('
  5. Host Table
  6. ') 161 | # index_output.write('
  7. Host Storage
  8. ') 162 | # index_output.write('
  9. Host Memory Allocation
  10. ') 163 | # index_output.write('
  11. Hosts json
  12. ') 164 | # index_output.write('
') 165 | 166 | index_output.close() 167 | 168 | services_filename = output_dir + '/services.html' 169 | services_output = open(services_filename, 'w') 170 | append_css(services_output) 171 | writeHeader(services_output) 172 | rpt_services(layout, services_output) 173 | services_output.close() 174 | 175 | count_types = {} 176 | count_types['Storage'] = ['DATANODE'] 177 | count_types['Compute'] = ['NODEMANAGER'] 178 | count_types['Master'] = ['NAMENODE', 'RESOURCEMANAGER', 'OOZIE_SERVER', 'HIVE_SERVER', 179 | 'HIVE_METASTORE'] 180 | count_types['Kafka'] = ['KAFKA_BROKER'] 181 | count_types['LLAP'] = ['HIVE_SERVER_INTERACTIVE'] 182 | 183 | count_types_filename = output_dir + '/count_types.html' 184 | count_types_output = open(count_types_filename, 'w') 185 | append_css(count_types_output) 186 | writeHeader(count_types_output) 187 | rpt_count_type(blueprint, layout, count_types, count_types_output, componentDict) 188 | count_types_output.close() 189 | 190 | mem_alloc_filename = output_dir + '/mem_alloc.html' 191 | mem_alloc_output = open(mem_alloc_filename, 'w') 192 | append_css(mem_alloc_output) 193 | writeHeader(mem_alloc_output) 194 | rpt_mem_allocations(hostMatrix, control, mem_alloc_output) 195 | mem_alloc_output.close() 196 | 197 | # rpt_mem_allocations() 198 | 199 | hosttable_filename = output_dir + '/hosttable.html' 200 | hosttable_output = open(hosttable_filename, 'w') 201 | append_css(hosttable_output) 202 | writeHeader(hosttable_output) 203 | rpt_hosttable(hostMatrix, control, hosttable_output) 204 | hosttable_output.close() 205 | 206 | # rpt_hosttable() 207 | 208 | # print '' 209 | # print '=======================================' 210 | # print ' Location Details' 211 | # print '---------------------------------------' 212 | # print json.dumps(CONFIGS, indent=4, sort_keys=True) 213 | # print '=======================================' 214 | # print '' 215 | 216 | # TODO: Get Memory Settings and use to find over allocated Hosts. 217 | hoststorage_filename = output_dir + '/hoststorage.html' 218 | hoststorage_output = open(hoststorage_filename, 'w') 219 | append_css(hoststorage_output) 220 | writeHeader(hoststorage_output) 221 | rpt_hoststorage(hostMatrix, control, hoststorage_output) 222 | hoststorage_output.close() 223 | 224 | # rpt_hoststorage() 225 | 226 | # print '' 227 | # print '=======================================' 228 | # print ' Host Details' 229 | # print '---------------------------------------' 230 | 231 | hostdump_filename = output_dir + '/hosts.json' 232 | hostdump_output = open(hostdump_filename, 'w') 233 | # appendCSS(hostdump_output) 234 | hostdump_output.write(json.dumps(hostMatrix, indent=2, sort_keys=True)) 235 | hostdump_output.close() 236 | 237 | # print json.dumps(HOSTS, indent=2, sort_keys=True) 238 | # print '=======================================' 239 | # print '' 240 | 241 | 242 | def gen_hosttable(items): 243 | records = [] 244 | compute_count = {} 245 | other_count = {} 246 | 247 | for item in items: 248 | record = [] 249 | 250 | hostItem = item["Hosts"] 251 | 252 | record.append(hostItem["host_name"]) 253 | record.append(hostItem["os_type"]) 254 | record.append(hostItem["cpu_count"]) 255 | record.append(hostItem["total_mem"] / (1024 * 1024)) 256 | record.append(hostItem["rack_info"]) 257 | 258 | record.append(is_component_x(item, 'KNOX_GATEWAY')) 259 | record.append(is_component_x(item, 'NAMENODE')) 260 | record.append(is_component_x(item, 'JOURNALNODE')) 261 | record.append(is_component_x(item, "ZKFC")) 262 | record.append(is_component_x(item, "DATANODE")) 263 | record.append(is_component_x(item, "RESOURCEMANAGER")) 264 | record.append(is_component_x(item, "NODEMANAGER")) 265 | record.append(is_component_x(item, "ZOOKEEPER_SERVER")) 266 | record.append(is_component_x(item, "HIVE_METASTORE")) 267 | record.append(is_component_x(item, "HIVE_SERVER")) 268 | record.append(is_component_x(item, "HIVE_SERVER_INTERACTIVE")) 269 | record.append(is_component_x(item, "OOZIE_SERVER")) 270 | record.append(is_component_x(item, "HBASE_MASTER")) 271 | record.append(is_component_x(item, "HBASE_REGIONSERVER")) 272 | record.append(is_component_x(item, "KAFKA_BROKER")) 273 | record.append(is_component_x(item, "NIFI_MASTER")) 274 | 275 | record.append(is_component_x(item, "LIVY2_SERVER")) 276 | record.append(is_component_x(item, "SPARK2_JOBHISTORY")) 277 | 278 | record.append(is_component_x(item, "DRUID_ROUTER")) 279 | record.append(is_component_x(item, "DRUID_OVERLOAD")) 280 | record.append(is_component_x(item, "DRUID_BROKER")) 281 | record.append(is_component_x(item, "DRUID_MIDDLEMANAGER")) 282 | record.append(is_component_x(item, "DRUID_HISTORICAL")) 283 | record.append(is_component_x(item, "DRUID_COORDINATOR")) 284 | 285 | try: 286 | disks = {} 287 | try: 288 | for disk in hostItem['disk_info']: 289 | # diskCount += 1 290 | diskCap = int(disk['size']) / (1024 * 1024) 291 | diskMount = disk['mountpoint'] 292 | diskFormat = disk['type'] 293 | if diskCap in disks: 294 | disks[diskCap]['count'] += 1 295 | if diskFormat not in disks[diskCap]['format']: 296 | disks[diskCap]['format'].append(diskFormat.encode("utf-8")) 297 | disks[diskCap]['mount'].append(diskMount.encode("utf-8")) 298 | else: 299 | disks[diskCap] = {'count': 1, 'size': diskCap, 300 | 'mount': [diskMount.encode("utf-8")], 301 | 'format': [diskFormat.encode("utf-8")]} 302 | record.append(disks) 303 | hostRecord = {'os_type': hostItem["os_type"], 'cpu_count': hostItem["cpu_count"], 304 | 'total_mem': hostItem["total_mem"] / (1024 * 1024), 305 | 'rack': hostItem["rack_info"], 'components': components, 'disks': disks} 306 | 307 | except: 308 | host_detail = " No host detail information supplied" 309 | except: 310 | hello = "No host information supplied" 311 | 312 | records.append(record) 313 | 314 | compute = is_component(item, "NODEMANAGER") 315 | key = str(compute) + str(record[3]) + str(record[1]) 316 | memory = record[3] 317 | cores = record[1] 318 | if compute and key not in compute_count: 319 | compute_count[key] = {'count': 1, 'memory': memory, 'cores': cores, } 320 | elif compute: 321 | compute_count[key]['count'] += 1 322 | elif not compute and key not in other_count: 323 | other_count[key] = {'count': 1, 'memory': memory, 'cores': cores, } 324 | elif not compute: 325 | other_count[key]['count'] += 1 326 | 327 | # print key + str(memory) + str(cores) 328 | 329 | return records, compute_count, other_count 330 | 331 | 332 | def host_matrix_from_layout(layout, control, componentDict): 333 | # layout = json.loads(open(layoutFile).read()) 334 | items = layout['items'] 335 | 336 | hostMatrix = {} 337 | 338 | services = {} 339 | 340 | for item in items: 341 | # Build total component counts for cluster while examining each item. 342 | # services = aggregateComponents(item) 343 | # add_to_component_dictionary(item, componentDict) 344 | 345 | hostItem = item["Hosts"] 346 | 347 | host = {} 348 | 349 | host['Hostname'] = hostItem['host_name'] 350 | host['OS'] = hostItem['os_type'] 351 | host['vC'] = hostItem['cpu_count'] 352 | host['Gb'] = hostItem['total_mem'] / (1024 * 1024) 353 | host['Rack'] = hostItem['rack_info'] 354 | host['ip'] = hostItem['ip'] 355 | 356 | components = {} 357 | hostGroup = 0 358 | for componentGroup in control: 359 | components[componentGroup] = {} 360 | for cKey in control[componentGroup]: 361 | # print cKey 362 | # print CONTROL[componentGroup][cKey] 363 | found, location = is_component(item, cKey, componentDict) 364 | if location > 0: 365 | hostGroup = hostGroup | location 366 | if found: 367 | cValue = control[componentGroup][cKey] 368 | components[componentGroup].update({cKey: {'abbr': cValue['abbr']}}) 369 | host['components'] = components 370 | host['HostGroupMask'] = get_host_group_mask(item, componentDict) 371 | 372 | disks = {} 373 | # Loop through the disks 374 | try: 375 | for disk in hostItem['disk_info']: 376 | diskCap = int(disk['size']) / (1024 * 1024) 377 | diskMount = disk['mountpoint'] 378 | diskFormat = disk['type'] 379 | if diskCap in disks: 380 | disks[diskCap]['count'] += 1 381 | if diskFormat not in disks[diskCap]['format']: 382 | disks[diskCap]['format'].append(diskFormat.encode("utf-8")) 383 | disks[diskCap]['mount'].append(diskMount.encode("utf-8")) 384 | else: 385 | disks[diskCap] = {'count': 1, 'size': diskCap, 386 | 'mount': [diskMount.encode("utf-8")], 387 | 'format': [diskFormat.encode("utf-8")]} 388 | host['Disks'] = disks 389 | 390 | except: 391 | host_detail = " No host detail information supplied" 392 | 393 | hostMatrix[hostItem['host_name']] = host 394 | return hostMatrix 395 | 396 | 397 | # def calcHostGroupBitMasks(hostgroups): 398 | # for hostgroup in hostgroups: 399 | # hgbitmask = 0 400 | # for component in hostgroup['components']: 401 | # try: 402 | # hgbitmask = hgbitmask | componentDict[component['name']] 403 | # except: 404 | # check = 'Component in Host that is not in the Layouts: ' + component['name'] 405 | # hostgroup['HostGroupMask'] = hgbitmask 406 | 407 | 408 | def rpt_mem_allocations(hostMatrix, control, output): 409 | output.write('\n

Host Memory Allocations

\n') 410 | fields = ['Hostname', 'Gb', 'Allocated', 'Components'] 411 | mem_recs = [] 412 | cluster_total_mem = 0 413 | for hostKey in hostMatrix: 414 | mem_rec = {} 415 | host = hostMatrix[hostKey] 416 | mem_rec['Hostname'] = host['Hostname'] 417 | mem_rec['Gb'] = host['Gb'] 418 | mem_rec_component_heaps = {} 419 | mem_rec['Components'] = {} 420 | for controlKey in control: 421 | for component in control[controlKey]: 422 | for hostGroupKey in host['components']: 423 | if hostGroupKey == controlKey: 424 | for hostComponentKey in host['components'][hostGroupKey]: 425 | mem = {} 426 | try: 427 | mem['heap'] = host['components'][hostGroupKey][hostComponentKey]['heap'] 428 | try: 429 | mem['off.heap'] = host['components'][hostGroupKey][hostComponentKey]['off.heap'] 430 | cluster_total_mem += mem['off.heap'] 431 | except: 432 | # No off.heap information 433 | pass 434 | except: 435 | no_heap = 'No HEAP Information->' + host[ 436 | 'Hostname'] + ':' + component + ':' + hostGroupKey + ':' + hostComponentKey 437 | # print hostComponentKey 438 | if len(mem) > 0: 439 | mem_rec['Components'][hostComponentKey] = mem 440 | # print 'host' 441 | total_mem = 0 442 | for mem_alloc_key in mem_rec['Components']: 443 | mem_type = mem_rec['Components'][mem_alloc_key] 444 | for type in mem_type: 445 | mem_raw = mem_type[type] 446 | try: 447 | mem = int(mem_raw) 448 | except: 449 | mem = int(mem_raw[:-1]) 450 | total_mem += mem 451 | mem_rec['Allocated'] = total_mem / 1024 452 | mem_recs.append(mem_rec) 453 | writehtmltable(mem_recs, fields, output) 454 | output.write("
") 455 | output.write("
") 456 | output.write("

Total Memory Footprint: " + str(cluster_total_mem) + " GB

") 457 | 458 | 459 | def rpt_services(layout, output): 460 | output.write('\n

Service Counts

\n') 461 | lcl_services = [] 462 | fields = ['Service', 'Count'] 463 | services = aggregate_components(layout) 464 | for service in services: 465 | lcl_service = {} 466 | lcl_service['Service'] = service 467 | lcl_service['Count'] = services[service] 468 | lcl_services.append(lcl_service) 469 | writehtmltable(lcl_services, fields, output) 470 | 471 | 472 | def get_hostbase(host, fields): 473 | hostRec = {} 474 | for field in fields: 475 | hostRec[field] = host[field] 476 | return hostRec 477 | 478 | 479 | def populate_components(paths, hostComponents, hostRec): 480 | for pabbr in paths: 481 | path = paths[pabbr] 482 | value = hostComponents[path[0]] 483 | try: 484 | value = hostComponents[path[0]][path[1]] 485 | hostRec[pabbr] = 'X' 486 | except: 487 | pass 488 | 489 | 490 | def rpt_hosttable(hostMatrix, control, output): 491 | output.write('\n

Host Table

\n') 492 | # master = datanode & compute 493 | fields_base = ['Hostname', 'OS', 'vC', 'Gb', 'Rack'] 494 | 495 | paths, bfields = build_field_path_from_abbr(control, ['KX', 'NN', 'JN', 'ZKFC', 'DN', 'RM', 'NM', 496 | 'ZK', 'HMS', 'HS2', 'HS2i', 'OZ', 'HM', 'RS', 497 | 'KB', 'NF', 'LV2', 'S2H', 'DR', 'DO', 'DB', 498 | 'DM', 'DH', 'DH']) 499 | 500 | fields = fields_base + bfields 501 | 502 | hosttable = [] 503 | for hostKey in hostMatrix: 504 | host = hostMatrix[hostKey] 505 | hostRec = get_hostbase(host, fields_base) 506 | populate_components(paths, host['components'], hostRec) 507 | 508 | hosttable.append(hostRec) 509 | 510 | writehtmltable(hosttable, fields, output) 511 | 512 | 513 | def rpt_hoststorage(hostMatrix, control, output): 514 | output.write('\n

Host Storage

\n') 515 | fields_base = ['Hostname', 'vC', 'Gb', 'Rack'] 516 | 517 | paths, bfields = build_field_path_from_abbr(control, ['NN', 'JN', 'DN', 'ZK', 'NM', 'KB', 'NF']) 518 | 519 | fields = fields_base + bfields 520 | fields.append('DataDirs') 521 | fields.append('LogsDirs') 522 | fields.append('Disks') 523 | 524 | hosttable = [] 525 | for hostKey in hostMatrix: 526 | host = hostMatrix[hostKey] 527 | hostRec = get_hostbase(host, fields_base) 528 | populate_components(paths, host['components'], hostRec) 529 | hostRec['DataDirs'] = getDataDirs(host['components']) 530 | hostRec['LogsDirs'] = getLogsDirs(host['components']) 531 | hostRec['Disks'] = host['Disks'] 532 | 533 | hosttable.append(hostRec) 534 | 535 | writehtmltable(hosttable, fields, output) 536 | 537 | 538 | def getDataDirs(components): 539 | dataDirs = {} 540 | for componentKey in components: 541 | for partKey in components[componentKey]: 542 | part = components[componentKey][partKey] 543 | if 'data.dir' in part.keys(): 544 | dataDirs[partKey] = part['data.dir'] 545 | return dataDirs 546 | 547 | 548 | def getLogsDirs(components): 549 | logsDirs = {} 550 | for componentKey in components: 551 | for partKey in components[componentKey]: 552 | part = components[componentKey][partKey] 553 | if 'logs.dir' in part.keys(): 554 | logsDirs[partKey] = part['logs.dir'] 555 | return logsDirs 556 | 557 | 558 | def rpt_count_type(blueprint, layout, types, output, componentDict): 559 | cluster_creation_template = {} 560 | 561 | output.write('\n

Count Types

\n') 562 | # layout = json.loads(open(layoutFile).read()) 563 | items = layout['items'] 564 | 565 | # master = datanode & compute 566 | # type = { category: ['DN','NN']} 567 | table = [] 568 | fields = ['Category', 'Types', 'Count', 'Min Cores', 'Max Cores', 'Min Gb', 'Max Gb'] 569 | for category in types: 570 | type_rec = {} 571 | type_rec['Category'] = category 572 | type_rec['Count'] = 0 573 | type_rec['Types'] = types[category] 574 | type_rec['Min Cores'] = 10000 575 | type_rec['Max Cores'] = 0 576 | type_rec['Min Gb'] = 10000 577 | type_rec['Max Gb'] = 0 578 | table.append(type_rec) 579 | 580 | for item in items: 581 | found = 0 582 | for comp in types[category]: 583 | componentFound, hgbitmask = is_component(item, comp, componentDict) 584 | if componentFound: 585 | found += 1 586 | host = item['Hosts'] 587 | mem = host['total_mem'] / (1024 * 1024) 588 | # CPU Min 589 | if host['cpu_count'] < type_rec['Min Cores']: 590 | type_rec['Min Cores'] = host['cpu_count'] 591 | # CPU Max 592 | if host['cpu_count'] > type_rec['Max Cores']: 593 | type_rec['Max Cores'] = host['cpu_count'] 594 | # Mem Min 595 | if mem < type_rec['Min Gb']: 596 | type_rec['Min Gb'] = mem 597 | # Mem Max 598 | if mem > type_rec['Max Gb']: 599 | type_rec['Max Gb'] = mem 600 | if found == 1: 601 | found += 1; 602 | type_rec['Count'] += 1 603 | 604 | writehtmltable(table, fields, output) 605 | 606 | daemon_count = 0 607 | for cat in table: 608 | if cat['Category'] == 'LLAP' and cat['Count'] > 0: 609 | for config in blueprint['configurations']: 610 | if 'hive-interactive-env' in config.keys(): 611 | daemon_count = int(config['hive-interactive-env']['properties']['num_llap_nodes_for_llap_daemons']) 612 | if config['hive-interactive-env']['properties']['num_llap_nodes'] > daemon_count: 613 | daemon_count = int(config['hive-interactive-env']['properties']['num_llap_nodes']) 614 | break 615 | 616 | if daemon_count > 0: 617 | output.write('\n

LLAP Daemon Count: ' + str(daemon_count) + '

\n') 618 | 619 | output.write('\n

Unique Host Count: ' + str(len(layout['items'])) + '

\n') 620 | 621 | 622 | # Generate Counts for Blueprint Host Groups. 623 | # Go through the Merged Blueprint and count the hosts in each host_group. 624 | hg_table = [] 625 | hg_fields = ['Host Group', 'Count', 'Components', 'Hosts'] 626 | cluster_creation_template['blueprint'] = 'need-to-set-me' 627 | host_groups = blueprint['host_groups'] 628 | cct_host_groups = [] 629 | for host_group in host_groups: 630 | cct_host_group = {} 631 | hgrec = {} 632 | hgrec['Host Group'] = host_group['name'] 633 | cct_host_group['name'] = host_group['name'] 634 | cct_hosts = [] 635 | 636 | hgrec['Count'] = len(host_group['hosts']) 637 | hgrec_components = [] 638 | for comps in host_group['components']: 639 | hgrec_components.append(comps['name']) 640 | hgrec['Components'] = hgrec_components 641 | hgrec_hosts = [] 642 | for hst in host_group['hosts']: 643 | hgrec_hosts.append(hst['hostname']) 644 | cct_host = {} 645 | cct_host['fqdn'] = hst['hostname'] 646 | cct_hosts.append(cct_host) 647 | cct_host_group['hosts'] = cct_hosts 648 | cct_host_groups.append(cct_host_group) 649 | 650 | hgrec['Hosts'] = hgrec_hosts 651 | hg_table.append(hgrec) 652 | cluster_creation_template['host_groups'] = cct_host_groups 653 | 654 | output.write('\n

Ambari Host Group Info

\n') 655 | 656 | writehtmltable(hg_table, hg_fields, output) 657 | 658 | return cluster_creation_template 659 | 660 | 661 | def rpt_totals(hosttable, output): 662 | output.write('\n

Totals

\n') 663 | totalFields = [[0, "Type"], [1, "Count"], [2, "OS"], [3, "CPU-Min"], [4, "CPU-Max"], [5, "Mem-Min"], [6, "Mem-Max"]] 664 | totalType = [] 665 | 666 | datanodes = ["Data Nodes", 0, [], 10000, 0, 100000, 0] 667 | for record in hosttable: 668 | if record[9] == 'X': 669 | datanodes[1] += 1 670 | if (record[1].encode('utf-8') not in datanodes[2]): 671 | datanodes[2].append(record[1].encode('utf-8')) 672 | # CPU Min 673 | if record[2] < datanodes[3]: 674 | datanodes[3] = record[2] 675 | # CPU Max 676 | if record[2] > datanodes[4]: 677 | datanodes[4] = record[2] 678 | # Mem Min 679 | if record[3] < datanodes[5]: 680 | datanodes[5] = record[3] 681 | # Mem Max 682 | if record[3] > datanodes[6]: 683 | datanodes[6] = record[3] 684 | 685 | totalType.append(datanodes) 686 | 687 | computeNodes = ["Compute Nodes", 0, [], 10000, 0, 100000, 0] 688 | for record in hosttable: 689 | if record[11] == 'X': 690 | computeNodes[1] += 1 691 | if (record[1].encode('utf-8') not in computeNodes[2]): 692 | computeNodes[2].append(record[1].encode('utf-8')) 693 | # CPU Min 694 | if record[2] < computeNodes[3]: 695 | computeNodes[3] = record[2] 696 | # CPU Max 697 | if record[2] > computeNodes[4]: 698 | computeNodes[4] = record[2] 699 | # Mem Min 700 | if record[3] < computeNodes[5]: 701 | computeNodes[5] = record[3] 702 | # Mem Max 703 | if record[3] > computeNodes[6]: 704 | computeNodes[6] = record[3] 705 | 706 | totalType.append(computeNodes) 707 | 708 | writehtmltable(totalType, totalFields, output) 709 | 710 | 711 | def writeHeader(output): 712 | global stack 713 | 714 | output.write('') 715 | output.write('') 716 | output.write('') 717 | output.write('') 718 | output.write('') 719 | output.write('') 720 | output.write('') 721 | output.write('') 722 | output.write('') 723 | output.write('') 724 | output.write('') 725 | output.write('') 726 | output.write('') 727 | output.write('') 728 | output.write('') 729 | output.write('
Date' + run_date + '
Stack Version' + stack + '
Blueprint' + bp_file + '
Layout' + layout_file + '
') 730 | 731 | 732 | def main(): 733 | global cluster 734 | global glayout 735 | global layout_file 736 | global bp_file 737 | global run_date 738 | global stack 739 | 740 | parser = optparse.OptionParser(usage="usage: %prog [options]") 741 | 742 | parser.add_option("-l", "--ambari-layout", dest="ambari_layout", help="Ambari Layout File") 743 | parser.add_option("-b", "--ambari-blueprint", dest="ambari_blueprint", help="Ambari Blueprint File") 744 | parser.add_option("-o", "--output-dir", dest="output_dir", help="Output Directory") 745 | 746 | (options, args) = parser.parse_args() 747 | 748 | logger.setLevel(logging.INFO) 749 | formatter = logging.Formatter('%(asctime)s %(levelname)s %(message)s') 750 | stdout_handler = logging.StreamHandler(sys.stdout) 751 | stdout_handler.setLevel(logging.INFO) 752 | stdout_handler.setFormatter(formatter) 753 | logger.addHandler(stdout_handler) 754 | 755 | control = get_control(os.path.dirname(os.path.realpath(__file__)) + "/hdp_support/control.json") 756 | 757 | hostMatrix = {} 758 | layout = {} 759 | componentDict = {} 760 | # mergedBlueprint = {} 761 | 762 | # if not (options.ambari_layout): 763 | # print ("Need to specify a 'layout'") 764 | # exit(-1) 765 | 766 | if options.ambari_layout: 767 | layout_file = options.ambari_layout 768 | layout = json.loads(open(options.ambari_layout).read()) 769 | else: 770 | layout_file = options.ambari_blueprint[:-14] + 'layout.json' 771 | if path.exists(layout_file): 772 | layout = json.loads(open(layout_file).read()) 773 | else: 774 | print("Can't locate layout file (based on blueprint filename: " + layout_file) 775 | exit(-1) 776 | 777 | 778 | if options.ambari_blueprint: 779 | bp_file = options.ambari_blueprint 780 | blueprint = json.loads(open(bp_file).read()) 781 | 782 | consolidate_blueprint_host_groups(blueprint, False) 783 | 784 | if layout is not None: 785 | componentDict = get_component_dictionary_from_bp(blueprint) 786 | hostMatrix = host_matrix_from_layout(layout, control, componentDict) 787 | else: 788 | print ("Couldn't find layout file: " + layout_file) 789 | exit(-1) 790 | 791 | # hostgroups = blueprint['host_groups'] 792 | # calc_host_group_bit_masks(hostgroups, componentDict) 793 | 794 | stack = blueprint['Blueprints']['stack_name'] + ' ' + \ 795 | blueprint['Blueprints']['stack_version'] 796 | 797 | mergedBlueprint = merge_configs_with_host_matrix(blueprint, hostMatrix, componentDict, control) 798 | 799 | run_date = str(date.today()) 800 | 801 | output_dir = '' 802 | if options.output_dir: 803 | output_dir = options.output_dir 804 | else: 805 | output_dir = './' + options.ambari_blueprint[:-5] + '_eval' 806 | # run_date + '_' + 807 | 808 | try: 809 | os.stat(output_dir) 810 | except: 811 | os.mkdir(output_dir) 812 | 813 | # if options.ambari_blueprint: 814 | # newblueprint = mergeConfigsWithHostMatrix(options.ambari_blueprint) 815 | # new_bp_filename = options.ambari_blueprint[:-14] + '_bp_host_matrix.json' 816 | # new_bp_output = open(new_bp_filename, 'w') 817 | # new_bp_output.write(json.dumps(mergedBlueprint, indent=2, sort_keys=False)) 818 | # new_bp_output.close() 819 | 820 | report(mergedBlueprint, hostMatrix, layout, control, componentDict, output_dir) 821 | else: 822 | print ("Missing input") 823 | 824 | 825 | main() 826 | -------------------------------------------------------------------------------- /eval_tools/hdp_support/bp_cfg.json: -------------------------------------------------------------------------------- 1 | { 2 | "evaluate_cfgs": [ 3 | { 4 | "ams-hbase-site": { 5 | "environment_dependent": [ 6 | "hbase.rootdir", 7 | "hbase.tmp.dir" 8 | ], 9 | "skip": [] 10 | } 11 | }, 12 | { 13 | "hadoop-env": { 14 | "environment_dependent": [ 15 | "hadoop_heapsize", 16 | "dtnode_heapsize", 17 | "namenode_backup_dir", 18 | "hadoop_pid_dir_prefix", 19 | "namenode_opt_newsize", 20 | "namenode_heapsize", 21 | "namenode_opt_maxpermsize", 22 | "namenode_opt_maxnewsize", 23 | "namenode_opt_permsize" 24 | ], 25 | "skip": [] 26 | } 27 | }, 28 | { 29 | "capacity-scheduler": { 30 | "environment_dependent": [ 31 | ], 32 | "skip": [] 33 | } 34 | }, 35 | { 36 | "core-site": { 37 | "environment_dependent": [ 38 | "hadoop.security.key.provider.path", 39 | "ha.zookeeper.quorum", 40 | "fs.protected.directories", 41 | "fs.defaultFS" 42 | ], 43 | "skip": ["hadoop.http.authentication.public.key.pem"] 44 | } 45 | }, 46 | { 47 | "dbks-site": { 48 | "environment_dependent": [ 49 | "ranger.ks.jpa.jdbc.url", 50 | "ranger.ks.kerberos.principal" 51 | ], 52 | "skip": [] 53 | } 54 | }, 55 | { 56 | "hbase-env": { 57 | "environment_dependent": [], 58 | "skip": [] 59 | } 60 | }, 61 | { 62 | "hbase-site": { 63 | "environment_dependent": [ 64 | "hbase.rootdir", 65 | "hbase.master.kerberos.principal", 66 | "hbase.regionserver.kerberos.principal", 67 | "hbase.security.authentication.spnego.kerberos.principal", 68 | "hbase.tmp.dir", 69 | "hbase.zookeeper.quorum", 70 | "phoenix.queryserver.kerberos.principal" 71 | ], 72 | "skip": [] 73 | } 74 | }, 75 | { 76 | "kerberos-env": { 77 | "environment_dependent": [], 78 | "skip": [] 79 | } 80 | }, 81 | { 82 | "hdfs-site": { 83 | "environment_dependent": [ 84 | "dfs.secondary.namenode.kerberos.principal", 85 | "dfs.secondary.namenode.kerberos.internal.spnego.principal", 86 | "dfs.internal.nameservices", 87 | "dfs.encryption.key.provider.uri", 88 | "dfs.datanode.kerberos.principal", 89 | "dfs.datanode.data.dir", 90 | "nfs.kerberos.principal", 91 | "dfs.web.authentication.kerberos.principal", 92 | "dfs.nameservices", 93 | "dfs.namenode.shared.edits.dir", 94 | "dfs.namenode.name.dir", 95 | "dfs.namenode.kerberos.principal", 96 | "dfs.namenode.kerberos.internal.spnego.principal", 97 | "dfs.namenode.checkpoint.dir", 98 | "dfs.journalnode.kerberos.principal", 99 | "dfs.journalnode.kerberos.internal.spnego.principal", 100 | "dfs.journalnode.edits.dir", 101 | "dfs.internal.nameservices", 102 | "dfs.encryption.key.provider.uri", 103 | "dfs.datanode.kerberos.principal", 104 | "dfs.datanode.data.dir" 105 | ], 106 | "skip": ["hadoop.http.authentication.public.key.pem"] 107 | } 108 | }, { 109 | "hive-env": { 110 | "environment_dependent": [ 111 | "hive_log_dir", 112 | "hive.client.heapsize", 113 | "heapsize", 114 | "hive.metastore.heapsize", 115 | "hive_ambari_database", 116 | "hive.heapsize" 117 | ], 118 | "skip": [ 119 | "hive_database_type", 120 | "hive_database", 121 | "hive_database_name", 122 | "hcat_log_dir" 123 | ] 124 | } 125 | }, 126 | 127 | { 128 | "hive-interactive-site": { 129 | "environment_dependent": [ 130 | "hive.server2.zookeeper.namespace", 131 | "hive.server2.tez.default.queues", 132 | "hive.metastore.uris", 133 | "hive.llap.zk.sm.connectionString", 134 | "hive.llap.task.principal", 135 | "hive.llap.io.memory.size", 136 | "hive.llap.daemon.yarn.container.mb", 137 | "hive.llap.daemon.service.principal" 138 | ], 139 | "skip": [] 140 | } 141 | }, 142 | { 143 | "hive-interactive-env": { 144 | "environment_dependent": [ 145 | "hive_heapsize", 146 | "llap_headroom_space", 147 | "llap_heap_size", 148 | "num_llap_nodes", 149 | "num_llap_nodes_for_llap_daemons" 150 | ], 151 | "skip": [] 152 | } 153 | }, 154 | { 155 | "hiveserver2-site": { 156 | "environment_dependent": [], 157 | "skip": [] 158 | } 159 | }, 160 | { 161 | "hive-site": { 162 | "environment_dependent": [ 163 | "hive.zookeeper.quorum", 164 | "hive.server2.zookeeper.namespace", 165 | "hive.server2.tez.default.queues", 166 | "hive.server2.authentication.spnego.principal", 167 | "hive.server2.authentication.kerberos.principal", 168 | "hive.metastore.uris", 169 | "hive.metastore.kerberos.principal", 170 | "hive.cluster.delegation.token.store.zookeeper.connectString", 171 | "atlas.rest.address" 172 | ], 173 | "skip": [ 174 | "ambari.hive.db.schema.name" 175 | ] 176 | } 177 | }, 178 | { 179 | "kafka-broker": { 180 | "environment_dependent": [ 181 | "sasl.kerberos.principal.to.local.rules", 182 | "zookeeper.connect", 183 | "log.dirs" 184 | ], 185 | "skip": [] 186 | } 187 | }, 188 | { 189 | "kafka-env": { 190 | "environment_dependent": [], 191 | "skip": [] 192 | } 193 | }, 194 | { 195 | "kms-site": { 196 | "environment_dependent": [], 197 | "skip": [] 198 | } 199 | }, 200 | { 201 | "livy2-conf": { 202 | "environment_dependent": [ 203 | "livy.server.launch.kerberos.principal", 204 | "livy.superusers", 205 | "livy.server.auth.kerberos.principal" 206 | ], 207 | "skip": [] 208 | } 209 | }, 210 | { 211 | "mapred-site": { 212 | "environment_dependent": [ 213 | "mapreduce.jobhistory.principal", 214 | "mapreduce.jobhistory.webapp.address", 215 | "mapreduce.jobhistory.webapp.spnego-principal", 216 | "mapreduce.jobhistory.address" 217 | ], 218 | "skip": ["hadoop.http.authentication.public.key.pem"] 219 | } 220 | }, 221 | { 222 | "oozie-site": { 223 | "environment_dependent": [ 224 | "oozie.base.url", 225 | "local.realm" 226 | ], 227 | "skip": ["oozie.authentication.public.key.pem"] 228 | } 229 | }, 230 | { 231 | "ranger-admin-site": { 232 | "environment_dependent": [ 233 | "xasecure.audit.jaas.Client.option.principal", 234 | "ranger.unixauth.service.hostname", 235 | "ranger.truststore.file", 236 | "ranger.ldap.url", 237 | "ranger.ldap.user.dnpattern", 238 | "ranger.sso.providerurl", 239 | "ranger.ldap.base.dn", 240 | "ranger.ldap.bind.dn", 241 | "ranger.ldap.group.searchbase", 242 | "ranger.ldap.group.searchfilter", 243 | "ranger.jpa.jdbc.url", 244 | "ranger.audit.solr.zookeepers", 245 | "ranger.audit.solr.urls" 246 | ], 247 | "skip": [ 248 | "ranger.sso.publicKey" 249 | ] 250 | } 251 | }, 252 | { 253 | "ranger-hive-audit": { 254 | "environment_dependent": [ 255 | "xasecure.audit.jaas.Client.option.principal", 256 | "xasecure.audit.destination.solr.zookeepers", 257 | "xasecure.audit.destination.solr.urls", 258 | "xasecure.audit.destination.hdfs.dir" 259 | ], 260 | "skip": [] 261 | } 262 | }, 263 | { 264 | "ranger-kms-audit": { 265 | "environment_dependent": [ 266 | "xasecure.audit.jaas.Client.option.principal", 267 | "xasecure.audit.destination.solr.zookeepers", 268 | "xasecure.audit.destination.solr.urls", 269 | "xasecure.audit.destination.hdfs.dir" 270 | ], 271 | "skip": [] 272 | } 273 | }, 274 | { 275 | "ranger-tagsync-site": { 276 | "environment_dependent": [ 277 | "ranger.tagsync.source.atlasrest.endpoint" 278 | ], 279 | "skip": [] 280 | } 281 | }, 282 | { 283 | "spark2-defaults": { 284 | "environment_dependent": [ 285 | "spark.sql.hive.hiveserver2.jdbc.url.principal", 286 | "spark.sql.hive.hiveserver2.jdbc.url", 287 | "spark.hadoop.hive.zookeeper.quorum" 288 | ], 289 | "skip": [] 290 | } 291 | }, 292 | { 293 | "spark2-hive-site-override": { 294 | "environment_dependent": [], 295 | "skip": [] 296 | } 297 | }, 298 | { 299 | "sqoop-site": { 300 | "environment_dependent": [], 301 | "skip": [] 302 | } 303 | }, 304 | { 305 | "tez-interactive-site": { 306 | "environment_dependent": [], 307 | "skip": [] 308 | } 309 | }, 310 | { 311 | "tez-site": { 312 | "environment_dependent": [], 313 | "skip": [] 314 | } 315 | }, 316 | { 317 | "yarn-site": { 318 | "environment_dependent": [ 319 | "yarn.timeline-service.webapp.https.address", 320 | "yarn.timeline-service.webapp.address", 321 | "yarn.timeline-service.principal", 322 | "yarn.timeline-service.leveldb-timeline-store.path", 323 | "yarn.timeline-service.http-authentication.kerberos.principal", 324 | "yarn.timeline-service.address", 325 | "yarn.resourcemanager.zk-address", 326 | "yarn.resourcemanager.webapp.spnego-principal", 327 | "yarn.resourcemanager.webapp.https.address.rm2", 328 | "yarn.resourcemanager.webapp.https.address.rm1", 329 | "yarn.resourcemanager.webapp.address.rm2", 330 | "yarn.resourcemanager.webapp.address.rm1", 331 | "yarn.resourcemanager.webapp.address", 332 | "yarn.resourcemanager.scheduler.address", 333 | "yarn.resourcemanager.resource-tracker.address", 334 | "yarn.resourcemanager.principal", 335 | "yarn.resourcemanager.hostname.rm2", 336 | "yarn.resourcemanager.hostname.rm1", 337 | "yarn.resourcemanager.hostname", 338 | "yarn.resourcemanager.admin.address", 339 | "yarn.nodemanager.webapp.spnego-principal", 340 | "yarn.resourcemanager.address", 341 | "yarn.nodemanager.resource.memory-mb", 342 | "yarn.nodemanager.resource.cpu-vcores", 343 | "yarn.nodemanager.recovery.dir", 344 | "yarn.nodemanager.local-dirs", 345 | "yarn.nodemanager.log-dirs", 346 | "yarn.nodemanager.principal", 347 | "yarn.log.server.web-service.url", 348 | "yarn.log.server.url" 349 | ], 350 | "skip": ["hadoop.http.authentication.public.key.pem"] 351 | } 352 | }, 353 | { 354 | "zoo.cfg": { 355 | "environment_dependent": [ 356 | "dataDir" 357 | ], 358 | "skip": [ 359 | "hello", 360 | "test" 361 | ] 362 | } 363 | } 364 | ], 365 | "global": {} 366 | } -------------------------------------------------------------------------------- /eval_tools/hdp_support/control.json: -------------------------------------------------------------------------------- 1 | { 2 | "GOVERNANCE": { 3 | "KNOX_GATEWAY": { 4 | "name": "Knox", 5 | "abbr": "KX", 6 | "translate-to-cm": false, 7 | "config": { 8 | "section": "", 9 | "configs": { 10 | } 11 | }, 12 | "environment": { 13 | "section": "", 14 | "configs": { 15 | } 16 | } 17 | }, 18 | "INFRA_SOLR": { 19 | "name": "SOLR Infrastructure", 20 | "abbr": "iSOLR", 21 | "translate-to-cm": false, 22 | "config": { 23 | "section": "infra-solr-xml", 24 | "configs": { 25 | } 26 | }, 27 | "environment": { 28 | "section": "infra-solr-env", 29 | "configs": { 30 | "heap": "infra_solr_maxmem", 31 | "user_nofile_limit": "infra_solr_user_nofile_limit", 32 | "user_nproc_limit": "infra_solr_user_nproc_limit", 33 | "log.dir": "infra_solr_log_dir", 34 | "data.dir": "infra_solr_datadir" 35 | } 36 | } 37 | }, 38 | "ATLAS_SERVER": { 39 | "name": "Atlas", 40 | "abbr": "AS", 41 | "translate-to-cm": false, 42 | "config": { 43 | "section": "ranger-atlas-security", 44 | "configs": { 45 | } 46 | }, 47 | "environment": { 48 | "section": "atlas-env", 49 | "configs": { 50 | "heap": "atlas_server_xmx", 51 | "data.dir": "metadata_data_dir", 52 | "log.dir": "metadata_log_dir" 53 | } 54 | } 55 | } 56 | }, 57 | "HADOOP": { 58 | "NAMENODE": { 59 | "name": "Namenode", 60 | "abbr": "NN", 61 | "translate-to-cm": true, 62 | "config": { 63 | "section": "hdfs-site", 64 | "configs": { 65 | "data.dir": "dfs.namenode.name.dir", 66 | "dfs.namenode.checkpoint.dir": "dfs.namenode.checkpoint.dir", 67 | "dfs.namenode.handler.count": "dfs.namenode.handler.count", 68 | "dfs.webhdfs.enabled": "dfs.webhdfs.enabled" 69 | } 70 | }, 71 | "environment": { 72 | "section": "hadoop-env", 73 | "configs": { 74 | "heap": "namenode_heapsize", 75 | "backup_dir": "namenode_backup_dir", 76 | "opt_max_size": "namenode_opt_maxnewsize", 77 | "opt_perm_size": "namenode_opt_permsize", 78 | "opt_max_perm_size": "namenode_opt_maxpermsize", 79 | "tmp_dir": "hdfs_tmp_dir", 80 | "user_proc_ulimit": "hdfs_user_nproc_limit", 81 | "user_nofile_ulimit": "hdfs_user_nofile_limit", 82 | "log_dir_prefix": "hdfs_log_dir_prefix" 83 | } 84 | } 85 | }, 86 | "DATANODE": { 87 | "name": "Datanode", 88 | "abbr": "DN", 89 | "translate-to-cm": true, 90 | "config": { 91 | "section": "hdfs-site", 92 | "configs": { 93 | "data.dir": "dfs.datanode.data.dir", 94 | "dfs.datanode.max.transfer.threads": "dfs.datanode.max.transfer.threads", 95 | "dfs.datanode.failed.volumes.tolerated": "dfs.datanode.failed.volumes.tolerated" 96 | } 97 | }, 98 | "environment": { 99 | "section": "hadoop-env", 100 | "configs": { 101 | "heap": "dtnode_heapsize" 102 | } 103 | } 104 | }, 105 | "JOURNALNODE": { 106 | "name": "JournalNode", 107 | "abbr": "JN", 108 | "translate-to-cm": true, 109 | "config": { 110 | "section": "hdfs-site", 111 | "configs": { 112 | "data.dir": "dfs.journalnode.edits.dir" 113 | } 114 | }, 115 | "environment": { 116 | "section": "hadoop-env", 117 | "configs": { 118 | "memory": "hadoop_heapsize" 119 | } 120 | } 121 | }, 122 | "ZKFC": { 123 | "name": "ZK Failover Controller", 124 | "abbr": "ZKFC", 125 | "translate-to-cm": true, 126 | "config": { 127 | "section": "hdfs-site", 128 | "configs": { 129 | } 130 | }, 131 | "environment": { 132 | "section": "", 133 | "configs": { 134 | } 135 | } 136 | }, 137 | "RESOURCEMANAGER": { 138 | "name": "Resource Manager", 139 | "abbr": "RM", 140 | "translate-to-cm": true, 141 | "config": { 142 | "section": "hdfs-site", 143 | "configs": { 144 | } 145 | }, 146 | "environment": { 147 | "section": "", 148 | "configs": { 149 | "heap": "resourcemanager_heapsize" 150 | } 151 | } 152 | }, 153 | "NODEMANAGER": { 154 | "name": "Node Manager", 155 | "abbr": "NM", 156 | "translate-to-cm": true, 157 | "config": { 158 | "section": "yarn-site", 159 | "configs": { 160 | "yarn.nodemanager.local-dirs": "yarn.nodemanager.local-dirs", 161 | "yarn.nodemanager.remote-app-log-dir": "yarn.nodemanager.remote-app-log-dir", 162 | "yarn.nodemanager.log-dirs": "yarn.nodemanager.log-dirs", 163 | "yarn.nodemanager.recovery.dir": "yarn.nodemanager.recovery.dir", 164 | "data.dir": "yarn.nodemanager.local-dirs", 165 | "logs.dir": "yarn.nodemanager.log-dirs", 166 | "off.heap": "yarn.nodemanager.resource.memory-mb" 167 | } 168 | }, 169 | "environment": { 170 | "section": "yarn-env", 171 | "configs": { 172 | "heap": "nodemanager_heapsize", 173 | "log_prefix": "yarn_log_dir_prefix" 174 | } 175 | } 176 | }, 177 | "ZOOKEEPER_SERVER": { 178 | "name": "ZooKeeper Server", 179 | "abbr": "ZK", 180 | "translate-to-cm": true, 181 | "config": { 182 | "section": "zoo.cfg", 183 | "configs": { 184 | "data.dir": "dataDir", 185 | "autopurge.purgeInterval": "autopurge.purgeInterval" 186 | } 187 | }, 188 | "environment": { 189 | "section": "zoo.cfg", 190 | "configs": { 191 | } 192 | } 193 | }, 194 | "HISTORYSERVER": { 195 | "name": "Job History Server", 196 | "abbr": "JHS", 197 | "translate-to-cm": true, 198 | "config": { 199 | "section": "hdfs-site", 200 | "configs": { 201 | } 202 | }, 203 | "environment": { 204 | "section": "", 205 | "configs": { 206 | } 207 | } 208 | } 209 | }, 210 | "HIVE": { 211 | "HIVE_METASTORE": { 212 | "name": "Hive Metastore", 213 | "abbr": "HMS", 214 | "translate-to-cm": true, 215 | "config": { 216 | "section": "hive-site", 217 | "configs": { 218 | "hive.metastore.warehouse.dir": "hive.metastore.warehouse.dir" 219 | } 220 | }, 221 | "environment": { 222 | "section": "hive-env", 223 | "configs": { 224 | "heap": "hive.metastore.heapsize", 225 | "log.dir": "hive_log_dir" 226 | } 227 | } 228 | }, 229 | "HIVE_SERVER": { 230 | "name": "Hive Server2", 231 | "abbr": "HS2", 232 | "translate-to-cm": true, 233 | "config": { 234 | "section": "hive-site", 235 | "configs": { 236 | } 237 | }, 238 | "environment": { 239 | "section": "hive-env", 240 | "configs": { 241 | "heap": "hive.heapsize" 242 | } 243 | } 244 | }, 245 | "HIVE_SERVER_INTERACTIVE": { 246 | "name": "Hive Server2 Interactive", 247 | "abbr": "HS2i", 248 | "translate-to-cm": false, 249 | "config": { 250 | "section": "hive-interactive-site", 251 | "configs": { 252 | } 253 | }, 254 | "environment": { 255 | "section": "hive-interactive-env", 256 | "configs": { 257 | "heap": "hive.heapsize" 258 | } 259 | } 260 | } 261 | }, 262 | "PIPELINE": { 263 | "OOZIE_SERVER": { 264 | "name": "Oozie", 265 | "abbr": "OZ", 266 | "translate-to-cm": true, 267 | "config": { 268 | "section": "oozie-site", 269 | "configs": { 270 | } 271 | }, 272 | "environment": { 273 | "section": "oozie-env", 274 | "configs": { 275 | "heap": "oozie_heapsize", 276 | "data.dir": "oozie_data_dir", 277 | "tmp.dir": "oozie_tmp_dir" 278 | } 279 | } 280 | }, 281 | "NIFI_MASTER": { 282 | "name": "NiFi", 283 | "abbr": "NF", 284 | "translate-to-cm": false, 285 | "config": { 286 | "section": "nifi-properties", 287 | "configs": { 288 | } 289 | }, 290 | "environment": { 291 | "section": "nifi-env", 292 | "configs": { 293 | "log.dir": "nifi_node_log_dir" 294 | } 295 | } 296 | }, 297 | "KAFKA_BROKER": { 298 | "name": "Kafka Broker", 299 | "abbr": "KB", 300 | "translate-to-cm": true, 301 | "config": { 302 | "section": "kafka-broker", 303 | "configs": { 304 | "data.dir": "log.dirs" 305 | } 306 | }, 307 | "environment": { 308 | "section": "kafka-env", 309 | "configs": { 310 | "log.dir": "kafka_log_dir", 311 | "user_nofile_limit": "kafka_user_nofile_limit", 312 | "user_nproc_limit": "kafka_user_nproc_limit" 313 | } 314 | } 315 | } 316 | }, 317 | "HBASE": { 318 | "HBASE_MASTER": { 319 | "name": "HBase Master", 320 | "abbr": "HM", 321 | "translate-to-cm": true, 322 | "config": { 323 | "section": "hbase-site", 324 | "configs": { 325 | "data.dir": "hbase.rootdir" 326 | } 327 | }, 328 | "environment": { 329 | "section": "hbase-env", 330 | "configs": { 331 | "heap": "hbase_master_heapsize", 332 | "log.dir": "hbase_log_dir" 333 | } 334 | } 335 | }, 336 | "HBASE_REGIONSERVER": { 337 | "name": "Region Server", 338 | "abbr": "RS", 339 | "translate-to-cm": true, 340 | "config": { 341 | "section": "hbase-site", 342 | "configs": { 343 | "local.dir": "hbase.local.dir", 344 | "tmp.dir": "hbase.tmp.dir", 345 | "handler.count": "hbase.regionserver.handler.count" 346 | } 347 | }, 348 | "environment": { 349 | "section": "hbase-env", 350 | "configs": { 351 | "heap": "hbase_regionserver_heapsize", 352 | "off.heap": "hbase_max_direct_memory_size", 353 | "user_nproc_limit": "hbase_user_nproc_limit", 354 | "user_nofile_limit": "hbase_user_nofile_limit" 355 | } 356 | } 357 | } 358 | }, 359 | "SPARK": { 360 | "LIVY2_SERVER": { 361 | "name": "Livy2 Server", 362 | "abbr": "L2S", 363 | "translate-to-cm": true, 364 | "config": { 365 | "section": "hdfs-site", 366 | "configs": { 367 | } 368 | }, 369 | "environment": { 370 | "section": "", 371 | "configs": { 372 | } 373 | } 374 | }, 375 | "SPARK2_JOBHISTORYSERVER": { 376 | "name": "Spark2 JobHistory Server", 377 | "abbr": "S2JHS", 378 | "translate-to-cm": true, 379 | "config": { 380 | "section": "hdfs-site", 381 | "configs": { 382 | } 383 | }, 384 | "environment": { 385 | "section": "", 386 | "configs": { 387 | } 388 | } 389 | }, 390 | "SPARK_JOBHISTORYSERVER": { 391 | "name": "Spark JobHistory Server", 392 | "abbr": "SJHS", 393 | "translate-to-cm": false, 394 | "config": { 395 | "section": "hdfs-site", 396 | "configs": { 397 | } 398 | }, 399 | "environment": { 400 | "section": "", 401 | "configs": { 402 | } 403 | } 404 | } 405 | }, 406 | "DRUID": { 407 | "DRUID_ROUTER": { 408 | "name": "Druid Router", 409 | "abbr": "DR", 410 | "translate-to-cm": false, 411 | "config": { 412 | "section": "druid-router", 413 | "configs": { 414 | } 415 | }, 416 | "environment": { 417 | "section": "druid-env", 418 | "configs": { 419 | "heap": "druid.router.jvm.heap.memory", 420 | "off.heap": "druid.router.jvm.direct.memory" 421 | } 422 | } 423 | }, 424 | "DRUID_OVERLOAD": { 425 | "name": "Druid Overload", 426 | "abbr": "DO", 427 | "translate-to-cm": false, 428 | "config": { 429 | "section": "druid-overload", 430 | "configs": { 431 | "data_dir": "" 432 | } 433 | }, 434 | "environment": { 435 | "section": "druid-env", 436 | "configs": { 437 | "heap": "druid.overlord.jvm.heap.memory", 438 | "off.heap": "druid.overlord.jvm.direct.memory" 439 | } 440 | } 441 | }, 442 | "DRUID_BROKER": { 443 | "name": "Druid Broker", 444 | "abbr": "DB", 445 | "translate-to-cm": false, 446 | "config": { 447 | "section": "hdfs-site", 448 | "configs": { 449 | "data_dir": "" 450 | } 451 | }, 452 | "environment": { 453 | "section": "druid-env", 454 | "configs": { 455 | "heap": "druid.broker.jvm.heap.memory", 456 | "off.heap": "druid.broker.jvm.direct.memory" 457 | } 458 | } 459 | }, 460 | "DRUID_MIDDLEMANAGER": { 461 | "name": "Druid Middle Manager", 462 | "abbr": "DM", 463 | "translate-to-cm": false, 464 | "config": { 465 | "section": "hdfs-site", 466 | "configs": { 467 | } 468 | }, 469 | "environment": { 470 | "section": "druid-env", 471 | "configs": { 472 | "heap": "druid.middlemanager.jvm.heap.memory", 473 | "off.heap": "druid.middlemanager.jvm.direct.memory" 474 | } 475 | } 476 | }, 477 | "DRUID_HISTORICAL": { 478 | "name": "Druid Historical", 479 | "abbr": "DH", 480 | "translate-to-cm": false, 481 | "config": { 482 | "section": "hdfs-site", 483 | "configs": { 484 | } 485 | }, 486 | "environment": { 487 | "section": "druid-env", 488 | "configs": { 489 | "heap": "druid.historical.jvm.heap.memory", 490 | "off.heap": "druid.historical.jvm.direct.memory" 491 | } 492 | } 493 | }, 494 | "DRUID_COORDINATOR": { 495 | "name": "Druid Coordinator", 496 | "abbr": "DC", 497 | "translate-to-cm": false, 498 | "config": { 499 | "section": "hdfs-site", 500 | "configs": { 501 | } 502 | }, 503 | "environment": { 504 | "section": "druid-env", 505 | "configs": { 506 | "heap": "druid.coordinator.jvm.heap.memory", 507 | "off.heap": "druid.coordinator.jvm.direct.memory" 508 | } 509 | } 510 | } 511 | } 512 | } -------------------------------------------------------------------------------- /eval_tools/hdp_support/host_group_control.json: -------------------------------------------------------------------------------- 1 | { 2 | "scale": { 3 | "small": { 4 | "masters": { 5 | "master_1": { 6 | "cardinality": 1, 7 | "services": [ 8 | "NAMENODE", 9 | "RESOURCEMANAGER", 10 | "ZOOKEEPER", 11 | "HISTORYSERVER", 12 | "HIVE_METASTORE", 13 | "HIVE_SERVER", 14 | "HBASE_MASTER", 15 | "OOZIE_SERVER" 16 | ] 17 | } 18 | }, 19 | "workers": { 20 | "worker_1": { 21 | "cardinality": 3, 22 | "services": [ 23 | "DATANODE", 24 | "NODEMANAGER", 25 | "HBASE_REGIONSERVER" 26 | ] 27 | } 28 | } 29 | }, 30 | "medium-ha": { 31 | "ha-services": [ 32 | "NAMENODE", 33 | "RESOURCEMANAGER" 34 | ], 35 | "masters": { 36 | "master_1": { 37 | "cardinality": 2, 38 | "services": [ 39 | "NAMENODE", 40 | "ZKFC", 41 | "RESOURCEMANAGER", 42 | "HBASE_MASTER", 43 | "ZOOKEEPER" 44 | ] 45 | }, 46 | "master_2": { 47 | "cardinality": 1, 48 | "services": [ 49 | "ZOOKEEPER", 50 | "HISTORYSERVER", 51 | "HIVE_METASTORE", 52 | "HIVE_SERVER", 53 | "HBASE_MASTER", 54 | "OOZIE_SERVER" 55 | ] 56 | } 57 | }, 58 | "workers": { 59 | "worker_1": { 60 | "cardinality": 3, 61 | "services": [ 62 | "DATANODE", 63 | "NODEMANAGER", 64 | "HBASE_REGIONSERVER" 65 | ] 66 | } 67 | } 68 | } 69 | } 70 | } -------------------------------------------------------------------------------- /eval_tools/hdp_support/sub_hosts_default.json: -------------------------------------------------------------------------------- 1 | { 2 | "hosts": [ 3 | { 4 | "host": "os02.streever.local", 5 | "rack_info": "/default" 6 | }, 7 | { 8 | "host": "os03.streever.local", 9 | "rack_info": "/default" 10 | }, 11 | { 12 | "host": "os04.streever.local", 13 | "rack_info": "/default" 14 | }, 15 | { 16 | "host": "os05.streever.local", 17 | "rack_info": "/default" 18 | }, 19 | { 20 | "host": "os06.streever.local", 21 | "rack_info": "/default" 22 | }, 23 | { 24 | "host": "os07.streever.local", 25 | "rack_info": "/default" 26 | }, 27 | { 28 | "host": "os10.streever.local", 29 | "rack_info": "/default" 30 | }, 31 | { 32 | "host": "os11.streever.local", 33 | "rack_info": "/default" 34 | }, 35 | { 36 | "host": "os12.streever.local", 37 | "rack_info": "/default" 38 | }, 39 | { 40 | "host": "os13.streever.local", 41 | "rack_info": "/default" 42 | }, 43 | { 44 | "host": "os14.streever.local", 45 | "rack_info": "/default" 46 | }, 47 | { 48 | "host": "os15.streever.local", 49 | "rack_info": "/default" 50 | }, 51 | { 52 | "host": "os16.streever.local", 53 | "rack_info": "/default" 54 | }, 55 | { 56 | "host": "os17.streever.local", 57 | "rack_info": "/default" 58 | }, 59 | { 60 | "host": "os18.streever.local", 61 | "rack_info": "/default" 62 | }, 63 | { 64 | "host": "os19.streever.local", 65 | "rack_info": "/default" 66 | }, 67 | { 68 | "host": "os20.streever.local", 69 | "rack_info": "/default" 70 | }, 71 | { 72 | "host": "os21.streever.local", 73 | "rack_info": "/default" 74 | }, 75 | { 76 | "host": "os22.streever.local", 77 | "rack_info": "/default" 78 | }, 79 | { 80 | "host": "os23.streever.local", 81 | "rack_info": "/default" 82 | }, 83 | { 84 | "host": "os24.streever.local", 85 | "rack_info": "/default" 86 | }, 87 | { 88 | "host": "os25.streever.local", 89 | "rack_info": "/default" 90 | }, 91 | { 92 | "host": "os26.streever.local", 93 | "rack_info": "/default" 94 | }, 95 | { 96 | "host": "os27.streever.local", 97 | "rack_info": "/default" 98 | }, 99 | { 100 | "host": "os28.streever.local", 101 | "rack_info": "/default" 102 | }, 103 | { 104 | "host": "os29.streever.local", 105 | "rack_info": "/default" 106 | }, 107 | { 108 | "host": "os30.streever.local", 109 | "rack_info": "/default" 110 | }, 111 | { 112 | "host": "os31.streever.local", 113 | "rack_info": "/default" 114 | }, 115 | { 116 | "host": "os32.streever.local", 117 | "rack_info": "/default" 118 | }, 119 | { 120 | "host": "os33.streever.local", 121 | "rack_info": "/default" 122 | }, 123 | { 124 | "host": "os34.streever.local", 125 | "rack_info": "/default" 126 | }, 127 | { 128 | "host": "os35.streever.local", 129 | "rack_info": "/default" 130 | }, 131 | { 132 | "host": "os36.streever.local", 133 | "rack_info": "/default" 134 | }, 135 | { 136 | "host": "os37.streever.local", 137 | "rack_info": "/default" 138 | }, 139 | { 140 | "host": "os38.streever.local", 141 | "rack_info": "/default" 142 | }, 143 | { 144 | "host": "os39.streever.local", 145 | "rack_info": "/default" 146 | }, 147 | { 148 | "host": "os40.streever.local", 149 | "rack_info": "/default" 150 | }, 151 | { 152 | "host": "os41.streever.local", 153 | "rack_info": "/default" 154 | } 155 | ] 156 | } -------------------------------------------------------------------------------- /eval_tools/suite.sh: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bash 2 | 3 | # This tool will run the suite of evaluation tools, as long as 4 | # the file naming conventions are followed. See the individual 5 | # tool readme's for details. 6 | 7 | EXT_BP="-blueprint" 8 | 9 | for BP_FILE in `ls *${EXT_BP}.json`; do 10 | echo ${BP_FILE} 11 | 12 | echo "Running Ambari Diff Tool" 13 | ambari_cfg_diff.py -c "${BP_FILE}" 14 | 15 | echo "Running Ambari Blueprint v2 Tool" 16 | ambari_bp_tool.py -b "${BP_FILE}" 17 | 18 | echo "Running Cluster Eval Report" 19 | hdp_eval.py -b "${BP_FILE}" 20 | 21 | # Parse the '-blueprint.json' from the filename 22 | BP=${BP_FILE:0:${#BP_FILE}-5} 23 | # Build the Ambari v2 Blueprint filename. 24 | BP_V2="${BP}-v2-generated.json" 25 | CM_FILE="${BP:0:${#BP}-9}cm_template.json" 26 | 27 | echo "Converting Ambari Blueprint v2 to CM Environment Template" 28 | am2cm.sh --blueprint-file "${BP_V2}" --deployment-template-file "${CM_FILE}" 29 | 30 | done 31 | -------------------------------------------------------------------------------- /hive-sre/README.md: -------------------------------------------------------------------------------- 1 | # Hive SRE Tooling 2 | 3 | This project has been migrated / promoted to [Cloudera Labs](https://github.com/cloudera-labs/hive-sre). Please follow link for updates and distributions. -------------------------------------------------------------------------------- /migration/HiveOnTEZ.puml: -------------------------------------------------------------------------------- 1 | @startuml 2 | 3 | [*] --> UsingLLAP 4 | 5 | Workload : Type 6 | UsingLLAP --> PVC : YES 7 | 8 | PVC: Is PVC Available? 9 | PVC --> BASE: NO 10 | PVC --> CDW: YES 11 | PVC --> BASE: Eventually 12 | 13 | BASE --> CDW: Once Configured 14 | 15 | BASE: CDP Base Cluster 16 | CDW: CDP PVC 17 | CDW: Running LLAP in Containers 18 | 19 | BASE --> HiveBase 20 | HiveBase : Running On YARN 21 | 22 | 23 | @enduml 24 | -------------------------------------------------------------------------------- /migration/LLAP_Migration.puml: -------------------------------------------------------------------------------- 1 | @startuml 2 | 'https://plantuml.com/activity-diagram-beta 3 | 4 | 5 | fork 6 | :CDW is Available; 7 | if (Consumer) then (hive) 8 | if (Job is Scheduled and Predictable) then (yes) 9 | :Consider running on CDP Base / Datahub; 10 | end 11 | else (no) 12 | split 13 | :ETL Workloads; 14 | :Run CDW in Isolation Mode; 15 | end 16 | split again 17 | :Interactive Workloads; 18 | :Determine CDW Instance sizing (other jobs using same instance?); 19 | :Understand Elastic Needs and Max PEEK compute needs to size CDW instance; 20 | :Run Hive on TEZ in CDW (LLAP); 21 | end 22 | end split 23 | endif 24 | else (spark) 25 | if (filesystem access to data) then (no) 26 | :Use HWC to Submit HiveSQL; 27 | if (RESULTSET is) then (small) 28 | :Use HWC JDBC_CLIENT mode; 29 | end 30 | else (large) 31 | :Use HWC STAGING_OUTPUT mode; 32 | end 33 | endif 34 | else (yes) 35 | if (sparkSql compatable) then (yes) 36 | :Use Native SparkSQL; 37 | end 38 | else (no) 39 | :Use HWC; 40 | :Grant Filesystem Access; 41 | if (DATASET is) then (small to large) 42 | :Use SparkSQL with HWC DIRECT_READER mode; 43 | end 44 | else (x-large) 45 | :Use HiveSQL with HWC STAGING_OUTPUT mode; 46 | end 47 | endif 48 | endif 49 | endif 50 | endif 51 | fork again 52 | :Only CDP Base is Available; 53 | :Isolate an HS2 Instance 54 | - Tune for pre-warmed Containers 55 | - Associate to dedicated YARN Queue 56 | - Ensure preemption configured to allow 57 | Queue to reclaim resources; 58 | :Run Hive on TEZ on CDP Base / Datahub; 59 | end 60 | end fork 61 | 62 | @enduml 63 | -------------------------------------------------------------------------------- /migration/LLAP_Migration_dot.puml: -------------------------------------------------------------------------------- 1 | @startuml 2 | 3 | digraph llap { 4 | LLAP [label="HDP LLAP"] 5 | Spark [label="Spark Workload"] 6 | Hive [label="Hive Workload"] 7 | 8 | LLAP -> 9 | 10 | subgraph cluster_cdpbase { 11 | label="CDP Base"; 12 | HOT [label="Hive on TEZ"]; 13 | YARN [label="YARN"]; 14 | 15 | HOT -> YARN [label="Runs on"]; 16 | } 17 | 18 | subgraph cluster_cdppvc { 19 | label="CDP Private Cloud"; 20 | 21 | CDW_Hive [label="Hive LLAP"] 22 | 23 | } 24 | 25 | } 26 | 27 | @enduml 28 | --------------------------------------------------------------------------------