├── .gitignore ├── LICENSE ├── README.md ├── Vagrantfile ├── conf └── elasticsearch.yml.erb ├── lib ├── elasticsearch-module.rb ├── elasticsearch-script.rb └── upgrade.sh └── scripts ├── node-attach ├── node-restart ├── node-start ├── node-status └── node-stop /.gitignore: -------------------------------------------------------------------------------- 1 | .vagrant/ 2 | conf/* 3 | logs/ 4 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | The MIT License (MIT) 2 | 3 | Copyright (c) 2017 Yannick Pereira-Reis 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | vagrant-elasticsearch-cluster 2 | ============================= 3 | 4 | **[NOT MAINTAINED]** 5 | 6 | Create an ElasticSearch cluster with a single bash command : 7 | 8 | ``` 9 | vagrant up 10 | ``` 11 | 12 | **Programs, plugins, libs and versions information** 13 | 14 | | Program, plugin, lib | Version | How to use it | 15 | | --------------------------------- | ----------- | ----------------------------------------- | 16 | | ElasticSearch | 1.4.3 | [http://www.elasticsearch.org/guide/](http://www.elasticsearch.org/guide/) | 17 | | Java (openjdk-7-jre) | 1.7.0_25 | | 18 | | elasticsearch-image | 1.2.0 | [https://github.com/kzwang/elasticsearch-image](https://github.com/kzwang/elasticsearch-image) | 19 | | elasticsearch-mapper-attachments | 2.4.2 | [https://github.com/elasticsearch/elasticsearch-mapper-attachments](https://github.com/elasticsearch/elasticsearch-mapper-attachments) | 20 | | rssriver (david pilato) | 1.3.0 | [http://www.pilato.fr/rssriver/](http://www.pilato.fr/rssriver/) | 21 | | elasticsearch-river-jdbc | 1.4.0.9 | [https://github.com/jprante/elasticsearch-river-jdbc](https://github.com/jprante/elasticsearch-river-jdbc) | 22 | | elasticsearch-river-rabbitmq | 2.4.1 | [https://github.com/elasticsearch/elasticsearch-river-rabbitmq](https://github.com/elasticsearch/elasticsearch-river-rabbitmq) | 23 | | elasticsearch-river-twitter | 2.4.2 | [https://github.com/elasticsearch/elasticsearch-river-twitter](https://github.com/elasticsearch/elasticsearch-river-twitter) | 24 | | elasticsearch-river-wikipedia | 2.4.1 | [https://github.com/elasticsearch/elasticsearch-river-wikipedia](https://github.com/elasticsearch/elasticsearch-river-wikipedia) | 25 | 26 | This plugins are just installed through the `bin/plugin -i` command. You must configure everything else. 27 | 28 | **Cluster default configuration** 29 | 30 | | Configuration | Value(s) | 31 | | -------------------------- | ---------------------------------------------------- | 32 | | Cluster name | elasticsearch-cluster-test | 33 | | Nodes names | thor, zeus, isis, baal, shifu | 34 | | VM names | vm1, vm2, vm3, vm4, vm5 | 35 | | Default cluster network IP | 10.0.0.0 | 36 | 37 | 38 | 1.Installation and requirements 39 | -- 40 | 41 | **Must have on your local machine** 42 | 43 | * VirtualBox (last version) 44 | * Vagrant (>=1.5) 45 | * cUrl (or another REST client to talk to ES) 46 | 47 | **Clone this repository** 48 | 49 | git clone git@github.com:ypereirareis/vagrant-elasticsearch-cluster.git 50 | 51 | **WARNING** 52 | 53 | You'll need enough RAM to run VMs in your cluster. 54 | Each new VM launched within your cluster will have 512M of RAM allocated. 55 | You can change this configuration in the Vagrantfile once cloned. 56 | 57 | 2.How to run a new ElasticSearch cluster 58 | -- 59 | 60 | **Important** 61 | 62 | The maximum number VMs running in the cluster is 5. 63 | Indeed, it is possible to run much more than 5, but it's not really needed for a test environment cluster, 64 | and the RAM needed would be much more important. 65 | If you still want to use more than 5 VMs, 66 | you will have to add/edit your own configuration files in the [conf](conf) directory. 67 | 68 | **Run the cluster** 69 | 70 | Simply go in the cloned directory (vagrant-elasticsearch-cluster by default). 71 | Execute this command : 72 | 73 | ``` 74 | vagrant up 75 | ``` 76 | 77 | By default, this command will boot 5 VMs, with `My amazing ES cluster` name, `512M` of RAM for each node and this network ip address `10.0.0.0`. 78 | 79 | You can change the cluster size with the `CLUSTER_COUNT` variable: 80 | 81 | ``` 82 | CLUSTER_COUNT=3 vagrant up 83 | ``` 84 | 85 | You can change the cluster name with the `CLUSTER_NAME` variable: 86 | 87 | ``` 88 | CLUSTER_NAME='My awesome cluster' vagrant up 89 | ``` 90 | 91 | You can change the cluster RAM used for each node with the `CLUSTER_RAM` variable: 92 | 93 | ``` 94 | CLUSTER_RAM=1024 vagrant up 95 | ``` 96 | 97 | You can change the cluster network IP address with the `CLUSTER_IP_PATTERN` variable: 98 | 99 | ``` 100 | CLUSTER_IP_PATTERN='172.16.15.%d' vagrant up 101 | ``` 102 | 103 | Providing the `CLUSTER_NAME`, `CLUSTER_COUNT`, `CLUSTER_RAM`, `CLUSTER_IP_PATTERN` variables is only required when you first start the cluster. 104 | Vagrant will save/cache these values so you can run other commands without repeating yourself. 105 | 106 | Of course you can use all these variables at the same time : 107 | 108 | ``` 109 | $ CLUSTER_NAME='My awesome search engine' CLUSTER_IP_PATTERN='172.16.25.%d' CLUSTER_COUNT=3 CLUSTER_RAM=512 vagrant status 110 | ---------------------------------------------------------- 111 | Your ES cluster configurations 112 | ---------------------------------------------------------- 113 | Cluster Name: My awesome search engine 114 | Cluster size: 3 115 | Cluster network IP: 172.16.25.0 116 | Cluster RAM (for each node): 512 117 | ---------------------------------------------------------- 118 | ---------------------------------------------------------- 119 | Current machine states: 120 | 121 | vm1 not created (virtualbox) 122 | vm2 not created (virtualbox) 123 | vm3 not created (virtualbox) 124 | 125 | ... 126 | ``` 127 | 128 | The names of the VMs will follow the following pattern: `vm[0-9]+`. 129 | The trailing number represents the index of the VM, starting at 1. 130 | 131 | ElasticSearch instance is started during provisioning of the VM. 132 | The command is launched into a new screen as root user inside the vagrant. 133 | 134 | Once the cluster is launched (please wait a few seconds) go to : [http://10.0.0.11:9200](http://10.0.0.11:9200) 135 | 136 | Plugins URLs (replace IP if you changed it with `CLUSTER_IP_PATTERN` var) : 137 | 138 | * [http://10.0.0.11:9200/_plugin/marvel](http://10.0.0.11:9200/_plugin/marvel) 139 | * [http://10.0.0.11:9200/_plugin/paramedic/](http://10.0.0.11:9200/_plugin/paramedic/) 140 | * [http://10.0.0.11:9200/_plugin/head/](http://10.0.0.11:9200/_plugin/head/) 141 | * [http://10.0.0.11:9200/_plugin/bigdesk](http://10.0.0.11:9200/_plugin/bigdesk) 142 | * [http://10.0.0.11:9200/_plugin/HQ/](http://10.0.0.11:9200/_plugin/HQ/) 143 | 144 | The default configuration (HTTP enabled for all nodes) allows you to use any of your VM IPs. 145 | If one (or more) of your nodes fails, try with another IP to see what happened. 146 | 147 | By default the cluster nodes have an IP following the pattern "10.0.0.%d" as you can see in [Vagrantfile](Vagrantfile). 148 | 149 | But you can change it using an ENV var : 150 | 151 | ``` 152 | CLUSTER_COUNT=2 CLUSTER_IP_PATTERN='172.16.10.%d' vagrant up 153 | ``` 154 | 155 | * This command will start 2 ES instances with IPs like : 172.16.10.11, 172.16.10.12. 156 | * :warning: Before that, you must verify that config files (conf/vm*) do not exist or delete them. 157 | * Indeed, this files need to be re-written. 158 | 159 | You will see this kind of shell : 160 | 161 | ``` 162 | $ CLUSTER_COUNT=2 CLUSTER_IP_PATTERN='172.16.10.%d' vagrant up 163 | Cluster size: 2 164 | Cluster IP: 172.16.10.0 165 | Bringing machine 'vm1' up with 'virtualbox' provider... 166 | Bringing machine 'vm2' up with 'virtualbox' provider... 167 | 168 | ``` 169 | 170 | And you now access to nodes like that : [http://172.16.10.11:9200](http://172.16.10.11:9200) 171 | 172 | **Stop the cluster** 173 | 174 | ``` 175 | vagrant halt 176 | ``` 177 | 178 | This will stop the whole cluster. If you want to only stop one VM, you can use: 179 | 180 | ``` 181 | vagrant halt vm2 182 | ``` 183 | 184 | This will stop the `vm2` instance. 185 | 186 | **Destroy the cluster** 187 | 188 | ``` 189 | vagrant destroy 190 | ``` 191 | 192 | This will stop the whole cluster. If you want to only stop one VM, you can use: 193 | 194 | ``` 195 | vagrant destroy vm2 196 | ``` 197 | 198 | **Remove the cluster** 199 | 200 | ``` 201 | vagrant box remove ypereirareis/debian-elasticsearch-amd64 202 | ``` 203 | 204 | This will remove your local copy of the vagrant base-box. 205 | 206 | :warning: If you destroy a VM, I suggest you to destroy all the cluster to be sure to have the same ES version in all of your nodes. 207 | 208 | **Managing ElasticSearch instances** 209 | 210 | Each VM has its own ElasticSearch instance running in a `screen` session named `elastic`. 211 | Once connected to the VM, you can manage this instance with the following commands: 212 | 213 | * `(sudo) node-start`: starts the ES instance 214 | * `(sudo) node-stop`: stops the ES instance 215 | * `(sudo) node-restart`: restarts the ES instance 216 | * `(sudo) node-status`: displays ES instance's status 217 | * `(sudo) node-attach`: bring you to the screen session hosting the ES instance. Use `^Ad` to detach. 218 | 219 | You should be brought to the screen session hosting ElasticSearch and see its log. 220 | 221 | The first launch of ES instance is done by vagrant provisionning. 222 | So you should prepend `sudo` for each command above. 223 | But you have the possibility to start an ES instance as 'vagrant' user from the VM. 224 | 225 | ``` 226 | vagrant ssh vmX 227 | sudo node-stop 228 | node-start 229 | ``` 230 | 231 | This chain of commands will log you into a chosen VM, 232 | will stop the ES 'root-user' instance and will start a 'vagrant-user' ES instance. 233 | 234 | 3.Configure your cluster 235 | -- 236 | 237 | If you need or want to change the default working configuration of your cluster, 238 | you can do it adding/editing elasticsearch.yml files in conf/vmX/elasticsearch.yml. 239 | Each node configuration is shared with VM thanks to this "conf" directory. 240 | 241 | By default, this configuration files are **auto-generated** by Vagrant when running the cluster for the first time. 242 | In this case, default values listed at the top of this page are used. 243 | 244 | 245 | 4.ElasticSearch plugins inside the base box 246 | -- 247 | 248 | * elasticsearch-head - [https://github.com/mobz/elasticsearch-head](https://github.com/mobz/elasticsearch-head) 249 | * elasticsearch-paramedic - [https://github.com/karmi/elasticsearch-paramedic](https://github.com/karmi/elasticsearch-paramedic) 250 | * BigDesk - [https://github.com/lukas-vlcek/bigdesk](https://github.com/lukas-vlcek/bigdesk) 251 | * Marvel - [http://www.elasticsearch.org/overview/marvel/](http://www.elasticsearch.org/overview/marvel/) 252 | * ElasticsearchHQ - [http://www.elastichq.org/](http://www.elastichq.org/) 253 | 254 | 255 | 5.Working with your cluster 256 | -- 257 | 258 | **Create a "subscriptions" index with 5 shards and 2 replicas** 259 | 260 | ``` 261 | curl -XPUT 'http://10.0.0.11:9200/subscriptions/' -d '{ 262 | "settings" : { 263 | "number_of_shards" : 5, 264 | "number_of_replicas" : 2 265 | } 266 | }' 267 | ``` 268 | 269 | **Index a "subscription" document inside the "subscriptions" index** 270 | 271 | ``` 272 | curl -XPUT 'http://10.0.0.11:9200/subscriptions/subscription/1' -d '{ 273 | "user" : "ypereirareis", 274 | "post_date" : "2014-03-26T14:12:12", 275 | "message" : "Trying out vagrant elasticsearch cluster" 276 | }' 277 | ``` 278 | 279 | You can now perform any action/request authorized by elasticsearch API (index, get, delete, bulk,...) 280 | 281 | 6.Vagrant 282 | -- 283 | 284 | You can use every vagrant command to manage your cluster and VMs. 285 | This project is simply made to launch a working ES cluster with a single command, using vagrant/virtualbox virtual machines. 286 | 287 | Use it to test every configuration/queries you want (split brain, unicast, recovery, indexing, sharding) 288 | 289 | 7.Important 290 | -- 291 | 292 | Do forks, PR, and MRs !!!! 293 | 294 | 8.TODO 295 | -- 296 | 297 | * Add extra plugins or applications in the base box (redis, logstash, kibana, ...) 298 | * Add some configurations to illustrate split brain, unicast discovery, load balancing, snapshots, recovery... 299 | * Add possibility to configure cluster name, RAM per node AND hostnames through the shell (ENV vars) 300 | 301 | LICENSE 302 | -- 303 | 304 | The MIT License (MIT) 305 | 306 | Copyright (c) 2017 Yannick Pereira-Reis 307 | 308 | Permission is hereby granted, free of charge, to any person obtaining a copy 309 | of this software and associated documentation files (the "Software"), to deal 310 | in the Software without restriction, including without limitation the rights 311 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 312 | copies of the Software, and to permit persons to whom the Software is 313 | furnished to do so, subject to the following conditions: 314 | 315 | The above copyright notice and this permission notice shall be included in all 316 | copies or substantial portions of the Software. 317 | 318 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 319 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 320 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 321 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 322 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 323 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 324 | SOFTWARE. 325 | -------------------------------------------------------------------------------- /Vagrantfile: -------------------------------------------------------------------------------- 1 | # -*- mode: ruby -*- 2 | # vi: set ft=ruby : 3 | require 'erb' 4 | require_relative 'lib/elasticsearch-module.rb' 5 | require_relative 'lib/elasticsearch-script.rb' 6 | 7 | utils = Vagrant::ElastiSearchCluster::Util.new 8 | 9 | Vagrant.configure("2") do |config| 10 | 11 | utils.manage_and_print_config 12 | 13 | nodes_number = utils.get_cluster_info 'cluster_count' 14 | nodes_number = nodes_number.to_i 15 | 16 | cluster_ram = utils.get_cluster_info 'cluster_ram' 17 | cluster_ram = cluster_ram.to_i 18 | 19 | config.vm.box = 'ypereirareis/debian-elasticsearch-amd64' 20 | config.vm.synced_folder ".", "/vagrant", :id => "vagrant-root", :mount_options => ['dmode=777', 'fmode=777'] 21 | 22 | config.vm.provider 'virtualbox' do |vbox| 23 | vbox.customize ['modifyvm', :id, '--memory', cluster_ram] 24 | vbox.customize ['modifyvm', :id, '--cpus', 1] 25 | end 26 | 27 | (1..nodes_number).each do |index| 28 | name = utils.get_vm_name index 29 | node_name = utils.get_node_name index 30 | ip = utils.get_vm_ip index 31 | primary = (index.eql? 1) 32 | 33 | utils.build_config index 34 | 35 | config.vm.define :"#{name}", primary: primary do |node| 36 | node.vm.hostname = "#{node_name}.es.dev" 37 | node.vm.network 'private_network', ip: ip, auto_config: true 38 | node.vm.provision 'shell', inline: @node_start_inline_script % [name, node_name, ip] 39 | end 40 | end 41 | utils.logger.info "----------------------------------------------------------" 42 | end 43 | -------------------------------------------------------------------------------- /conf/elasticsearch.yml.erb: -------------------------------------------------------------------------------- 1 | ##################### Elasticsearch Configuration Example ##################### 2 | 3 | # This file contains an overview of various configuration settings, 4 | # targeted at operations staff. Application developers should 5 | # consult the guide at . 6 | # 7 | # The installation procedure is covered at 8 | # . 9 | # 10 | # Elasticsearch comes with reasonable defaults for most settings, 11 | # so you can try it out without bothering with configuration. 12 | # 13 | # Most of the time, these defaults are just fine for running a production 14 | # cluster. If you're fine-tuning your cluster, or wondering about the 15 | # effect of certain configuration option, please _do ask_ on the 16 | # mailing list or IRC channel [http://elasticsearch.org/community]. 17 | 18 | # Any element in the configuration can be replaced with environment variables 19 | # by placing them in ${...} notation. For example: 20 | # 21 | # node.rack: ${RACK_ENV_VAR} 22 | 23 | # For information on supported formats and syntax for the config file, see 24 | # 25 | 26 | 27 | ################################### Cluster ################################### 28 | 29 | # Cluster name identifies your cluster for auto-discovery. If you're running 30 | # multiple clusters on the same network, make sure you're using unique names. 31 | # 32 | cluster.name: "<%= @cluster_name.strip %>" 33 | 34 | 35 | #################################### Node ##################################### 36 | 37 | # Node names are generated dynamically on startup, so you're relieved 38 | # from configuring them manually. You can tie this node to a specific name: 39 | # 40 | node.name: "<%= @node_name.strip %>" 41 | 42 | # Every node can be configured to allow or deny being eligible as the master, 43 | # and to allow or deny to store the data. 44 | # 45 | # Allow this node to be eligible as a master node (enabled by default): 46 | # 47 | node.master: true 48 | # 49 | # Allow this node to store data (enabled by default): 50 | # 51 | node.data: true 52 | 53 | # You can exploit these settings to design advanced cluster topologies. 54 | # 55 | # 1. You want this node to never become a master node, only to hold data. 56 | # This will be the "workhorse" of your cluster. 57 | # 58 | # node.master: false 59 | # node.data: true 60 | # 61 | # 2. You want this node to only serve as a master: to not store any data and 62 | # to have free resources. This will be the "coordinator" of your cluster. 63 | # 64 | # node.master: true 65 | # node.data: false 66 | # 67 | # 3. You want this node to be neither master nor data node, but 68 | # to act as a "search load balancer" (fetching data from nodes, 69 | # aggregating results, etc.) 70 | # 71 | # node.master: false 72 | # node.data: false 73 | 74 | # Use the Cluster Health API [http://localhost:9200/_cluster/health], the 75 | # Node Info API [http://localhost:9200/_nodes] or GUI tools 76 | # such as , 77 | # , 78 | # and 79 | # to inspect the cluster state. 80 | 81 | # A node can have generic attributes associated with it, which can later be used 82 | # for customized shard allocation filtering, or allocation awareness. An attribute 83 | # is a simple key value pair, similar to node.key: value, here is an example: 84 | # 85 | # node.rack: rack314 86 | 87 | # By default, multiple nodes are allowed to start from the same installation location 88 | # to disable it, set the following: 89 | # node.max_local_storage_nodes: 1 90 | 91 | 92 | #################################### Index #################################### 93 | 94 | # You can set a number of options (such as shard/replica options, mapping 95 | # or analyzer definitions, translog settings, ...) for indices globally, 96 | # in this file. 97 | # 98 | # Note, that it makes more sense to configure index settings specifically for 99 | # a certain index, either when creating it or by using the index templates API. 100 | # 101 | # See and 102 | # 103 | # for more information. 104 | 105 | # Set the number of shards (splits) of an index (5 by default): 106 | # 107 | index.number_of_shards: 5 108 | 109 | # Set the number of replicas (additional copies) of an index (1 by default): 110 | # 111 | index.number_of_replicas: 2 112 | 113 | # Note, that for development on a local machine, with small indices, it usually 114 | # makes sense to "disable" the distributed features: 115 | # 116 | # index.number_of_shards: 1 117 | # index.number_of_replicas: 0 118 | 119 | # These settings directly affect the performance of index and search operations 120 | # in your cluster. Assuming you have enough machines to hold shards and 121 | # replicas, the rule of thumb is: 122 | # 123 | # 1. Having more *shards* enhances the _indexing_ performance and allows to 124 | # _distribute_ a big index across machines. 125 | # 2. Having more *replicas* enhances the _search_ performance and improves the 126 | # cluster _availability_. 127 | # 128 | # The "number_of_shards" is a one-time setting for an index. 129 | # 130 | # The "number_of_replicas" can be increased or decreased anytime, 131 | # by using the Index Update Settings API. 132 | # 133 | # Elasticsearch takes care about load balancing, relocating, gathering the 134 | # results from nodes, etc. Experiment with different settings to fine-tune 135 | # your setup. 136 | 137 | # Use the Index Status API () to inspect 138 | # the index status. 139 | 140 | 141 | #################################### Paths #################################### 142 | 143 | # Path to directory containing configuration (this file and logging.yml): 144 | # 145 | # path.conf: /vagrant 146 | 147 | # Path to directory where to store index data allocated for this node. 148 | # 149 | # path.data: /path/to/data 150 | # 151 | # Can optionally include more than one location, causing data to be striped across 152 | # the locations (a la RAID 0) on a file level, favouring locations with most free 153 | # space on creation. For example: 154 | # 155 | # path.data: /path/to/data1,/path/to/data2 156 | 157 | # Path to temporary files: 158 | # 159 | # path.work: /path/to/work 160 | 161 | # Path to log files: 162 | # 163 | # path.logs: /path/to/logs 164 | 165 | # Path to where plugins are installed: 166 | # 167 | # path.plugins: /path/to/plugins 168 | 169 | 170 | #################################### Plugin ################################### 171 | 172 | # If a plugin listed here is not installed for current node, the node will not start. 173 | # 174 | # plugin.mandatory: mapper-attachments,lang-groovy 175 | 176 | 177 | ################################### Memory #################################### 178 | 179 | # Elasticsearch performs poorly when JVM starts swapping: you should ensure that 180 | # it _never_ swaps. 181 | # 182 | # Set this property to true to lock the memory: 183 | # 184 | # bootstrap.mlockall: true 185 | 186 | # Make sure that the ES_MIN_MEM and ES_MAX_MEM environment variables are set 187 | # to the same value, and that the machine has enough memory to allocate 188 | # for Elasticsearch, leaving enough memory for the operating system itself. 189 | # 190 | # You should also make sure that the Elasticsearch process is allowed to lock 191 | # the memory, eg. by using `ulimit -l unlimited`. 192 | 193 | 194 | ############################## Network And HTTP ############################### 195 | 196 | # Elasticsearch, by default, binds itself to the 0.0.0.0 address, and listens 197 | # on port [9200-9300] for HTTP traffic and on port [9300-9400] for node-to-node 198 | # communication. (the range means that if the port is busy, it will automatically 199 | # try the next port). 200 | 201 | # Set the bind address specifically (IPv4 or IPv6): 202 | # 203 | # network.bind_host: 192.168.0.1 204 | 205 | # Set the address other nodes will use to communicate with this node. If not 206 | # set, it is automatically derived. It must point to an actual IP address. 207 | # 208 | # network.publish_host: 192.168.0.1 209 | 210 | # Set both 'bind_host' and 'publish_host': 211 | # 212 | network.host: <%= @node_ip.strip %> 213 | 214 | # Set a custom port for the node to node communication (9300 by default): 215 | # 216 | transport.tcp.port: 9300 217 | 218 | # Enable compression for all communication between nodes (disabled by default): 219 | # 220 | # transport.tcp.compress: true 221 | 222 | # Set a custom port to listen for HTTP traffic: 223 | # 224 | http.port: 9200 225 | 226 | # Set a custom allowed content length: 227 | # 228 | # http.max_content_length: 100mb 229 | 230 | # Disable HTTP completely: 231 | # 232 | # http.enabled: false 233 | 234 | 235 | ################################### Gateway ################################### 236 | 237 | # The gateway allows for persisting the cluster state between full cluster 238 | # restarts. Every change to the state (such as adding an index) will be stored 239 | # in the gateway, and when the cluster starts up for the first time, 240 | # it will read its state from the gateway. 241 | 242 | # There are several types of gateway implementations. For more information, see 243 | # . 244 | 245 | # The default gateway type is the "local" gateway (recommended): 246 | # 247 | # gateway.type: local 248 | 249 | # Settings below control how and when to start the initial recovery process on 250 | # a full cluster restart (to reuse as much local data as possible when using shared 251 | # gateway). 252 | 253 | # Allow recovery process after N nodes in a cluster are up: 254 | # 255 | # gateway.recover_after_nodes: 1 256 | 257 | # Set the timeout to initiate the recovery process, once the N nodes 258 | # from previous setting are up (accepts time value): 259 | # 260 | # gateway.recover_after_time: 5m 261 | 262 | # Set how many nodes are expected in this cluster. Once these N nodes 263 | # are up (and recover_after_nodes is met), begin recovery process immediately 264 | # (without waiting for recover_after_time to expire): 265 | # 266 | # gateway.expected_nodes: 2 267 | 268 | 269 | ############################# Recovery Throttling ############################# 270 | 271 | # These settings allow to control the process of shards allocation between 272 | # nodes during initial recovery, replica allocation, rebalancing, 273 | # or when adding and removing nodes. 274 | 275 | # Set the number of concurrent recoveries happening on a node: 276 | # 277 | # 1. During the initial recovery 278 | # 279 | # cluster.routing.allocation.node_initial_primaries_recoveries: 4 280 | # 281 | # 2. During adding/removing nodes, rebalancing, etc 282 | # 283 | # cluster.routing.allocation.node_concurrent_recoveries: 2 284 | 285 | # Set to throttle throughput when recovering (eg. 100mb, by default 20mb): 286 | # 287 | # indices.recovery.max_bytes_per_sec: 20mb 288 | 289 | # Set to limit the number of open concurrent streams when 290 | # recovering a shard from a peer: 291 | # 292 | # indices.recovery.concurrent_streams: 5 293 | 294 | 295 | ################################## Discovery ################################## 296 | 297 | # Discovery infrastructure ensures nodes can be found within a cluster 298 | # and master node is elected. Multicast discovery is the default. 299 | 300 | # Set to ensure a node sees N other master eligible nodes to be considered 301 | # operational within the cluster. Its recommended to set it to a higher value 302 | # than 1 when running more than 2 nodes in the cluster. 303 | # 304 | # discovery.zen.minimum_master_nodes: 1 305 | 306 | # Set the time to wait for ping responses from other nodes when discovering. 307 | # Set this option to a higher value on a slow or congested network 308 | # to minimize discovery failures: 309 | # 310 | # discovery.zen.ping.timeout: 3s 311 | 312 | # For more information, see 313 | # 314 | 315 | # Unicast discovery allows to explicitly control which nodes will be used 316 | # to discover the cluster. It can be used when multicast is not present, 317 | # or to restrict the cluster communication-wise. 318 | # 319 | # 1. Disable multicast discovery (enabled by default): 320 | # 321 | # discovery.zen.ping.multicast.enabled: false 322 | # 323 | # 2. Configure an initial list of master nodes in the cluster 324 | # to perform discovery when new nodes (master or data) are started: 325 | # 326 | # discovery.zen.ping.unicast.hosts: ["host1", "host2:port"] 327 | 328 | # EC2 discovery allows to use AWS EC2 API in order to perform discovery. 329 | # 330 | # You have to install the cloud-aws plugin for enabling the EC2 discovery. 331 | # 332 | # For more information, see 333 | # 334 | # 335 | # See 336 | # for a step-by-step tutorial. 337 | 338 | # GCE discovery allows to use Google Compute Engine API in order to perform discovery. 339 | # 340 | # You have to install the cloud-gce plugin for enabling the GCE discovery. 341 | # 342 | # For more information, see . 343 | 344 | # Azure discovery allows to use Azure API in order to perform discovery. 345 | # 346 | # You have to install the cloud-azure plugin for enabling the Azure discovery. 347 | # 348 | # For more information, see . 349 | 350 | ################################## Slow Log ################################## 351 | 352 | # Shard level query and fetch threshold logging. 353 | 354 | #index.search.slowlog.threshold.query.warn: 10s 355 | #index.search.slowlog.threshold.query.info: 5s 356 | #index.search.slowlog.threshold.query.debug: 2s 357 | #index.search.slowlog.threshold.query.trace: 500ms 358 | 359 | #index.search.slowlog.threshold.fetch.warn: 1s 360 | #index.search.slowlog.threshold.fetch.info: 800ms 361 | #index.search.slowlog.threshold.fetch.debug: 500ms 362 | #index.search.slowlog.threshold.fetch.trace: 200ms 363 | 364 | #index.indexing.slowlog.threshold.index.warn: 10s 365 | #index.indexing.slowlog.threshold.index.info: 5s 366 | #index.indexing.slowlog.threshold.index.debug: 2s 367 | #index.indexing.slowlog.threshold.index.trace: 500ms 368 | 369 | ################################## GC Logging ################################ 370 | 371 | #monitor.jvm.gc.young.warn: 1000ms 372 | #monitor.jvm.gc.young.info: 700ms 373 | #monitor.jvm.gc.young.debug: 400ms 374 | 375 | #monitor.jvm.gc.old.warn: 10s 376 | #monitor.jvm.gc.old.info: 5s 377 | #monitor.jvm.gc.old.debug: 2s 378 | 379 | marvel.agent.enabled: <%= @node_marvel_enabled %> 380 | marvel.agent.exporter.es.hosts: ["<%= @cluster_ip.strip % 11 %>:9200"] 381 | 382 | -------------------------------------------------------------------------------- /lib/elasticsearch-module.rb: -------------------------------------------------------------------------------- 1 | module Vagrant 2 | module ElastiSearchCluster 3 | class Util 4 | attr_accessor :logger 5 | 6 | def initialize 7 | @params = [ 8 | 'cluster_name' => ['CLUSTER_NAME', 'cluster_name', 'My amazing ES cluster'], 9 | 'cluster_ip' => ['CLUSTER_IP_PATTERN', 'cluster_ip', '10.0.0.%d'], 10 | 'cluster_count' => ['CLUSTER_COUNT', 'cluster_size', 5], 11 | 'cluster_ram' => ['CLUSTER_RAM', 'cluster_ram', 512], 12 | ] 13 | 14 | @names = %w(thor zeus isis shifu baal) 15 | @logger = Vagrant::UI::Colored.new 16 | @logger.opts[:color] = :white 17 | end 18 | 19 | def get_vm_name(index) 20 | "vm#{index}" 21 | end 22 | 23 | def get_vm_ip(index) 24 | ip = get_cluster_info 'cluster_ip' 25 | ip % (10 + index) 26 | end 27 | 28 | def get_node_name(index) 29 | @names[index - 1] 30 | end 31 | 32 | def get_cluster_info(index) 33 | return ENV[@params[0][index][0]] if ENV[@params[0][index][0]] 34 | return (File.read ".vagrant/#{@params[0][index][1]}") if File.exist? ".vagrant/#{@params[0][index][1]}" 35 | "#{@params[0][index][2]}" 36 | end 37 | 38 | def save_cluster_info(index, value) 39 | Dir.mkdir('.vagrant') unless Dir.exist?('.vagrant') 40 | File.open(".vagrant/#{@params[0][index][1]}", 'w') do |file| 41 | file.puts value.to_s 42 | end 43 | end 44 | 45 | def get_config_template 46 | config_file = File.open('conf/elasticsearch.yml.erb', 'r') 47 | ERB.new(config_file.read) 48 | end 49 | 50 | def build_config(index) 51 | vm = get_vm_name index 52 | conf_file_format = "conf/elasticsearch-#{vm}.yml" 53 | 54 | File.open(conf_file_format, 'w') do |file| 55 | @node_ip = get_vm_ip index 56 | @node_name = get_node_name index 57 | @node_marvel_enabled = (index == 1) 58 | @cluster_ip = get_cluster_info 'cluster_ip' 59 | @cluster_name = get_cluster_info 'cluster_name' 60 | 61 | @logger.info "Building configuration for #{vm}" 62 | file.puts self.get_config_template.result(binding) 63 | end unless File.exist? conf_file_format 64 | end 65 | 66 | def manage_and_print_config 67 | self.logger.info "----------------------------------------------------------" 68 | self.logger.info " Your ES cluster configurations" 69 | self.logger.info "----------------------------------------------------------" 70 | 71 | # Building and showing CLUSTER NAME information 72 | index = 'cluster_name' 73 | cluster_name = self.get_cluster_info index 74 | self.logger.info "Cluster Name: #{cluster_name.strip}" 75 | self.save_cluster_info index, cluster_name 76 | 77 | # Building and showing CLUSTER COUNT information 78 | index = 'cluster_count' 79 | nodes_number = self.get_cluster_info index 80 | self.logger.info "Cluster size: #{nodes_number.strip}" 81 | self.save_cluster_info index, nodes_number 82 | 83 | # Building and showing CLUSTER IP PATTERN information 84 | index = 'cluster_ip' 85 | cluster_network_ip = self.get_cluster_info index 86 | self.logger.info "Cluster network IP: #{cluster_network_ip.strip % 0}" 87 | self.save_cluster_info index, cluster_network_ip 88 | 89 | # Building and showing CLUSTER RAM information 90 | index = 'cluster_ram' 91 | cluster_ram = self.get_cluster_info index 92 | self.logger.info "Cluster RAM (for each node): #{cluster_ram.strip}" 93 | self.save_cluster_info index, cluster_ram 94 | 95 | self.logger.info "----------------------------------------------------------" 96 | end 97 | end 98 | end 99 | end 100 | 101 | -------------------------------------------------------------------------------- /lib/elasticsearch-script.rb: -------------------------------------------------------------------------------- 1 | @node_start_inline_script = <