├── .gitignore
├── LICENSE
├── README.md
├── Vagrantfile
├── conf
    └── elasticsearch.yml.erb
├── lib
    ├── elasticsearch-module.rb
    ├── elasticsearch-script.rb
    └── upgrade.sh
└── scripts
    ├── node-attach
    ├── node-restart
    ├── node-start
    ├── node-status
    └── node-stop


/.gitignore:
--------------------------------------------------------------------------------
1 | .vagrant/
2 | conf/*
3 | logs/
4 | 


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
 1 | The MIT License (MIT)
 2 | 
 3 | Copyright (c) 2017 Yannick Pereira-Reis
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | vagrant-elasticsearch-cluster
  2 | =============================
  3 | 
  4 | **[NOT MAINTAINED]**
  5 | 
  6 | Create an ElasticSearch cluster with a single bash command :
  7 | 
  8 | ```
  9 | vagrant up
 10 | ```
 11 | 
 12 | **Programs, plugins, libs and versions information**
 13 | 
 14 | | Program, plugin, lib              | Version     | How to use it                             |
 15 | | --------------------------------- | ----------- | ----------------------------------------- |
 16 | | ElasticSearch                     | 1.4.3       | [http://www.elasticsearch.org/guide/](http://www.elasticsearch.org/guide/) |
 17 | | Java (openjdk-7-jre)              | 1.7.0_25    |                                           |
 18 | | elasticsearch-image               | 1.2.0       | [https://github.com/kzwang/elasticsearch-image](https://github.com/kzwang/elasticsearch-image) |
 19 | | elasticsearch-mapper-attachments  | 2.4.2       | [https://github.com/elasticsearch/elasticsearch-mapper-attachments](https://github.com/elasticsearch/elasticsearch-mapper-attachments) |
 20 | | rssriver (david pilato)           | 1.3.0       | [http://www.pilato.fr/rssriver/](http://www.pilato.fr/rssriver/) |
 21 | | elasticsearch-river-jdbc          | 1.4.0.9     | [https://github.com/jprante/elasticsearch-river-jdbc](https://github.com/jprante/elasticsearch-river-jdbc) |
 22 | | elasticsearch-river-rabbitmq      | 2.4.1       | [https://github.com/elasticsearch/elasticsearch-river-rabbitmq](https://github.com/elasticsearch/elasticsearch-river-rabbitmq) |
 23 | | elasticsearch-river-twitter       | 2.4.2       | [https://github.com/elasticsearch/elasticsearch-river-twitter](https://github.com/elasticsearch/elasticsearch-river-twitter) |
 24 | | elasticsearch-river-wikipedia     | 2.4.1       | [https://github.com/elasticsearch/elasticsearch-river-wikipedia](https://github.com/elasticsearch/elasticsearch-river-wikipedia) |
 25 | 
 26 | This plugins are just installed through the `bin/plugin -i` command. You must configure everything else.
 27 | 
 28 | **Cluster default configuration**
 29 | 
 30 | | Configuration              |  Value(s)                                            |
 31 | | -------------------------- | ---------------------------------------------------- |
 32 | | Cluster name               | elasticsearch-cluster-test                           |
 33 | | Nodes names                | thor, zeus, isis, baal, shifu                        |
 34 | | VM names                   | vm1, vm2, vm3, vm4, vm5                              |
 35 | | Default cluster network IP | 10.0.0.0                                             |
 36 | 
 37 | 
 38 | 1.Installation and requirements
 39 | --
 40 | 
 41 | **Must have on your local machine**
 42 | 
 43 | * VirtualBox (last version)
 44 | * Vagrant (>=1.5)
 45 | * cUrl (or another REST client to talk to ES)
 46 | 
 47 | **Clone this repository**
 48 | 
 49 | git clone git@github.com:ypereirareis/vagrant-elasticsearch-cluster.git
 50 | 
 51 | **WARNING**
 52 | 
 53 | You'll need enough RAM to run VMs in your cluster.
 54 | Each new VM launched within your cluster will have 512M of RAM allocated.
 55 | You can change this configuration in the Vagrantfile once cloned.
 56 | 
 57 | 2.How to run a new ElasticSearch cluster
 58 | --
 59 | 
 60 | **Important**
 61 | 
 62 | The maximum number VMs running in the cluster is 5.
 63 | Indeed, it is possible to run much more than 5, but it's not really needed for a test environment cluster,
 64 | and the RAM needed would be much more important.
 65 | If you still want to use more than 5 VMs,
 66 | you will have to add/edit your own configuration files in the [conf](conf) directory.
 67 | 
 68 | **Run the cluster**
 69 | 
 70 | Simply go in the cloned directory (vagrant-elasticsearch-cluster by default).
 71 | Execute this command :
 72 | 
 73 | ```
 74 | vagrant up
 75 | ```
 76 | 
 77 | By default, this command will boot 5 VMs, with `My amazing ES cluster` name, `512M` of RAM for each node and this network ip address `10.0.0.0`.
 78 | 
 79 | You can change the cluster size with the `CLUSTER_COUNT` variable:
 80 | 
 81 | ```
 82 | CLUSTER_COUNT=3 vagrant up
 83 | ```
 84 | 
 85 | You can change the cluster name with the `CLUSTER_NAME` variable:
 86 | 
 87 | ```
 88 | CLUSTER_NAME='My awesome cluster' vagrant up
 89 | ```
 90 | 
 91 | You can change the cluster RAM used for each node with the `CLUSTER_RAM` variable:
 92 | 
 93 | ```
 94 | CLUSTER_RAM=1024 vagrant up
 95 | ```
 96 | 
 97 | You can change the cluster network IP address with the `CLUSTER_IP_PATTERN` variable:
 98 | 
 99 | ```
100 | CLUSTER_IP_PATTERN='172.16.15.%d' vagrant up
101 | ```
102 | 
103 | Providing the `CLUSTER_NAME`, `CLUSTER_COUNT`, `CLUSTER_RAM`, `CLUSTER_IP_PATTERN` variables is only required when you first start the cluster.
104 | Vagrant will save/cache these values so you can run other commands without repeating yourself.
105 | 
106 | Of course you can use all these variables at the same time :
107 | 
108 | ```
109 | $ CLUSTER_NAME='My awesome search engine' CLUSTER_IP_PATTERN='172.16.25.%d' CLUSTER_COUNT=3 CLUSTER_RAM=512 vagrant status
110 | ----------------------------------------------------------
111 |           Your ES cluster configurations
112 | ----------------------------------------------------------
113 | Cluster Name: My awesome search engine
114 | Cluster size: 3
115 | Cluster network IP: 172.16.25.0
116 | Cluster RAM (for each node): 512
117 | ----------------------------------------------------------
118 | ----------------------------------------------------------
119 | Current machine states:
120 | 
121 | vm1                       not created (virtualbox)
122 | vm2                       not created (virtualbox)
123 | vm3                       not created (virtualbox)
124 | 
125 | ...
126 | ```
127 | 
128 | The names of the VMs will follow the following pattern: `vm[0-9]+`.
129 | The trailing number represents the index of the VM, starting at 1.
130 | 
131 | ElasticSearch instance is started during provisioning of the VM.
132 | The command is launched into a new screen as root user inside the vagrant.
133 | 
134 | Once the cluster is launched (please wait a few seconds) go to : [http://10.0.0.11:9200](http://10.0.0.11:9200)
135 | 
136 | Plugins URLs (replace IP if you changed it with `CLUSTER_IP_PATTERN` var) :
137 | 
138 | * [http://10.0.0.11:9200/_plugin/marvel](http://10.0.0.11:9200/_plugin/marvel)
139 | * [http://10.0.0.11:9200/_plugin/paramedic/](http://10.0.0.11:9200/_plugin/paramedic/)
140 | * [http://10.0.0.11:9200/_plugin/head/](http://10.0.0.11:9200/_plugin/head/)
141 | * [http://10.0.0.11:9200/_plugin/bigdesk](http://10.0.0.11:9200/_plugin/bigdesk)
142 | * [http://10.0.0.11:9200/_plugin/HQ/](http://10.0.0.11:9200/_plugin/HQ/)
143 | 
144 | The default configuration (HTTP enabled for all nodes) allows you to use any of your VM IPs.
145 | If one (or more) of your nodes fails, try with another IP to see what happened.
146 | 
147 | By default the cluster nodes have an IP following the pattern "10.0.0.%d" as you can see in [Vagrantfile](Vagrantfile).
148 | 
149 | But you can change it using an ENV var :
150 | 
151 | ```
152 | CLUSTER_COUNT=2 CLUSTER_IP_PATTERN='172.16.10.%d' vagrant up
153 | ```
154 | 
155 | * This command will start 2 ES instances with IPs like : 172.16.10.11, 172.16.10.12.
156 | * :warning: Before that, you must verify that config files (conf/vm*) do not exist or delete them.
157 | * Indeed, this files need to be re-written.
158 | 
159 | You will see this kind of shell :
160 | 
161 | ```
162 | $ CLUSTER_COUNT=2 CLUSTER_IP_PATTERN='172.16.10.%d' vagrant up
163 | Cluster size: 2
164 | Cluster IP: 172.16.10.0
165 | Bringing machine 'vm1' up with 'virtualbox' provider...
166 | Bringing machine 'vm2' up with 'virtualbox' provider...
167 | 
168 | ```
169 | 
170 | And you now access to nodes like that : [http://172.16.10.11:9200](http://172.16.10.11:9200)
171 | 
172 | **Stop the cluster**
173 | 
174 | ```
175 | vagrant halt
176 | ```
177 | 
178 | This will stop the whole cluster. If you want to only stop one VM, you can use:
179 | 
180 | ```
181 | vagrant halt vm2
182 | ```
183 | 
184 | This will stop the `vm2` instance.
185 | 
186 | **Destroy the cluster**
187 | 
188 | ```
189 | vagrant destroy
190 | ```
191 | 
192 | This will stop the whole cluster. If you want to only stop one VM, you can use:
193 | 
194 | ```
195 | vagrant destroy vm2
196 | ```
197 | 
198 | **Remove the cluster**
199 | 
200 | ```
201 | vagrant box remove ypereirareis/debian-elasticsearch-amd64
202 | ```
203 | 
204 | This will remove your local copy of the vagrant base-box.
205 | 
206 | :warning: If you destroy a VM, I suggest you to destroy all the cluster to be sure to have the same ES version in all of your nodes.
207 | 
208 | **Managing ElasticSearch instances**
209 | 
210 | Each VM has its own ElasticSearch instance running in a `screen` session named `elastic`.
211 | Once connected to the VM, you can manage this instance with the following commands:
212 | 
213 | * `(sudo) node-start`: starts the ES instance
214 | * `(sudo) node-stop`: stops the ES instance
215 | * `(sudo) node-restart`: restarts the ES instance
216 | * `(sudo) node-status`: displays ES instance's status
217 | * `(sudo) node-attach`: bring you to the screen session hosting the ES instance. Use `^Ad` to detach.
218 | 
219 | You should be brought to the screen session hosting ElasticSearch and see its log.
220 | 
221 | The first launch of ES instance is done by vagrant provisionning.
222 | So you should prepend `sudo` for each command above.
223 | But you have the possibility to start an ES instance as 'vagrant' user from the VM.
224 | 
225 | ```
226 | vagrant ssh vmX
227 | sudo node-stop
228 | node-start
229 | ```
230 | 
231 | This chain of commands will log you into a chosen VM,
232 | will stop the ES 'root-user' instance and will start a 'vagrant-user' ES instance.
233 | 
234 | 3.Configure your cluster
235 | --
236 | 
237 | If you need or want to change the default working configuration of your cluster,
238 | you can do it adding/editing elasticsearch.yml files in conf/vmX/elasticsearch.yml.
239 | Each node configuration is shared with VM thanks to this "conf" directory.
240 | 
241 | By default, this configuration files are **auto-generated** by Vagrant when running the cluster for the first time.
242 | In this case, default values listed at the top of this page are used.
243 | 
244 | 
245 | 4.ElasticSearch plugins inside the base box
246 | --
247 | 
248 | * elasticsearch-head - [https://github.com/mobz/elasticsearch-head](https://github.com/mobz/elasticsearch-head)
249 | * elasticsearch-paramedic - [https://github.com/karmi/elasticsearch-paramedic](https://github.com/karmi/elasticsearch-paramedic)
250 | * BigDesk - [https://github.com/lukas-vlcek/bigdesk](https://github.com/lukas-vlcek/bigdesk)
251 | * Marvel - [http://www.elasticsearch.org/overview/marvel/](http://www.elasticsearch.org/overview/marvel/)
252 | * ElasticsearchHQ - [http://www.elastichq.org/](http://www.elastichq.org/)
253 | 
254 | 
255 | 5.Working with your cluster
256 | --
257 | 
258 | **Create a "subscriptions" index with 5 shards and 2 replicas**
259 | 
260 | ```
261 | curl -XPUT 'http://10.0.0.11:9200/subscriptions/' -d '{
262 |     "settings" : {
263 |         "number_of_shards" : 5,
264 |         "number_of_replicas" : 2
265 |     }
266 | }'
267 | ```
268 | 
269 | **Index a "subscription" document inside the "subscriptions" index**
270 | 
271 | ```
272 | curl -XPUT 'http://10.0.0.11:9200/subscriptions/subscription/1' -d '{
273 |     "user" : "ypereirareis",
274 |     "post_date" : "2014-03-26T14:12:12",
275 |     "message" : "Trying out vagrant elasticsearch cluster"
276 | }'
277 | ```
278 | 
279 | You can now perform any action/request authorized by elasticsearch API (index, get, delete, bulk,...)
280 | 
281 | 6.Vagrant
282 | --
283 | 
284 | You can use every vagrant command to manage your cluster and VMs.
285 | This project is simply made to launch a working ES cluster with a single command, using vagrant/virtualbox virtual machines.
286 | 
287 | Use it to test every configuration/queries you want (split brain, unicast, recovery, indexing, sharding)
288 | 
289 | 7.Important
290 | --
291 | 
292 | Do forks, PR, and MRs !!!!
293 | 
294 | 8.TODO
295 | --
296 | 
297 | * Add extra plugins or applications in the base box (redis, logstash, kibana, ...)
298 | * Add some configurations to illustrate split brain, unicast discovery, load balancing, snapshots, recovery...
299 | * Add possibility to configure cluster name, RAM per node AND hostnames through the shell (ENV vars)
300 | 
301 | LICENSE
302 | --
303 | 
304 | The MIT License (MIT)
305 | 
306 | Copyright (c) 2017 Yannick Pereira-Reis
307 | 
308 | Permission is hereby granted, free of charge, to any person obtaining a copy
309 | of this software and associated documentation files (the "Software"), to deal
310 | in the Software without restriction, including without limitation the rights
311 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
312 | copies of the Software, and to permit persons to whom the Software is
313 | furnished to do so, subject to the following conditions:
314 | 
315 | The above copyright notice and this permission notice shall be included in all
316 | copies or substantial portions of the Software.
317 | 
318 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
319 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
320 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
321 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
322 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
323 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
324 | SOFTWARE.
325 | 


--------------------------------------------------------------------------------
/Vagrantfile:
--------------------------------------------------------------------------------
 1 | # -*- mode: ruby -*-
 2 | # vi: set ft=ruby :
 3 | require 'erb'
 4 | require_relative 'lib/elasticsearch-module.rb'
 5 | require_relative 'lib/elasticsearch-script.rb'
 6 | 
 7 | utils = Vagrant::ElastiSearchCluster::Util.new
 8 | 
 9 | Vagrant.configure("2") do |config|
10 | 
11 |   utils.manage_and_print_config
12 | 
13 |   nodes_number = utils.get_cluster_info 'cluster_count'
14 |   nodes_number = nodes_number.to_i
15 | 
16 |   cluster_ram = utils.get_cluster_info 'cluster_ram'
17 |   cluster_ram = cluster_ram.to_i
18 | 
19 |   config.vm.box = 'ypereirareis/debian-elasticsearch-amd64'
20 |   config.vm.synced_folder ".", "/vagrant", :id => "vagrant-root", :mount_options => ['dmode=777', 'fmode=777']
21 | 
22 |   config.vm.provider 'virtualbox' do |vbox|
23 |     vbox.customize ['modifyvm', :id, '--memory', cluster_ram]
24 |     vbox.customize ['modifyvm', :id, '--cpus', 1]
25 |   end
26 | 
27 |   (1..nodes_number).each do |index|
28 |       name = utils.get_vm_name index
29 |       node_name = utils.get_node_name index
30 |       ip = utils.get_vm_ip index
31 |       primary = (index.eql? 1)
32 | 
33 |       utils.build_config index
34 | 
35 |       config.vm.define :"#{name}", primary: primary do |node|
36 |           node.vm.hostname = "#{node_name}.es.dev"
37 |           node.vm.network 'private_network', ip: ip, auto_config: true
38 |           node.vm.provision 'shell', inline: @node_start_inline_script % [name, node_name, ip]
39 |       end
40 |   end
41 |   utils.logger.info "----------------------------------------------------------"
42 | end
43 | 


--------------------------------------------------------------------------------
/conf/elasticsearch.yml.erb:
--------------------------------------------------------------------------------
  1 | ##################### Elasticsearch Configuration Example #####################
  2 | 
  3 | # This file contains an overview of various configuration settings,
  4 | # targeted at operations staff. Application developers should
  5 | # consult the guide at <http://elasticsearch.org/guide>.
  6 | #
  7 | # The installation procedure is covered at
  8 | # <http://elasticsearch.org/guide/en/elasticsearch/reference/current/setup.html>.
  9 | #
 10 | # Elasticsearch comes with reasonable defaults for most settings,
 11 | # so you can try it out without bothering with configuration.
 12 | #
 13 | # Most of the time, these defaults are just fine for running a production
 14 | # cluster. If you're fine-tuning your cluster, or wondering about the
 15 | # effect of certain configuration option, please _do ask_ on the
 16 | # mailing list or IRC channel [http://elasticsearch.org/community].
 17 | 
 18 | # Any element in the configuration can be replaced with environment variables
 19 | # by placing them in ${...} notation. For example:
 20 | #
 21 | # node.rack: ${RACK_ENV_VAR}
 22 | 
 23 | # For information on supported formats and syntax for the config file, see
 24 | # <http://elasticsearch.org/guide/en/elasticsearch/reference/current/setup-configuration.html>
 25 | 
 26 | 
 27 | ################################### Cluster ###################################
 28 | 
 29 | # Cluster name identifies your cluster for auto-discovery. If you're running
 30 | # multiple clusters on the same network, make sure you're using unique names.
 31 | #
 32 | cluster.name: "<%= @cluster_name.strip %>"
 33 | 
 34 | 
 35 | #################################### Node #####################################
 36 | 
 37 | # Node names are generated dynamically on startup, so you're relieved
 38 | # from configuring them manually. You can tie this node to a specific name:
 39 | #
 40 | node.name: "<%= @node_name.strip %>"
 41 | 
 42 | # Every node can be configured to allow or deny being eligible as the master,
 43 | # and to allow or deny to store the data.
 44 | #
 45 | # Allow this node to be eligible as a master node (enabled by default):
 46 | #
 47 | node.master: true
 48 | #
 49 | # Allow this node to store data (enabled by default):
 50 | #
 51 | node.data: true
 52 | 
 53 | # You can exploit these settings to design advanced cluster topologies.
 54 | #
 55 | # 1. You want this node to never become a master node, only to hold data.
 56 | #    This will be the "workhorse" of your cluster.
 57 | #
 58 | # node.master: false
 59 | # node.data: true
 60 | #
 61 | # 2. You want this node to only serve as a master: to not store any data and
 62 | #    to have free resources. This will be the "coordinator" of your cluster.
 63 | #
 64 | # node.master: true
 65 | # node.data: false
 66 | #
 67 | # 3. You want this node to be neither master nor data node, but
 68 | #    to act as a "search load balancer" (fetching data from nodes,
 69 | #    aggregating results, etc.)
 70 | #
 71 | # node.master: false
 72 | # node.data: false
 73 | 
 74 | # Use the Cluster Health API [http://localhost:9200/_cluster/health], the
 75 | # Node Info API [http://localhost:9200/_nodes] or GUI tools
 76 | # such as <http://www.elasticsearch.org/overview/marvel/>,
 77 | # <http://github.com/karmi/elasticsearch-paramedic>,
 78 | # <http://github.com/lukas-vlcek/bigdesk> and
 79 | # <http://mobz.github.com/elasticsearch-head> to inspect the cluster state.
 80 | 
 81 | # A node can have generic attributes associated with it, which can later be used
 82 | # for customized shard allocation filtering, or allocation awareness. An attribute
 83 | # is a simple key value pair, similar to node.key: value, here is an example:
 84 | #
 85 | # node.rack: rack314
 86 | 
 87 | # By default, multiple nodes are allowed to start from the same installation location
 88 | # to disable it, set the following:
 89 | # node.max_local_storage_nodes: 1
 90 | 
 91 | 
 92 | #################################### Index ####################################
 93 | 
 94 | # You can set a number of options (such as shard/replica options, mapping
 95 | # or analyzer definitions, translog settings, ...) for indices globally,
 96 | # in this file.
 97 | #
 98 | # Note, that it makes more sense to configure index settings specifically for
 99 | # a certain index, either when creating it or by using the index templates API.
100 | #
101 | # See <http://elasticsearch.org/guide/en/elasticsearch/reference/current/index-modules.html> and
102 | # <http://elasticsearch.org/guide/en/elasticsearch/reference/current/indices-create-index.html>
103 | # for more information.
104 | 
105 | # Set the number of shards (splits) of an index (5 by default):
106 | #
107 | index.number_of_shards: 5
108 | 
109 | # Set the number of replicas (additional copies) of an index (1 by default):
110 | #
111 | index.number_of_replicas: 2
112 | 
113 | # Note, that for development on a local machine, with small indices, it usually
114 | # makes sense to "disable" the distributed features:
115 | #
116 | # index.number_of_shards: 1
117 | # index.number_of_replicas: 0
118 | 
119 | # These settings directly affect the performance of index and search operations
120 | # in your cluster. Assuming you have enough machines to hold shards and
121 | # replicas, the rule of thumb is:
122 | #
123 | # 1. Having more *shards* enhances the _indexing_ performance and allows to
124 | #    _distribute_ a big index across machines.
125 | # 2. Having more *replicas* enhances the _search_ performance and improves the
126 | #    cluster _availability_.
127 | #
128 | # The "number_of_shards" is a one-time setting for an index.
129 | #
130 | # The "number_of_replicas" can be increased or decreased anytime,
131 | # by using the Index Update Settings API.
132 | #
133 | # Elasticsearch takes care about load balancing, relocating, gathering the
134 | # results from nodes, etc. Experiment with different settings to fine-tune
135 | # your setup.
136 | 
137 | # Use the Index Status API (<http://localhost:9200/A/_status>) to inspect
138 | # the index status.
139 | 
140 | 
141 | #################################### Paths ####################################
142 | 
143 | # Path to directory containing configuration (this file and logging.yml):
144 | #
145 | # path.conf: /vagrant
146 | 
147 | # Path to directory where to store index data allocated for this node.
148 | #
149 | # path.data: /path/to/data
150 | #
151 | # Can optionally include more than one location, causing data to be striped across
152 | # the locations (a la RAID 0) on a file level, favouring locations with most free
153 | # space on creation. For example:
154 | #
155 | # path.data: /path/to/data1,/path/to/data2
156 | 
157 | # Path to temporary files:
158 | #
159 | # path.work: /path/to/work
160 | 
161 | # Path to log files:
162 | #
163 | # path.logs: /path/to/logs
164 | 
165 | # Path to where plugins are installed:
166 | #
167 | # path.plugins: /path/to/plugins
168 | 
169 | 
170 | #################################### Plugin ###################################
171 | 
172 | # If a plugin listed here is not installed for current node, the node will not start.
173 | #
174 | # plugin.mandatory: mapper-attachments,lang-groovy
175 | 
176 | 
177 | ################################### Memory ####################################
178 | 
179 | # Elasticsearch performs poorly when JVM starts swapping: you should ensure that
180 | # it _never_ swaps.
181 | #
182 | # Set this property to true to lock the memory:
183 | #
184 | # bootstrap.mlockall: true
185 | 
186 | # Make sure that the ES_MIN_MEM and ES_MAX_MEM environment variables are set
187 | # to the same value, and that the machine has enough memory to allocate
188 | # for Elasticsearch, leaving enough memory for the operating system itself.
189 | #
190 | # You should also make sure that the Elasticsearch process is allowed to lock
191 | # the memory, eg. by using `ulimit -l unlimited`.
192 | 
193 | 
194 | ############################## Network And HTTP ###############################
195 | 
196 | # Elasticsearch, by default, binds itself to the 0.0.0.0 address, and listens
197 | # on port [9200-9300] for HTTP traffic and on port [9300-9400] for node-to-node
198 | # communication. (the range means that if the port is busy, it will automatically
199 | # try the next port).
200 | 
201 | # Set the bind address specifically (IPv4 or IPv6):
202 | #
203 | # network.bind_host: 192.168.0.1
204 | 
205 | # Set the address other nodes will use to communicate with this node. If not
206 | # set, it is automatically derived. It must point to an actual IP address.
207 | #
208 | # network.publish_host: 192.168.0.1
209 | 
210 | # Set both 'bind_host' and 'publish_host':
211 | #
212 | network.host: <%= @node_ip.strip %>
213 | 
214 | # Set a custom port for the node to node communication (9300 by default):
215 | #
216 | transport.tcp.port: 9300
217 | 
218 | # Enable compression for all communication between nodes (disabled by default):
219 | #
220 | # transport.tcp.compress: true
221 | 
222 | # Set a custom port to listen for HTTP traffic:
223 | #
224 | http.port: 9200
225 | 
226 | # Set a custom allowed content length:
227 | #
228 | # http.max_content_length: 100mb
229 | 
230 | # Disable HTTP completely:
231 | #
232 | # http.enabled: false
233 | 
234 | 
235 | ################################### Gateway ###################################
236 | 
237 | # The gateway allows for persisting the cluster state between full cluster
238 | # restarts. Every change to the state (such as adding an index) will be stored
239 | # in the gateway, and when the cluster starts up for the first time,
240 | # it will read its state from the gateway.
241 | 
242 | # There are several types of gateway implementations. For more information, see
243 | # <http://elasticsearch.org/guide/en/elasticsearch/reference/current/modules-gateway.html>.
244 | 
245 | # The default gateway type is the "local" gateway (recommended):
246 | #
247 | # gateway.type: local
248 | 
249 | # Settings below control how and when to start the initial recovery process on
250 | # a full cluster restart (to reuse as much local data as possible when using shared
251 | # gateway).
252 | 
253 | # Allow recovery process after N nodes in a cluster are up:
254 | #
255 | # gateway.recover_after_nodes: 1
256 | 
257 | # Set the timeout to initiate the recovery process, once the N nodes
258 | # from previous setting are up (accepts time value):
259 | #
260 | # gateway.recover_after_time: 5m
261 | 
262 | # Set how many nodes are expected in this cluster. Once these N nodes
263 | # are up (and recover_after_nodes is met), begin recovery process immediately
264 | # (without waiting for recover_after_time to expire):
265 | #
266 | # gateway.expected_nodes: 2
267 | 
268 | 
269 | ############################# Recovery Throttling #############################
270 | 
271 | # These settings allow to control the process of shards allocation between
272 | # nodes during initial recovery, replica allocation, rebalancing,
273 | # or when adding and removing nodes.
274 | 
275 | # Set the number of concurrent recoveries happening on a node:
276 | #
277 | # 1. During the initial recovery
278 | #
279 | # cluster.routing.allocation.node_initial_primaries_recoveries: 4
280 | #
281 | # 2. During adding/removing nodes, rebalancing, etc
282 | #
283 | # cluster.routing.allocation.node_concurrent_recoveries: 2
284 | 
285 | # Set to throttle throughput when recovering (eg. 100mb, by default 20mb):
286 | #
287 | # indices.recovery.max_bytes_per_sec: 20mb
288 | 
289 | # Set to limit the number of open concurrent streams when
290 | # recovering a shard from a peer:
291 | #
292 | # indices.recovery.concurrent_streams: 5
293 | 
294 | 
295 | ################################## Discovery ##################################
296 | 
297 | # Discovery infrastructure ensures nodes can be found within a cluster
298 | # and master node is elected. Multicast discovery is the default.
299 | 
300 | # Set to ensure a node sees N other master eligible nodes to be considered
301 | # operational within the cluster. Its recommended to set it to a higher value
302 | # than 1 when running more than 2 nodes in the cluster.
303 | #
304 | # discovery.zen.minimum_master_nodes: 1
305 | 
306 | # Set the time to wait for ping responses from other nodes when discovering.
307 | # Set this option to a higher value on a slow or congested network
308 | # to minimize discovery failures:
309 | #
310 | # discovery.zen.ping.timeout: 3s
311 | 
312 | # For more information, see
313 | # <http://elasticsearch.org/guide/en/elasticsearch/reference/current/modules-discovery-zen.html>
314 | 
315 | # Unicast discovery allows to explicitly control which nodes will be used
316 | # to discover the cluster. It can be used when multicast is not present,
317 | # or to restrict the cluster communication-wise.
318 | #
319 | # 1. Disable multicast discovery (enabled by default):
320 | #
321 | # discovery.zen.ping.multicast.enabled: false
322 | #
323 | # 2. Configure an initial list of master nodes in the cluster
324 | #    to perform discovery when new nodes (master or data) are started:
325 | #
326 | # discovery.zen.ping.unicast.hosts: ["host1", "host2:port"]
327 | 
328 | # EC2 discovery allows to use AWS EC2 API in order to perform discovery.
329 | #
330 | # You have to install the cloud-aws plugin for enabling the EC2 discovery.
331 | #
332 | # For more information, see
333 | # <http://elasticsearch.org/guide/en/elasticsearch/reference/current/modules-discovery-ec2.html>
334 | #
335 | # See <http://elasticsearch.org/tutorials/elasticsearch-on-ec2/>
336 | # for a step-by-step tutorial.
337 | 
338 | # GCE discovery allows to use Google Compute Engine API in order to perform discovery.
339 | #
340 | # You have to install the cloud-gce plugin for enabling the GCE discovery.
341 | #
342 | # For more information, see <https://github.com/elasticsearch/elasticsearch-cloud-gce>.
343 | 
344 | # Azure discovery allows to use Azure API in order to perform discovery.
345 | #
346 | # You have to install the cloud-azure plugin for enabling the Azure discovery.
347 | #
348 | # For more information, see <https://github.com/elasticsearch/elasticsearch-cloud-azure>.
349 | 
350 | ################################## Slow Log ##################################
351 | 
352 | # Shard level query and fetch threshold logging.
353 | 
354 | #index.search.slowlog.threshold.query.warn: 10s
355 | #index.search.slowlog.threshold.query.info: 5s
356 | #index.search.slowlog.threshold.query.debug: 2s
357 | #index.search.slowlog.threshold.query.trace: 500ms
358 | 
359 | #index.search.slowlog.threshold.fetch.warn: 1s
360 | #index.search.slowlog.threshold.fetch.info: 800ms
361 | #index.search.slowlog.threshold.fetch.debug: 500ms
362 | #index.search.slowlog.threshold.fetch.trace: 200ms
363 | 
364 | #index.indexing.slowlog.threshold.index.warn: 10s
365 | #index.indexing.slowlog.threshold.index.info: 5s
366 | #index.indexing.slowlog.threshold.index.debug: 2s
367 | #index.indexing.slowlog.threshold.index.trace: 500ms
368 | 
369 | ################################## GC Logging ################################
370 | 
371 | #monitor.jvm.gc.young.warn: 1000ms
372 | #monitor.jvm.gc.young.info: 700ms
373 | #monitor.jvm.gc.young.debug: 400ms
374 | 
375 | #monitor.jvm.gc.old.warn: 10s
376 | #monitor.jvm.gc.old.info: 5s
377 | #monitor.jvm.gc.old.debug: 2s
378 | 
379 | marvel.agent.enabled: <%= @node_marvel_enabled %>
380 | marvel.agent.exporter.es.hosts: ["<%= @cluster_ip.strip % 11 %>:9200"]
381 | 
382 | 


--------------------------------------------------------------------------------
/lib/elasticsearch-module.rb:
--------------------------------------------------------------------------------
  1 | module Vagrant
  2 |     module ElastiSearchCluster
  3 |         class Util
  4 |             attr_accessor :logger
  5 | 
  6 |             def initialize
  7 |                 @params = [
  8 |                     'cluster_name' => ['CLUSTER_NAME', 'cluster_name', 'My amazing ES cluster'],
  9 |                     'cluster_ip' => ['CLUSTER_IP_PATTERN', 'cluster_ip', '10.0.0.%d'],
 10 |                     'cluster_count' => ['CLUSTER_COUNT', 'cluster_size', 5],
 11 |                     'cluster_ram' => ['CLUSTER_RAM', 'cluster_ram', 512],
 12 |                 ]
 13 | 
 14 |                 @names = %w(thor zeus isis shifu baal)
 15 |                 @logger = Vagrant::UI::Colored.new
 16 |                 @logger.opts[:color] = :white
 17 |             end
 18 | 
 19 |             def get_vm_name(index)
 20 |                 "vm#{index}"
 21 |             end
 22 | 
 23 |             def get_vm_ip(index)
 24 |                 ip = get_cluster_info 'cluster_ip'
 25 |                 ip % (10 + index)
 26 |             end
 27 | 
 28 |             def get_node_name(index)
 29 |                 @names[index - 1]
 30 |             end
 31 | 
 32 |             def get_cluster_info(index)
 33 |                 return ENV[@params[0][index][0]] if ENV[@params[0][index][0]]
 34 |                 return (File.read ".vagrant/#{@params[0][index][1]}") if File.exist? ".vagrant/#{@params[0][index][1]}"
 35 |                 "#{@params[0][index][2]}"
 36 |             end
 37 | 
 38 |             def save_cluster_info(index, value)
 39 |                 Dir.mkdir('.vagrant') unless Dir.exist?('.vagrant')
 40 |                 File.open(".vagrant/#{@params[0][index][1]}", 'w') do |file|
 41 |                     file.puts value.to_s
 42 |                 end
 43 |             end
 44 | 
 45 |             def get_config_template
 46 |                 config_file = File.open('conf/elasticsearch.yml.erb', 'r')
 47 |                 ERB.new(config_file.read)
 48 |             end
 49 | 
 50 |             def build_config(index)
 51 |                 vm = get_vm_name index
 52 |                 conf_file_format = "conf/elasticsearch-#{vm}.yml"
 53 | 
 54 |                 File.open(conf_file_format, 'w') do |file|
 55 |                     @node_ip = get_vm_ip index
 56 |                     @node_name = get_node_name index
 57 |                     @node_marvel_enabled = (index == 1)
 58 |                     @cluster_ip = get_cluster_info 'cluster_ip'
 59 |                     @cluster_name = get_cluster_info 'cluster_name'
 60 | 
 61 |                     @logger.info "Building configuration for #{vm}"
 62 |                     file.puts self.get_config_template.result(binding)
 63 |                 end unless File.exist? conf_file_format
 64 |             end
 65 | 
 66 |             def manage_and_print_config
 67 |                 self.logger.info "----------------------------------------------------------"
 68 |                 self.logger.info "          Your ES cluster configurations"
 69 |                 self.logger.info "----------------------------------------------------------"
 70 | 
 71 |                 # Building and showing CLUSTER NAME information
 72 |                 index = 'cluster_name'
 73 |                 cluster_name = self.get_cluster_info index
 74 |                 self.logger.info "Cluster Name: #{cluster_name.strip}"
 75 |                 self.save_cluster_info index, cluster_name
 76 | 
 77 |                 # Building and showing CLUSTER COUNT information
 78 |                 index = 'cluster_count'
 79 |                 nodes_number = self.get_cluster_info index
 80 |                 self.logger.info "Cluster size: #{nodes_number.strip}"
 81 |                 self.save_cluster_info index, nodes_number
 82 | 
 83 |                 # Building and showing CLUSTER IP PATTERN information
 84 |                 index = 'cluster_ip'
 85 |                 cluster_network_ip = self.get_cluster_info index
 86 |                 self.logger.info "Cluster network IP: #{cluster_network_ip.strip % 0}"
 87 |                 self.save_cluster_info index, cluster_network_ip
 88 | 
 89 |                 # Building and showing CLUSTER RAM information
 90 |                 index = 'cluster_ram'
 91 |                 cluster_ram = self.get_cluster_info index
 92 |                 self.logger.info "Cluster RAM (for each node): #{cluster_ram.strip}"
 93 |                 self.save_cluster_info index, cluster_ram
 94 | 
 95 |                 self.logger.info "----------------------------------------------------------"
 96 |             end
 97 |         end
 98 |     end
 99 | end
100 | 
101 | 


--------------------------------------------------------------------------------
/lib/elasticsearch-script.rb:
--------------------------------------------------------------------------------
 1 | @node_start_inline_script = <<SCRIPT
 2 | if ! cat /etc/profile | grep -q vagrant
 3 | then
 4 |     cat <<EOT >> /etc/profile.d/vagrant-elasticsearch-cluster.sh
 5 | 
 6 | export VM_NAME=%s
 7 | export VM_NODE_NAME=%s
 8 | export VM_NODE_IP=%s
 9 | export PATH=/vagrant/scripts:/home/vagrant/elasticsearch/bin:\\$PATH
10 | EOT
11 | 
12 |     sed 's#^.*secure_path="\\(.*\\)"$#Defaults secure_path="\\1:/vagrant/scripts:/home/vagrant/elasticsearch/bin"#' -i /etc/sudoers
13 |     echo 'Defaults env_keep = "VM_NAME"' >> /etc/sudoers
14 | 
15 |     source /etc/profile
16 | fi
17 | 
18 | screen -li | grep -q elastic || node-start
19 | SCRIPT
20 | 


--------------------------------------------------------------------------------
/lib/upgrade.sh:
--------------------------------------------------------------------------------
 1 | # Setting ES version to install
 2 | ES_VERSION="elasticsearch-1.4.3"
 3 | ES_PLUGIN_INSTALL_CMD="elasticsearch/bin/plugin -install"
 4 | 
 5 | # Removing all previous potentially installed version
 6 | rm -rf elasticsearch
 7 | rm -rf elasticsearch-*
 8 | 
 9 | # Downloading the version to install
10 | wget https://download.elasticsearch.org/elasticsearch/elasticsearch/$ES_VERSION.tar.gz
11 | tar -xvf $ES_VERSION.tar.gz
12 | rm -rf $ES_VERSION.tar.gz
13 | 
14 | # Renaming extracted folder to a generic name to avoid changing ES commands (elasticsearch/bin/...)
15 | mv $ES_VERSION elasticsearch
16 | 
17 | # Internal ES plugins
18 | ${ES_PLUGIN_INSTALL_CMD} com.github.kzwang/elasticsearch-image/1.2.0
19 | ${ES_PLUGIN_INSTALL_CMD} elasticsearch/elasticsearch-mapper-attachments/2.4.2
20 | ${ES_PLUGIN_INSTALL_CMD} fr.pilato.elasticsearch.river/rssriver/1.3.0
21 | ${ES_PLUGIN_INSTALL_CMD} jdbc --url http://xbib.org/repository/org/xbib/elasticsearch/plugin/elasticsearch-river-jdbc/1.4.0.9/elasticsearch-river-jdbc-1.4.0.9-plugin.zip
22 | ${ES_PLUGIN_INSTALL_CMD} elasticsearch/elasticsearch-river-rabbitmq/2.4.1
23 | ${ES_PLUGIN_INSTALL_CMD} elasticsearch/elasticsearch-river-twitter/2.4.2
24 | ${ES_PLUGIN_INSTALL_CMD} elasticsearch/elasticsearch-river-wikipedia/2.4.1
25 | 
26 | # Supervision/Dashboards ES Plugins
27 | ${ES_PLUGIN_INSTALL_CMD} mobz/elasticsearch-head
28 | ${ES_PLUGIN_INSTALL_CMD} karmi/elasticsearch-paramedic
29 | ${ES_PLUGIN_INSTALL_CMD} lukas-vlcek/bigdesk/2.5.0
30 | ${ES_PLUGIN_INSTALL_CMD} elasticsearch/marvel/latest
31 | ${ES_PLUGIN_INSTALL_CMD} royrusso/elasticsearch-HQ
32 | 


--------------------------------------------------------------------------------
/scripts/node-attach:
--------------------------------------------------------------------------------
1 | #!/bin/bash
2 | 
3 | screen -rx elastic
4 | 


--------------------------------------------------------------------------------
/scripts/node-restart:
--------------------------------------------------------------------------------
1 | #!/bin/bash
2 | 
3 | node-stop
4 | node-start
5 | 


--------------------------------------------------------------------------------
/scripts/node-start:
--------------------------------------------------------------------------------
 1 | #!/bin/bash
 2 | 
 3 | INSTANCE=$(screen -li | grep elastic | sed 's/\s/ /g')
 4 | 
 5 | if [ -n "$INSTANCE" ]
 6 | then
 7 |     echo "Already running: $INSTANCE"
 8 | else
 9 |     mkdir -p /vagrant/logs
10 | 
11 |     screen -S elastic -d -m bash -l -c "elasticsearch -Des.config=/vagrant/conf/elasticsearch-$VM_NAME.yml > /vagrant/logs/elasticsearch-$VM_NAME.log 2>&1"
12 |     echo "-----------------------------------------------------------------------------------------------------------"
13 |     echo " => Started $VM_NAME - $VM_NODE_NAME - $VM_NODE_IP: $(screen -li | grep elastic | sed 's/\s/ /g')"
14 |     echo "-----------------------------------------------------------------------------------------------------------"
15 | fi
16 | 


--------------------------------------------------------------------------------
/scripts/node-status:
--------------------------------------------------------------------------------
 1 | #!/bin/bash
 2 | 
 3 | INSTANCE=$(screen -li | grep elastic | sed 's/\s/ /g')
 4 | 
 5 | if [ -z "$INSTANCE" ]
 6 | then
 7 |     echo "No running instance"
 8 | else
 9 |     echo $INSTANCE
10 | fi
11 | 


--------------------------------------------------------------------------------
/scripts/node-stop:
--------------------------------------------------------------------------------
 1 | #!/bin/bash
 2 | 
 3 | INSTANCE=$(screen -li | grep elastic | sed 's/\s/ /g')
 4 | 
 5 | if [ -n "$INSTANCE" ]
 6 | then
 7 |     screen -X -S elastic quit
 8 |     echo "Killed $INSTANCE"
 9 | else
10 |     echo "No running instance"
11 | fi
12 | 


--------------------------------------------------------------------------------