├── .gitignore ├── .gitmodules ├── prepare.bat ├── prepare.sh ├── examples ├── second-example-your-tweets.md ├── first-example-first-tweet.md ├── advanced │ └── setup-twitter-river.md └── third-example-real-tweets.md ├── README.md ├── dashboard.json ├── LICENCE └── elasticsearch.yml /.gitignore: -------------------------------------------------------------------------------- 1 | elasticsearch-* 2 | -------------------------------------------------------------------------------- /.gitmodules: -------------------------------------------------------------------------------- 1 | [submodule "sense"] 2 | path = sense 3 | url = https://github.com/bleskes/sense/ 4 | [submodule "kibana"] 5 | path = kibana 6 | url = https://github.com/elasticsearch/kibana 7 | -------------------------------------------------------------------------------- /prepare.bat: -------------------------------------------------------------------------------- 1 | curl -O -k https://download.elasticsearch.org/elasticsearch/elasticsearch/elasticsearch-0.90.3.zip 2 | unzip elasticsearch-0.90.3.zip 3 | del elasticsearch-0.90.3.zip 4 | xcopy elasticsearch.yml .\elasticsearch-0.90.3\config\ 5 | 6 | .\elasticsearch-0.90.3\bin\plugin -install royrusso/elasticsearch-HQ -------------------------------------------------------------------------------- /prepare.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | CURRENT="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )" 4 | 5 | echo "Downloading elasticsearch version 0.90.3" 6 | curl -O -k https://download.elasticsearch.org/elasticsearch/elasticsearch/elasticsearch-0.90.3.zip 7 | wait $! 8 | unzip -d ${CURRENT} elasticsearch-0.90.3.zip 9 | wait $! 10 | rm elasticsearch-0.90.3.zip 11 | 12 | cp elasticsearch.yml elasticsearch-0.90.3/config/ 13 | 14 | elasticsearch-0.90.3/bin/plugin -install royrusso/elasticsearch-HQ 15 | 16 | elasticsearch-0.90.3/bin/plugin -install elasticsearch/elasticsearch-river-twitter/1.4.0 -------------------------------------------------------------------------------- /examples/second-example-your-tweets.md: -------------------------------------------------------------------------------- 1 | Prerequisites 2 | -------------- 3 | * Running node of elasticsearch *on our uiosearch cluster* 4 | * Open elasticsearch-uio-demo/sense/index.html 5 | 6 | Second example, your tweets! 7 | --------------- 8 | In the following, we will use [Sense] to query our index *uiocomperio*. 9 | 10 | Query for all tweets that contian the word "oslo" 11 | ```json 12 | { 13 | "query": { 14 | "match": { 15 | "text": { 16 | "query":"oslo" 17 | } 18 | } 19 | } 20 | ``` 21 | Query for all tweets that contain the hashtag "comperiouio" 22 | ```json 23 | { 24 | "query": { 25 | "match": { 26 | "hashtag.text": { 27 | "query":"comperiouio" 28 | } 29 | } 30 | } 31 | ``` 32 | Query for all tweets for the user "comperio_dev" 33 | ```json 34 | { 35 | "query": { 36 | "match": { 37 | "user.screen_name": { 38 | "query":"comperio_dev" 39 | } 40 | } 41 | } 42 | ``` 43 | 44 | [Sense]: https://github.com/bleskes/sense/ 45 | -------------------------------------------------------------------------------- /examples/first-example-first-tweet.md: -------------------------------------------------------------------------------- 1 | Prerequisites 2 | -------------- 3 | * Running instance of elasticsearch (again, simply do: bin/elasticsearch) 4 | * cURL 5 | 6 | In this example, we will manually (1) create an index, (2) index some 7 | tweets and (3) write a couple of queries. 8 | 9 | 10 | The following curl requests will create an index named *comperio* containing a type 11 | named *tweet*. 12 | 13 | Add a fake tweet 14 | 15 | curl -XPUT localhost:9200/comperio/tweet/1 -d ' 16 | { 17 | "tweet" : "Comperio is awesome!", 18 | "posted" : "2013-09-19", 19 | "user" : { 20 | "name" : "Murhaf", 21 | "email" : "murhaf.fares@comperio.no" 22 | }, 23 | "tags" : ["comperio", "awesomeness"] 24 | }' 25 | 26 | Add another fake tweet 27 | 28 | curl -XPUT localhost:9200/comperio/tweet/2 -d ' 29 | { 30 | "tweet" : "IFI is awesome!", 31 | "posted" : "2003-09-19", 32 | "user" : { 33 | "name" : "Niels Henrik", 34 | "email" : "nhhagen@comperiosearch.com" 35 | }, 36 | "tags" : ["IFI", "awesomeness"] 37 | }' 38 | 39 | Now as we have indexed two "documents", we can write some queries: 40 | 41 | Query for "comperio" 42 | 43 | curl -XGET 'http://localhost:9200/comperio/tweet/_search?pretty=true' -d ' 44 | { 45 | "query" : { 46 | "match" : { 47 | "tweet" : { 48 | "query" : "comperio" 49 | } 50 | } 51 | } 52 | }' 53 | 54 | Query for "awesome" 55 | 56 | curl -XGET 'http://localhost:9200/comperio/tweet/_search?pretty=true' -d ' 57 | { 58 | "query" : { 59 | "match" : { 60 | "tweet" : { 61 | "query" : "awesome" 62 | } 63 | } 64 | } 65 | }' 66 | -------------------------------------------------------------------------------- /examples/advanced/setup-twitter-river.md: -------------------------------------------------------------------------------- 1 | Prerequisites 2 | -------------- 3 | * Running node of elasticsearch (again, simply do: bin/elasticsearch) 4 | * The twitter river plugin: ``bin/plugin -install elasticsearch/elasticsearch-river-twitter/1.4.0`` 5 | 6 | The following is copied from [Twitter River Plugin for ElasticSearch](https://github.com/elasticsearch/elasticsearch-river-twitter): 7 | 8 | You need to get an OAuth token in order to use Twitter river. 9 | Please follow [Twitter documentation](https://dev.twitter.com/docs/auth/tokens-devtwittercom), basically: 10 | 11 | * Login to: https://dev.twitter.com/apps/ 12 | * Create a new Twitter application (let's say elasticsearch): https://dev.twitter.com/apps/new 13 | You don't need a callback URL. 14 | * When done, click on `Create my access token`. 15 | * Open `OAuth tool` tab and note `Consumer key`, `Consumer secret`, `Access token` and `Access token secret`. 16 | 17 | 18 | Get started 19 | ------------------- 20 | 21 | The following curl request, will create: 22 | 1. Index named *my_twitter_river* 23 | 2. Twitter river that indexes all tweets containing any of the words specified in "tracks", i.e. linux, norge etc. 24 | 25 | ```json 26 | curl -XPUT localhost:9200/_river/my_twitter_river/_meta -d ' 27 | { 28 | "type" : "twitter", 29 | "twitter" : { 30 | "oauth" : { 31 | "consumer_key" : "*** YOUR Consumer key HERE ***", 32 | "consumer_secret" : "*** YOUR Consumer secret HERE ***", 33 | "access_token" : "*** YOUR Access token HERE ***", 34 | "access_token_secret" : "*** YOUR Access token secret HERE ***" 35 | }, 36 | "type":"filter", 37 | "filter": { 38 | "tracks" : ["norge","oslo", "norway", "snowden", "linux"] 39 | } 40 | } 41 | 42 | }, 43 | "index" : { 44 | "index" : "my_twitter_river", 45 | "type" : "status", 46 | "bulk_size" : 100 47 | } 48 | } 49 | ' 50 | ``` 51 | 52 | In order to stop the river you need to delete it, as follows: 53 | ```sh 54 | curl -XDELETE localhost:9200/_river/my_twitter_river/ 55 | ``` 56 | 57 | Don't worry this will only delete the river not the index. 58 | 59 | 60 | 61 | Sounds cool? Why not try the Wikipedia river: 62 | https://github.com/elasticsearch/elasticsearch-river-wikipedia 63 | 64 | Read more on: 65 | * https://github.com/elasticsearch/elasticsearch-river-twitter 66 | * http://www.elasticsearch.org/guide/reference/river/ 67 | -------------------------------------------------------------------------------- /examples/third-example-real-tweets.md: -------------------------------------------------------------------------------- 1 | Prerequisites 2 | -------------- 3 | * Running node of elasticsearch on our uiosearch cluster 4 | * Open elasticsearch-uio-demo/sense/index.html 5 | 6 | 7 | Third example, real tweets with real queries 8 | -------------- 9 | 10 | In this example we are going to use an index of tweets from all over the world. The index is named *twitter*. 11 | 12 | Phrase 13 | ```json 14 | { 15 | "query": { 16 | "match": { 17 | "text" : { 18 | "query": "New York", 19 | "operator": "and" 20 | } 21 | } 22 | } 23 | } 24 | ``` 25 | Boolean 26 | ```json 27 | { 28 | "query": { 29 | "bool": { 30 | "must": [ 31 | { 32 | "match": { 33 | "hashtag.text": "oslo" 34 | } 35 | } 36 | ], 37 | "must_not": [ 38 | { 39 | "match": { 40 | "source": "Instagram" 41 | } 42 | } 43 | ] 44 | } 45 | } 46 | } 47 | ``` 48 | 49 | Filters 50 | ----------- 51 | 52 | Field 53 | ```json 54 | { 55 | "query": { 56 | "constant_score": { 57 | "filter": { 58 | "exists": { 59 | "field": "place.country" 60 | } 61 | } 62 | } 63 | } 64 | } 65 | ``` 66 | Boolean 67 | ```json 68 | { 69 | "query": { 70 | "constant_score": { 71 | "filter": { 72 | "and": { 73 | "filters": [ 74 | { 75 | "exists": { 76 | "field": "place.country" 77 | } 78 | }, 79 | { 80 | "term": { 81 | "hashtag.text": "oslo" 82 | } 83 | } 84 | ] 85 | } 86 | } 87 | } 88 | } 89 | } 90 | ``` 91 | 92 | * Filters only do exact matching 93 | * Queries full text search 94 | * Filters binary matches or not 95 | * Queries relevance scoring 96 | * Filters fast 97 | * Queries heavier 98 | * Filters cachable 99 | * Queries not cacheable 100 | 101 | Facets 102 | ---------- 103 | 104 | So who is spamming twitter with tweets on justin bieber. 105 | 106 | ```json 107 | { 108 | "query": { 109 | "match": { 110 | "text": "justin bieber" 111 | } 112 | }, 113 | "facets": { 114 | "countries": { 115 | "terms": { 116 | "field": "user.screen_name", 117 | "size": 20, 118 | "order": "count" 119 | } 120 | } 121 | } 122 | } 123 | ``` -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | elasticsearch-uio-demo 2 | ====================== 3 | 4 | This demo is based on [hammertime](https://github.com/s1monw/hammertime). 5 | 6 | This repo is basically a collection of different softwares and tools which help you get started with elasticsearch. 7 | We inlcude two submodules ([Kibana] and [Sense]), two configuration 8 | files ([elasticsearch.yml](elasticsearch.yml), [dashboard.json](dashboard.json)) and some [examples](examples/). 9 | 10 | [Kibana] is an open source (Apache Licensed), browser based analytics 11 | and search interface data sets stored in 12 | elasticsearch [[source]](https://github.com/elasticsearch/kibana). 13 | Said differently, Kibana helps you make sense of your data. 14 | 15 | [Sense] is a JSON-aware tool that allows you to communicate with 16 | elasticsearch from your browser and it provides many features such as 17 | autocomplete and syntax checking. 18 | 19 | Prerequisites 20 | -------------- 21 | * Java (for [elasticsearch]) 22 | * Python (for [Kibana] three) 23 | 24 | 25 | Installation 26 | -------------- 27 | 28 | 1. ``git clone --recursive https://github.com/comperiosearch/elasticsearch-uio-demo.git`` 29 | 2. Change directory to ``elasticsearch-uio-demo`` 30 | 31 | Now you can either run the script ``prepare.sh`` (Windows: ``prepare.bat``) or manually do the following: 32 | 33 | 3. Download [https://download.elasticsearch.org/elasticsearch/elasticsearch/elasticsearch-0.90.3.zip](https://download.elasticsearch.org/elasticsearch/elasticsearch/elasticsearch-0.90.3.zip) 34 | 4. Extract elasticsearch-0.90.3.zip in the same directory where you cloned this repo 35 | 5. Replace the configuration file *elasticsearch.yml* in *elasticsearch-0.90.3/config/* with the file provided in this repo 36 | 6. Install Elastic HQ plugin, as follows: 37 | ``elasticsearch.0.90.3/bin/plugin -install royrusso/elasticsearch-HQ`` 38 | 39 | Now you are ready to go! 40 | 41 | 42 | Getting started with elasticsearch 43 | -------------- 44 | 45 | To run elasticsearch simply do: 46 | ``elasticsearch/bin/elasticsearch`` 47 | 48 | Linux users: if you want to keep the elasticsearch process in the 49 | foreground, do: 50 | ``elasticsearch/bin/elasticsearch -f`` 51 | 52 | Windows users: ``.\elasticsearch-0.90.3\bin\elasticsearch`` 53 | 54 | In you fave browser, open: 55 | localhost:9200 56 | 57 | Did you see something like: 58 | 59 | { 60 | "ok" : true, 61 | "status" : 200, 62 | "name" : "Styx and Stone", 63 | "version" : { 64 | "number" : "0.90.3", 65 | "build_hash" : "5c38d6076448b899d758f29443329571e2522410", 66 | "build_timestamp" : "2013-08-06T13:18:31Z", 67 | "build_snapshot" : false, 68 | "lucene_version" : "4.4" 69 | }, 70 | "tagline" : "You Know, for Search" 71 | } 72 | 73 | It works! 74 | 75 | Now you can try the queries in [examples](examples/). 76 | 77 | 78 | How to use Sense 79 | ----------------- 80 | Open sense/index.html in your browser, and done. 81 | 82 | Getting started with Kibana 83 | -------------- 84 | 1. Change direcotry to ``kibana`` 85 | 2. ``python -m SimpleHTTPServer`` 86 | 3. Open localhost:8000 87 | 4. Load the dashboard provided in this repo 88 | 89 | 90 | 91 | elasticsearch plugins 92 | -------------- 93 | * [HQ](https://github.com/royrusso/elasticsearch-HQ) ``bin/plugin -install royrusso/elasticsearch-HQ``, open: http://localhost:9200/_plugin/HQ/ 94 | * [inquisitor](https://github.com/polyfractal/elasticsearch-inquisitor) ``bin/plugin -install polyfractal/elasticsearch-inquisitor``, open: http://localhost:9200/_plugin/inquisitor 95 | * [head](https://github.com/mobz/elasticsearch-head) ``bin/plugin -install mobz/elasticsearch-head``, open: http://localhost:9200/_plugin/head 96 | 97 | [elasticsearch]: http://www.elasticsearch.org/ 98 | [Kibana]: http://www.elasticsearch.org/overview/kibana/ 99 | [Sense]: https://github.com/bleskes/sense/ 100 | -------------------------------------------------------------------------------- /dashboard.json: -------------------------------------------------------------------------------- 1 | { 2 | "title": "Tweet Dashboard", 3 | "rows": [ 4 | { 5 | "title": "Query", 6 | "height": "50px", 7 | "editable": true, 8 | "collapse": false, 9 | "panels": [ 10 | { 11 | "error": false, 12 | "span": 4, 13 | "editable": true, 14 | "type": "query", 15 | "label": "Search", 16 | "query": "*", 17 | "pinned": true, 18 | "history": [ 19 | "*" 20 | ], 21 | "remember": 10, 22 | "title": "Fulltext Query" 23 | }, 24 | { 25 | "error": "", 26 | "span": 5, 27 | "editable": true, 28 | "type": "timepicker", 29 | "status": "Stable", 30 | "mode": "relative", 31 | "time_options": [ 32 | "5m", 33 | "15m", 34 | "1h", 35 | "6h", 36 | "12h", 37 | "24h", 38 | "2d", 39 | "7d", 40 | "30d" 41 | ], 42 | "timespan": "6h", 43 | "timefield": "created_at", 44 | "timeformat": "", 45 | "refresh": { 46 | "enable": true, 47 | "interval": 30, 48 | "min": 3 49 | }, 50 | "title": "Tweet Creation Date", 51 | "filter_id": 0 52 | } 53 | ], 54 | "collapsable": true 55 | }, 56 | { 57 | "title": "Analytics", 58 | "height": "250px", 59 | "editable": true, 60 | "collapse": false, 61 | "panels": [ 62 | { 63 | "error": false, 64 | "span": 3, 65 | "editable": true, 66 | "type": "terms", 67 | "loadingEditor": false, 68 | "queries": { 69 | "mode": "all", 70 | "ids": [ 71 | 0 72 | ] 73 | }, 74 | "field": "user.location", 75 | "exclude": [], 76 | "missing": false, 77 | "other": true, 78 | "size": 10, 79 | "order": "count", 80 | "style": { 81 | "font-size": "10pt" 82 | }, 83 | "donut": false, 84 | "tilt": false, 85 | "labels": true, 86 | "arrangement": "horizontal", 87 | "chart": "bar", 88 | "counter_pos": "below", 89 | "spyable": true, 90 | "title": "location" 91 | }, 92 | { 93 | "span": 5, 94 | "editable": true, 95 | "group": [ 96 | "default" 97 | ], 98 | "type": "histogram", 99 | "query": [ 100 | { 101 | "query": "place.country_code:en", 102 | "label": "place.country_node:en" 103 | } 104 | ], 105 | "interval": "5m", 106 | "show": [ 107 | "bars", 108 | "y-axis", 109 | "x-axis", 110 | "legend" 111 | ], 112 | "fill": 3, 113 | "timezone": "browser", 114 | "index": [ 115 | "twitter" 116 | ], 117 | "loading": false, 118 | "mode": "count", 119 | "time_field": "created_at", 120 | "queries": { 121 | "mode": "all", 122 | "ids": [ 123 | 0 124 | ] 125 | }, 126 | "value_field": null, 127 | "auto_int": true, 128 | "resolution": 100, 129 | "linewidth": 3, 130 | "spyable": true, 131 | "zoomlinks": true, 132 | "bars": true, 133 | "stack": true, 134 | "points": false, 135 | "lines": false, 136 | "legend": true, 137 | "x-axis": true, 138 | "y-axis": true, 139 | "percentage": false, 140 | "interactive": true, 141 | "tooltip": { 142 | "value_type": "cumulative", 143 | "query_as_alias": false 144 | } 145 | }, 146 | { 147 | "span": 4, 148 | "editable": true, 149 | "group": [ 150 | "default" 151 | ], 152 | "type": "map", 153 | "query": "place.country_code:en", 154 | "map": "world", 155 | "colors": [ 156 | "#E5FCC2", 157 | "#9DE0AD", 158 | "#45ADA8" 159 | ], 160 | "size": 100, 161 | "exclude": [], 162 | "index": [ 163 | "twitter" 164 | ], 165 | "field": "place.country_code", 166 | "loading": false, 167 | "queries": { 168 | "mode": "all", 169 | "ids": [ 170 | 0 171 | ] 172 | }, 173 | "spyable": true, 174 | "index_limit": 0 175 | }, 176 | { 177 | "error": false, 178 | "span": 3, 179 | "editable": true, 180 | "type": "terms", 181 | "loadingEditor": false, 182 | "queries": { 183 | "mode": "all", 184 | "ids": [ 185 | 0 186 | ] 187 | }, 188 | "field": "source", 189 | "exclude": [], 190 | "missing": true, 191 | "other": true, 192 | "size": 10, 193 | "order": "count", 194 | "style": { 195 | "font-size": "10pt" 196 | }, 197 | "donut": false, 198 | "tilt": false, 199 | "labels": true, 200 | "arrangement": "horizontal", 201 | "chart": "bar", 202 | "counter_pos": "below", 203 | "spyable": true, 204 | "title": "source" 205 | } 206 | ], 207 | "collapsable": true 208 | }, 209 | { 210 | "title": "Table", 211 | "height": "350px", 212 | "editable": true, 213 | "collapse": false, 214 | "panels": [ 215 | { 216 | "span": 10, 217 | "editable": true, 218 | "group": [ 219 | "default" 220 | ], 221 | "type": "table", 222 | "title": "Tweets", 223 | "query": "place.country_code:en", 224 | "size": 100, 225 | "offset": 0, 226 | "sort": [ 227 | "created_at", 228 | "desc" 229 | ], 230 | "style": {}, 231 | "fields": [ 232 | "created_at", 233 | "user.name", 234 | "text" 235 | ], 236 | "index": [ 237 | "twitter" 238 | ], 239 | "error": false, 240 | "loading": false, 241 | "status": "Stable", 242 | "queries": { 243 | "mode": "all", 244 | "ids": [ 245 | 0 246 | ] 247 | }, 248 | "pages": 5, 249 | "overflow": "min-height", 250 | "highlight": [], 251 | "sortable": true, 252 | "header": true, 253 | "paging": true, 254 | "field_list": true, 255 | "trimFactor": 300, 256 | "normTimes": true, 257 | "spyable": true 258 | } 259 | ], 260 | "collapsable": true 261 | } 262 | ], 263 | "editable": true, 264 | "style": "dark", 265 | "failover": false, 266 | "services": { 267 | "query": { 268 | "idQueue": [], 269 | "list": { 270 | "0": { 271 | "id": 0, 272 | "color": "#7EB26D", 273 | "query": "*", 274 | "alias": "", 275 | "pin": false, 276 | "type": "lucene" 277 | } 278 | }, 279 | "ids": [ 280 | 0 281 | ] 282 | }, 283 | "filter": { 284 | "idQueue": [], 285 | "list": { 286 | "0": { 287 | "from": "2013-09-18T09:06:31.750Z", 288 | "to": "2013-09-18T15:06:31.751Z", 289 | "field": "created_at", 290 | "type": "time", 291 | "mandate": "must", 292 | "active": true, 293 | "alias": "", 294 | "id": 0 295 | } 296 | }, 297 | "ids": [ 298 | 0 299 | ] 300 | } 301 | }, 302 | "loader": { 303 | "save_gist": false, 304 | "save_elasticsearch": true, 305 | "save_local": true, 306 | "save_default": true, 307 | "save_temp": true, 308 | "save_temp_ttl_enable": true, 309 | "save_temp_ttl": "30d", 310 | "load_gist": true, 311 | "load_elasticsearch": true, 312 | "load_elasticsearch_size": 20, 313 | "load_local": true, 314 | "hide": false 315 | }, 316 | "index": { 317 | "interval": "none", 318 | "pattern": "_all", 319 | "default": "twitter" 320 | } 321 | } -------------------------------------------------------------------------------- /LICENCE: -------------------------------------------------------------------------------- 1 | Apache License 2 | Version 2.0, January 2004 3 | http://www.apache.org/licenses/ 4 | 5 | TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 6 | 7 | 1. Definitions. 8 | 9 | "License" shall mean the terms and conditions for use, reproduction, and 10 | distribution as defined by Sections 1 through 9 of this document. 11 | 12 | "Licensor" shall mean the copyright owner or entity authorized by the copyright 13 | owner that is granting the License. 14 | 15 | "Legal Entity" shall mean the union of the acting entity and all other entities 16 | that control, are controlled by, or are under common control with that entity. 17 | For the purposes of this definition, "control" means (i) the power, direct or 18 | indirect, to cause the direction or management of such entity, whether by 19 | contract or otherwise, or (ii) ownership of fifty percent (50%) or more of the 20 | outstanding shares, or (iii) beneficial ownership of such entity. 21 | 22 | "You" (or "Your") shall mean an individual or Legal Entity exercising 23 | permissions granted by this License. 24 | 25 | "Source" form shall mean the preferred form for making modifications, including 26 | but not limited to software source code, documentation source, and configuration 27 | files. 28 | 29 | "Object" form shall mean any form resulting from mechanical transformation or 30 | translation of a Source form, including but not limited to compiled object code, 31 | generated documentation, and conversions to other media types. 32 | 33 | "Work" shall mean the work of authorship, whether in Source or Object form, made 34 | available under the License, as indicated by a copyright notice that is included 35 | in or attached to the work (an example is provided in the Appendix below). 36 | 37 | "Derivative Works" shall mean any work, whether in Source or Object form, that 38 | is based on (or derived from) the Work and for which the editorial revisions, 39 | annotations, elaborations, or other modifications represent, as a whole, an 40 | original work of authorship. For the purposes of this License, Derivative Works 41 | shall not include works that remain separable from, or merely link (or bind by 42 | name) to the interfaces of, the Work and Derivative Works thereof. 43 | 44 | "Contribution" shall mean any work of authorship, including the original version 45 | of the Work and any modifications or additions to that Work or Derivative Works 46 | thereof, that is intentionally submitted to Licensor for inclusion in the Work 47 | by the copyright owner or by an individual or Legal Entity authorized to submit 48 | on behalf of the copyright owner. For the purposes of this definition, 49 | "submitted" means any form of electronic, verbal, or written communication sent 50 | to the Licensor or its representatives, including but not limited to 51 | communication on electronic mailing lists, source code control systems, and 52 | issue tracking systems that are managed by, or on behalf of, the Licensor for 53 | the purpose of discussing and improving the Work, but excluding communication 54 | that is conspicuously marked or otherwise designated in writing by the copyright 55 | owner as "Not a Contribution." 56 | 57 | "Contributor" shall mean Licensor and any individual or Legal Entity on behalf 58 | of whom a Contribution has been received by Licensor and subsequently 59 | incorporated within the Work. 60 | 61 | 2. Grant of Copyright License. 62 | 63 | Subject to the terms and conditions of this License, each Contributor hereby 64 | grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, 65 | irrevocable copyright license to reproduce, prepare Derivative Works of, 66 | publicly display, publicly perform, sublicense, and distribute the Work and such 67 | Derivative Works in Source or Object form. 68 | 69 | 3. Grant of Patent License. 70 | 71 | Subject to the terms and conditions of this License, each Contributor hereby 72 | grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, 73 | irrevocable (except as stated in this section) patent license to make, have 74 | made, use, offer to sell, sell, import, and otherwise transfer the Work, where 75 | such license applies only to those patent claims licensable by such Contributor 76 | that are necessarily infringed by their Contribution(s) alone or by combination 77 | of their Contribution(s) with the Work to which such Contribution(s) was 78 | submitted. If You institute patent litigation against any entity (including a 79 | cross-claim or counterclaim in a lawsuit) alleging that the Work or a 80 | Contribution incorporated within the Work constitutes direct or contributory 81 | patent infringement, then any patent licenses granted to You under this License 82 | for that Work shall terminate as of the date such litigation is filed. 83 | 84 | 4. Redistribution. 85 | 86 | You may reproduce and distribute copies of the Work or Derivative Works thereof 87 | in any medium, with or without modifications, and in Source or Object form, 88 | provided that You meet the following conditions: 89 | 90 | You must give any other recipients of the Work or Derivative Works a copy of 91 | this License; and 92 | You must cause any modified files to carry prominent notices stating that You 93 | changed the files; and 94 | You must retain, in the Source form of any Derivative Works that You distribute, 95 | all copyright, patent, trademark, and attribution notices from the Source form 96 | of the Work, excluding those notices that do not pertain to any part of the 97 | Derivative Works; and 98 | If the Work includes a "NOTICE" text file as part of its distribution, then any 99 | Derivative Works that You distribute must include a readable copy of the 100 | attribution notices contained within such NOTICE file, excluding those notices 101 | that do not pertain to any part of the Derivative Works, in at least one of the 102 | following places: within a NOTICE text file distributed as part of the 103 | Derivative Works; within the Source form or documentation, if provided along 104 | with the Derivative Works; or, within a display generated by the Derivative 105 | Works, if and wherever such third-party notices normally appear. The contents of 106 | the NOTICE file are for informational purposes only and do not modify the 107 | License. You may add Your own attribution notices within Derivative Works that 108 | You distribute, alongside or as an addendum to the NOTICE text from the Work, 109 | provided that such additional attribution notices cannot be construed as 110 | modifying the License. 111 | You may add Your own copyright statement to Your modifications and may provide 112 | additional or different license terms and conditions for use, reproduction, or 113 | distribution of Your modifications, or for any such Derivative Works as a whole, 114 | provided Your use, reproduction, and distribution of the Work otherwise complies 115 | with the conditions stated in this License. 116 | 117 | 5. Submission of Contributions. 118 | 119 | Unless You explicitly state otherwise, any Contribution intentionally submitted 120 | for inclusion in the Work by You to the Licensor shall be under the terms and 121 | conditions of this License, without any additional terms or conditions. 122 | Notwithstanding the above, nothing herein shall supersede or modify the terms of 123 | any separate license agreement you may have executed with Licensor regarding 124 | such Contributions. 125 | 126 | 6. Trademarks. 127 | 128 | This License does not grant permission to use the trade names, trademarks, 129 | service marks, or product names of the Licensor, except as required for 130 | reasonable and customary use in describing the origin of the Work and 131 | reproducing the content of the NOTICE file. 132 | 133 | 7. Disclaimer of Warranty. 134 | 135 | Unless required by applicable law or agreed to in writing, Licensor provides the 136 | Work (and each Contributor provides its Contributions) on an "AS IS" BASIS, 137 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied, 138 | including, without limitation, any warranties or conditions of TITLE, 139 | NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A PARTICULAR PURPOSE. You are 140 | solely responsible for determining the appropriateness of using or 141 | redistributing the Work and assume any risks associated with Your exercise of 142 | permissions under this License. 143 | 144 | 8. Limitation of Liability. 145 | 146 | In no event and under no legal theory, whether in tort (including negligence), 147 | contract, or otherwise, unless required by applicable law (such as deliberate 148 | and grossly negligent acts) or agreed to in writing, shall any Contributor be 149 | liable to You for damages, including any direct, indirect, special, incidental, 150 | or consequential damages of any character arising as a result of this License or 151 | out of the use or inability to use the Work (including but not limited to 152 | damages for loss of goodwill, work stoppage, computer failure or malfunction, or 153 | any and all other commercial damages or losses), even if such Contributor has 154 | been advised of the possibility of such damages. 155 | 156 | 9. Accepting Warranty or Additional Liability. 157 | 158 | While redistributing the Work or Derivative Works thereof, You may choose to 159 | offer, and charge a fee for, acceptance of support, warranty, indemnity, or 160 | other liability obligations and/or rights consistent with this License. However, 161 | in accepting such obligations, You may act only on Your own behalf and on Your 162 | sole responsibility, not on behalf of any other Contributor, and only if You 163 | agree to indemnify, defend, and hold each Contributor harmless for any liability 164 | incurred by, or claims asserted against, such Contributor by reason of your 165 | accepting any such warranty or additional liability. 166 | 167 | END OF TERMS AND CONDITIONS 168 | 169 | APPENDIX: How to apply the Apache License to your work 170 | 171 | To apply the Apache License to your work, attach the following boilerplate 172 | notice, with the fields enclosed by brackets "[]" replaced with your own 173 | identifying information. (Don't include the brackets!) The text should be 174 | enclosed in the appropriate comment syntax for the file format. We also 175 | recommend that a file or class name and description of purpose be included on 176 | the same "printed page" as the copyright notice for easier identification within 177 | third-party archives. 178 | 179 | Copyright [yyyy] [name of copyright owner] 180 | 181 | Licensed under the Apache License, Version 2.0 (the "License"); 182 | you may not use this file except in compliance with the License. 183 | You may obtain a copy of the License at 184 | 185 | http://www.apache.org/licenses/LICENSE-2.0 186 | 187 | Unless required by applicable law or agreed to in writing, software 188 | distributed under the License is distributed on an "AS IS" BASIS, 189 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 190 | See the License for the specific language governing permissions and 191 | limitations under the License. 192 | -------------------------------------------------------------------------------- /elasticsearch.yml: -------------------------------------------------------------------------------- 1 | ##################### ElasticSearch Configuration Example ##################### 2 | 3 | # This file contains an overview of various configuration settings, 4 | # targeted at operations staff. Application developers should 5 | # consult the guide at . 6 | # 7 | # The installation procedure is covered at 8 | # . 9 | # 10 | # ElasticSearch comes with reasonable defaults for most settings, 11 | # so you can try it out without bothering with configuration. 12 | # 13 | # Most of the time, these defaults are just fine for running a production 14 | # cluster. If you're fine-tuning your cluster, or wondering about the 15 | # effect of certain configuration option, please _do ask_ on the 16 | # mailing list or IRC channel [http://elasticsearch.org/community]. 17 | 18 | # Any element in the configuration can be replaced with environment variables 19 | # by placing them in ${...} notation. For example: 20 | # 21 | # node.rack: ${RACK_ENV_VAR} 22 | 23 | # See 24 | # for information on supported formats and syntax for the configuration file. 25 | 26 | 27 | ################################### Cluster ################################### 28 | 29 | # Cluster name identifies your cluster for auto-discovery. If you're running 30 | # multiple clusters on the same network, make sure you're using unique names. 31 | # 32 | cluster.name: uiosearch 33 | 34 | 35 | #################################### Node ##################################### 36 | 37 | # Node names are generated dynamically on startup, so you're relieved 38 | # from configuring them manually. You can tie this node to a specific name: 39 | # 40 | # node.name: "Franz Kafka" 41 | 42 | # Every node can be configured to allow or deny being eligible as the master, 43 | # and to allow or deny to store the data. 44 | # 45 | # Allow this node to be eligible as a master node (enabled by default): 46 | # 47 | #node.master: false 48 | # 49 | # Allow this node to store data (enabled by default): 50 | # 51 | # node.data: true 52 | 53 | # You can exploit these settings to design advanced cluster topologies. 54 | # 55 | # 1. You want this node to never become a master node, only to hold data. 56 | # This will be the "workhorse" of your cluster. 57 | # 58 | # node.master: false 59 | # node.data: true 60 | # 61 | # 2. You want this node to only serve as a master: to not store any data and 62 | # to have free resources. This will be the "coordinator" of your cluster. 63 | # 64 | # node.master: true 65 | # node.data: false 66 | # 67 | # 3. You want this node to be neither master nor data node, but 68 | # to act as a "search load balancer" (fetching data from nodes, 69 | # aggregating results, etc.) 70 | # 71 | # node.master: false 72 | # node.data: false 73 | 74 | # Use the Cluster Health API [http://localhost:9200/_cluster/health], the 75 | # Node Info API [http://localhost:9200/_cluster/nodes] or GUI tools 76 | # such as and 77 | # to inspect the cluster state. 78 | 79 | # A node can have generic attributes associated with it, which can later be used 80 | # for customized shard allocation filtering, or allocation awareness. An attribute 81 | # is a simple key value pair, similar to node.key: value, here is an example: 82 | # 83 | # node.rack: rack314 84 | 85 | # By default, multiple nodes are allowed to start from the same installation location 86 | # to disable it, set the following: 87 | # node.max_local_storage_nodes: 1 88 | 89 | 90 | #################################### Index #################################### 91 | 92 | # You can set a number of options (such as shard/replica options, mapping 93 | # or analyzer definitions, translog settings, ...) for indices globally, 94 | # in this file. 95 | # 96 | # Note, that it makes more sense to configure index settings specifically for 97 | # a certain index, either when creating it or by using the index templates API. 98 | # 99 | # See and 100 | # 101 | # for more information. 102 | 103 | # Set the number of shards (splits) of an index (5 by default): 104 | # 105 | # index.number_of_shards: 5 106 | 107 | # Set the number of replicas (additional copies) of an index (1 by default): 108 | # 109 | # index.number_of_replicas: 1 110 | 111 | # Note, that for development on a local machine, with small indices, it usually 112 | # makes sense to "disable" the distributed features: 113 | # 114 | # index.number_of_shards: 1 115 | # index.number_of_replicas: 0 116 | 117 | # These settings directly affect the performance of index and search operations 118 | # in your cluster. Assuming you have enough machines to hold shards and 119 | # replicas, the rule of thumb is: 120 | # 121 | # 1. Having more *shards* enhances the _indexing_ performance and allows to 122 | # _distribute_ a big index across machines. 123 | # 2. Having more *replicas* enhances the _search_ performance and improves the 124 | # cluster _availability_. 125 | # 126 | # The "number_of_shards" is a one-time setting for an index. 127 | # 128 | # The "number_of_replicas" can be increased or decreased anytime, 129 | # by using the Index Update Settings API. 130 | # 131 | # ElasticSearch takes care about load balancing, relocating, gathering the 132 | # results from nodes, etc. Experiment with different settings to fine-tune 133 | # your setup. 134 | 135 | # Use the Index Status API () to inspect 136 | # the index status. 137 | 138 | 139 | #################################### Paths #################################### 140 | 141 | # Path to directory containing configuration (this file and logging.yml): 142 | # 143 | # path.conf: /path/to/conf 144 | 145 | # Path to directory where to store index data allocated for this node. 146 | # 147 | # path.data: /path/to/data 148 | # 149 | # Can optionally include more than one location, causing data to be striped across 150 | # the locations (a la RAID 0) on a file level, favouring locations with most free 151 | # space on creation. For example: 152 | # 153 | # path.data: /path/to/data1,/path/to/data2 154 | 155 | # Path to temporary files: 156 | # 157 | # path.work: /path/to/work 158 | 159 | # Path to log files: 160 | # 161 | # path.logs: /path/to/logs 162 | 163 | # Path to where plugins are installed: 164 | # 165 | # path.plugins: /path/to/plugins 166 | 167 | 168 | #################################### Plugin ################################### 169 | 170 | # If a plugin listed here is not installed for current node, the node will not start. 171 | # 172 | # plugin.mandatory: mapper-attachments,lang-groovy 173 | 174 | 175 | ################################### Memory #################################### 176 | 177 | # ElasticSearch performs poorly when JVM starts swapping: you should ensure that 178 | # it _never_ swaps. 179 | # 180 | # Set this property to true to lock the memory: 181 | # 182 | # bootstrap.mlockall: true 183 | 184 | # Make sure that the ES_MIN_MEM and ES_MAX_MEM environment variables are set 185 | # to the same value, and that the machine has enough memory to allocate 186 | # for ElasticSearch, leaving enough memory for the operating system itself. 187 | # 188 | # You should also make sure that the ElasticSearch process is allowed to lock 189 | # the memory, eg. by using `ulimit -l unlimited`. 190 | 191 | 192 | ############################## Network And HTTP ############################### 193 | 194 | # ElasticSearch, by default, binds itself to the 0.0.0.0 address, and listens 195 | # on port [9200-9300] for HTTP traffic and on port [9300-9400] for node-to-node 196 | # communication. (the range means that if the port is busy, it will automatically 197 | # try the next port). 198 | 199 | # Set the bind address specifically (IPv4 or IPv6): 200 | # 201 | # network.bind_host: 192.168.0.1 202 | 203 | # Set the address other nodes will use to communicate with this node. If not 204 | # set, it is automatically derived. It must point to an actual IP address. 205 | # 206 | # network.publish_host: 192.168.0.1 207 | 208 | # Set both 'bind_host' and 'publish_host': 209 | # 210 | # network.host: 192.168.0.1 211 | 212 | # Set a custom port for the node to node communication (9300 by default): 213 | # 214 | # transport.tcp.port: 9300 215 | 216 | # Enable compression for all communication between nodes (disabled by default): 217 | # 218 | # transport.tcp.compress: true 219 | 220 | # Set a custom port to listen for HTTP traffic: 221 | # 222 | # http.port: 9200 223 | 224 | # Set a custom allowed content length: 225 | # 226 | # http.max_content_length: 100mb 227 | 228 | # Disable HTTP completely: 229 | # 230 | # http.enabled: false 231 | 232 | 233 | ################################### Gateway ################################### 234 | 235 | # The gateway allows for persisting the cluster state between full cluster 236 | # restarts. Every change to the state (such as adding an index) will be stored 237 | # in the gateway, and when the cluster starts up for the first time, 238 | # it will read its state from the gateway. 239 | 240 | # There are several types of gateway implementations. For more information, 241 | # see . 242 | 243 | # The default gateway type is the "local" gateway (recommended): 244 | # 245 | # gateway.type: local 246 | 247 | # Settings below control how and when to start the initial recovery process on 248 | # a full cluster restart (to reuse as much local data as possible when using shared 249 | # gateway). 250 | 251 | # Allow recovery process after N nodes in a cluster are up: 252 | # 253 | # gateway.recover_after_nodes: 1 254 | 255 | # Set the timeout to initiate the recovery process, once the N nodes 256 | # from previous setting are up (accepts time value): 257 | # 258 | # gateway.recover_after_time: 5m 259 | 260 | # Set how many nodes are expected in this cluster. Once these N nodes 261 | # are up (and recover_after_nodes is met), begin recovery process immediately 262 | # (without waiting for recover_after_time to expire): 263 | # 264 | # gateway.expected_nodes: 2 265 | 266 | 267 | ############################# Recovery Throttling ############################# 268 | 269 | # These settings allow to control the process of shards allocation between 270 | # nodes during initial recovery, replica allocation, rebalancing, 271 | # or when adding and removing nodes. 272 | 273 | # Set the number of concurrent recoveries happening on a node: 274 | # 275 | # 1. During the initial recovery 276 | # 277 | # cluster.routing.allocation.node_initial_primaries_recoveries: 4 278 | # 279 | # 2. During adding/removing nodes, rebalancing, etc 280 | # 281 | # cluster.routing.allocation.node_concurrent_recoveries: 2 282 | 283 | # Set to throttle throughput when recovering (eg. 100mb, by default 20mb): 284 | # 285 | # indices.recovery.max_bytes_per_sec: 20mb 286 | 287 | # Set to limit the number of open concurrent streams when 288 | # recovering a shard from a peer: 289 | # 290 | # indices.recovery.concurrent_streams: 5 291 | 292 | 293 | ################################## Discovery ################################## 294 | 295 | # Discovery infrastructure ensures nodes can be found within a cluster 296 | # and master node is elected. Multicast discovery is the default. 297 | 298 | # Set to ensure a node sees N other master eligible nodes to be considered 299 | # operational within the cluster. Set this option to a higher value (2-4) 300 | # for large clusters (>3 nodes): 301 | # 302 | # discovery.zen.minimum_master_nodes: 2 303 | 304 | # Set the time to wait for ping responses from other nodes when discovering. 305 | # Set this option to a higher value on a slow or congested network 306 | # to minimize discovery failures: 307 | # 308 | # discovery.zen.ping.timeout: 3s 309 | 310 | # See 311 | # for more information. 312 | 313 | # Unicast discovery allows to explicitly control which nodes will be used 314 | # to discover the cluster. It can be used when multicast is not present, 315 | # or to restrict the cluster communication-wise. 316 | # 317 | # 1. Disable multicast discovery (enabled by default): 318 | # 319 | # discovery.zen.ping.multicast.enabled: false 320 | # 321 | # 2. Configure an initial list of master nodes in the cluster 322 | # to perform discovery when new nodes (master or data) are started: 323 | # 324 | # discovery.zen.ping.unicast.hosts: ["host1", "host2:port", "host3[portX-portY]"] 325 | 326 | # EC2 discovery allows to use AWS EC2 API in order to perform discovery. 327 | # 328 | # You have to install the cloud-aws plugin for enabling the EC2 discovery. 329 | # 330 | # See 331 | # for more information. 332 | # 333 | # See 334 | # for a step-by-step tutorial. 335 | 336 | 337 | ################################## Slow Log ################################## 338 | 339 | # Shard level query and fetch threshold logging. 340 | 341 | #index.search.slowlog.threshold.query.warn: 10s 342 | #index.search.slowlog.threshold.query.info: 5s 343 | #index.search.slowlog.threshold.query.debug: 2s 344 | #index.search.slowlog.threshold.query.trace: 500ms 345 | 346 | #index.search.slowlog.threshold.fetch.warn: 1s 347 | #index.search.slowlog.threshold.fetch.info: 800ms 348 | #index.search.slowlog.threshold.fetch.debug: 500ms 349 | #index.search.slowlog.threshold.fetch.trace: 200ms 350 | 351 | #index.indexing.slowlog.threshold.index.warn: 10s 352 | #index.indexing.slowlog.threshold.index.info: 5s 353 | #index.indexing.slowlog.threshold.index.debug: 2s 354 | #index.indexing.slowlog.threshold.index.trace: 500ms 355 | 356 | ################################## GC Logging ################################ 357 | 358 | #monitor.jvm.gc.ParNew.warn: 1000ms 359 | #monitor.jvm.gc.ParNew.info: 700ms 360 | #monitor.jvm.gc.ParNew.debug: 400ms 361 | 362 | #monitor.jvm.gc.ConcurrentMarkSweep.warn: 10s 363 | #monitor.jvm.gc.ConcurrentMarkSweep.info: 5s 364 | #monitor.jvm.gc.ConcurrentMarkSweep.debug: 2s 365 | --------------------------------------------------------------------------------