├── .gitignore ├── README.md ├── SUMMARY.md ├── aggregations.md ├── aggregations ├── bucket-aggregations.md ├── metrics-aggregations.md └── structuring-aggregations.md ├── assets ├── Cover_1600_2400.jpg ├── Cover_400_600.jpg ├── Cover_800_1200.jpg └── qrcode_for_gh_26893aa0a4ea_258.jpg ├── client.md ├── client ├── transport-client.md └── xpack-transport-client.md ├── dependency.md ├── document-apis.md ├── document-apis ├── bulk-api.md ├── delete-api.md ├── delete-by-query-api.md ├── get-api.md ├── index-api.md ├── multi-get-api.md ├── update-api.md └── using-bulk-processor.md ├── indexed-scripts-api.md ├── indexed-scripts-api └── script-language.md ├── java-api-administration.md ├── java-api-administration ├── cluster-administration.md └── indices-administration.md ├── query-dsl.md ├── query-dsl ├── compound-queries.md ├── full-text-queries.md ├── geo-queries.md ├── joining-queries.md ├── match-all-query.md ├── span-queries.md ├── specialized-queries.md └── term-level-queries.md ├── search-api.md └── search-api ├── multisearch-api.md ├── search-template.md ├── terminate-after.md ├── using-aggregations.md └── using-scrolls-in-java.md /.gitignore: -------------------------------------------------------------------------------- 1 | # Node rules: 2 | ## Grunt intermediate storage (http://gruntjs.com/creating-plugins#storing-task-files) 3 | .grunt 4 | 5 | ## Dependency directory 6 | ## Commenting this out is preferred by some people, see 7 | ## https://docs.npmjs.com/misc/faq#should-i-check-my-node_modules-folder-into-git 8 | node_modules 9 | 10 | # Book build output 11 | _book 12 | 13 | # eBook build output 14 | *.epub 15 | *.mobi 16 | *.pdf -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Elasticsearch Java API 手册 2 | 3 | ![](/assets/Cover_400_600.jpg) 4 | 5 | 本手册由 [全科](http://woquanke.com) 翻译,并且整理成电子书,支持PDF,ePub,Mobi格式,方便大家下载阅读。 6 | 7 | 8 | 阅读地址:[http://woquanke.com/books/esjava/](http://woquanke.com/books/esjava/) 9 | 10 | 下载地址:[https://www.gitbook.com/book/quanke/elasticsearch-java](https://www.gitbook.com/book/quanke/elasticsearch-java) 11 | 12 | github地址:[https://github.com/quanke/elasticsearch-java](https://github.com/quanke/elasticsearch-java) 13 | 14 | gitee 地址:[https://gitee.com/quanke/elasticsearch-java](https://gitee.com/quanke/elasticsearch-java) 15 | 16 | 配套示例代码:[https://gitee.com/quanke/elasticsearch-java-study](https://gitee.com/quanke/elasticsearch-java-study) 17 | 18 | 19 | 编辑:[http://woquanke.com](http://woquanke.com) 20 | 21 | 编辑整理辛苦,还望大神们点一下star ,抚平我虚荣的心 22 | 23 | > 不只是官方文档的翻译,还包含使用实例,包含我们使用踩过的坑 24 | 25 | ## 推荐阅读 26 | 27 | [Elasticsearch Java Rest 手册](https://www.gitbook.com/book/quanke/elasticsearch-java-rest/) 已经完成大部分 28 | 29 | 30 | 更多请关注我的微信公众号: 31 | 32 | ![](/assets/qrcode_for_gh_26893aa0a4ea_258.jpg) 33 | 34 | 35 | 下面几个章节应用的相对少,所以会延后更新,计划先把 配套实例 [ elasticsearch-java-study](https://gitee.com/quanke/elasticsearch-java-study) 项目写完; 36 | 37 | * [Indexed Scripts API](indexed-scripts-api.md) 38 | * [Script Language](indexed-scripts-api/script-language.md) 39 | * [Java API Administration](java-api-administration.md) 40 | * [Indices Administration](java-api-administration/indices-administration.md) 41 | * [Cluster Administration](java-api-administration/cluster-administration.md) 42 | 43 | ## 参考 44 | 45 | - [elasticsearch java API 官方文档](https://www.elastic.co/guide/en/elasticsearch/client/java-api/current/index.html) 46 | - [elasticsearch性能调优](http://www.cnblogs.com/hseagle/p/6015245.html) 47 | - [ElasticSearch 5.0.1 java API操作](http://blog.csdn.net/gaoqiao1988/article/details/53842728) 48 | - [fendo Elasticsearch 类目](http://blog.csdn.net/u011781521/article/category/7096008) 49 | - [Java API 之 滚动搜索(Scroll API)](http://blog.csdn.net/sunnyyoona/article/details/52810397) 50 | - [Elastic Elasticsearch - ApacheCN(Apache中文网](http://cwiki.apachecn.org/display/Elasticsearch/) 51 | - [aggregation 详解2(metrics aggregations)](http://www.cnblogs.com/licongyu/p/5515786.html) 52 | - [aggregation 详解3(bucket aggregation)](http://www.cnblogs.com/licongyu/p/5503094.html) 53 | - [Percentile Ranks Aggregation](http://www.cnblogs.com/benjiming/p/7099638.html) 54 | - [Java API之TermQuery](http://blog.csdn.net/sunnyyoona/article/details/52852483) -------------------------------------------------------------------------------- /SUMMARY.md: -------------------------------------------------------------------------------- 1 | # Summary 2 | 3 | * [Introduction](README.md) 4 | * [Dependency](dependency.md) 5 | * [Client](client.md) 6 | * [Transport Client](client/transport-client.md) 7 | * [XPack Transport Client](client/xpack-transport-client.md) 8 | * [Document APIs](document-apis.md) 9 | * [Index API](document-apis/index-api.md) 10 | * [Get API](document-apis/get-api.md) 11 | * [Delete API](document-apis/delete-api.md) 12 | * [Delete By Query API](document-apis/delete-by-query-api.md) 13 | * [Update API](document-apis/update-api.md) 14 | * [Multi Get API](document-apis/multi-get-api.md) 15 | * [Bulk API](document-apis/bulk-api.md) 16 | * [Using Bulk Processor](document-apis/using-bulk-processor.md) 17 | * [Search API](search-api.md) 18 | * [Using scrolls in Java](search-api/using-scrolls-in-java.md) 19 | * [MultiSearch API](search-api/multisearch-api.md) 20 | * [Using Aggregations](search-api/using-aggregations.md) 21 | * [Terminate After](search-api/terminate-after.md) 22 | * [Search Template](search-api/search-template.md) 23 | * [Aggregations](aggregations.md) 24 | * [Structuring aggregations](aggregations/structuring-aggregations.md) 25 | * [Metrics aggregations](aggregations/metrics-aggregations.md) 26 | * [Bucket aggregations](aggregations/bucket-aggregations.md) 27 | * [Query DSL](query-dsl.md) 28 | * [Match All Query](query-dsl/match-all-query.md) 29 | * [Full text queries](query-dsl/full-text-queries.md) 30 | * [Term level queries](query-dsl/term-level-queries.md) 31 | * [Compound queries](query-dsl/compound-queries.md) 32 | * [Joining queries](query-dsl/joining-queries.md) 33 | * [Geo queries](query-dsl/geo-queries.md) 34 | * [Specialized queries](query-dsl/specialized-queries.md) 35 | * [Span queries](query-dsl/span-queries.md) 36 | * [Indexed Scripts API](indexed-scripts-api.md) 37 | * [Script Language](indexed-scripts-api/script-language.md) 38 | * [Java API Administration](java-api-administration.md) 39 | * [Indices Administration](java-api-administration/indices-administration.md) 40 | * [Cluster Administration](java-api-administration/cluster-administration.md) 41 | 42 | -------------------------------------------------------------------------------- /aggregations.md: -------------------------------------------------------------------------------- 1 | 2 | ## Aggregations 3 | 4 | 聚合 5 | 6 | Elasticsearch提供完整的Java API来使用聚合。 请参阅[聚合指南](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/search-aggregations.html)。 7 | 8 | 9 | 使用 `AggregationBuilders` 构建对象,增加到搜索请求中: 10 | 11 | ``` 12 | import org.elasticsearch.search.aggregations.AggregationBuilders; 13 | 14 | ``` 15 | 16 | ``` 17 | SearchResponse sr = node.client().prepareSearch() 18 | .setQuery( /* your query */ ) 19 | .addAggregation( /* add an aggregation */ ) 20 | .execute().actionGet(); 21 | ``` 22 | -------------------------------------------------------------------------------- /aggregations/bucket-aggregations.md: -------------------------------------------------------------------------------- 1 | 2 | ### Bucket aggregations 桶分聚合 3 | 4 | > Bucket aggregations 不像 metrics aggregations 那样计算指标,恰恰相反,它创建文档的buckets,每个buckets与标准(取决于聚合类型)相关联,它决定了当前上下文中的文档是否会“falls”到它。换句话说,bucket可以有效地定义文档集合。除了buckets本身,bucket集合还计算并返回“落入”每个bucket的文档数量。 5 | 6 | > 与度量聚合相比,Bucket聚合可以保存子聚合,这些子聚合将针对由其“父”bucket聚合创建的bucket进行聚合。 7 | 8 | > 有不同的bucket聚合器,每个具有不同的“bucketing”策略,一些定义一个单独的bucket,一些定义多个bucket的固定数量,另一些定义在聚合过程中动态创建bucket 9 | 10 | 11 | #### Global Aggregation 全局聚合 12 | 13 | > 定义搜索执行上下文中的所有文档的单个bucket,这个上下文由索引和您正在搜索的文档类型定义,但不受搜索查询本身的影响。 14 | 15 | 16 | > 全局聚合器只能作为顶层聚合器放置,因为将全局聚合器嵌入到另一个分组聚合器中是没有意义的。 17 | 18 | 19 | 下面是如何使用 `Java API` 使用[全局聚合](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/search-aggregations-bucket-global-aggregation.html) 20 | 21 | ##### 准备聚合请求 22 | 23 | 下面是如何创建聚合请求的是示例: 24 | 25 | 26 | ``` 27 | AggregationBuilders 28 | .global("agg") 29 | .subAggregation(AggregationBuilders.terms("genders").field("gender")); 30 | ``` 31 | 32 | ##### 使用聚合请求 33 | 34 | 35 | ``` 36 | import org.elasticsearch.search.aggregations.bucket.global.Global; 37 | 38 | ``` 39 | 40 | ``` 41 | // sr is here your SearchResponse object 42 | Global agg = sr.getAggregations().get("agg"); 43 | agg.getDocCount(); // Doc count 44 | ``` 45 | 46 | #### Filter Aggregation 过滤聚合 47 | 48 | > 过滤聚合——基于一个条件,来对当前的文档进行过滤的聚合。 49 | 50 | 51 | 52 | 下面是如何使用 `Java API` 使用[过滤聚合](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/search-aggregations-bucket-filter-aggregation.html) 53 | 54 | ##### 准备聚合请求 55 | 56 | 下面是如何创建聚合请求的是示例: 57 | 58 | 59 | ``` 60 | AggregationBuilders 61 | .filter("agg", QueryBuilders.termQuery("gender", "male")); 62 | ``` 63 | 64 | ##### 使用聚合请求 65 | 66 | 67 | ``` 68 | import org.elasticsearch.search.aggregations.bucket.filter.Filter; 69 | 70 | ``` 71 | 72 | ``` 73 | // sr is here your SearchResponse object 74 | Filter agg = sr.getAggregations().get("agg"); 75 | agg.getDocCount(); // Doc count 76 | ``` 77 | 78 | #### Filters Aggregation 多过滤聚合 79 | 80 | > 多过滤聚合——基于多个过滤条件,来对当前文档进行【过滤】的聚合,每个过滤都包含所有满足它的文档(多个bucket中可能重复)。 81 | 82 | 83 | 84 | 下面是如何使用 `Java API` 使用[多过滤聚合](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/search-aggregations-bucket-filters-aggregation.html) 85 | 86 | ##### 准备聚合请求 87 | 88 | 下面是如何创建聚合请求的是示例: 89 | 90 | 91 | ``` 92 | AggregationBuilder aggregation = 93 | AggregationBuilders 94 | .filters("agg", 95 | new FiltersAggregator.KeyedFilter("men", QueryBuilders.termQuery("gender", "male")), 96 | new FiltersAggregator.KeyedFilter("women", QueryBuilders.termQuery("gender", "female"))); 97 | ``` 98 | 99 | ##### 使用聚合请求 100 | 101 | 102 | ``` 103 | import org.elasticsearch.search.aggregations.bucket.filters.Filters; 104 | 105 | ``` 106 | 107 | ``` 108 | // sr is here your SearchResponse object 109 | Filters agg = sr.getAggregations().get("agg"); 110 | 111 | // For each entry 112 | for (Filters.Bucket entry : agg.getBuckets()) { 113 | String key = entry.getKeyAsString(); // bucket key 114 | long docCount = entry.getDocCount(); // Doc count 115 | logger.info("key [{}], doc_count [{}]", key, docCount); 116 | } 117 | ``` 118 | 119 | 大概输出 120 | 121 | 122 | ``` 123 | key [men], doc_count [4982] 124 | key [women], doc_count [5018] 125 | ``` 126 | 127 | #### Missing Aggregation 基于字段数据的单桶聚合 128 | 129 | > 基于字段数据的单桶聚合,创建当前文档集上下文中缺少字段值的所有文档的bucket(桶)(有效地,丢失了一个字段或配置了NULL值集),此聚合器通常与其他字段数据桶聚合器(例如范围)结合使用,以返回由于缺少字段数据值而无法放置在任何其他存储区中的所有文档的信息。 130 | 131 | 132 | 133 | 下面是如何使用 `Java API` 使用[基于字段数据的单桶聚合](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/search-aggregations-bucket-missing-aggregation.html) 134 | 135 | ##### 准备聚合请求 136 | 137 | 下面是如何创建聚合请求的是示例: 138 | 139 | 140 | ``` 141 | AggregationBuilders.missing("agg").field("gender"); 142 | 143 | ``` 144 | 145 | ##### 使用聚合请求 146 | 147 | 148 | ``` 149 | import org.elasticsearch.search.aggregations.bucket.missing.Missing; 150 | 151 | ``` 152 | 153 | ``` 154 | // sr is here your SearchResponse object 155 | Missing agg = sr.getAggregations().get("agg"); 156 | agg.getDocCount(); // Doc count 157 | ``` 158 | 159 | #### Nested Aggregation 嵌套类型聚合 160 | 161 | > 基于嵌套(nested)数据类型,把该【嵌套类型的信息】聚合到单个桶里,然后就可以对嵌套类型做进一步的聚合操作。 162 | 163 | 164 | 165 | 下面是如何使用 `Java API` 使用[嵌套类型聚合](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/search-aggregations-bucket-nested-aggregation.html) 166 | 167 | ##### 准备聚合请求 168 | 169 | 下面是如何创建聚合请求的是示例: 170 | 171 | 172 | ``` 173 | AggregationBuilders 174 | .nested("agg", "resellers"); 175 | 176 | ``` 177 | 178 | ##### 使用聚合请求 179 | 180 | 181 | ``` 182 | import org.elasticsearch.search.aggregations.bucket.nested.Nested; 183 | 184 | ``` 185 | 186 | ``` 187 | // sr is here your SearchResponse object 188 | Nested agg = sr.getAggregations().get("agg"); 189 | agg.getDocCount(); // Doc count 190 | ``` 191 | 192 | #### Reverse nested Aggregation 193 | 194 | > 一个特殊的单桶聚合,可以从嵌套文档中聚合父文档。实际上,这种聚合可以从嵌套的块结构中跳出来,并链接到其他嵌套的结构或根文档.这允许嵌套不是嵌套对象的一部分的其他聚合在嵌套聚合中。 195 | reverse_nested 聚合必须定义在nested之中 196 | 197 | 198 | 199 | 下面是如何使用 `Java API` 使用[Reverse nested Aggregation](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/search-aggregations-bucket-reverse-nested-aggregation.html) 200 | 201 | ##### 准备聚合请求 202 | 203 | 下面是如何创建聚合请求的是示例: 204 | 205 | 206 | ``` 207 | AggregationBuilder aggregation = 208 | AggregationBuilders 209 | .nested("agg", "resellers") 210 | .subAggregation( 211 | AggregationBuilders 212 | .terms("name").field("resellers.name") 213 | .subAggregation( 214 | AggregationBuilders 215 | .reverseNested("reseller_to_product") 216 | ) 217 | ); 218 | ``` 219 | 220 | ##### 使用聚合请求 221 | 222 | 223 | ``` 224 | import org.elasticsearch.search.aggregations.bucket.nested.Nested; 225 | import org.elasticsearch.search.aggregations.bucket.nested.ReverseNested; 226 | import org.elasticsearch.search.aggregations.bucket.terms.Terms; 227 | 228 | ``` 229 | 230 | ``` 231 | // sr is here your SearchResponse object 232 | Nested agg = sr.getAggregations().get("agg"); 233 | Terms name = agg.getAggregations().get("name"); 234 | for (Terms.Bucket bucket : name.getBuckets()) { 235 | ReverseNested resellerToProduct = bucket.getAggregations().get("reseller_to_product"); 236 | resellerToProduct.getDocCount(); // Doc count 237 | } 238 | ``` 239 | 240 | 241 | #### Children Aggregation 242 | 243 | > 一种特殊的单桶聚合,可以将父文档类型上的桶聚合到子文档上。 244 | 245 | 246 | 247 | 248 | 下面是如何使用 `Java API` 使用[Children Aggregation](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/search-aggregations-bucket-children-aggregation.html) 249 | 250 | ##### 准备聚合请求 251 | 252 | 下面是如何创建聚合请求的是示例: 253 | 254 | 255 | ``` 256 | AggregationBuilder aggregation = 257 | AggregationBuilders 258 | .children("agg", "reseller"); // agg 是聚合名,reseller 是子类型 259 | ``` 260 | 261 | ##### 使用聚合请求 262 | 263 | 264 | ``` 265 | import org.elasticsearch.join.aggregations.Children; 266 | 267 | 268 | ``` 269 | 270 | ``` 271 | // sr is here your SearchResponse object 272 | Children agg = sr.getAggregations().get("agg"); 273 | agg.getDocCount(); // Doc count 274 | ``` 275 | 276 | #### Terms Aggregation 词元聚合 277 | 278 | > 基于某个field,该 field 内的每一个【唯一词元】为一个桶,并计算每个桶内文档个数。默认返回顺序是按照文档个数多少排序。当不返回所有 buckets 的情况,文档个数可能不准确。 279 | 280 | 281 | 282 | 283 | 下面是如何使用 `Java API` 使用[Terms Aggregation](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/search-aggregations-bucket-terms-aggregation.html) 284 | 285 | ##### 准备聚合请求 286 | 287 | 下面是如何创建聚合请求的是示例: 288 | 289 | 290 | ``` 291 | AggregationBuilders 292 | .terms("genders") 293 | .field("gender"); 294 | ``` 295 | 296 | ##### 使用聚合请求 297 | 298 | 299 | ``` 300 | import org.elasticsearch.search.aggregations.bucket.terms.Terms; 301 | 302 | ``` 303 | 304 | ``` 305 | // sr is here your SearchResponse object 306 | Terms genders = sr.getAggregations().get("genders"); 307 | 308 | // For each entry 309 | for (Terms.Bucket entry : genders.getBuckets()) { 310 | entry.getKey(); // Term 311 | entry.getDocCount(); // Doc count 312 | } 313 | ``` 314 | 315 | #### Order 排序 316 | 317 | 通过 `doc_count` 按升序排列: 318 | 319 | 320 | ``` 321 | AggregationBuilders 322 | .terms("genders") 323 | .field("gender") 324 | .order(Terms.Order.count(true)) 325 | ``` 326 | 327 | 按字词顺序,升序排列: 328 | 329 | ``` 330 | AggregationBuilders 331 | .terms("genders") 332 | .field("gender") 333 | .order(Terms.Order.term(true)) 334 | ``` 335 | 336 | 按metrics 子聚合排列(标示为聚合名) 337 | 338 | 339 | ``` 340 | AggregationBuilders 341 | .terms("genders") 342 | .field("gender") 343 | .order(Terms.Order.aggregation("avg_height", false)) 344 | .subAggregation( 345 | AggregationBuilders.avg("avg_height").field("height") 346 | ) 347 | ``` 348 | 349 | 350 | 351 | #### Significant Terms Aggregation 352 | 353 | > 返回集合中感兴趣的或者不常见的词条的聚合 354 | 355 | 356 | 357 | 下面是如何使用 `Java API` 使用[Significant Terms Aggregation](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/search-aggregations-bucket-significantterms-aggregation.html) 358 | 359 | ##### 准备聚合请求 360 | 361 | 下面是如何创建聚合请求的是示例: 362 | 363 | 364 | ``` 365 | AggregationBuilder aggregation = 366 | AggregationBuilders 367 | .significantTerms("significant_countries") 368 | .field("address.country"); 369 | 370 | // Let say you search for men only 371 | SearchResponse sr = client.prepareSearch() 372 | .setQuery(QueryBuilders.termQuery("gender", "male")) 373 | .addAggregation(aggregation) 374 | .get(); 375 | ``` 376 | 377 | ##### 使用聚合请求 378 | 379 | 380 | ``` 381 | import org.elasticsearch.search.aggregations.bucket.significant.SignificantTerms; 382 | 383 | ``` 384 | 385 | ``` 386 | // sr is here your SearchResponse object 387 | SignificantTerms agg = sr.getAggregations().get("significant_countries"); 388 | 389 | // For each entry 390 | for (SignificantTerms.Bucket entry : agg.getBuckets()) { 391 | entry.getKey(); // Term 392 | entry.getDocCount(); // Doc count 393 | } 394 | ``` 395 | 396 | 397 | #### Range Aggregation 范围聚合 398 | 399 | > 基于某个值(可以是 field 或 script),以【字段范围】来桶分聚合。范围聚合包括 from 值,不包括 to 值(区间前闭后开)。 400 | 401 | 402 | 下面是如何使用 `Java API` 使用[Range Aggregation](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/search-aggregations-bucket-range-aggregation.html) 403 | 404 | ##### 准备聚合请求 405 | 406 | 下面是如何创建聚合请求的是示例: 407 | 408 | 409 | ``` 410 | AggregationBuilder aggregation = 411 | AggregationBuilders 412 | .range("agg") 413 | .field("height") 414 | .addUnboundedTo(1.0f) // from -infinity to 1.0 (excluded) 415 | .addRange(1.0f, 1.5f) // from 1.0 to 1.5 (excluded) 416 | .addUnboundedFrom(1.5f); // from 1.5 to +infinity 417 | ``` 418 | 419 | ##### 使用聚合请求 420 | 421 | 422 | ``` 423 | import org.elasticsearch.search.aggregations.bucket.range.Range; 424 | 425 | ``` 426 | 427 | ``` 428 | // sr is here your SearchResponse object 429 | Range agg = sr.getAggregations().get("agg"); 430 | 431 | // For each entry 432 | for (Range.Bucket entry : agg.getBuckets()) { 433 | String key = entry.getKeyAsString(); // Range as key 434 | Number from = (Number) entry.getFrom(); // Bucket from 435 | Number to = (Number) entry.getTo(); // Bucket to 436 | long docCount = entry.getDocCount(); // Doc count 437 | 438 | logger.info("key [{}], from [{}], to [{}], doc_count [{}]", key, from, to, docCount); 439 | } 440 | ``` 441 | 442 | 输出: 443 | 444 | 445 | ``` 446 | key [*-1.0], from [-Infinity], to [1.0], doc_count [9] 447 | key [1.0-1.5], from [1.0], to [1.5], doc_count [21] 448 | key [1.5-*], from [1.5], to [Infinity], doc_count [20] 449 | ``` 450 | 451 | #### Date Range Aggregation 日期范围聚合 452 | 453 | > 日期范围聚合——基于日期类型的值,以【日期范围】来桶分聚合。 454 | 455 | > 日期范围可以用各种 Date Math 表达式。 456 | 457 | > 同样的,包括 from 的值,不包括 to 的值。 458 | 459 | 460 | 461 | 462 | 下面是如何使用 `Java API` 使用[Date Range Aggregation](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/search-aggregations-bucket-daterange-aggregation.html) 463 | 464 | ##### 准备聚合请求 465 | 466 | 下面是如何创建聚合请求的是示例: 467 | 468 | 469 | ``` 470 | AggregationBuilder aggregation = 471 | AggregationBuilders 472 | .dateRange("agg") 473 | .field("dateOfBirth") 474 | .format("yyyy") 475 | .addUnboundedTo("1950") // from -infinity to 1950 (excluded) 476 | .addRange("1950", "1960") // from 1950 to 1960 (excluded) 477 | .addUnboundedFrom("1960"); // from 1960 to +infinity 478 | ``` 479 | 480 | ##### 使用聚合请求 481 | 482 | 483 | ``` 484 | import org.elasticsearch.search.aggregations.bucket.range.Range; 485 | ``` 486 | 487 | ``` 488 | // sr is here your SearchResponse object 489 | Range agg = sr.getAggregations().get("agg"); 490 | 491 | // For each entry 492 | for (Range.Bucket entry : agg.getBuckets()) { 493 | String key = entry.getKeyAsString(); // Date range as key 494 | DateTime fromAsDate = (DateTime) entry.getFrom(); // Date bucket from as a Date 495 | DateTime toAsDate = (DateTime) entry.getTo(); // Date bucket to as a Date 496 | long docCount = entry.getDocCount(); // Doc count 497 | 498 | logger.info("key [{}], from [{}], to [{}], doc_count [{}]", key, fromAsDate, toAsDate, docCount); 499 | } 500 | ``` 501 | 502 | 输出: 503 | 504 | 505 | ``` 506 | key [*-1950], from [null], to [1950-01-01T00:00:00.000Z], doc_count [8] 507 | key [1950-1960], from [1950-01-01T00:00:00.000Z], to [1960-01-01T00:00:00.000Z], doc_count [5] 508 | key [1960-*], from [1960-01-01T00:00:00.000Z], to [null], doc_count [37] 509 | ``` 510 | 511 | #### Ip Range Aggregation Ip范围聚合 512 | 513 | 514 | 下面是如何使用 `Java API` 使用[Ip Range Aggregation](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/search-aggregations-bucket-iprange-aggregation.html) 515 | 516 | ##### 准备聚合请求 517 | 518 | 下面是如何创建聚合请求的是示例: 519 | 520 | 521 | ``` 522 | AggregatorBuilder aggregation = 523 | AggregationBuilders 524 | .ipRange("agg") 525 | .field("ip") 526 | .addUnboundedTo("192.168.1.0") // from -infinity to 192.168.1.0 (excluded) 527 | .addRange("192.168.1.0", "192.168.2.0") // from 192.168.1.0 to 192.168.2.0 (excluded) 528 | .addUnboundedFrom("192.168.2.0"); // from 192.168.2.0 to +infinity 529 | 530 | ``` 531 | 532 | ##### 使用聚合请求 533 | 534 | 535 | ``` 536 | import org.elasticsearch.search.aggregations.bucket.range.Range; 537 | ``` 538 | 539 | ``` 540 | // sr is here your SearchResponse object 541 | Range agg = sr.getAggregations().get("agg"); 542 | 543 | // For each entry 544 | for (Range.Bucket entry : agg.getBuckets()) { 545 | String key = entry.getKeyAsString(); // Ip range as key 546 | String fromAsString = entry.getFromAsString(); // Ip bucket from as a String 547 | String toAsString = entry.getToAsString(); // Ip bucket to as a String 548 | long docCount = entry.getDocCount(); // Doc count 549 | 550 | logger.info("key [{}], from [{}], to [{}], doc_count [{}]", key, fromAsString, toAsString, docCount); 551 | } 552 | ``` 553 | 554 | 输出: 555 | 556 | 557 | ``` 558 | key [*-1950], from [null], to [1950-01-01T00:00:00.000Z], doc_count [8] 559 | key [1950-1960], from [1950-01-01T00:00:00.000Z], to [1960-01-01T00:00:00.000Z], doc_count [5] 560 | key [1960-*], from [1960-01-01T00:00:00.000Z], to [null], doc_count [37] 561 | ``` 562 | 563 | 564 | #### Histogram Aggregation 直方图聚合 565 | 566 | > 基于文档中的某个【数值类型】字段,通过计算来动态的分桶。 567 | 568 | 下面是如何使用 `Java API` 使用[Histogram Aggregation](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/search-aggregations-bucket-histogram-aggregation.html) 569 | 570 | ##### 准备聚合请求 571 | 572 | 下面是如何创建聚合请求的是示例: 573 | 574 | 575 | ``` 576 | AggregationBuilder aggregation = 577 | AggregationBuilders 578 | .histogram("agg") 579 | .field("height") 580 | .interval(1); 581 | ``` 582 | 583 | ##### 使用聚合请求 584 | 585 | 586 | ``` 587 | import org.elasticsearch.search.aggregations.bucket.histogram.Histogram; 588 | ``` 589 | 590 | ``` 591 | // sr is here your SearchResponse object 592 | Histogram agg = sr.getAggregations().get("agg"); 593 | 594 | // For each entry 595 | for (Histogram.Bucket entry : agg.getBuckets()) { 596 | Number key = (Number) entry.getKey(); // Key 597 | long docCount = entry.getDocCount(); // Doc count 598 | 599 | logger.info("key [{}], doc_count [{}]", key, docCount); 600 | } 601 | 602 | ``` 603 | 604 | 605 | #### Date Histogram Aggregation 日期范围直方图聚合 606 | 607 | > 与直方图类似的多bucket聚合,但只能应用于日期值.。 608 | 609 | 下面是如何使用 `Java API` 使用[ Date Histogram Aggregation](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/search-aggregations-bucket-datehistogram-aggregation.html) 610 | 611 | ##### 准备聚合请求 612 | 613 | 下面是如何创建聚合请求的是示例: 614 | 615 | 616 | ``` 617 | AggregationBuilder aggregation = 618 | AggregationBuilders 619 | .dateHistogram("agg") 620 | .field("dateOfBirth") 621 | .dateHistogramInterval(DateHistogramInterval.YEAR); 622 | 623 | ``` 624 | 625 | 或者把时间间隔设置为10天 626 | 627 | 628 | ``` 629 | AggregationBuilder aggregation = 630 | AggregationBuilders 631 | .dateHistogram("agg") 632 | .field("dateOfBirth") 633 | .dateHistogramInterval(DateHistogramInterval.days(10)); 634 | ``` 635 | 636 | 637 | ##### 使用聚合请求 638 | 639 | 640 | ``` 641 | import org.elasticsearch.search.aggregations.bucket.histogram.Histogram; 642 | ``` 643 | 644 | ``` 645 | // sr is here your SearchResponse object 646 | Histogram agg = sr.getAggregations().get("agg"); 647 | 648 | // For each entry 649 | for (Histogram.Bucket entry : agg.getBuckets()) { 650 | DateTime key = (DateTime) entry.getKey(); // Key 651 | String keyAsString = entry.getKeyAsString(); // Key as String 652 | long docCount = entry.getDocCount(); // Doc count 653 | 654 | logger.info("key [{}], date [{}], doc_count [{}]", keyAsString, key.getYear(), docCount); 655 | } 656 | 657 | 658 | ``` 659 | 660 | 输出: 661 | 662 | ``` 663 | key [1942-01-01T00:00:00.000Z], date [1942], doc_count [1] 664 | key [1945-01-01T00:00:00.000Z], date [1945], doc_count [1] 665 | key [1946-01-01T00:00:00.000Z], date [1946], doc_count [1] 666 | ... 667 | key [2005-01-01T00:00:00.000Z], date [2005], doc_count [1] 668 | key [2007-01-01T00:00:00.000Z], date [2007], doc_count [2] 669 | key [2008-01-01T00:00:00.000Z], date [2008], doc_count [3] 670 | ``` 671 | 672 | 673 | #### Geo Distance Aggregation 地理距离聚合 674 | 675 | > 在geo_point字段上工作的多bucket聚合和概念上的工作非常类似于range(范围)聚合.用户可以定义原点的点和距离范围的集合。聚合计算每个文档值与原点的距离,并根据范围确定其所属的bucket(桶)(如果文档和原点之间的距离落在bucket(桶)的距离范围内,则文档属于bucket(桶) ) 676 | 677 | 678 | 下面是如何使用 `Java API` 使用[ Geo Distance Aggregation](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/search-aggregations-bucket-geodistance-aggregation.html) 679 | 680 | ##### 准备聚合请求 681 | 682 | 下面是如何创建聚合请求的是示例: 683 | 684 | 685 | ``` 686 | AggregationBuilder aggregation = 687 | AggregationBuilders 688 | .geoDistance("agg", new GeoPoint(48.84237171118314,2.33320027692004)) 689 | .field("address.location") 690 | .unit(DistanceUnit.KILOMETERS) 691 | .addUnboundedTo(3.0) 692 | .addRange(3.0, 10.0) 693 | .addRange(10.0, 500.0); 694 | 695 | ``` 696 | 697 | 698 | ##### 使用聚合请求 699 | 700 | 701 | ``` 702 | import org.elasticsearch.search.aggregations.bucket.range.Range; 703 | 704 | ``` 705 | 706 | ``` 707 | // sr is here your SearchResponse object 708 | Range agg = sr.getAggregations().get("agg"); 709 | 710 | // For each entry 711 | for (Range.Bucket entry : agg.getBuckets()) { 712 | String key = entry.getKeyAsString(); // key as String 713 | Number from = (Number) entry.getFrom(); // bucket from value 714 | Number to = (Number) entry.getTo(); // bucket to value 715 | long docCount = entry.getDocCount(); // Doc count 716 | 717 | logger.info("key [{}], from [{}], to [{}], doc_count [{}]", key, from, to, docCount); 718 | } 719 | 720 | ``` 721 | 722 | 输出: 723 | 724 | ``` 725 | key [*-3.0], from [0.0], to [3.0], doc_count [161] 726 | key [3.0-10.0], from [3.0], to [10.0], doc_count [460] 727 | key [10.0-500.0], from [10.0], to [500.0], doc_count [4925] 728 | ``` 729 | 730 | #### Geo Hash Grid Aggregation GeoHash网格聚合 731 | 732 | > 在geo_point字段和组上工作的多bucket聚合将指向网格中表示单元格的bucket。生成的网格可以是稀疏的,并且只包含具有匹配数据的单元格。每个单元格使用具有用户可定义精度的 [geohash](http://en.wikipedia.org/wiki/Geohash) 进行标记。 733 | 734 | 735 | 736 | 下面是如何使用 `Java API` 使用[Geo Hash Grid Aggregation](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/search-aggregations-bucket-geohashgrid-aggregation.html) 737 | 738 | ##### 准备聚合请求 739 | 740 | 下面是如何创建聚合请求的是示例: 741 | 742 | 743 | ``` 744 | AggregationBuilder aggregation = 745 | AggregationBuilders 746 | .geohashGrid("agg") 747 | .field("address.location") 748 | .precision(4); 749 | 750 | ``` 751 | 752 | 753 | ##### 使用聚合请求 754 | 755 | 756 | ``` 757 | import org.elasticsearch.search.aggregations.bucket.geogrid.GeoHashGrid; 758 | ``` 759 | 760 | ``` 761 | // sr is here your SearchResponse object 762 | GeoHashGrid agg = sr.getAggregations().get("agg"); 763 | 764 | // For each entry 765 | for (GeoHashGrid.Bucket entry : agg.getBuckets()) { 766 | String keyAsString = entry.getKeyAsString(); // key as String 767 | GeoPoint key = (GeoPoint) entry.getKey(); // key as geo point 768 | long docCount = entry.getDocCount(); // Doc count 769 | 770 | logger.info("key [{}], point {}, doc_count [{}]", keyAsString, key, docCount); 771 | } 772 | 773 | ``` 774 | 775 | 输出: 776 | 777 | ``` 778 | key [gbqu], point [47.197265625, -1.58203125], doc_count [1282] 779 | key [gbvn], point [50.361328125, -4.04296875], doc_count [1248] 780 | key [u1j0], point [50.712890625, 7.20703125], doc_count [1156] 781 | key [u0j2], point [45.087890625, 7.55859375], doc_count [1138] 782 | ... 783 | -------------------------------------------------------------------------------- /aggregations/metrics-aggregations.md: -------------------------------------------------------------------------------- 1 | 2 | 3 | ### Metrics aggregations 4 | > 计算度量这类的聚合操作是以使用一种方式或者从文档中提取需要聚合的值为基础的。这些数据不但可以从文档(使用数据属性)的属性中提取出来,也可以使用脚本生成。 5 | 6 | > 数值计量聚合操作是能够产生具体的数值的一种计量聚合操作。一些聚合操作输出单个的计量数值(例如avg),并且被称作single-value numeric metric aggregation,其他产生多个计量数值(例如 stats)的称作 multi-value numeric metrics aggregation。这两种不同的聚合操作只有在桶聚合的子聚合操作中才会有不同的表现(有些桶聚合可以基于每个的数值计量来对返回的桶进行排序)。 7 | 8 | #### Min Aggregatione 最小值聚合 9 | 10 | 下面是如何用Java API 使用[最小值聚合](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/search-aggregations-metrics-min-aggregation.html) 11 | 12 | ##### 准备聚合请求 13 | 14 | 下面是如何创建聚合请求的是示例: 15 | 16 | 17 | ``` 18 | MinAggregationBuilder aggregation = 19 | AggregationBuilders 20 | .min("agg") 21 | .field("height"); 22 | ``` 23 | 24 | ##### 使用聚合请求 25 | 26 | 27 | ``` 28 | import org.elasticsearch.search.aggregations.metrics.min.Min; 29 | 30 | ``` 31 | 32 | 33 | ``` 34 | // sr is here your SearchResponse object 35 | Min agg = sr.getAggregations().get("agg"); 36 | double value = agg.getValue(); 37 | ``` 38 | 39 | #### Max Aggregation 最大值聚合 40 | 下面是如何用Java API 使用[最大值聚合](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/search-aggregations-metrics-max-aggregation.html) 41 | 42 | ##### 准备聚合请求 43 | 44 | 下面是如何创建聚合请求的是示例: 45 | 46 | 47 | ``` 48 | MaxAggregationBuilder aggregation = 49 | AggregationBuilders 50 | .max("agg") 51 | .field("height"); 52 | ``` 53 | 54 | ##### 使用聚合请求 55 | 56 | 57 | ``` 58 | import org.elasticsearch.search.aggregations.metrics.max.Max; 59 | 60 | ``` 61 | 62 | 63 | ``` 64 | // sr is here your SearchResponse object 65 | Max agg = sr.getAggregations().get("agg"); 66 | double value = agg.getValue(); 67 | ``` 68 | 69 | 70 | #### Sum Aggregation 求和聚合 71 | 72 | 下面是如何用Java API 使用[求和聚合](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/search-aggregations-metrics-sum-aggregation.html) 73 | 74 | ##### 准备聚合请求 75 | 76 | 下面是如何创建聚合请求的是示例: 77 | 78 | 79 | ``` 80 | SumAggregationBuilder aggregation = 81 | AggregationBuilders 82 | .sum("agg") 83 | .field("height"); 84 | ``` 85 | 86 | ##### 使用聚合请求 87 | 88 | 89 | ``` 90 | import org.elasticsearch.search.aggregations.metrics.sum.Sum; 91 | 92 | ``` 93 | 94 | 95 | ``` 96 | // sr is here your SearchResponse object 97 | Sum agg = sr.getAggregations().get("agg"); 98 | double value = agg.getValue(); 99 | ``` 100 | 101 | 102 | #### Avg Aggregation 平均值聚合 103 | 104 | 下面是如何用Java API 使用[平均值聚合](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/search-aggregations-metrics-avg-aggregation.html) 105 | 106 | ##### 准备聚合请求 107 | 108 | 下面是如何创建聚合请求的是示例: 109 | 110 | 111 | ``` 112 | AvgAggregationBuilder aggregation = 113 | AggregationBuilders 114 | .avg("agg") 115 | .field("height"); 116 | ``` 117 | 118 | ##### 使用聚合请求 119 | 120 | 121 | ``` 122 | import org.elasticsearch.search.aggregations.metrics.avg.Avg; 123 | 124 | ``` 125 | 126 | 127 | ``` 128 | // sr is here your SearchResponse object 129 | Avg agg = sr.getAggregations().get("agg"); 130 | double value = agg.getValue(); 131 | ``` 132 | 133 | #### Stats Aggregation 统计聚合 134 | 135 | > 统计聚合——基于文档的某个值,计算出一些统计信息(min、max、sum、count、avg), 用于计算的值可以是特定的数值型字段,也可以通过脚本计算而来。 136 | 137 | 下面是如何用Java API 使用[统计聚合](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/search-aggregations-metrics-stats-aggregation.html) 138 | 139 | ##### 准备聚合请求 140 | 141 | 下面是如何创建聚合请求的是示例: 142 | 143 | 144 | ``` 145 | StatsAggregationBuilder aggregation = 146 | AggregationBuilders 147 | .stats("agg") 148 | .field("height"); 149 | ``` 150 | 151 | ##### 使用聚合请求 152 | 153 | 154 | ``` 155 | import org.elasticsearch.search.aggregations.metrics.stats.Stats; 156 | 157 | ``` 158 | 159 | ``` 160 | // sr is here your SearchResponse object 161 | Stats agg = sr.getAggregations().get("agg"); 162 | double min = agg.getMin(); 163 | double max = agg.getMax(); 164 | double avg = agg.getAvg(); 165 | double sum = agg.getSum(); 166 | long count = agg.getCount(); 167 | ``` 168 | 169 | 170 | #### Extended Stats Aggregation 扩展统计聚合 171 | 172 | > 扩展统计聚合——基于文档的某个值,计算出一些统计信息(比普通的stats聚合多了sum_of_squares、variance、std_deviation、std_deviation_bounds), 用于计算的值可以是特定的数值型字段,也可以通过脚本计算而来。 173 | 174 | 175 | 176 | 下面是如何用Java API 使用[扩展统计聚合](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/search-aggregations-metrics-extendedstats-aggregation.html) 177 | 178 | ##### 准备聚合请求 179 | 180 | 下面是如何创建聚合请求的是示例: 181 | 182 | 183 | ``` 184 | ExtendedStatsAggregationBuilder aggregation = 185 | AggregationBuilders 186 | .extendedStats("agg") 187 | .field("height"); 188 | ``` 189 | 190 | ##### 使用聚合请求 191 | 192 | 193 | ``` 194 | import org.elasticsearch.search.aggregations.metrics.stats.extended.ExtendedStats; 195 | 196 | 197 | ``` 198 | 199 | ``` 200 | // sr is here your SearchResponse object 201 | ExtendedStats agg = sr.getAggregations().get("agg"); 202 | double min = agg.getMin(); 203 | double max = agg.getMax(); 204 | double avg = agg.getAvg(); 205 | double sum = agg.getSum(); 206 | long count = agg.getCount(); 207 | double stdDeviation = agg.getStdDeviation(); 208 | double sumOfSquares = agg.getSumOfSquares(); 209 | double variance = agg.getVariance(); 210 | ``` 211 | 212 | 213 | #### Value Count Aggregation 值计数聚合 214 | 215 | > 值计数聚合——计算聚合文档中某个值的个数, 用于计算的值可以是特定的数值型字段,也可以通过脚本计算而来。 216 | 217 | > 该聚合一般域其它 single-value 聚合联合使用,比如在计算一个字段的平均值的时候,可能还会关注这个平均值是由多少个值计算而来。 218 | 219 | 220 | 下面是如何用Java API 使用[值计数聚合](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/search-aggregations-metrics-valuecount-aggregation.html) 221 | 222 | ##### 准备聚合请求 223 | 224 | 下面是如何创建聚合请求的是示例: 225 | 226 | 227 | ``` 228 | ValueCountAggregationBuilder aggregation = 229 | AggregationBuilders 230 | .count("agg") 231 | .field("height"); 232 | ``` 233 | 234 | ##### 使用聚合请求 235 | 236 | 237 | ``` 238 | import org.elasticsearch.search.aggregations.metrics.valuecount.ValueCount; 239 | 240 | 241 | ``` 242 | 243 | ``` 244 | // sr is here your SearchResponse object 245 | ValueCount agg = sr.getAggregations().get("agg"); 246 | long value = agg.getValue(); 247 | ``` 248 | 249 | 250 | #### Percentile Aggregation 百分百聚合 251 | 252 | > 百分百聚合——基于聚合文档中某个数值类型的值,求这些值中的一个或者多个百分比, 用于计算的值可以是特定的数值型字段,也可以通过脚本计算而来。 253 | 254 | 255 | 下面是如何用Java API 使用[百分百聚合](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/search-aggregations-metrics-percentile-aggregation.html) 256 | 257 | ##### 准备聚合请求 258 | 259 | 下面是如何创建聚合请求的是示例: 260 | 261 | 262 | ``` 263 | PercentilesAggregationBuilder aggregation = 264 | AggregationBuilders 265 | .percentiles("agg") 266 | .field("height"); 267 | ``` 268 | 269 | 可以提供百分位数,而不是使用默认值: 270 | 271 | ``` 272 | PercentilesAggregationBuilder aggregation = 273 | AggregationBuilders 274 | .percentiles("agg") 275 | .field("height") 276 | .percentiles(1.0, 5.0, 10.0, 20.0, 30.0, 75.0, 95.0, 99.0); 277 | ``` 278 | 279 | ##### 使用聚合请求 280 | 281 | 282 | ``` 283 | import org.elasticsearch.search.aggregations.metrics.percentiles.Percentile; 284 | import org.elasticsearch.search.aggregations.metrics.percentiles.Percentiles; 285 | 286 | ``` 287 | 288 | ``` 289 | // sr is here your SearchResponse object 290 | Percentiles agg = sr.getAggregations().get("agg"); 291 | // For each entry 292 | for (Percentile entry : agg) { 293 | double percent = entry.getPercent(); // Percent 294 | double value = entry.getValue(); // Value 295 | 296 | logger.info("percent [{}], value [{}]", percent, value); 297 | } 298 | ``` 299 | 300 | 大概输出: 301 | 302 | 303 | ``` 304 | percent [1.0], value [0.814338896154595] 305 | percent [5.0], value [0.8761912455821302] 306 | percent [25.0], value [1.173346540141847] 307 | percent [50.0], value [1.5432023318692198] 308 | percent [75.0], value [1.923915462033674] 309 | percent [95.0], value [2.2273644908535335] 310 | percent [99.0], value [2.284989339108279] 311 | ``` 312 | 313 | #### Percentile Ranks Aggregation 百分比等级聚合 314 | 315 | > 一个multi-value指标聚合,它通过从聚合文档中提取数值来计算一个或多个百分比。这些值可以从特定数值字段中提取,也可以由提供的脚本生成。 316 | 317 | > 注意:请参考百分比(通常)近视值(percentiles are (usually approximate))和压缩(Compression)以获得关于近视值的建议和内存使用的百分比排位聚合。百分比排位展示那些在某一值之下的观测值的百分比。例如,假如某一直大于等于被观测值的95%,则称其为第95百分位数。假设你的数据由网页加载时间组成。你可能有一个服务协议,95%页面需要在15ms加载完全,99%页面在30ms加载完全。 318 | 319 | 320 | 下面是如何用Java API 使用[百分比等级聚合](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/search-aggregations-metrics-percentile-rank-aggregation.html) 321 | 322 | ##### 准备聚合请求 323 | 324 | 下面是如何创建聚合请求的是示例: 325 | 326 | 327 | ``` 328 | PercentilesAggregationBuilder aggregation = 329 | AggregationBuilders 330 | .percentiles("agg") 331 | .field("height"); 332 | ``` 333 | 334 | 335 | ##### 使用聚合请求 336 | 337 | 338 | ``` 339 | import org.elasticsearch.search.aggregations.metrics.percentiles.Percentile; 340 | import org.elasticsearch.search.aggregations.metrics.percentiles.PercentileRanks; 341 | 342 | ``` 343 | 344 | ``` 345 | // sr is here your SearchResponse object 346 | PercentileRanks agg = sr.getAggregations().get("agg"); 347 | // For each entry 348 | for (Percentile entry : agg) { 349 | double percent = entry.getPercent(); // Percent 350 | double value = entry.getValue(); // Value 351 | 352 | logger.info("percent [{}], value [{}]", percent, value); 353 | } 354 | ``` 355 | 356 | 大概输出: 357 | 358 | 359 | ``` 360 | percent [29.664353095090945], value [1.24] 361 | percent [73.9335313461868], value [1.91] 362 | percent [94.40095147327283], value [2.22] 363 | ``` 364 | 365 | #### Cardinality Aggregation 基数聚合 366 | 367 | > 基于文档的某个值,计算文档非重复的个数(去重计数)。。这些值可以从特定数值字段中提取,也可以由提供的脚本生成。 368 | 369 | 370 | 下面是如何用Java API 使用[基数聚合](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/search-aggregations-metrics-cardinality-aggregation.html) 371 | 372 | ##### 准备聚合请求 373 | 374 | 下面是如何创建聚合请求的是示例: 375 | 376 | 377 | ``` 378 | CardinalityAggregationBuilder aggregation = 379 | AggregationBuilders 380 | .cardinality("agg") 381 | .field("tags"); 382 | ``` 383 | 384 | 385 | ##### 使用聚合请求 386 | 387 | 388 | ``` 389 | import org.elasticsearch.search.aggregations.metrics.cardinality.Cardinality; 390 | ``` 391 | 392 | ``` 393 | // sr is here your SearchResponse object 394 | Cardinality agg = sr.getAggregations().get("agg"); 395 | long value = agg.getValue(); 396 | ``` 397 | 398 | 399 | #### Geo Bounds Aggregation 地理边界聚合 400 | 401 | > 地理边界聚合——基于文档的某个字段(geo-point类型字段),计算出该字段所有地理坐标点的边界(左上角/右下角坐标点)。 402 | 403 | 404 | 下面是如何用Java API 使用[地理边界聚合](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/search-aggregations-metrics-geobounds-aggregation.html) 405 | 406 | ##### 准备聚合请求 407 | 408 | 下面是如何创建聚合请求的是示例: 409 | 410 | 411 | ``` 412 | GeoBoundsBuilder aggregation = 413 | GeoBoundsAggregationBuilder 414 | .geoBounds("agg") 415 | .field("address.location") 416 | .wrapLongitude(true); 417 | ``` 418 | 419 | 420 | ##### 使用聚合请求 421 | 422 | 423 | ``` 424 | import org.elasticsearch.search.aggregations.metrics.geobounds.GeoBounds; 425 | 426 | ``` 427 | 428 | ``` 429 | // sr is here your SearchResponse object 430 | GeoBounds agg = sr.getAggregations().get("agg"); 431 | GeoPoint bottomRight = agg.bottomRight(); 432 | GeoPoint topLeft = agg.topLeft(); 433 | logger.info("bottomRight {}, topLeft {}", bottomRight, topLeft); 434 | ``` 435 | 436 | 大概会输出: 437 | 438 | 439 | ``` 440 | bottomRight [40.70500764381921, 13.952946866893775], topLeft [53.49603022435221, -4.190029308156676] 441 | 442 | ``` 443 | 444 | 445 | #### Top Hits Aggregation 最高匹配权值聚合 446 | 447 | > 最高匹配权值聚合——跟踪聚合中相关性最高的文档。该聚合一般用做 sub-aggregation,以此来聚合每个桶中的最高匹配的文档。 448 | 449 | 450 | 451 | 下面是如何用Java API 使用[最高匹配权值聚合](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/search-aggregations-metrics-top-hits-aggregation.html) 452 | 453 | ##### 准备聚合请求 454 | 455 | 下面是如何创建聚合请求的是示例: 456 | 457 | 458 | ``` 459 | AggregationBuilder aggregation = 460 | AggregationBuilders 461 | .terms("agg").field("gender") 462 | .subAggregation( 463 | AggregationBuilders.topHits("top") 464 | ); 465 | ``` 466 | 467 | 大多数标准的搜索选项可以使用,比如:`from`, `size`, `sort`, `highlight`, `explain` … 468 | 469 | 470 | ``` 471 | AggregationBuilder aggregation = 472 | AggregationBuilders 473 | .terms("agg").field("gender") 474 | .subAggregation( 475 | AggregationBuilders.topHits("top") 476 | .explain(true) 477 | .size(1) 478 | .from(10) 479 | ); 480 | ``` 481 | 482 | 483 | ##### 使用聚合请求 484 | 485 | 486 | ``` 487 | import org.elasticsearch.search.aggregations.bucket.terms.Terms; 488 | import org.elasticsearch.search.aggregations.metrics.tophits.TopHits; 489 | 490 | ``` 491 | 492 | ``` 493 | // sr is here your SearchResponse object 494 | Terms agg = sr.getAggregations().get("agg"); 495 | 496 | // For each entry 497 | for (Terms.Bucket entry : agg.getBuckets()) { 498 | String key = entry.getKey(); // bucket key 499 | long docCount = entry.getDocCount(); // Doc count 500 | logger.info("key [{}], doc_count [{}]", key, docCount); 501 | 502 | // We ask for top_hits for each bucket 503 | TopHits topHits = entry.getAggregations().get("top"); 504 | for (SearchHit hit : topHits.getHits().getHits()) { 505 | logger.info(" -> id [{}], _source [{}]", hit.getId(), hit.getSourceAsString()); 506 | } 507 | } 508 | ``` 509 | 510 | 大概会输出: 511 | 512 | 513 | ``` 514 | key [male], doc_count [5107] 515 | -> id [AUnzSZze9k7PKXtq04x2], _source [{"gender":"male",...}] 516 | -> id [AUnzSZzj9k7PKXtq04x4], _source [{"gender":"male",...}] 517 | -> id [AUnzSZzl9k7PKXtq04x5], _source [{"gender":"male",...}] 518 | key [female], doc_count [4893] 519 | -> id [AUnzSZzM9k7PKXtq04xy], _source [{"gender":"female",...}] 520 | -> id [AUnzSZzp9k7PKXtq04x8], _source [{"gender":"female",...}] 521 | -> id [AUnzSZ0W9k7PKXtq04yS], _source [{"gender":"female",...}] 522 | 523 | ``` 524 | 525 | 526 | #### Scripted Metric Aggregation 527 | 528 | > 此功能为实验性的,不建议生产使用,所以也不做过多说明 有兴趣可以自己参考 [官方文档](https://www.elastic.co/guide/en/elasticsearch/client/java-api/current/_metrics_aggregations.html#java-aggs-metrics-scripted-metric) 529 | 530 | -------------------------------------------------------------------------------- /aggregations/structuring-aggregations.md: -------------------------------------------------------------------------------- 1 | 2 | ### Structuring aggregations 3 | 4 | 结构化聚合 5 | 6 | 如 [ Aggregations guide](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/search-aggregations.html) 中所述,可以在聚合中定义子聚合。 7 | 8 | 聚合可能是 `Metrics` 聚合(一个跟踪和计算指标的聚合)或者 `Bucket` 聚合 (构建桶聚合) 9 | 10 | 11 | 例如,这里是一个3级聚合组成的聚合: 12 | 13 | - Terms aggregation (bucket) 14 | - Date Histogram aggregation (bucket) 15 | - Average aggregation (metric) 16 | 17 | 18 | ``` 19 | SearchResponse sr = node.client().prepareSearch() 20 | .addAggregation( 21 | AggregationBuilders.terms("by_country").field("country") 22 | .subAggregation(AggregationBuilders.dateHistogram("by_year") 23 | .field("dateOfBirth") 24 | .dateHistogramInterval(DateHistogramInterval.YEAR) 25 | .subAggregation(AggregationBuilders.avg("avg_children").field("children")) 26 | ) 27 | ) 28 | .execute().actionGet(); 29 | ``` 30 | -------------------------------------------------------------------------------- /assets/Cover_1600_2400.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/quanke/elasticsearch-java/296e1c18f4fec25e9917a7a6fba26a496e36d7c1/assets/Cover_1600_2400.jpg -------------------------------------------------------------------------------- /assets/Cover_400_600.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/quanke/elasticsearch-java/296e1c18f4fec25e9917a7a6fba26a496e36d7c1/assets/Cover_400_600.jpg -------------------------------------------------------------------------------- /assets/Cover_800_1200.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/quanke/elasticsearch-java/296e1c18f4fec25e9917a7a6fba26a496e36d7c1/assets/Cover_800_1200.jpg -------------------------------------------------------------------------------- /assets/qrcode_for_gh_26893aa0a4ea_258.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/quanke/elasticsearch-java/296e1c18f4fec25e9917a7a6fba26a496e36d7c1/assets/qrcode_for_gh_26893aa0a4ea_258.jpg -------------------------------------------------------------------------------- /client.md: -------------------------------------------------------------------------------- 1 | ## Client 2 | 3 | Java 客户端连接 Elasticsearch 4 | 5 | 一个是`TransportClient`,一个是`NodeClient`,还有一个`XPackTransportClient` 6 | 7 | - TransportClient: 8 | 9 | 作为一个外部访问者,请求ES的集群,对于集群而言,它是一个外部因素。 10 | 11 | 12 | - NodeClient 13 | 14 | 作为ES集群的一个节点,它是ES中的一环,其他的节点对它是感知的。 15 | 16 | - XPackTransportClient: 17 | 18 | 服务安装了 `x-pack` 插件 19 | 20 | > 重要:客户端版本应该和服务端版本保持一致 21 | 22 | > TransportClient旨在被Java高级REST客户端取代,该客户端执行HTTP请求而不是序列化的Java请求。 在即将到来的Elasticsearch版本中将不赞成使用TransportClient,建议使用Java高级REST客户端。 23 | 24 | 25 | > 上面的警告比较尴尬,但是在 5xx版本中使用还是没有问题的,可能使用rest 客户端兼容性更好做一些。 26 | 27 | [Elasticsearch Java Rest API 手册](https://www.gitbook.com/book/quanke/elasticsearch-java-rest) -------------------------------------------------------------------------------- /client/transport-client.md: -------------------------------------------------------------------------------- 1 | 2 | ### Transport Client 3 | 4 | #### 不设置集群名称 5 | 6 | ``` 7 | // on startup 8 | 9 | //此步骤添加IP,至少一个,如果设置了"client.transport.sniff"= true 一个就够了,因为添加了自动嗅探配置 10 | TransportClient client = new PreBuiltTransportClient(Settings.EMPTY) 11 | .addTransportAddress(new InetSocketTransportAddress(InetAddress.getByName("host1"), 9300)) 12 | .addTransportAddress(new InetSocketTransportAddress(InetAddress.getByName("host2"), 9300)); 13 | 14 | // on shutdown 关闭client 15 | 16 | client.close(); 17 | ``` 18 | 19 | #### 设置集群名称 20 | 21 | ``` 22 | Settings settings = Settings.builder() 23 | .put("cluster.name", "myClusterName").build(); //设置ES实例的名称 24 | TransportClient client = new PreBuiltTransportClient(settings); //自动嗅探整个集群的状态,把集群中其他ES节点的ip添加到本地的客户端列表中 25 | //Add transport addresses and do something with the client... 26 | ``` 27 | 28 | #### 增加自动嗅探配置 29 | ``` 30 | Settings settings = Settings.builder() 31 | .put("client.transport.sniff", true).build(); 32 | TransportClient client = new PreBuiltTransportClient(settings); 33 | ``` 34 | 35 | #### 其他配置 36 | 37 | ``` 38 | client.transport.ignore_cluster_name //设置 true ,忽略连接节点集群名验证 39 | client.transport.ping_timeout //ping一个节点的响应时间 默认5秒 40 | client.transport.nodes_sampler_interval //sample/ping 节点的时间间隔,默认是5s 41 | ``` 42 | > 对于ES Client,有两种形式,一个是TransportClient,一个是NodeClient。两个的区别为: 43 | TransportClient作为一个外部访问者,通过HTTP去请求ES的集群,对于集群而言,它是一个外部因素。 44 | NodeClient顾名思义,是作为ES集群的一个节点,它是ES中的一环,其他的节点对它是感知的,不像TransportClient那样,ES集群对它一无所知。NodeClient通信的性能会更好,但是因为是ES的一环,所以它出问题,也会给ES集群带来问题。NodeClient可以设置不作为数据节点,在elasticsearch.yml中设置,这样就不会在此节点上分配数据。 45 | 46 | 如果用ES的节点,大家仁者见仁智者见智,各按所需。 47 | 48 | #### 实例 49 | 50 | ``` 51 | Settings esSettings = Settings.builder() 52 | 53 | .put("cluster.name", clusterName) //设置ES实例的名称 54 | 55 | .put("client.transport.sniff", true) //自动嗅探整个集群的状态,把集群中其他ES节点的ip添加到本地的客户端列表中 56 | 57 | .build(); 58 | 59 | client = new PreBuiltTransportClient(esSettings);//初始化client较老版本发生了变化,此方法有几个重载方法,初始化插件等。 60 | 61 | //此步骤添加IP,至少一个,其实一个就够了,因为添加了自动嗅探配置 62 | 63 | client.addTransportAddress(new InetSocketTransportAddress(InetAddress.getByName(ip), esPort)); 64 | 65 | ``` -------------------------------------------------------------------------------- /client/xpack-transport-client.md: -------------------------------------------------------------------------------- 1 | 2 | ### XPackTransportClient 3 | 如果 `ElasticSearch ` 服务安装了 `x-pack` 插件,需要`PreBuiltXPackTransportClient`实例才能访问 4 | 5 | 6 | 使用Maven管理项目,把下面代码增加到`pom.xml`; 7 | 8 | > 一定要修改默认仓库地址为https://artifacts.elastic.co/maven ,因为这个库没有上传到Maven中央仓库 9 | 10 | ``` 11 | 12 | 13 | 14 | 15 | 16 | elasticsearch-releases 17 | https://artifacts.elastic.co/maven 18 | 19 | true 20 | 21 | 22 | false 23 | 24 | 25 | ... 26 | 27 | ... 28 | 29 | 30 | 31 | 32 | org.elasticsearch.client 33 | x-pack-transport 34 | 5.6.3 35 | 36 | ... 37 | 38 | ... 39 | 40 | 41 | ``` 42 | 43 | #### 实例 44 | 45 | 46 | ``` 47 | Settings settings = Settings.builder().put("cluster.name", "xxx") 48 | .put("xpack.security.transport.ssl.enabled", false) 49 | .put("xpack.security.user", "xxx:xxx") 50 | .put("client.transport.sniff", true).build(); 51 | try { 52 | client = new PreBuiltXPackTransportClient(settings) 53 | .addTransportAddress(new InetSocketTransportAddress(InetAddress.getByName("xxx.xxx.xxx.xxx"), 9300)) 54 | .addTransportAddress(new InetSocketTransportAddress(InetAddress.getByName("xxx.xxx.xxx.xxx"), 9300)); 55 | } catch (UnknownHostException e) { 56 | e.printStackTrace(); 57 | } 58 | ``` 59 | 60 | 更多请浏览 dayu-spring-boot-starter 开源项目 -------------------------------------------------------------------------------- /dependency.md: -------------------------------------------------------------------------------- 1 | 2 | ## 安装 3 | 4 | ### Maven Repository 5 | 6 | Elasticsearch Java API包已经上传到 [Maven Central](http://search.maven.org/#search%7Cga%7C1%7Ca%3A%22elasticsearch%22) 7 | 8 | 在`pom.xml`文件中增加: 9 | 10 | > transport 版本号最好就是与Elasticsearch版本号一致。 11 | 12 | ``` 13 | 14 | org.elasticsearch.client 15 | transport 16 | 5.6.3 17 | 18 | ``` -------------------------------------------------------------------------------- /document-apis.md: -------------------------------------------------------------------------------- 1 | ## Document APIs 2 | 3 | 本节介绍以下 CRUD API: 4 | 5 | 6 | 单文档 APIs 7 | 8 | * [Index API](document-apis/index-api.md) 9 | * [Get API](document-apis/get-api.md) 10 | * [Delete API](document-apis/delete-api.md) 11 | * [Delete By Query API](document-apis/delete-by-query-api.md) 12 | * [Update API](document-apis/update-api.md) 13 | 14 | 多文档 APIs 15 | 16 | * [Multi Get API](document-apis/multi-get-api.md) 17 | * [Bulk API](document-apis/bulk-api.md) 18 | * [Using Bulk Processor](document-apis/using-bulk-processor.md) 19 | 20 | Multi Get API 21 | Bulk API 22 | 23 | > 注意:所有的单文档的CRUD API,index参数只能接受单一的索引库名称,或者是一个指向单一索引库的alias。 -------------------------------------------------------------------------------- /document-apis/bulk-api.md: -------------------------------------------------------------------------------- 1 | 2 | ### Bulk API 3 | 4 | Bulk API,批量插入: 5 | 6 | ``` 7 | import static org.elasticsearch.common.xcontent.XContentFactory.*; 8 | ``` 9 | 10 | ``` 11 | BulkRequestBuilder bulkRequest = client.prepareBulk(); 12 | 13 | // either use client#prepare, or use Requests# to directly build index/delete requests 14 | bulkRequest.add(client.prepareIndex("twitter", "tweet", "1") 15 | .setSource(jsonBuilder() 16 | .startObject() 17 | .field("user", "kimchy") 18 | .field("postDate", new Date()) 19 | .field("message", "trying out Elasticsearch") 20 | .endObject() 21 | ) 22 | ); 23 | 24 | bulkRequest.add(client.prepareIndex("twitter", "tweet", "2") 25 | .setSource(jsonBuilder() 26 | .startObject() 27 | .field("user", "kimchy") 28 | .field("postDate", new Date()) 29 | .field("message", "another post") 30 | .endObject() 31 | ) 32 | ); 33 | 34 | BulkResponse bulkResponse = bulkRequest.get(); 35 | if (bulkResponse.hasFailures()) { 36 | // process failures by iterating through each bulk response item 37 | //处理失败 38 | } 39 | ``` -------------------------------------------------------------------------------- /document-apis/delete-api.md: -------------------------------------------------------------------------------- 1 | 2 | ### Delete API 3 | 4 | 根据ID删除: 5 | 6 | ``` 7 | DeleteResponse response = client.prepareDelete("twitter", "tweet", "1").get(); 8 | 9 | ``` 10 | 11 | 更多请查看 [delete API](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/docs-delete.html) 文档 12 | 13 | #### 配置线程 14 | 15 | `operationThreaded` 设置为 `true` 是在不同的线程里执行此次操作 16 | 17 | 下面的例子是`operationThreaded` 设置为 `false` : 18 | ``` 19 | GetResponse response = client.prepareGet("twitter", "tweet", "1") 20 | .setOperationThreaded(false) 21 | .get(); 22 | ``` 23 | 24 | ``` 25 | DeleteResponse response = client.prepareDelete("twitter", "tweet", "1") 26 | .setOperationThreaded(false) 27 | .get(); 28 | ``` 29 | -------------------------------------------------------------------------------- /document-apis/delete-by-query-api.md: -------------------------------------------------------------------------------- 1 | 2 | ### Delete By Query API 3 | 4 | 通过查询条件删除 5 | 6 | ``` 7 | BulkByScrollResponse response = 8 | DeleteByQueryAction.INSTANCE.newRequestBuilder(client) 9 | .filter(QueryBuilders.matchQuery("gender", "male")) //查询条件 10 | .source("persons") //index(索引名) 11 | .get(); //执行 12 | 13 | long deleted = response.getDeleted(); //删除文档的数量 14 | ``` 15 | 16 | 如果需要执行的时间比较长,可以使用异步的方式处理,结果在回调里面获取 17 | 18 | 19 | ``` 20 | DeleteByQueryAction.INSTANCE.newRequestBuilder(client) 21 | .filter(QueryBuilders.matchQuery("gender", "male")) //查询 22 | .source("persons") //index(索引名) 23 | .execute(new ActionListener() { //回调监听 24 | @Override 25 | public void onResponse(BulkByScrollResponse response) { 26 | long deleted = response.getDeleted(); //删除文档的数量 27 | } 28 | @Override 29 | public void onFailure(Exception e) { 30 | // Handle the exception 31 | } 32 | }); 33 | ``` 34 | -------------------------------------------------------------------------------- /document-apis/get-api.md: -------------------------------------------------------------------------------- 1 | 2 | ### Get API 3 | 4 | 根据id查看文档: 5 | 6 | ``` 7 | GetResponse response = client.prepareGet("twitter", "tweet", "1").get(); 8 | 9 | ``` 10 | 11 | 更多请查看 [rest get API](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/docs-get.html) 文档 12 | 13 | #### 配置线程 14 | 15 | `operationThreaded` 设置为 `true` 是在不同的线程里执行此次操作 16 | 17 | 下面的例子是`operationThreaded` 设置为 `false` : 18 | ``` 19 | GetResponse response = client.prepareGet("twitter", "tweet", "1") 20 | .setOperationThreaded(false) 21 | .get(); 22 | ``` -------------------------------------------------------------------------------- /document-apis/index-api.md: -------------------------------------------------------------------------------- 1 | ### Index API 2 | Index API 允许我们存储一个JSON格式的文档,使数据可以被搜索。文档通过index、type、id唯一确定。我们可以自己提供一个id,或者也使用Index API 为我们自动生成一个。 3 | 4 | 这里有几种不同的方式来产生JSON格式的文档(document): 5 | 6 | - 手动方式,使用原生的byte[]或者String 7 | - 使用Map方式,会自动转换成与之等价的JSON 8 | - 使用第三方库来序列化beans,如Jackson 9 | - 使用内置的帮助类 XContentFactory.jsonBuilder() 10 | 11 | #### 手动方式 12 | 13 | [数据格式](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/mapping-date-format.html) 14 | ``` 15 | String json = "{" + 16 | "\"user\":\"kimchy\"," + 17 | "\"postDate\":\"2013-01-30\"," + 18 | "\"message\":\"trying out Elasticsearch\"" + 19 | "}"; 20 | ``` 21 | ##### 实例 22 | 23 | ``` 24 | /** 25 | * 手动生成JSON 26 | */ 27 | @Test 28 | public void CreateJSON(){ 29 | 30 | String json = "{" + 31 | "\"user\":\"fendo\"," + 32 | "\"postDate\":\"2013-01-30\"," + 33 | "\"message\":\"Hell word\"" + 34 | "}"; 35 | 36 | IndexResponse response = client.prepareIndex("fendo", "fendodate") 37 | .setSource(json) 38 | .get(); 39 | System.out.println(response.getResult()); 40 | 41 | } 42 | ``` 43 | 44 | #### Map方式 45 | Map是key:value数据类型,可以代表json结构. 46 | 47 | ``` 48 | Map json = new HashMap(); 49 | json.put("user","kimchy"); 50 | json.put("postDate",new Date()); 51 | json.put("message","trying out Elasticsearch"); 52 | ``` 53 | ##### 实例 54 | 55 | ``` 56 | /** 57 | * 使用集合 58 | */ 59 | @Test 60 | public void CreateList(){ 61 | 62 | Map json = new HashMap(); 63 | json.put("user","kimchy"); 64 | json.put("postDate","2013-01-30"); 65 | json.put("message","trying out Elasticsearch"); 66 | 67 | IndexResponse response = client.prepareIndex("fendo", "fendodate") 68 | .setSource(json) 69 | .get(); 70 | System.out.println(response.getResult()); 71 | 72 | } 73 | ``` 74 | 75 | #### 序列化方式 76 | ElasticSearch已经使用了jackson,可以直接使用它把javabean转为json. 77 | 78 | ``` 79 | import com.fasterxml.jackson.databind.*; 80 | 81 | // instance a json mapper 82 | ObjectMapper mapper = new ObjectMapper(); // create once, reuse 83 | 84 | // generate json 85 | byte[] json = mapper.writeValueAsBytes(yourbeaninstance); 86 | ``` 87 | ##### 实例 88 | 89 | ``` 90 | /** 91 | * 使用JACKSON序列化 92 | * @throws Exception 93 | */ 94 | @Test 95 | public void CreateJACKSON() throws Exception{ 96 | 97 | CsdnBlog csdn=new CsdnBlog(); 98 | csdn.setAuthor("fendo"); 99 | csdn.setContent("这是JAVA书籍"); 100 | csdn.setTag("C"); 101 | csdn.setView("100"); 102 | csdn.setTitile("编程"); 103 | csdn.setDate(new Date().toString()); 104 | 105 | // instance a json mapper 106 | ObjectMapper mapper = new ObjectMapper(); // create once, reuse 107 | 108 | // generate json 109 | byte[] json = mapper.writeValueAsBytes(csdn); 110 | 111 | IndexResponse response = client.prepareIndex("fendo", "fendodate") 112 | .setSource(json) 113 | .get(); 114 | System.out.println(response.getResult()); 115 | } 116 | ``` 117 | 118 | #### XContentBuilder帮助类方式 119 | ElasticSearch提供了一个内置的帮助类XContentBuilder来产生JSON文档 120 | 121 | ``` 122 | // Index name 123 | String _index = response.getIndex(); 124 | // Type name 125 | String _type = response.getType(); 126 | // Document ID (generated or not) 127 | String _id = response.getId(); 128 | // Version (if it's the first time you index this document, you will get: 1) 129 | long _version = response.getVersion(); 130 | // status has stored current instance statement. 131 | RestStatus status = response.status(); 132 | ``` 133 | 134 | ##### 实例 135 | 136 | ``` 137 | /** 138 | * 使用ElasticSearch 帮助类 139 | * @throws IOException 140 | */ 141 | @Test 142 | public void CreateXContentBuilder() throws IOException{ 143 | 144 | XContentBuilder builder = XContentFactory.jsonBuilder() 145 | .startObject() 146 | .field("user", "ccse") 147 | .field("postDate", new Date()) 148 | .field("message", "this is Elasticsearch") 149 | .endObject(); 150 | 151 | IndexResponse response = client.prepareIndex("fendo", "fendodata").setSource(builder).get(); 152 | System.out.println("创建成功!"); 153 | 154 | 155 | } 156 | ``` 157 | 158 | #### 综合实例 159 | ``` 160 | 161 | import java.io.IOException; 162 | import java.net.InetAddress; 163 | import java.net.UnknownHostException; 164 | import java.util.Date; 165 | import java.util.HashMap; 166 | import java.util.Map; 167 | 168 | import org.elasticsearch.action.index.IndexResponse; 169 | import org.elasticsearch.client.transport.TransportClient; 170 | import org.elasticsearch.common.settings.Settings; 171 | import org.elasticsearch.common.transport.InetSocketTransportAddress; 172 | import org.elasticsearch.common.xcontent.XContentBuilder; 173 | import org.elasticsearch.common.xcontent.XContentFactory; 174 | import org.elasticsearch.transport.client.PreBuiltTransportClient; 175 | import org.junit.Before; 176 | import org.junit.Test; 177 | 178 | import com.fasterxml.jackson.core.JsonProcessingException; 179 | import com.fasterxml.jackson.databind.ObjectMapper; 180 | 181 | public class CreateIndex { 182 | 183 | private TransportClient client; 184 | 185 | @Before 186 | public void getClient() throws Exception{ 187 | //设置集群名称 188 | Settings settings = Settings.builder().put("cluster.name", "my-application").build();// 集群名 189 | //创建client 190 | client = new PreBuiltTransportClient(settings) 191 | .addTransportAddress(new InetSocketTransportAddress(InetAddress.getByName("127.0.0.1"), 9300)); 192 | } 193 | 194 | /** 195 | * 手动生成JSON 196 | */ 197 | @Test 198 | public void CreateJSON(){ 199 | 200 | String json = "{" + 201 | "\"user\":\"fendo\"," + 202 | "\"postDate\":\"2013-01-30\"," + 203 | "\"message\":\"Hell word\"" + 204 | "}"; 205 | 206 | IndexResponse response = client.prepareIndex("fendo", "fendodate") 207 | .setSource(json) 208 | .get(); 209 | System.out.println(response.getResult()); 210 | 211 | } 212 | 213 | 214 | /** 215 | * 使用集合 216 | */ 217 | @Test 218 | public void CreateList(){ 219 | 220 | Map json = new HashMap(); 221 | json.put("user","kimchy"); 222 | json.put("postDate","2013-01-30"); 223 | json.put("message","trying out Elasticsearch"); 224 | 225 | IndexResponse response = client.prepareIndex("fendo", "fendodate") 226 | .setSource(json) 227 | .get(); 228 | System.out.println(response.getResult()); 229 | 230 | } 231 | 232 | /** 233 | * 使用JACKSON序列化 234 | * @throws Exception 235 | */ 236 | @Test 237 | public void CreateJACKSON() throws Exception{ 238 | 239 | CsdnBlog csdn=new CsdnBlog(); 240 | csdn.setAuthor("fendo"); 241 | csdn.setContent("这是JAVA书籍"); 242 | csdn.setTag("C"); 243 | csdn.setView("100"); 244 | csdn.setTitile("编程"); 245 | csdn.setDate(new Date().toString()); 246 | 247 | // instance a json mapper 248 | ObjectMapper mapper = new ObjectMapper(); // create once, reuse 249 | 250 | // generate json 251 | byte[] json = mapper.writeValueAsBytes(csdn); 252 | 253 | IndexResponse response = client.prepareIndex("fendo", "fendodate") 254 | .setSource(json) 255 | .get(); 256 | System.out.println(response.getResult()); 257 | } 258 | 259 | /** 260 | * 使用ElasticSearch 帮助类 261 | * @throws IOException 262 | */ 263 | @Test 264 | public void CreateXContentBuilder() throws IOException{ 265 | 266 | XContentBuilder builder = XContentFactory.jsonBuilder() 267 | .startObject() 268 | .field("user", "ccse") 269 | .field("postDate", new Date()) 270 | .field("message", "this is Elasticsearch") 271 | .endObject(); 272 | 273 | IndexResponse response = client.prepareIndex("fendo", "fendodata").setSource(builder).get(); 274 | System.out.println("创建成功!"); 275 | 276 | 277 | } 278 | 279 | } 280 | ``` 281 | 282 | 283 | > 你还可以通过startArray(string)和endArray()方法添加数组。.field()方法可以接受多种对象类型。你可以给它传递数字、日期、甚至其他XContentBuilder对象。 284 | -------------------------------------------------------------------------------- /document-apis/multi-get-api.md: -------------------------------------------------------------------------------- 1 | 2 | ### Multi Get API 3 | 一次获取多个文档 4 | 5 | 6 | ``` 7 | MultiGetResponse multiGetItemResponses = client.prepareMultiGet() 8 | .add("twitter", "tweet", "1") //一个id的方式 9 | .add("twitter", "tweet", "2", "3", "4") //多个id的方式 10 | .add("another", "type", "foo") //可以从另外一个索引获取 11 | .get(); 12 | 13 | for (MultiGetItemResponse itemResponse : multiGetItemResponses) { //迭代返回值 14 | GetResponse response = itemResponse.getResponse(); 15 | if (response.isExists()) { //判断是否存在 16 | String json = response.getSourceAsString(); //_source 字段 17 | } 18 | } 19 | ``` 20 | 更多请浏览REST [multi get](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/docs-multi-get.html) 文档 21 | -------------------------------------------------------------------------------- /document-apis/update-api.md: -------------------------------------------------------------------------------- 1 | 2 | ### Update API 3 | 4 | 有两种方式更新索引: 5 | - 创建 `UpdateRequest`,通过client发送; 6 | - 使用 `prepareUpdate()` 方法; 7 | 8 | #### 使用UpdateRequest 9 | 10 | ``` 11 | UpdateRequest updateRequest = new UpdateRequest(); 12 | updateRequest.index("index"); 13 | updateRequest.type("type"); 14 | updateRequest.id("1"); 15 | updateRequest.doc(jsonBuilder() 16 | .startObject() 17 | .field("gender", "male") 18 | .endObject()); 19 | client.update(updateRequest).get(); 20 | ``` 21 | 22 | #### 使用 `prepareUpdate()` 方法 23 | 24 | > 这里官方的示例有问题,new Script()参数错误,所以一下代码是我自己写的(2017/11/10) 25 | 26 | ``` 27 | client.prepareUpdate("ttl", "doc", "1") 28 | .setScript(new Script("ctx._source.gender = \"male\"" ,ScriptService.ScriptType.INLINE, null, null))//脚本可以是本地文件存储的,如果使用文件存储的脚本,需要设置 ScriptService.ScriptType.FILE 29 | .get(); 30 | 31 | client.prepareUpdate("ttl", "doc", "1") 32 | .setDoc(jsonBuilder() //合并到现有文档 33 | .startObject() 34 | .field("gender", "male") 35 | .endObject()) 36 | .get(); 37 | ``` 38 | 39 | #### Update by script 40 | 41 | 使用脚本更新文档 42 | 43 | ``` 44 | UpdateRequest updateRequest = new UpdateRequest("ttl", "doc", "1") 45 | .script(new Script("ctx._source.gender = \"male\"")); 46 | client.update(updateRequest).get(); 47 | 48 | ``` 49 | 50 | #### Update by merging documents 51 | 52 | 合并文档 53 | 54 | ``` 55 | UpdateRequest updateRequest = new UpdateRequest("index", "type", "1") 56 | .doc(jsonBuilder() 57 | .startObject() 58 | .field("gender", "male") 59 | .endObject()); 60 | client.update(updateRequest).get(); 61 | ``` 62 | 63 | 64 | #### Upsert 65 | 更新插入,如果存在文档就更新,如果不存在就插入 66 | 67 | 68 | ``` 69 | IndexRequest indexRequest = new IndexRequest("index", "type", "1") 70 | .source(jsonBuilder() 71 | .startObject() 72 | .field("name", "Joe Smith") 73 | .field("gender", "male") 74 | .endObject()); 75 | UpdateRequest updateRequest = new UpdateRequest("index", "type", "1") 76 | .doc(jsonBuilder() 77 | .startObject() 78 | .field("gender", "male") 79 | .endObject()) 80 | .upsert(indexRequest); //如果不存在此文档 ,就增加 `indexRequest` 81 | client.update(updateRequest).get(); 82 | ``` 83 | 84 | 如果 `index/type/1` 存在,类似下面的文档: 85 | 86 | 87 | ``` 88 | { 89 | "name" : "Joe Dalton", 90 | "gender": "male" 91 | } 92 | ``` 93 | 94 | 如果不存在,会插入新的文档: 95 | 96 | 97 | ``` 98 | { 99 | "name" : "Joe Smith", 100 | "gender": "male" 101 | } 102 | ``` 103 | -------------------------------------------------------------------------------- /document-apis/using-bulk-processor.md: -------------------------------------------------------------------------------- 1 | 2 | ### 使用 Bulk Processor 3 | BulkProcessor 提供了一个简单的接口,在给定的大小数量上定时批量自动请求 4 | 5 | #### 创建`BulkProcessor`实例 6 | 7 | 首先创建`BulkProcessor`实例 8 | 9 | ``` 10 | import org.elasticsearch.action.bulk.BackoffPolicy; 11 | import org.elasticsearch.action.bulk.BulkProcessor; 12 | import org.elasticsearch.common.unit.ByteSizeUnit; 13 | import org.elasticsearch.common.unit.ByteSizeValue; 14 | import org.elasticsearch.common.unit.TimeValue; 15 | ``` 16 | 17 | ``` 18 | BulkProcessor bulkProcessor = BulkProcessor.builder( 19 | client, //增加elasticsearch客户端 20 | new BulkProcessor.Listener() { 21 | @Override 22 | public void beforeBulk(long executionId, 23 | BulkRequest request) { ... } //调用bulk之前执行 ,例如你可以通过request.numberOfActions()方法知道numberOfActions 24 | 25 | @Override 26 | public void afterBulk(long executionId, 27 | BulkRequest request, 28 | BulkResponse response) { ... } //调用bulk之后执行 ,例如你可以通过request.hasFailures()方法知道是否执行失败 29 | 30 | @Override 31 | public void afterBulk(long executionId, 32 | BulkRequest request, 33 | Throwable failure) { ... } //调用失败抛 Throwable 34 | }) 35 | .setBulkActions(10000) //每次10000请求 36 | .setBulkSize(new ByteSizeValue(5, ByteSizeUnit.MB)) //拆成5mb一块 37 | .setFlushInterval(TimeValue.timeValueSeconds(5)) //无论请求数量多少,每5秒钟请求一次。 38 | .setConcurrentRequests(1) //设置并发请求的数量。值为0意味着只允许执行一个请求。值为1意味着允许1并发请求。 39 | .setBackoffPolicy( 40 | BackoffPolicy.exponentialBackoff(TimeValue.timeValueMillis(100), 3))//设置自定义重复请求机制,最开始等待100毫秒,之后成倍更加,重试3次,当一次或多次重复请求失败后因为计算资源不够抛出 EsRejectedExecutionException 异常,可以通过BackoffPolicy.noBackoff()方法关闭重试机制 41 | .build(); 42 | ``` 43 | #### BulkProcessor 默认设置 44 | - bulkActions 1000 45 | - bulkSize 5mb 46 | - 不设置flushInterval 47 | - concurrentRequests 为 1 ,异步执行 48 | - backoffPolicy 重试 8次,等待50毫秒 49 | 50 | #### 增加requests 51 | 然后增加`requests`到`BulkProcessor` 52 | ``` 53 | bulkProcessor.add(new IndexRequest("twitter", "tweet", "1").source(/* your doc here */)); 54 | bulkProcessor.add(new DeleteRequest("twitter", "tweet", "2")); 55 | ``` 56 | #### 关闭 Bulk Processor 57 | 当所有文档都处理完成,使用`awaitClose` 或 `close` 方法关闭`BulkProcessor`: 58 | 59 | 60 | ``` 61 | bulkProcessor.awaitClose(10, TimeUnit.MINUTES); 62 | 63 | ``` 64 | 或 65 | 66 | ``` 67 | bulkProcessor.close(); 68 | 69 | ``` 70 | 71 | #### 在测试中使用Bulk Processor 72 | 73 | 如果你在测试种使用`Bulk Processor`可以执行同步方法 74 | ``` 75 | BulkProcessor bulkProcessor = BulkProcessor.builder(client, new BulkProcessor.Listener() { /* Listener methods */ }) 76 | .setBulkActions(10000) 77 | .setConcurrentRequests(0) 78 | .build(); 79 | 80 | // Add your requests 81 | bulkProcessor.add(/* Your requests */); 82 | 83 | // Flush any remaining requests 84 | bulkProcessor.flush(); 85 | 86 | // Or close the bulkProcessor if you don't need it anymore 87 | bulkProcessor.close(); 88 | 89 | // Refresh your indices 90 | client.admin().indices().prepareRefresh().get(); 91 | 92 | // Now you can start searching! 93 | client.prepareSearch().get(); 94 | ``` -------------------------------------------------------------------------------- /indexed-scripts-api.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/quanke/elasticsearch-java/296e1c18f4fec25e9917a7a6fba26a496e36d7c1/indexed-scripts-api.md -------------------------------------------------------------------------------- /indexed-scripts-api/script-language.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/quanke/elasticsearch-java/296e1c18f4fec25e9917a7a6fba26a496e36d7c1/indexed-scripts-api/script-language.md -------------------------------------------------------------------------------- /java-api-administration.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/quanke/elasticsearch-java/296e1c18f4fec25e9917a7a6fba26a496e36d7c1/java-api-administration.md -------------------------------------------------------------------------------- /java-api-administration/cluster-administration.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/quanke/elasticsearch-java/296e1c18f4fec25e9917a7a6fba26a496e36d7c1/java-api-administration/cluster-administration.md -------------------------------------------------------------------------------- /java-api-administration/indices-administration.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/quanke/elasticsearch-java/296e1c18f4fec25e9917a7a6fba26a496e36d7c1/java-api-administration/indices-administration.md -------------------------------------------------------------------------------- /query-dsl.md: -------------------------------------------------------------------------------- 1 | 2 | ## Query DSL 3 | 4 | > Elasticsearch 提供了一个基于 JSON 的完整的查询 DSL 来定义查询。 5 | 6 | Elasticsearch以类似于REST [Query DSL](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/query-dsl.html) 的方式提供完整的Java查询dsl。 查询构建器的工厂是 `QueryBuilders`。 一旦的查询准备就绪,就可以使用[Search API](https://www.elastic.co/guide/en/elasticsearch/client/java-api/current/java-search.html) 。 7 | 8 | 要使用QueryBuilder,只需将它们导入到类中: 9 | 10 | ``` 11 | import static org.elasticsearch.index.query.QueryBuilders.*; 12 | 13 | ``` 14 | 15 | > 注意,可以使用`QueryBuilder`对象上的`toString()`方法打印。 16 | 17 | `QueryBuilder`可以用于接受任何查询API,如`count`和`search`。 18 | 19 | -------------------------------------------------------------------------------- /query-dsl/compound-queries.md: -------------------------------------------------------------------------------- 1 | 2 | ### Compound queries 3 | 4 | 复合查询用来包装其他复合或者叶子查询,一方面可综合其结果和分数,从而改变它的行为,另一方面可从查询切换到过滤器上下文。此类查询包含: 5 | 6 | - constant_score 查询 7 | 8 | 这是一个包装其他查询的查询,并且在过滤器上下文中执行。与此查询匹配的所有文件都需要返回相同的“常量” _score 。 9 | 10 | 查看[Constant Score Query](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/query-dsl-constant-score-query.html) 11 | 12 | 13 | ``` 14 | QueryBuilder qb = constantScoreQuery( 15 | termQuery("name","kimchy") //查询语句 16 | ) 17 | .boost(2.0f); //分数 18 | ``` 19 | 20 | 21 | - bool 查询 22 | 23 | 组合多个叶子并发查询或复合查询条件的默认查询类型,例如must, should, must_not, 以及 filter 条件。 在 must 和 should 子句他们的分数相结合-匹配条件越多,预期越好-而 must_not 和 filter 子句在过滤器上下文中执行。 24 | 25 | 查看[Bool Query](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/query-dsl-bool-query.html) 26 | 27 | ``` 28 | QueryBuilder qb = boolQuery() 29 | .must(termQuery("content", "test1")) //must query 30 | .must(termQuery("content", "test4")) 31 | .mustNot(termQuery("content", "test2")) //must not query 32 | .should(termQuery("content", "test3")) // should query 33 | .filter(termQuery("content", "test5// 与一般查询作用一样,只不过不参与评分 34 | ``` 35 | 36 | - dis_max 查询 37 | 38 | 支持多并发查询的查询,并可返回与任意查询条件子句匹配的任何文档类型。与 bool 查询可以将所有匹配查询的分数相结合使用的方式不同的是,dis_max 查询只使用最佳匹配查询条件的分数。 39 | 40 | 查看[Dis Max Query](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/query-dsl-dis-max-query.html) 41 | 42 | ``` 43 | QueryBuilder qb = disMaxQuery() 44 | .add(termQuery("name", "kimchy")) 45 | .add(termQuery("name", "elasticsearch")) 46 | .boost(1.2f) //boost factor 47 | .tieBreaker(0.7f); //tie breaker 48 | ``` 49 | 50 | - function_score 查询 51 | 52 | 使用函数修改主查询返回的分数,以考虑诸如流行度,新近度,距离或使用脚本实现的自定义算法等因素。 53 | 54 | 查看[Function Score Query](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/query-dsl-function-score-query.html) 55 | 56 | 57 | ``` 58 | import static org.elasticsearch.index.query.functionscore.ScoreFunctionBuilders.*; 59 | 60 | ``` 61 | 62 | 63 | ``` 64 | FilterFunctionBuilder[] functions = { 65 | new FunctionScoreQueryBuilder.FilterFunctionBuilder( 66 | matchQuery("name", "kimchy"), //根据查询添加第一个function 67 | randomFunction("ABCDEF")), //根据给定的种子随机分数 68 | new FunctionScoreQueryBuilder.FilterFunctionBuilder( 69 | exponentialDecayFunction("age", 0L, 1L)) //根据年龄字段添加另一个function 70 | 71 | }; 72 | QueryBuilder qb = QueryBuilders.functionScoreQuery(functions); 73 | ``` 74 | 75 | 76 | - boosting 查询 77 | 78 | 返回匹配 positive 查询的文档,并且当减少文档的分数时其结果也匹配 negative 查询。 79 | 80 | 81 | 查看[Boosting Query](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/query-dsl-boosting-query.html) 82 | 83 | 84 | ``` 85 | QueryBuilder qb = boostingQuery( 86 | termQuery("name","kimchy"), 87 | termQuery("name","dadoonet")) 88 | .negativeBoost(0.2f); 89 | ``` 90 | 91 | - indices 查询 92 | 93 | 对指定的索引执行一个查询,对其他索引执行另一个查询。 94 | 95 | 查看[Indices Query](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/query-dsl-indices-query.html) 96 | 97 | 98 | > 在5.0.0中已弃用。用搜索 _index 字段来代替 99 | 100 | ``` 101 | // Using another query when no match for the main one 102 | QueryBuilder qb = indicesQuery( 103 | termQuery("tag", "wow"), 104 | "index1", "index2" 105 | ).noMatchQuery(termQuery("tag", "kow")); 106 | 107 | ``` 108 | 109 | 110 | ``` 111 | // Using all (match all) or none (match no documents) 112 | QueryBuilder qb = indicesQuery( 113 | termQuery("tag", "wow"), 114 | "index1", "index2" 115 | ).noMatchQuery("all"); 116 | ``` 117 | 118 | -------------------------------------------------------------------------------- /query-dsl/full-text-queries.md: -------------------------------------------------------------------------------- 1 | 2 | ### Full text queries 全文搜索 3 | 4 | 高级别的全文搜索通常用于在全文字段(例如:一封邮件的正文)上进行全文搜索。它们了解如何分析查询的字段,并在执行之前将每个字段的分析器(或搜索分析器)应用于查询字符串。 5 | 6 | 这样的查询有以下这些: 7 | 8 | - 匹配查询(match query) 9 | 10 | 用于执行全文查询的标准查询,包括模糊匹配和词组或邻近程度的查询 11 | 12 | 查看[ Match Query](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/query-dsl-match-query.html) 13 | 14 | 15 | ``` 16 | QueryBuilder qb = matchQuery( 17 | "name", //field 字段 18 | "kimchy elasticsearch" // text 19 | ); 20 | ``` 21 | 22 | - 多字段查询(multi_match query) 23 | 24 | 可以用来对多个字段的版本进行匹配查询 25 | 26 | 查看 [Multi Match Query](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/query-dsl-multi-match-query.html) 27 | 28 | 29 | ``` 30 | QueryBuilder qb = multiMatchQuery( 31 | "kimchy elasticsearch", //text 32 | "user", "message" //fields 多个字段 33 | ); 34 | ``` 35 | 36 | 37 | - 常用术语查询(common_terms query) 38 | 39 | 可以对一些比较专业的偏门词语进行的更加专业的查询 40 | 41 | 查看[Common Terms Query](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/query-dsl-common-terms-query.html) 42 | 43 | 44 | ``` 45 | QueryBuilder qb = commonTermsQuery("name", //field 字段 46 | "kimchy"); // value 47 | ``` 48 | 49 | 50 | - 查询语句查询(query_string query) 51 | 52 | 与lucene查询语句的语法结合的更加紧密的一种查询,允许你在一个查询语句中使用多个 特殊条件关键字(如:AND|OR|NOT )对多个字段进行查询,当然这种查询仅限`专家用户`去使用。 53 | 54 | 查看[Query String Query](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/query-dsl-query-string-query.html) 55 | 56 | 57 | ``` 58 | QueryBuilder qb = queryStringQuery("+kimchy -elasticsearch"); //text 59 | 60 | ``` 61 | 62 | 63 | - 简单查询语句(simple_query_string) 64 | 65 | 是一种适合直接暴露给用户的简单的且具有非常完善的查询语法的查询语句 66 | 67 | 查看[Simple Query String Query](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/query-dsl-simple-query-string-query.html) 68 | 69 | 70 | ``` 71 | QueryBuilder qb = simpleQueryStringQuery("+kimchy -elasticsearch"); //text 72 | ``` 73 | -------------------------------------------------------------------------------- /query-dsl/geo-queries.md: -------------------------------------------------------------------------------- 1 | 2 | ### Geo queries 地理位置查询 3 | 4 | Elasticsearch支持两种类型的地理数据:geo_point类型支持成对的纬度/经度,geo_shape类型支持点、线、圆、多边形、多个多边形等。 5 | 在这组的查询中: 6 | 7 | - geo_shape查询 8 | 9 | 查找要么相交,包含的,要么指定形状不相交的地理形状的文档。 10 | 11 | 查看[Geo Shape Query](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/query-dsl-geo-shape-query.html) 12 | 13 | > `geo_shape` 类型使用 [`Spatial4J`](http://search.maven.org/#search%7Cga%7C1%7Cg%3A%22org.locationtech.spatial4j%22%20AND%20a%3A%22spatial4j%22) 和 [`JTS`](http://search.maven.org/#search%7Cga%7C1%7Cg%3A%22com.vividsolutions%22%20AND%20a%3A%22jts%22) ,这两者都是可选的依赖项。 因此,必须将 [`Spatial4J`](http://search.maven.org/#search%7Cga%7C1%7Cg%3A%22org.locationtech.spatial4j%22%20AND%20a%3A%22spatial4j%22) 和 [`JTS`](http://search.maven.org/#search%7Cga%7C1%7Cg%3A%22com.vividsolutions%22%20AND%20a%3A%22jts%22) 添加到 `classpath ` 中才能使用此类型: 14 | 15 | ``` 16 | 17 | org.locationtech.spatial4j 18 | spatial4j 19 | 0.6 20 | 21 | 22 | 23 | com.vividsolutions 24 | jts 25 | 1.13 26 | 27 | 28 | xerces 29 | xercesImpl 30 | 31 | 32 | 33 | ``` 34 | 35 | 36 | ``` 37 | // Import ShapeRelation and ShapeBuilder 38 | import org.elasticsearch.common.geo.ShapeRelation; 39 | import org.elasticsearch.common.geo.builders.ShapeBuilder; 40 | ``` 41 | 42 | 43 | ``` 44 | List points = new ArrayList<>(); 45 | points.add(new Coordinate(0, 0)); 46 | points.add(new Coordinate(0, 10)); 47 | points.add(new Coordinate(10, 10)); 48 | points.add(new Coordinate(10, 0)); 49 | points.add(new Coordinate(0, 0)); 50 | 51 | QueryBuilder qb = geoShapeQuery( 52 | "pin.location", //field 53 | ShapeBuilders.newMultiPoint(points) //shape 54 | .relation(ShapeRelation.WITHIN); //relation 可以是 ShapeRelation.CONTAINS, ShapeRelation.WITHIN, ShapeRelation.INTERSECTS 或 ShapeRelation.DISJOINT 55 | ``` 56 | 57 | 58 | 59 | 60 | ``` 61 | // Using pre-indexed shapes 62 | QueryBuilder qb = geoShapeQuery( 63 | "pin.location", //field 64 | "DEU", //The ID of the document that containing the pre-indexed shape. 65 | "countries") //Index type where the pre-indexed shape is. 66 | .relation(ShapeRelation.WITHIN)) //relation 67 | .indexedShapeIndex("shapes") //Name of the index where the pre-indexed shape is. Defaults to shapes. 68 | .indexedShapePath("location"); //The field specified as path containing the pre-indexed shape. Defaults to shape. 69 | ``` 70 | 71 | 72 | - geo_bounding_box 查询 73 | 74 | 查找落入指定的矩形地理点的文档。 75 | 76 | 查看[Geo Bounding Box Query](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/query-dsl-geo-bounding-box-query.html) 77 | 78 | 79 | ``` 80 | QueryBuilder qb = geoBoundingBoxQuery("pin.location") //field 81 | .setCorners(40.73, -74.1, //bounding box top left point 82 | 40.717, -73.99); //bounding box bottom right point 83 | ``` 84 | 85 | 86 | - geo_distance 查询 87 | 88 | 查找在一个中心点指定范围内的地理点文档。 89 | 90 | 查看[Geo Distance Query](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/query-dsl-geo-distance-query.html) 91 | 92 | 93 | ``` 94 | QueryBuilder qb = geoDistanceQuery("pin.location") //field 95 | .point(40, -70) //center point 96 | .distance(200, DistanceUnit.KILOMETERS); //distance from center point 97 | ``` 98 | 99 | 100 | - geo_polygon 查询 101 | 102 | 103 | 查找指定多边形内地理点的文档。 104 | 105 | 查看[Geo Polygon Query](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/query-dsl-geo-polygon-query.html) 106 | 107 | 108 | ``` 109 | List points = new ArrayList<>(); //add your polygon of points a document should fall within 110 | points.add(new GeoPoint(40, -70)); 111 | points.add(new GeoPoint(30, -80)); 112 | points.add(new GeoPoint(20, -90)); 113 | 114 | QueryBuilder qb = 115 | geoPolygonQuery("pin.location", points); //initialise the query with field and points 116 | ``` 117 | -------------------------------------------------------------------------------- /query-dsl/joining-queries.md: -------------------------------------------------------------------------------- 1 | 2 | ### Joining queries 3 | 4 | 在像 ElasticSearch 这样的分布式系统中执行全 SQL 风格的连接查询代价昂贵,是不可行的。相应地,为了实现水平规模地扩展,ElasticSearch 提供了两种形式的 join。 5 | 6 | - nested query (嵌套查询) 7 | 8 | 文档中可能包含嵌套类型的字段,这些字段用来索引一些数组对象,每个对象都可以作为一条独立的文档被查询出来(用嵌套查询) 9 | 10 | 查看[Nested Query](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/query-dsl-nested-query.html) 11 | 12 | ``` 13 | QueryBuilder qb = nestedQuery( 14 | "obj1", //nested 嵌套文档的路径 15 | boolQuery() // 查询 查询中引用的任何字段都必须使用完整路径(fully qualified)。 16 | .must(matchQuery("obj1.name", "blue")) 17 | .must(rangeQuery("obj1.count").gt(5)), 18 | ScoreMode.Avg // score 模型 ScoreMode.Max, ScoreMode.Min, ScoreMode.Total, ScoreMode.Avg or ScoreMode.None 19 | ); 20 | ``` 21 | 22 | - has_child (有子查询) and has_parent (有父查询) queries 23 | 24 | 一类父子关系可以存在单个的索引的两个类型的文档之间。has_child 查询将返回其子文档能满足特定的查询的父文档,而 has_parent 则返回其父文档能满足特定查询的子文档 25 | 26 | 27 | #### Has Child Query 28 | 29 | 查看[Has Child Query](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/query-dsl-has-child-query.html) 30 | 31 | 使用 `has_child` 查询时,必须使用`PreBuiltTransportClient`而不是常规 `Client`,这个点很重要: 32 | 33 | 34 | ``` 35 | Settings settings = Settings.builder().put("cluster.name", "elasticsearch").build(); 36 | TransportClient client = new PreBuiltTransportClient(settings); 37 | client.addTransportAddress(new InetSocketTransportAddress(new InetSocketAddress(InetAddresses.forString("127.0.0.1"), 9300))); 38 | 39 | ``` 40 | 41 | 否则,`parent-join` 模块不会被加载,并且不能从transport client 使用`has_child`查询。 42 | 43 | 44 | ``` 45 | QueryBuilder qb = JoinQueryBuilders.hasChildQuery( 46 | "blog_tag", //要查询的子类型 47 | termQuery("tag","something"), //查询 48 | ScoreMode.Avg //score 模型 ScoreMode.Avg, ScoreMode.Max, ScoreMode.Min, ScoreMode.None or ScoreMode.Total 49 | ); 50 | ``` 51 | 52 | 53 | #### Has Parent Query 54 | 55 | 查看[Has Parent](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/query-dsl-has-parent-query.html) 56 | 57 | 使用`has_parent`查询时,必须使用`PreBuiltTransportClient`而不是常规 `Client`,这个点很重要: 58 | 59 | 60 | ``` 61 | Settings settings = Settings.builder().put("cluster.name", "elasticsearch").build(); 62 | TransportClient client = new PreBuiltTransportClient(settings); 63 | client.addTransportAddress(new InetSocketTransportAddress(new InetSocketAddress(InetAddresses.forString("127.0.0.1"), 9300))); 64 | 65 | ``` 66 | 67 | 否则,`parent-join` 模块不会被加载,并且不能从transport client 使用`has_child`查询。 68 | 69 | 70 | ``` 71 | QueryBuilder qb = JoinQueryBuilders.hasParentQuery( 72 | "blog", //要查询的子类型 73 | termQuery("tag","something"), //查询 74 | false //是否从父hit的score 传给子 hit 75 | ); 76 | ``` 77 | 78 | 参考 term 查询中的[terms-lookup mechanism](https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-terms-query.html#query-dsl-terms-lookup) ,它允许你在另一个文档的值中创建一个term 查询。 79 | -------------------------------------------------------------------------------- /query-dsl/match-all-query.md: -------------------------------------------------------------------------------- 1 | ### Match All Query 2 | 3 | > 最简单的查询,它匹配所有文档 4 | 5 | 查看 [Match All Query](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/query-dsl-match-all-query.html) 6 | 7 | 8 | ``` 9 | QueryBuilder qb = matchAllQuery(); 10 | ``` -------------------------------------------------------------------------------- /query-dsl/span-queries.md: -------------------------------------------------------------------------------- 1 | 2 | ### Span queries 3 | 4 | - span_term查询 5 | 6 | 等同于 [term query](https://www.elastic.co/guide/en/elasticsearch/client/java-api/current/java-term-level-queries.html#java-query-dsl-term-query) ,但与其他Span查询一起使用。 7 | 8 | 查看 [ Span Term Query](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/query-dsl-span-term-query.html) 9 | 10 | 11 | ``` 12 | QueryBuilder qb = spanTermQuery( 13 | "user", //field 14 | "kimchy" //value 15 | ); 16 | ``` 17 | 18 | 19 | - span_multi查询 20 | 21 | 22 | 包含term, range, prefix, wildcard, regexp 或者 fuzzy 查询。 23 | 24 | 查看[Span Multi Term Query](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/query-dsl-span-multi-term-query.html) 25 | 26 | ``` 27 | QueryBuilder qb = spanMultiTermQueryBuilder( 28 | prefixQuery("user", "ki") //可以是MultiTermQueryBuilder 的 扩展 比如:FuzzyQueryBuilder, PrefixQueryBuilder, RangeQueryBuilder, RegexpQueryBuilder , WildcardQueryBuilder。 29 | ); 30 | ``` 31 | 32 | - span_first查询 33 | 34 | 接受另一个Span查询,其匹配必须出现在字段的前N个位置。 35 | 36 | 查看[Span First Query](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/query-dsl-span-first-query.html) 37 | 38 | ``` 39 | QueryBuilder qb = spanFirstQuery( 40 | spanTermQuery("user", "kimchy"), //query 41 | 3 //max end position 42 | ); 43 | ``` 44 | 45 | - span_near查询 46 | 47 | 接受多个Span查询,其匹配必须在彼此的指定距离内,并且可能顺序相同。 48 | 49 | 查看[Span Near Query](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/query-dsl-span-near-query.html) 50 | 51 | ``` 52 | QueryBuilder qb = spanNearQuery( 53 | spanTermQuery("field","value1"), //span term queries 54 | 12) //slop factor: the maximum number of intervening unmatched positions 55 | .addClause(spanTermQuery("field","value2")) //span term queries 56 | .addClause(spanTermQuery("field","value3")) //span term queries 57 | .inOrder(false); //whether matches are required to be in-order 58 | ``` 59 | 60 | - span_or查询 61 | 62 | 组合多个Span查询 - 返回与任何指定查询匹配的文档。 63 | 64 | 查看[Span Or Query](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/query-dsl-span-or-query.html) 65 | 66 | 67 | ``` 68 | QueryBuilder qb = spanOrQuery( 69 | spanTermQuery("field","value1")) 70 | .addClause(spanTermQuery("field","value2")) 71 | .addClause(spanTermQuery("field","value3")); //span term queries 72 | ``` 73 | 74 | 75 | - span_not查询 76 | 77 | 包装另一个Span查询,并排除与该查询匹配的所有文档。 78 | 79 | 查看[Span Not Query](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/query-dsl-span-not-query.html) 80 | 81 | 82 | ``` 83 | QueryBuilder qb = spanNotQuery( 84 | spanTermQuery("field","value1"), //span query whose matches are filtered 85 | spanTermQuery("field","value2")); //span query whose matches must not overlap those returned 86 | 87 | ``` 88 | 89 | 90 | - span_containing 查询 91 | 92 | 接受Span查询的列表,但仅返回与第二个Spans查询匹配的Span。 93 | 94 | 查看[ Span Containing Query](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/query-dsl-span-containing-query.html) 95 | 96 | 97 | ``` 98 | QueryBuilder qb = spanContainingQuery( 99 | spanNearQuery(spanTermQuery("field1","bar"), 5) //big part 100 | .addClause(spanTermQuery("field1","baz")) 101 | .inOrder(true), 102 | spanTermQuery("field1","foo")); //little part 103 | ``` 104 | 105 | - span_within查询 106 | 107 | 只要其 span 位于由其他Span查询列表返回的范围内,就会返回单个Span查询的结果, 108 | 109 | 查看[Span Within Query](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/query-dsl-span-within-query.html) 110 | 111 | ``` 112 | QueryBuilder qb = spanWithinQuery( 113 | spanNearQuery(spanTermQuery("field1", "bar"), 5) //big part 114 | .addClause(spanTermQuery("field1", "baz")) 115 | .inOrder(true), 116 | spanTermQuery("field1", "foo")); //little part 117 | ``` 118 | 119 | -------------------------------------------------------------------------------- /query-dsl/specialized-queries.md: -------------------------------------------------------------------------------- 1 | 2 | ### Specialized queries 3 | 4 | - more_like_this query(相似度查询) 5 | 6 | 这个查询能检索到与指定文本、文档或者文档集合相似的文档。 7 | 8 | 查看[More Like This Query](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/query-dsl-mlt-query.html) 9 | 10 | ``` 11 | String[] fields = {"name.first", "name.last"}; //fields 12 | String[] texts = {"text like this one"}; //text 13 | Item[] items = null; 14 | 15 | QueryBuilder qb = moreLikeThisQuery(fields, texts, items) 16 | .minTermFreq(1) //ignore threshold 17 | .maxQueryTerms(12); //max num of Terms in generated queries 18 | ``` 19 | 20 | - script query 21 | 22 | 该查询允许脚本充当过滤器。 另请参阅 [function_score query](https://www.elastic.co/guide/en/elasticsearch/client/java-api/current/java-compound-queries.html#java-query-dsl-function-score-query) 。 23 | 24 | 25 | 查看[Script Query](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/query-dsl-script-query.html) 26 | 27 | 28 | 29 | ``` 30 | QueryBuilder qb = scriptQuery( 31 | new Script("doc['num1'].value > 1") //inlined script 32 | ); 33 | ``` 34 | 35 | 如果已经在每个数据节点上存储名为 `myscript.painless 的脚本,请执行以下操作: 36 | 37 | ``` 38 | doc['num1'].value > params.param1 39 | ``` 40 | 41 | 然后使用: 42 | 43 | ``` 44 | QueryBuilder qb = scriptQuery( 45 | new Script( 46 | ScriptType.FILE, //脚本类型 ScriptType.FILE, ScriptType.INLINE, ScriptType.INDEXED 47 | "painless", //Scripting engine 脚本引擎 48 | "myscript", //Script name 脚本名 49 | Collections.singletonMap("param1", 5)) //Parameters as a Map of 50 | ); 51 | ``` 52 | 53 | - Percolate Query 54 | 55 | 查看[Percolate Query](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/query-dsl-percolate-query.html) 56 | 57 | 58 | 59 | ``` 60 | Settings settings = Settings.builder().put("cluster.name", "elasticsearch").build(); 61 | TransportClient client = new PreBuiltTransportClient(settings); 62 | client.addTransportAddress(new InetSocketTransportAddress(new InetSocketAddress(InetAddresses.forString("127.0.0.1"), 9300))); 63 | 64 | ``` 65 | 66 | 在可以使用`percolate`查询之前,应该添加`percolator`映射,并且应该对包含`percolator`查询的文档建立索引: 67 | 68 | ``` 69 | // create an index with a percolator field with the name 'query': 70 | client.admin().indices().prepareCreate("myIndexName") 71 | .addMapping("query", "query", "type=percolator") 72 | .addMapping("docs", "content", "type=text") 73 | .get(); 74 | 75 | //This is the query we're registering in the percolator 76 | QueryBuilder qb = termQuery("content", "amazing"); 77 | 78 | //Index the query = register it in the percolator 79 | client.prepareIndex("myIndexName", "query", "myDesignatedQueryName") 80 | .setSource(jsonBuilder() 81 | .startObject() 82 | .field("query", qb) // Register the query 83 | .endObject()) 84 | .setRefreshPolicy(RefreshPolicy.IMMEDIATE) // Needed when the query shall be available immediately 85 | .get(); 86 | ``` 87 | 88 | 在上面的index中query名为 `myDesignatedQueryName` 89 | 90 | 为了检查文档注册查询,使用这个代码: 91 | 92 | 93 | ``` 94 | //Build a document to check against the percolator 95 | XContentBuilder docBuilder = XContentFactory.jsonBuilder().startObject(); 96 | docBuilder.field("content", "This is amazing!"); 97 | docBuilder.endObject(); //End of the JSON root object 98 | 99 | PercolateQueryBuilder percolateQuery = new PercolateQueryBuilder("query", "docs", docBuilder.bytes()); 100 | 101 | // Percolate, by executing the percolator query in the query dsl: 102 | SearchResponse response = client().prepareSearch("myIndexName") 103 | .setQuery(percolateQuery)) 104 | .get(); 105 | //Iterate over the results 106 | for(SearchHit hit : response.getHits()) { 107 | // Percolator queries as hit 108 | } 109 | ``` 110 | -------------------------------------------------------------------------------- /query-dsl/term-level-queries.md: -------------------------------------------------------------------------------- 1 | 2 | ### Term level queries 术语查询 3 | 4 | 虽然全文查询将在执行之前分析查询字符串,但是项级别查询对存储在反向索引中的确切项进行操作。 5 | 6 | 通常用于结构化数据,如数字、日期和枚举,而不是全文字段。或者,在分析过程之前,它允许你绘制低级查询。 7 | 8 | 这样的查询有以下这些: 9 | 10 | - Term Query(项查询) 11 | 12 | 查询包含在指定字段中指定的确切值的文档。 13 | 14 | 查看[Term Query](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/query-dsl-term-query.html) 15 | 16 | ``` 17 | 18 | QueryBuilder qb = termQuery( 19 | "name", //field 20 | "kimchy" //text 21 | ); 22 | ``` 23 | 24 | - Terms Query(多项查询) 25 | 26 | 查询包含任意一个在指定字段中指定的多个确切值的文档。 27 | 28 | 查看[Terms Query](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/query-dsl-terms-query.html) 29 | 30 | ``` 31 | QueryBuilder qb = termsQuery("tags", //field 32 | "blue", "pill"); //values 33 | ``` 34 | 35 | - Range Query(范围查询) 36 | 37 | 查询指定字段包含指定范围内的值(日期,数字或字符串)的文档。 38 | 39 | 查看[Range Query](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/query-dsl-range-query.html) 40 | 41 | 方法: 42 | 43 | 1. gte() :范围查询将匹配字段值大于或等于此参数值的文档。 44 | 2. gt() :范围查询将匹配字段值大于此参数值的文档。 45 | 3. lte() :范围查询将匹配字段值小于或等于此参数值的文档。 46 | 4. lt() :范围查询将匹配字段值小于此参数值的文档。 47 | 5. from() 开始值 to() 结束值 这两个函数与includeLower()和includeUpper()函数配套使用。 48 | 6. includeLower(true) 表示 from() 查询将匹配字段值大于或等于此参数值的文档。 49 | 7. includeLower(false) 表示 from() 查询将匹配字段值大于此参数值的文档。 50 | 8. includeUpper(true) 表示 to() 查询将匹配字段值小于或等于此参数值的文档。 51 | 9. includeUpper(false) 表示 to() 查询将匹配字段值小于此参数值的文档。 52 | 53 | ``` 54 | QueryBuilder qb = rangeQuery("price") //field 55 | .from(5) //开始值,与includeLower()和includeUpper()函数配套使用 56 | .to(10) //结束值,与includeLower()和includeUpper()函数配套使用 57 | .includeLower(true) // true: 表示 from() 查询将匹配字段值大于或等于此参数值的文档; false:表示 from() 查询将匹配字段值大于此参数值的文档。 58 | .includeUpper(false); //true:表示 to() 查询将匹配字段值小于或等于此参数值的文档; false:表示 to() 查询将匹配字段值小于此参数值的文档。 59 | ``` 60 | 61 | #### 实例 62 | 63 | 64 | ``` 65 | // Query 66 | RangeQueryBuilder rangeQueryBuilder = QueryBuilders.rangeQuery("age"); 67 | rangeQueryBuilder.from(19); 68 | rangeQueryBuilder.to(21); 69 | rangeQueryBuilder.includeLower(true); 70 | rangeQueryBuilder.includeUpper(true); 71 | //RangeQueryBuilder rangeQueryBuilder = QueryBuilders.rangeQuery("age").gte(19).lte(21); 72 | // Search 73 | SearchRequestBuilder searchRequestBuilder = client.prepareSearch(index); 74 | searchRequestBuilder.setTypes(type); 75 | searchRequestBuilder.setQuery(rangeQueryBuilder); 76 | // 执行 77 | SearchResponse searchResponse = searchRequestBuilder.execute().actionGet(); 78 | ``` 79 | 80 | 上面代码中的查询语句与下面的是等价的: 81 | 82 | 83 | ``` 84 | QueryBuilder queryBuilder = QueryBuilders.rangeQuery("age").gte(19).lte(21); 85 | 86 | ``` 87 | 88 | - Exists Query(存在查询) 89 | 90 | 查询指定的字段包含任何非空值的文档,如果指定字段上至少存在一个no-null的值就会返回该文档。 91 | 92 | 查看[Exists Query](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/query-dsl-exists-query.html) 93 | 94 | 95 | ``` 96 | QueryBuilder qb = existsQuery("name"); 97 | 98 | ``` 99 | #### 实例 100 | 101 | 102 | ``` 103 | // Query 104 | ExistsQueryBuilder existsQueryBuilder = QueryBuilders.existsQuery("name"); 105 | // Search 106 | SearchRequestBuilder searchRequestBuilder = client.prepareSearch(index); 107 | searchRequestBuilder.setTypes(type); 108 | searchRequestBuilder.setQuery(existsQueryBuilder); 109 | // 执行 110 | SearchResponse searchResponse = searchRequestBuilder.get(); 111 | ``` 112 | 113 | 举例说明,下面的几个文档都会得到上面代码的匹配: 114 | 115 | 116 | ``` 117 | { "name": "yoona" } 118 | { "name": "" } 119 | { "name": "-" } 120 | { "name": ["yoona"] } 121 | { "name": ["yoona", null ] } 122 | ``` 123 | 124 | 第一个是字符串,是一个非null的值。 125 | 126 | 第二个是空字符串,也是非null。 127 | 128 | 第三个使用标准分析器的情况下尽管不会返回词条,但是原始字段值是非null的(Even though the standard analyzer would emit zero tokens, the original field is non-null)。 129 | 130 | 第五个中至少有一个是非null值。 131 | 132 | 下面几个文档不会得到上面代码的匹配: 133 | 134 | 135 | ``` 136 | { "name": null } 137 | { "name": [] } 138 | { "name": [null] } 139 | { "user": "bar" } 140 | ``` 141 | 142 | 第一个是null值。 143 | 144 | 第二个没有值。 145 | 146 | 第三个只有null值,至少需要一个非null值。 147 | 148 | 第四个与指定字段不匹配。 149 | - Prefix Query(前缀查询) 150 | 151 | 查找指定字段包含以指定的精确前缀开头的值的文档。 152 | 153 | 查看[Prefix Query](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/query-dsl-prefix-query.html) 154 | 155 | 156 | ``` 157 | QueryBuilder qb = prefixQuery( 158 | "brand", //field 159 | "heine" //prefix 160 | ); 161 | ``` 162 | 163 | 164 | - Wildcard Query(通配符查询) 165 | 166 | 查询指定字段包含与指定模式匹配的值的文档,其中该模式支持单字符通配符(?)和多字符通配符(*),和前缀查询一样,通配符查询指定字段是未分析的(not analyzed) 167 | 168 | 可以使用星号代替0个或多个字符,使用问号代替一个字符。星号表示匹配的数量不受限制,而后者的匹配字符数则受到限制。这个技巧主要用于英文搜索中,如输入““computer*”,就可以找到“computer、computers、computerised、computerized”等单词,而输入“comp?ter”,则只能找到“computer、compater、competer”等单词。注意的是通配符查询不太注重性能,在可能时尽量避免,特别是要避免前缀通配符(以通配符开始的词条)。 169 | 170 | 171 | ``` 172 | QueryBuilder qb = wildcardQuery("user", "k?mc*"); 173 | 174 | ``` 175 | 176 | 177 | #### 实例 178 | 179 | ``` 180 | // Query 181 | WildcardQueryBuilder wildcardQueryBuilder = QueryBuilders.wildcardQuery("country", "西*牙"); 182 | // Search 183 | SearchRequestBuilder searchRequestBuilder = client.prepareSearch(index); 184 | searchRequestBuilder.setTypes(type); 185 | searchRequestBuilder.setQuery(wildcardQueryBuilder); 186 | // 执行 187 | SearchResponse searchResponse = searchRequestBuilder.get(); 188 | ``` 189 | 190 | 191 | 查看[Wildcard Query](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/query-dsl-wildcard-query.html) 192 | 193 | - Regexp Query(正则表达式查询) 194 | 195 | 查询指定的字段包含与指定的正则表达式匹配的值的文档。 196 | 197 | 和前缀查询一样,正则表达式查询指定字段是未分析的(not analyzed)。正则表达式查询的性能取决于所选的正则表达式。如果我们的正则表达式匹配许多词条,查询将很慢。一般规则是,正则表达式匹配的词条数越高,查询越慢。 198 | 199 | 200 | 查看[Regexp Query](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/query-dsl-regexp-query.html) 201 | 202 | 203 | 204 | ``` 205 | QueryBuilder qb = regexpQuery( 206 | "name.first", //field 207 | "s.*y"); //regexp 208 | ``` 209 | 210 | #### 实例 211 | 212 | 213 | ``` 214 | // Query 215 | RegexpQueryBuilder regexpQueryBuilder = QueryBuilders.regexpQuery("country", "(西班|葡萄)牙"); 216 | 217 | // Search 218 | SearchRequestBuilder searchRequestBuilder = client.prepareSearch(index); 219 | searchRequestBuilder.setTypes(type); 220 | searchRequestBuilder.setQuery(regexpQueryBuilder); 221 | // 执行 222 | SearchResponse searchResponse = searchRequestBuilder.get(); 223 | ``` 224 | 225 | 226 | - Fuzzy Query(模糊查询) 227 | 228 | 查询指定字段包含与指定术语模糊相似的术语的文档。模糊性测量为1或2的 Levenshtein。 229 | 230 | 如果指定的字段是string类型,模糊查询是基于编辑距离算法来匹配文档。编辑距离的计算基于我们提供的查询词条和被搜索文档。如果指定的字段是数值类型或者日期类型,模糊查询基于在字段值上进行加减操作来匹配文档(The fuzzy query uses similarity based on Levenshtein edit distance for string fields, and a +/-margin on numeric and date fields)。此查询很占用CPU资源,但当需要模糊匹配时它很有用,例如,当用户拼写错误时。另外我们可以在搜索词的尾部加上字符 “~” 来进行模糊查询。 231 | 232 | 233 | 查看[Fuzzy Query](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/query-dsl-fuzzy-query.html) 234 | 235 | 236 | ``` 237 | QueryBuilder qb = fuzzyQuery( 238 | "name", //field 239 | "kimzhy" //text 240 | ); 241 | ``` 242 | 243 | 244 | - Type Query(类型查询) 245 | 246 | 查询指定类型的文档。 247 | 248 | 查看[Type Query](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/query-dsl-type-query.html) 249 | 250 | 251 | ``` 252 | QueryBuilder qb = typeQuery("my_type"); //type 253 | 254 | ``` 255 | 256 | 257 | 258 | 259 | - Ids Query(ID查询) 260 | 261 | 查询具有指定类型和 ID 的文档。 262 | 263 | 查看[Ids Query](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/query-dsl-ids-query.html) 264 | 265 | 266 | ``` 267 | QueryBuilder qb = idsQuery("my_type", "type2") 268 | .addIds("1", "4", "100"); 269 | 270 | QueryBuilder qb = idsQuery() // type 是可选择的,可以不写 271 | .addIds("1", "4", "100"); 272 | ``` 273 | -------------------------------------------------------------------------------- /search-api.md: -------------------------------------------------------------------------------- 1 | # 搜索API 2 | 3 | 搜索查询,返回查询匹配的结果,搜索一个index / type 或者多个index / type,可以使用 [query Java API](https://www.elastic.co/guide/en/elasticsearch/client/java-api/current/java-query-dsl.html) 作为查询条件,下面是例子: 4 | 5 | 6 | ``` 7 | import org.elasticsearch.action.search.SearchResponse; 8 | import org.elasticsearch.action.search.SearchType; 9 | import org.elasticsearch.index.query.QueryBuilders.*; 10 | ``` 11 | 12 | ``` 13 | SearchResponse response = client.prepareSearch("index1", "index2") 14 | .setTypes("type1", "type2") 15 | .setSearchType(SearchType.DFS_QUERY_THEN_FETCH) 16 | .setQuery(QueryBuilders.termQuery("multi", "test")) // Query 查询条件 17 | .setPostFilter(QueryBuilders.rangeQuery("age").from(12).to(18)) // Filter 过滤 18 | .setFrom(0).setSize(60).setExplain(true) 19 | .get(); 20 | ``` 21 | 22 | 所有的参数都是可选的,下面是最简单的调用: 23 | 24 | 25 | ``` 26 | // MatchAll on the whole cluster with all default options 27 | SearchResponse response = client.prepareSearch().get(); 28 | ``` 29 | 30 | > 尽管Java API默认提供`QUERY_AND_FETCH` 和 `DFS_QUERY_AND_FETCH` 两种 search types ,但是这种模式应该由系统选择,用户不要手动指定 31 | 32 | 更多请移步 [REST search](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/search.html) 文档 -------------------------------------------------------------------------------- /search-api/multisearch-api.md: -------------------------------------------------------------------------------- 1 | 2 | ### MultiSearch API 3 | 4 | multi search API 允许在同一API中执行多个搜索请求。它的端点(endpoint)是 _msearch 。 5 | 6 | 首先请看[MultiSearch API Query ](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/search-multi-search.html) 文档 7 | 8 | ``` 9 | SearchRequestBuilder srb1 = client 10 | .prepareSearch().setQuery(QueryBuilders.queryStringQuery("elasticsearch")).setSize(1); 11 | SearchRequestBuilder srb2 = client 12 | .prepareSearch().setQuery(QueryBuilders.matchQuery("name", "kimchy")).setSize(1); 13 | 14 | MultiSearchResponse sr = client.prepareMultiSearch() 15 | .add(srb1) 16 | .add(srb2) 17 | .get(); 18 | 19 | // You will get all individual responses from MultiSearchResponse#getResponses() 20 | long nbHits = 0; 21 | for (MultiSearchResponse.Item item : sr.getResponses()) { 22 | SearchResponse response = item.getResponse(); 23 | nbHits += response.getHits().getTotalHits(); 24 | } 25 | ``` 26 | 27 | #### 实例 28 | 29 | - [MultiSearchAPI.java](https://gitee.com/quanke/elasticsearch-java-study/blob/master/src/test/java/name/quanke/es/study/search/MultiSearchAPI.java) 30 | 31 | - [本手册完整实例](https://gitee.com/quanke/elasticsearch-java-study) -------------------------------------------------------------------------------- /search-api/search-template.md: -------------------------------------------------------------------------------- 1 | 2 | 3 | ### Search Template 4 | 5 | 6 | 首先查看 [Search Template](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/search-template.html) 文档 7 | 8 | > /_search/template endpoint 允许我们在执行搜索请求和使用模板参数填充现有模板之前,能够使用 `mustache` 语言预先呈现搜索请求。 9 | 10 | 将模板参数定义为 `Map `: 11 | 12 | ``` 13 | Map template_params = new HashMap<>(); 14 | template_params.put("param_gender", "male"); 15 | ``` 16 | 17 | 可以在`config/scripts`中使用存储的 `search templates`。 例如,有一个名为 `config/scripts/template_gender.mustache` 的文件,其中包含: 18 | 19 | ``` 20 | { 21 | "query" : { 22 | "match" : { 23 | "gender" : "{{param_gender}}" 24 | } 25 | } 26 | } 27 | ``` 28 | 29 | 创建`search templates` 请求: 30 | 31 | 32 | ``` 33 | SearchResponse sr = new SearchTemplateRequestBuilder(client) 34 | .setScript("template_gender") //template 名 35 | .setScriptType(ScriptService.ScriptType.FILE) //template 存储在 gender_template.mustache 磁盘上 36 | .setScriptParams(template_params) //参数 37 | .setRequest(new SearchRequest()) //设置执行的context(ie: 这里定义索引名称) 38 | .get() 39 | .getResponse(); 40 | ``` 41 | 42 | 43 | 还可以将 `template` 存储在 `cluster state` 中: 44 | 45 | > cluster state是全局性信息, 包含了整个群集中所有分片的元信息(规则, 位置, 大小等信息), 并保持每个每节的信息同步。 参考: [《为什么ElasticSearch应用开发者需要了解cluster state》](https://segmentfault.com/a/1190000008812263) 46 | 47 | 48 | ``` 49 | client.admin().cluster().preparePutStoredScript() 50 | .setScriptLang("mustache") 51 | .setId("template_gender") 52 | .setSource(new BytesArray( 53 | "{\n" + 54 | " \"query\" : {\n" + 55 | " \"match\" : {\n" + 56 | " \"gender\" : \"{{param_gender}}\"\n" + 57 | " }\n" + 58 | " }\n" + 59 | "}")).get(); 60 | ``` 61 | 62 | 使用`ScriptService.ScriptType.STORED` 执行一个存储的 `templates`: 63 | 64 | 65 | ``` 66 | SearchResponse sr = new SearchTemplateRequestBuilder(client) 67 | .setScript("template_gender") //template 名 68 | .setScriptType(ScriptType.STORED) //template 存储在 cluster state 上 69 | .setScriptParams(template_params) //参数 70 | .setRequest(new SearchRequest()) //设置执行的context(ie: 这里定义索引名称) 71 | .get() //执行获取template 请求 72 | .getResponse(); 73 | ``` 74 | 75 | 也可以执行 内联(`inline`) `templates`: 76 | 77 | 78 | 79 | ``` 80 | sr = new SearchTemplateRequestBuilder(client) 81 | .setScript("{\n" + //template 名 82 | " \"query\" : {\n" + 83 | " \"match\" : {\n" + 84 | " \"gender\" : \"{{param_gender}}\"\n" + 85 | " }\n" + 86 | " }\n" + 87 | "}") 88 | .setScriptType(ScriptType.INLINE) //template 是内联传递的 89 | .setScriptParams(template_params) //参数 90 | .setRequest(new SearchRequest()) //设置执行的context(ie: 这里定义索引名称) 91 | .get() //执行获取template 请求 92 | .getResponse(); 93 | ``` 94 | -------------------------------------------------------------------------------- /search-api/terminate-after.md: -------------------------------------------------------------------------------- 1 | 2 | ### Terminate After 3 | 4 | 获取文档的最大数量,如果设置了,需要通过`SearchResponse`对象里的`isTerminatedEarly()` 判断返回文档是否达到设置的数量: 5 | 6 | 7 | ``` 8 | SearchResponse sr = client.prepareSearch(INDEX) 9 | .setTerminateAfter(1000) //如果达到这个数量,提前终止 10 | .get(); 11 | 12 | if (sr.isTerminatedEarly()) { 13 | // We finished early 14 | } 15 | ``` -------------------------------------------------------------------------------- /search-api/using-aggregations.md: -------------------------------------------------------------------------------- 1 | 2 | ### Using Aggregations 3 | 4 | 下面的代码演示了如何在搜索中添加两个聚合: 5 | 6 | > 聚合框架有助于根据搜索查询提供聚合数据。它是基于简单的构建块也称为整合,整合就是将复杂的数据摘要有序的放在一块。 7 | 8 | > 聚合可以被看做是从一组文件中获取分析信息的一系列工作的统称。聚合的实现过程就是定义这个文档集的过程(例如,在搜索请求的基础上,执行查询/过滤,才能得到高水平的聚合结果)。 9 | 10 | ``` 11 | SearchResponse sr = client.prepareSearch() 12 | .setQuery(QueryBuilders.matchAllQuery()) 13 | .addAggregation( 14 | AggregationBuilders.terms("agg1").field("field") 15 | ) 16 | .addAggregation( 17 | AggregationBuilders.dateHistogram("agg2") 18 | .field("birth") 19 | .dateHistogramInterval(DateHistogramInterval.YEAR) 20 | ) 21 | .get(); 22 | 23 | // Get your facet results 24 | Terms agg1 = sr.getAggregations().get("agg1"); 25 | Histogram agg2 = sr.getAggregations().get("agg2"); 26 | ``` 27 | 28 | 详细文档请看 [Aggregations Java API](https://www.elastic.co/guide/en/elasticsearch/client/java-api/current/java-aggs.html) 29 | -------------------------------------------------------------------------------- /search-api/using-scrolls-in-java.md: -------------------------------------------------------------------------------- 1 | 2 | ### Using scrolls in Java 3 | 4 | 首先需要阅读 [scroll documentation](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/search-request-scroll.html) 5 | 6 | > 一般搜索请求都是返回一"页"数据,无论数据量多大都一起返回给用户,Scroll API可以允许我们检索大量数据(甚至全部数据)。Scroll API允许我们做一个初始阶段搜索并且持续批量从Elasticsearch里拉取结果直到没有结果剩下。这有点像传统数据库里的cursors(游标)。 7 | Scroll API的创建并不是为了实时的用户响应,而是为了处理大量的数据(Scrolling is not intended for real time user requests, but rather for processing large amounts of data)。从 scroll 请求返回的结果只是反映了 search 发生那一时刻的索引状态,就像一个快照(The results that are returned from a scroll request reflect the state of the index at the time that the initial search request was made, like a snapshot in time)。后续的对文档的改动(索引、更新或者删除)都只会影响后面的搜索请求。 8 | 9 | 10 | ``` 11 | import static org.elasticsearch.index.query.QueryBuilders.*; 12 | ``` 13 | 14 | ``` 15 | QueryBuilder qb = termQuery("multi", "test"); 16 | 17 | SearchResponse scrollResp = client.prepareSearch(test) 18 | .addSort(FieldSortBuilder.DOC_FIELD_NAME, SortOrder.ASC) 19 | .setScroll(new TimeValue(60000)) //为了使用 scroll,初始搜索请求应该在查询中指定 scroll 参数,告诉 Elasticsearch 需要保持搜索的上下文环境多长时间(滚动时间) 20 | .setQuery(qb) 21 | .setSize(100).get(); //max of 100 hits will be returned for each scroll 22 | //Scroll until no hits are returned 23 | do { 24 | for (SearchHit hit : scrollResp.getHits().getHits()) { 25 | //Handle the hit... 26 | } 27 | 28 | scrollResp = client.prepareSearchScroll(scrollResp.getScrollId()).setScroll(new TimeValue(60000)).execute().actionGet(); 29 | } while(scrollResp.getHits().getHits().length != 0); // Zero hits mark the end of the scroll and the while loop. 30 | ``` 31 | 32 | > 如果超过滚动时间,继续使用该滚动ID搜索数据,则会报错: 33 | 34 | 35 | ``` 36 | Caused by: SearchContextMissingException[No search context found for id [2861]] 37 | at org.elasticsearch.search.SearchService.findContext(SearchService.java:613) 38 | at org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:403) 39 | at org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryScrollTransportHandler.messageReceived(SearchServiceTransportAction.java:384) 40 | at org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryScrollTransportHandler.messageReceived(SearchServiceTransportAction.java:381) 41 | at org.elasticsearch.transport.TransportRequestHandler.messageReceived(TransportRequestHandler.java:33) 42 | at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:75) 43 | at org.elasticsearch.transport.TransportService$4.doRun(TransportService.java:376) 44 | at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) 45 | at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 46 | at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 47 | at java.lang.Thread.run(Thread.java:745) 48 | ``` 49 | > 虽然当滚动有效时间已过,搜索上下文(Search Context)会自动被清除,但是一值保持滚动代价也是很大的,所以当我们不在使用滚动时要尽快使用Clear-Scroll API进行清除。 50 | 51 | ## 清除Scroll 52 | 53 | ``` 54 | /** 55 | * 清除滚动ID 56 | * @param client 57 | * @param scrollIdList 58 | * @return 59 | */ 60 | public static boolean clearScroll(Client client, List scrollIdList){ 61 | ClearScrollRequestBuilder clearScrollRequestBuilder = client.prepareClearScroll(); 62 | clearScrollRequestBuilder.setScrollIds(scrollIdList); 63 | ClearScrollResponse response = clearScrollRequestBuilder.get(); 64 | return response.isSucceeded(); 65 | } 66 | /** 67 | * 清除滚动ID 68 | * @param client 69 | * @param scrollId 70 | * @return 71 | */ 72 | public static boolean clearScroll(Client client, String scrollId){ 73 | ClearScrollRequestBuilder clearScrollRequestBuilder = client.prepareClearScroll(); 74 | clearScrollRequestBuilder.addScrollId(scrollId); 75 | ClearScrollResponse response = clearScrollRequestBuilder.get(); 76 | return response.isSucceeded(); 77 | } 78 | ``` 79 | 80 | #### 实例 81 | 82 | ``` 83 | 84 | public class ScrollsAPI extends ElasticsearchClientBase { 85 | 86 | private String scrollId; 87 | 88 | @Test 89 | public void testScrolls() throws Exception { 90 | 91 | SearchResponse scrollResp = client.prepareSearch("twitter") 92 | .addSort(FieldSortBuilder.DOC_FIELD_NAME, SortOrder.ASC) 93 | .setScroll(new TimeValue(60000)) //为了使用 scroll,初始搜索请求应该在查询中指定 scroll 参数,告诉 Elasticsearch 需要保持搜索的上下文环境多长时间(滚动时间) 94 | .setQuery(QueryBuilders.termQuery("user", "kimchy")) // Query 查询条件 95 | .setSize(5).get(); //max of 100 hits will be returned for each scroll 96 | //Scroll until no hits are returned 97 | 98 | scrollId = scrollResp.getScrollId(); 99 | do { 100 | for (SearchHit hit : scrollResp.getHits().getHits()) { 101 | //Handle the hit... 102 | 103 | System.out.println("" + hit.getSource().toString()); 104 | } 105 | 106 | scrollResp = client.prepareSearchScroll(scrollId).setScroll(new TimeValue(60000)).execute().actionGet(); 107 | } 108 | while (scrollResp.getHits().getHits().length != 0); // Zero hits mark the end of the scroll and the while loop. 109 | } 110 | 111 | @Override 112 | public void tearDown() throws Exception { 113 | ClearScrollRequestBuilder clearScrollRequestBuilder = client.prepareClearScroll(); 114 | clearScrollRequestBuilder.addScrollId(scrollId); 115 | ClearScrollResponse response = clearScrollRequestBuilder.get(); 116 | 117 | if (response.isSucceeded()) { 118 | System.out.println("成功清除"); 119 | } 120 | 121 | super.tearDown(); 122 | } 123 | } 124 | 125 | ``` 126 | 127 | - [ScrollsAPI.java](https://gitee.com/quanke/elasticsearch-java-study/blob/master/src/test/java/name/quanke/es/study/search/ScrollsAPI.java) 128 | 129 | - [本手册完整实例](https://gitee.com/quanke/elasticsearch-java-study) --------------------------------------------------------------------------------