├── .gitignore ├── LICENSE ├── README.md ├── pom.xml ├── sql └── dynamic_synonym_rule.sql └── src └── main ├── assemblies └── plugin.xml ├── java └── com │ └── ginobefunny │ └── elasticsearch │ └── plugins │ └── synonym │ ├── DynamicSynonymPlugin.java │ ├── DynamicSynonymTokenFilterFactory.java │ └── service │ ├── Configuration.java │ ├── DynamicSynonymTokenFilter.java │ ├── SimpleSynonymMap.java │ ├── SynonymRuleManager.java │ └── utils │ ├── JDBCUtils.java │ └── Monitor.java └── resources ├── license-check ├── license_header.txt └── license_header_definition.xml └── plugin-descriptor.properties /.gitignore: -------------------------------------------------------------------------------- 1 | .idea/ 2 | target/ 3 | *.iml 4 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | Apache License 2 | Version 2.0, January 2004 3 | http://www.apache.org/licenses/ 4 | 5 | TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 6 | 7 | 1. Definitions. 8 | 9 | "License" shall mean the terms and conditions for use, reproduction, 10 | and distribution as defined by Sections 1 through 9 of this document. 11 | 12 | "Licensor" shall mean the copyright owner or entity authorized by 13 | the copyright owner that is granting the License. 14 | 15 | "Legal Entity" shall mean the union of the acting entity and all 16 | other entities that control, are controlled by, or are under common 17 | control with that entity. For the purposes of this definition, 18 | "control" means (i) the power, direct or indirect, to cause the 19 | direction or management of such entity, whether by contract or 20 | otherwise, or (ii) ownership of fifty percent (50%) or more of the 21 | outstanding shares, or (iii) beneficial ownership of such entity. 22 | 23 | "You" (or "Your") shall mean an individual or Legal Entity 24 | exercising permissions granted by this License. 25 | 26 | "Source" form shall mean the preferred form for making modifications, 27 | including but not limited to software source code, documentation 28 | source, and configuration files. 29 | 30 | "Object" form shall mean any form resulting from mechanical 31 | transformation or translation of a Source form, including but 32 | not limited to compiled object code, generated documentation, 33 | and conversions to other media types. 34 | 35 | "Work" shall mean the work of authorship, whether in Source or 36 | Object form, made available under the License, as indicated by a 37 | copyright notice that is included in or attached to the work 38 | (an example is provided in the Appendix below). 39 | 40 | "Derivative Works" shall mean any work, whether in Source or Object 41 | form, that is based on (or derived from) the Work and for which the 42 | editorial revisions, annotations, elaborations, or other modifications 43 | represent, as a whole, an original work of authorship. For the purposes 44 | of this License, Derivative Works shall not include works that remain 45 | separable from, or merely link (or bind by name) to the interfaces of, 46 | the Work and Derivative Works thereof. 47 | 48 | "Contribution" shall mean any work of authorship, including 49 | the original version of the Work and any modifications or additions 50 | to that Work or Derivative Works thereof, that is intentionally 51 | submitted to Licensor for inclusion in the Work by the copyright owner 52 | or by an individual or Legal Entity authorized to submit on behalf of 53 | the copyright owner. For the purposes of this definition, "submitted" 54 | means any form of electronic, verbal, or written communication sent 55 | to the Licensor or its representatives, including but not limited to 56 | communication on electronic mailing lists, source code control systems, 57 | and issue tracking systems that are managed by, or on behalf of, the 58 | Licensor for the purpose of discussing and improving the Work, but 59 | excluding communication that is conspicuously marked or otherwise 60 | designated in writing by the copyright owner as "Not a Contribution." 61 | 62 | "Contributor" shall mean Licensor and any individual or Legal Entity 63 | on behalf of whom a Contribution has been received by Licensor and 64 | subsequently incorporated within the Work. 65 | 66 | 2. Grant of Copyright License. Subject to the terms and conditions of 67 | this License, each Contributor hereby grants to You a perpetual, 68 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 69 | copyright license to reproduce, prepare Derivative Works of, 70 | publicly display, publicly perform, sublicense, and distribute the 71 | Work and such Derivative Works in Source or Object form. 72 | 73 | 3. Grant of Patent License. Subject to the terms and conditions of 74 | this License, each Contributor hereby grants to You a perpetual, 75 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 76 | (except as stated in this section) patent license to make, have made, 77 | use, offer to sell, sell, import, and otherwise transfer the Work, 78 | where such license applies only to those patent claims licensable 79 | by such Contributor that are necessarily infringed by their 80 | Contribution(s) alone or by combination of their Contribution(s) 81 | with the Work to which such Contribution(s) was submitted. If You 82 | institute patent litigation against any entity (including a 83 | cross-claim or counterclaim in a lawsuit) alleging that the Work 84 | or a Contribution incorporated within the Work constitutes direct 85 | or contributory patent infringement, then any patent licenses 86 | granted to You under this License for that Work shall terminate 87 | as of the date such litigation is filed. 88 | 89 | 4. Redistribution. You may reproduce and distribute copies of the 90 | Work or Derivative Works thereof in any medium, with or without 91 | modifications, and in Source or Object form, provided that You 92 | meet the following conditions: 93 | 94 | (a) You must give any other recipients of the Work or 95 | Derivative Works a copy of this License; and 96 | 97 | (b) You must cause any modified files to carry prominent notices 98 | stating that You changed the files; and 99 | 100 | (c) You must retain, in the Source form of any Derivative Works 101 | that You distribute, all copyright, patent, trademark, and 102 | attribution notices from the Source form of the Work, 103 | excluding those notices that do not pertain to any part of 104 | the Derivative Works; and 105 | 106 | (d) If the Work includes a "NOTICE" text file as part of its 107 | distribution, then any Derivative Works that You distribute must 108 | include a readable copy of the attribution notices contained 109 | within such NOTICE file, excluding those notices that do not 110 | pertain to any part of the Derivative Works, in at least one 111 | of the following places: within a NOTICE text file distributed 112 | as part of the Derivative Works; within the Source form or 113 | documentation, if provided along with the Derivative Works; or, 114 | within a display generated by the Derivative Works, if and 115 | wherever such third-party notices normally appear. The contents 116 | of the NOTICE file are for informational purposes only and 117 | do not modify the License. You may add Your own attribution 118 | notices within Derivative Works that You distribute, alongside 119 | or as an addendum to the NOTICE text from the Work, provided 120 | that such additional attribution notices cannot be construed 121 | as modifying the License. 122 | 123 | You may add Your own copyright statement to Your modifications and 124 | may provide additional or different license terms and conditions 125 | for use, reproduction, or distribution of Your modifications, or 126 | for any such Derivative Works as a whole, provided Your use, 127 | reproduction, and distribution of the Work otherwise complies with 128 | the conditions stated in this License. 129 | 130 | 5. Submission of Contributions. Unless You explicitly state otherwise, 131 | any Contribution intentionally submitted for inclusion in the Work 132 | by You to the Licensor shall be under the terms and conditions of 133 | this License, without any additional terms or conditions. 134 | Notwithstanding the above, nothing herein shall supersede or modify 135 | the terms of any separate license agreement you may have executed 136 | with Licensor regarding such Contributions. 137 | 138 | 6. Trademarks. This License does not grant permission to use the trade 139 | names, trademarks, service marks, or product names of the Licensor, 140 | except as required for reasonable and customary use in describing the 141 | origin of the Work and reproducing the content of the NOTICE file. 142 | 143 | 7. Disclaimer of Warranty. Unless required by applicable law or 144 | agreed to in writing, Licensor provides the Work (and each 145 | Contributor provides its Contributions) on an "AS IS" BASIS, 146 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 147 | implied, including, without limitation, any warranties or conditions 148 | of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A 149 | PARTICULAR PURPOSE. You are solely responsible for determining the 150 | appropriateness of using or redistributing the Work and assume any 151 | risks associated with Your exercise of permissions under this License. 152 | 153 | 8. Limitation of Liability. In no event and under no legal theory, 154 | whether in tort (including negligence), contract, or otherwise, 155 | unless required by applicable law (such as deliberate and grossly 156 | negligent acts) or agreed to in writing, shall any Contributor be 157 | liable to You for damages, including any direct, indirect, special, 158 | incidental, or consequential damages of any character arising as a 159 | result of this License or out of the use or inability to use the 160 | Work (including but not limited to damages for loss of goodwill, 161 | work stoppage, computer failure or malfunction, or any and all 162 | other commercial damages or losses), even if such Contributor 163 | has been advised of the possibility of such damages. 164 | 165 | 9. Accepting Warranty or Additional Liability. While redistributing 166 | the Work or Derivative Works thereof, You may choose to offer, 167 | and charge a fee for, acceptance of support, warranty, indemnity, 168 | or other liability obligations and/or rights consistent with this 169 | License. However, in accepting such obligations, You may act only 170 | on Your own behalf and on Your sole responsibility, not on behalf 171 | of any other Contributor, and only if You agree to indemnify, 172 | defend, and hold each Contributor harmless for any liability 173 | incurred by, or claims asserted against, such Contributor by reason 174 | of your accepting any such warranty or additional liability. 175 | 176 | END OF TERMS AND CONDITIONS 177 | 178 | APPENDIX: How to apply the Apache License to your work. 179 | 180 | To apply the Apache License to your work, attach the following 181 | boilerplate notice, with the fields enclosed by brackets "{}" 182 | replaced with your own identifying information. (Don't include 183 | the brackets!) The text should be enclosed in the appropriate 184 | comment syntax for the file format. We also recommend that a 185 | file or class name and description of purpose be included on the 186 | same "printed page" as the copyright notice for easier 187 | identification within third-party archives. 188 | 189 | Copyright {yyyy} {name of copyright owner} 190 | 191 | Licensed under the Apache License, Version 2.0 (the "License"); 192 | you may not use this file except in compliance with the License. 193 | You may obtain a copy of the License at 194 | 195 | http://www.apache.org/licenses/LICENSE-2.0 196 | 197 | Unless required by applicable law or agreed to in writing, software 198 | distributed under the License is distributed on an "AS IS" BASIS, 199 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 200 | See the License for the specific language governing permissions and 201 | limitations under the License. 202 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # elasticsearch-dynamic-synonym 2 | An Elasticsearch token filter sopport load dynamic synonym. 3 | 4 | Elasticsearch自带了一个synonym同义词插件,但是该插件只能使用文件或在分析器中静态地配置同义词,如果需要添加或修改,需要修改配置文件和重启,使用方式不够友好。通过学习Elasticsearch的synonym代码,自研了一个可动态维护同义词的插件,并以运用于生产环境。 5 | 6 | # Elasticsearch自带的SynonymTokenFilter 7 | Elasticsearch自带的同义词过滤器支持在分析器配置(使用synonyms参数)和文件中配置(使用synonyms_path参数)同义词,配置方式如下: 8 | 9 | { 10 | "index" : { 11 | "analysis" : { 12 | "analyzer" : { 13 | "synonym_analyzer" : { 14 | "tokenizer" : "whitespace", 15 | "filter" : ["my_synonym"] 16 | } 17 | }, 18 | "filter" : { 19 | "my_synonym" : { 20 | "type" : "synonym", 21 | "expand": true, 22 | "ignore_case": true, 23 | "synonyms_path" : "analysis/synonym.txt" 24 | "synonyms" : ["阿迪, 阿迪达斯, adidasi => Adidas","Nike, 耐克, naike"] 25 | } 26 | } 27 | } 28 | } 29 | } 30 | 31 | 在配置同义词规则时有[Solr synonyms](https://www.elastic.co/guide/en/elasticsearch/reference/2.3/analysis-synonym-tokenfilter.html#_solr_synonyms)和[WordNet synonyms](https://www.elastic.co/guide/en/elasticsearch/reference/2.3/analysis-synonym-tokenfilter.html#_wordnet_synonyms),一般我们使用的都是Solr synonyms。在配置时又存在映射和对等两种方式,区别如下: 32 | 33 | 34 | 35 | // 精确映射同义词,【阿迪】、【阿迪达斯】和【adidasi】的token将会转换为【Adidas】存入倒排索引中 36 | 阿迪, 阿迪达斯, adidasi => Adidas 37 | 38 | // 对等同义词 39 | // 当expand为true时,当出现以下任何一个token,三个token都会存入倒排索引中 40 | // 当expand为false时,当出现以下任何一个token,第一个token也就是【Nike】会存入倒排索引中 41 | Nike, 耐克, naike 42 | 43 | # DynamicSynonymTokenFilter 44 | ## 实现方式 45 | - DynamicSynonymTokenFilter参考了SynonymTokenFilter的方式,但又予以简化,使用一个HashMap来保存同义词之间的转换关系; 46 | - DynamicSynonymTokenFilter只支持Solr synonyms,同时也支持expand和ignore_case参数的配置; 47 | - DynamicSynonymTokenFilter通过数据库来管理同义词的配置,并轮询数据库(通过version字段判断是否存在规则变化)实现同义词的动态管理; 48 | 49 | ## 安装 50 | 1.下载插件源码 51 | 52 | git clone git@github.com:ginobefun/elasticsearch-dynamic-synonym.git 53 | 54 | 2.使用maven编译插件 55 | 56 | mvn clean install -DskipTests 57 | 58 | 3.在ES_HOME/plugin目录新建dynamic-synonym目录,并将target/releases/elasticsearch-dynamic-synonym-\.zip文件解压到该目录 59 | 60 | 4.在MySQL中创建Elasticsearch同义词数据库并创建用户 61 | 62 | create database elasticsearch; 63 | DROP TABLE IF EXISTS `dynamic_synonym_rule`; 64 | CREATE TABLE `dynamic_synonym_rule` ( 65 | `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT, 66 | `rule` varchar(255) NOT NULL, 67 | `status` tinyint(1) NOT NULL DEFAULT '1' COMMENT '1: available, 0:unavailable', 68 | `version` int(11) NOT NULL, 69 | PRIMARY KEY (`id`), 70 | KEY `IDX_DYNAMIC_SYNONYM_VERSION` (`version`), 71 | KEY `IDX_DYNAMIC_SYNONYM_RULE` (`rule`) 72 | ) ENGINE=InnoDB AUTO_INCREMENT=3 DEFAULT CHARSET=utf8; 73 | 74 | -- ---------------------------- 75 | -- insert sample records 76 | -- ---------------------------- 77 | INSERT INTO `dynamic_synonym_rule` VALUES ('1', '阿迪, 阿迪达斯, adidasi => Adidas', '1', '1'); 78 | INSERT INTO `dynamic_synonym_rule` VALUES ('2', 'Nike, 耐克, naike', '1', '2'); 79 | 80 | 81 | 5.重启Elasticsearch 82 | 83 | ## 配置 84 | 85 | Elasticsearch创建索引时配置分析器和过滤器: 86 | 87 | PUT /index_synonym 88 | { 89 | "settings": { 90 | "analysis": { 91 | "analyzer": { 92 | "analyzer_with_dynamic_synonym": { 93 | "type": "custom", 94 | "tokenizer": "whitespace", 95 | "filter": ["my_synonym"] 96 | } 97 | }, 98 | "filter": { 99 | "my_synonym": { 100 | "type": "dynamic-synonym", 101 | "expand": true, 102 | "ignore_case": true, 103 | "tokenizer": "whitespace", 104 | "db_url": "jdbc:mysql://localhost:3306/elasticsearch?user=es_user&password=es_pwd&useUnicode=true&characterEncoding=UTF8" 105 | } 106 | } 107 | } 108 | } 109 | } 110 | 111 | 设置Mapping 112 | 113 | POST /index_synonym/product/_mapping 114 | { 115 | "product": { 116 | "properties": { 117 | "productName": { 118 | "type": "text", 119 | "analyzer": "analyzer_with_dynamic_synonym" 120 | } 121 | } 122 | } 123 | } 124 | 125 | 126 | ## 使用 127 | 索引一些测试数据 128 | 129 | POST /index_synonym/product/1 130 | {"productName":"This is a nike shoes"} 131 | 132 | POST /index_synonym/product/2 133 | {"productName":"This is a nike sports jacket"} 134 | 135 | POST /index_synonym/product/3 136 | {"productName":"This is a adidas shoes"} 137 | 138 | POST /index_synonym/product/4 139 | {"productName":"This is a adidas sports jacket"} 140 | 141 | POST /index_synonym/product/5 142 | {"productName":"This is a vans shoes"} 143 | 144 | POST /index_synonym/product/6 145 | {"productName":"This is a vans sports jacket"} 146 | 147 | 148 | 测试分析器效果【耐克】 149 | 150 | POST index_synonym/_search 151 | { 152 | "query": { 153 | "match": { 154 | "productName": "耐克" 155 | } 156 | } 157 | } 158 | 159 | { 160 | "took": 7, 161 | "timed_out": false, 162 | "_shards": { 163 | "total": 5, 164 | "successful": 5, 165 | "failed": 0 166 | }, 167 | "hits": { 168 | "total": 2, 169 | "max_score": 2.4740286, 170 | "hits": [ 171 | { 172 | "_index": "index_synonym", 173 | "_type": "product", 174 | "_id": "2", 175 | "_score": 2.4740286, 176 | "_source": { 177 | "productName": "This is a nike sports jacket" 178 | } 179 | }, 180 | { 181 | "_index": "index_synonym", 182 | "_type": "product", 183 | "_id": "1", 184 | "_score": 0.85747814, 185 | "_source": { 186 | "productName": "This is a nike shoes" 187 | } 188 | } 189 | ] 190 | } 191 | } 192 | 193 | 往数据库中插入一条同义词,测试【范斯】 194 | 195 | INSERT INTO `dynamic_synonym_rule` VALUES ('3', 'Vans, 范斯', '1', '3'); 196 | 197 | // wait for 2 minutes to reload 198 | [2017-03-15 15:52:28,895][INFO ][node ] [node-local] started 199 | [2017-03-15 15:55:29,645][INFO ][dynamic-synonym ] Start to reload synonym rule... 200 | [2017-03-15 15:55:29,661][INFO ][dynamic-synonym ] Succeed to reload 3 synonym rule! 201 | 202 | POST index_synonym/_search 203 | { 204 | "query": { 205 | "match": { 206 | "productName": "范斯" 207 | } 208 | } 209 | } 210 | 211 | { 212 | "took": 4, 213 | "timed_out": false, 214 | "_shards": { 215 | "total": 5, 216 | "successful": 5, 217 | "failed": 0 218 | }, 219 | "hits": { 220 | "total": 2, 221 | "max_score": 1.9490025, 222 | "hits": [ 223 | { 224 | "_index": "index_synonym", 225 | "_type": "product", 226 | "_id": "6", 227 | "_score": 1.9490025, 228 | "_source": { 229 | "productName": "This is a vans sports jacket" 230 | } 231 | }, 232 | { 233 | "_index": "index_synonym", 234 | "_type": "product", 235 | "_id": "5", 236 | "_score": 0.53484553, 237 | "_source": { 238 | "productName": "This is a vans shoes" 239 | } 240 | } 241 | ] 242 | } 243 | } 244 | 245 | # 总结与后续改进 246 | - 通过学习Elasticsearch源码自己实现了一个简易版的同义词插件,通过同义词的配置可以实现同义词规则的增删改的动态更新; 247 | - 需要注意的是,同义词的动态更新存在一个很重要的问题是原本在索引中已存在的数据不受同义词更新动态的影响,因此在使用时需要考虑是否可以容忍该问题,一个通常的做法是在某个时刻集中管理同义词,更新后执行索引重建动作; 248 | - 另外该插件目前存在一个问题,就是同义词的映射关系在内存中是一个全局数据,因此如果有多个不同的同义词过滤器则会存在问题,代码初始化时以第一个成功初始化的过滤器生成的映射关系为准,这个后续版本考虑改进。 249 | 250 | # 参考资料 251 | - [Using Synonyms](https://www.elastic.co/guide/en/elasticsearch/guide/current/using-synonyms.html) 252 | - [Synonym Token Filter](https://www.elastic.co/guide/en/elasticsearch/reference/2.3/analysis-synonym-tokenfilter.html) 253 | -------------------------------------------------------------------------------- /pom.xml: -------------------------------------------------------------------------------- 1 | 2 | 5 | 6 | ElasticSearch Dynaic Synonym Token Filter Plugin 7 | 4.0.0 8 | com.ginobefunny.elasticsearch.plugins 9 | elasticsearch-dynamic-synonym 10 | ${elasticsearch.version} 11 | jar 12 | ElasticSearch Plugin for Dynaic Synonym Token Filter. 13 | 2017 14 | 15 | 16 | 5.2.2 17 | 1.8 18 | ${project.basedir}/src/main/assemblies/plugin.xml 19 | dynamic-synonym 20 | com.ginobefunny.elasticsearch.plugins.synonym.DynamicSynonymPlugin 21 | true 22 | false 23 | true 24 | 25 | 26 | 27 | 28 | The Apache Software License, Version 2.0 29 | http://www.apache.org/licenses/LICENSE-2.0.txt 30 | repo 31 | 32 | 33 | 34 | 35 | 36 | ginobefun 37 | ginobefun@163.com 38 | 39 | 40 | 41 | 42 | scm:git:git@github.com:ginobefun/elasticsearch-dynamic-synonym.git 43 | scm:git:git@github.com:ginobefun/elasticsearch-dynamic-synonym.git 44 | http://github.com/ginobefun/elasticsearch-dynamic-synonym 45 | 46 | 47 | 48 | org.sonatype.oss 49 | oss-parent 50 | 9 51 | 52 | 53 | 54 | 55 | oss.sonatype.org 56 | https://oss.sonatype.org/content/repositories/snapshots 57 | 58 | 59 | oss.sonatype.org 60 | https://oss.sonatype.org/service/local/staging/deploy/maven2/ 61 | 62 | 63 | 64 | 65 | 66 | oss.sonatype.org 67 | OSS Sonatype 68 | true 69 | true 70 | http://oss.sonatype.org/content/repositories/releases/ 71 | 72 | 73 | 74 | 75 | 76 | org.elasticsearch 77 | elasticsearch 78 | ${elasticsearch.version} 79 | compile 80 | 81 | 82 | 83 | mysql 84 | mysql-connector-java 85 | 5.1.30 86 | 87 | 88 | 89 | org.apache.logging.log4j 90 | log4j-api 91 | 2.3 92 | 93 | 94 | 95 | org.apache.httpcomponents 96 | httpclient 97 | 4.5.2 98 | 99 | 100 | 101 | 102 | 103 | 104 | org.apache.maven.plugins 105 | maven-compiler-plugin 106 | 3.5.1 107 | 108 | ${maven.compiler.target} 109 | ${maven.compiler.target} 110 | 111 | 112 | 113 | org.apache.maven.plugins 114 | maven-surefire-plugin 115 | 2.11 116 | 117 | 118 | **/*Tests.java 119 | 120 | 121 | 122 | 123 | org.apache.maven.plugins 124 | maven-source-plugin 125 | 2.1.2 126 | 127 | 128 | attach-sources 129 | 130 | jar 131 | 132 | 133 | 134 | 135 | 136 | maven-assembly-plugin 137 | 138 | 139 | false 140 | ${project.build.directory}/releases/ 141 | 142 | ${basedir}/src/main/assemblies/plugin.xml 143 | 144 | 145 | 146 | fully.qualified.MainClass 147 | 148 | 149 | 150 | 151 | 152 | package 153 | 154 | single 155 | 156 | 157 | 158 | 159 | 160 | 161 | 162 | 163 | disable-java8-doclint 164 | 165 | [1.8,) 166 | 167 | 168 | -Xdoclint:none 169 | 170 | 171 | 172 | release 173 | 174 | 175 | 176 | org.sonatype.plugins 177 | nexus-staging-maven-plugin 178 | 1.6.3 179 | true 180 | 181 | oss 182 | https://oss.sonatype.org/ 183 | true 184 | 185 | 186 | 187 | org.apache.maven.plugins 188 | maven-release-plugin 189 | 2.1 190 | 191 | true 192 | false 193 | release 194 | deploy 195 | 196 | 197 | 198 | org.apache.maven.plugins 199 | maven-compiler-plugin 200 | 3.5.1 201 | 202 | ${maven.compiler.target} 203 | ${maven.compiler.target} 204 | 205 | 206 | 207 | org.apache.maven.plugins 208 | maven-source-plugin 209 | 2.2.1 210 | 211 | 212 | attach-sources 213 | 214 | jar-no-fork 215 | 216 | 217 | 218 | 219 | 220 | org.apache.maven.plugins 221 | maven-javadoc-plugin 222 | 2.9 223 | 224 | 225 | attach-javadocs 226 | 227 | jar 228 | 229 | 230 | 231 | 232 | 233 | 234 | 235 | 236 | -------------------------------------------------------------------------------- /sql/dynamic_synonym_rule.sql: -------------------------------------------------------------------------------- 1 | DROP TABLE IF EXISTS `dynamic_synonym_rule`; 2 | CREATE TABLE `dynamic_synonym_rule` ( 3 | `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT, 4 | `rule` varchar(255) NOT NULL, 5 | `status` tinyint(1) NOT NULL DEFAULT '1' COMMENT '1: available, 0:unavailable', 6 | `version` int(11) NOT NULL, 7 | PRIMARY KEY (`id`), 8 | KEY `IDX_DYNAMIC_SYNONYM_VERSION` (`version`), 9 | KEY `IDX_DYNAMIC_SYNONYM_RULE` (`rule`) 10 | ) ENGINE=InnoDB AUTO_INCREMENT=3 DEFAULT CHARSET=utf8; 11 | 12 | -- ---------------------------- 13 | -- insert sample records 14 | -- ---------------------------- 15 | INSERT INTO `dynamic_synonym_rule` VALUES ('1', '阿迪, 阿迪达斯, adidasi => Adidas', '1', '1'); 16 | INSERT INTO `dynamic_synonym_rule` VALUES ('2', 'Nike, 耐克, naike', '1', '2'); 17 | 18 | -------------------------------------------------------------------------------- /src/main/assemblies/plugin.xml: -------------------------------------------------------------------------------- 1 | 2 | 3 | elasticsearch-dynamic-synonym-release 4 | 5 | zip 6 | 7 | false 8 | 9 | 10 | 11 | ${project.basedir}/src/main/resources/plugin-descriptor.properties 12 | 13 | true 14 | 15 | 16 | 17 | 18 | / 19 | true 20 | true 21 | 22 | org.elasticsearch:elasticsearch 23 | org.apache.logging.log4j:log4j-api 24 | 25 | 26 | 27 | / 28 | true 29 | true 30 | 31 | org.apache.httpcomponents:httpclient 32 | 33 | 34 | 35 | 36 | -------------------------------------------------------------------------------- /src/main/java/com/ginobefunny/elasticsearch/plugins/synonym/DynamicSynonymPlugin.java: -------------------------------------------------------------------------------- 1 | /* 2 | * Licensed under the Apache License, Version 2.0 (the "License"); 3 | * you may not use this file except in compliance with the License. 4 | * You may obtain a copy of the License at 5 | * 6 | * http://www.apache.org/licenses/LICENSE-2.0 7 | * 8 | * Unless required by applicable law or agreed to in writing, software 9 | * distributed under the License is distributed on an "AS IS" BASIS, 10 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 11 | * See the License for the specific language governing permissions and 12 | * limitations under the License. 13 | */ 14 | package com.ginobefunny.elasticsearch.plugins.synonym; 15 | 16 | import org.elasticsearch.common.settings.Settings; 17 | import org.elasticsearch.env.Environment; 18 | import org.elasticsearch.index.IndexSettings; 19 | import org.elasticsearch.index.analysis.TokenFilterFactory; 20 | import org.elasticsearch.indices.analysis.AnalysisModule; 21 | import org.elasticsearch.plugins.AnalysisPlugin; 22 | import org.elasticsearch.plugins.Plugin; 23 | 24 | import java.io.IOException; 25 | import java.util.HashMap; 26 | import java.util.Map; 27 | 28 | /** 29 | * Created by ginozhang on 2017/1/12. 30 | */ 31 | public class DynamicSynonymPlugin extends Plugin implements AnalysisPlugin { 32 | 33 | /** Plugin name **/ 34 | public static final String PLUGIN_NAME = "dynamic-synonym"; 35 | 36 | @Override 37 | public Map> getTokenFilters() { 38 | Map> tokenFilters = new HashMap<>(); 39 | 40 | tokenFilters.put(PLUGIN_NAME, requiresAnalysisSettings((is, env, name, settings) -> new DynamicSynonymTokenFilterFactory(is, env, name, settings))); 41 | 42 | return tokenFilters; 43 | } 44 | 45 | private AnalysisModule.AnalysisProvider requiresAnalysisSettings(AnalysisModule.AnalysisProvider provider) { 46 | return new AnalysisModule.AnalysisProvider() { 47 | 48 | @Override 49 | public T get(IndexSettings indexSettings, Environment environment, String name, Settings settings) throws IOException { 50 | return provider.get(indexSettings, environment, name, settings); 51 | } 52 | 53 | @Override 54 | public boolean requiresAnalysisSettings() { 55 | return true; 56 | } 57 | }; 58 | } 59 | } -------------------------------------------------------------------------------- /src/main/java/com/ginobefunny/elasticsearch/plugins/synonym/DynamicSynonymTokenFilterFactory.java: -------------------------------------------------------------------------------- 1 | /* 2 | * Licensed under the Apache License, Version 2.0 (the "License"); 3 | * you may not use this file except in compliance with the License. 4 | * You may obtain a copy of the License at 5 | * 6 | * http://www.apache.org/licenses/LICENSE-2.0 7 | * 8 | * Unless required by applicable law or agreed to in writing, software 9 | * distributed under the License is distributed on an "AS IS" BASIS, 10 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 11 | * See the License for the specific language governing permissions and 12 | * limitations under the License. 13 | */ 14 | package com.ginobefunny.elasticsearch.plugins.synonym; 15 | 16 | import com.ginobefunny.elasticsearch.plugins.synonym.service.Configuration; 17 | import com.ginobefunny.elasticsearch.plugins.synonym.service.DynamicSynonymTokenFilter; 18 | import com.ginobefunny.elasticsearch.plugins.synonym.service.SynonymRuleManager; 19 | import org.apache.lucene.analysis.Analyzer; 20 | import org.apache.lucene.analysis.TokenStream; 21 | import org.apache.lucene.analysis.core.KeywordAnalyzer; 22 | import org.apache.lucene.analysis.core.SimpleAnalyzer; 23 | import org.apache.lucene.analysis.core.WhitespaceAnalyzer; 24 | import org.apache.lucene.analysis.standard.StandardAnalyzer; 25 | import org.elasticsearch.common.settings.Settings; 26 | import org.elasticsearch.env.Environment; 27 | import org.elasticsearch.index.IndexSettings; 28 | import org.elasticsearch.index.analysis.AbstractTokenFilterFactory; 29 | 30 | import java.io.IOException; 31 | 32 | public class DynamicSynonymTokenFilterFactory extends AbstractTokenFilterFactory { 33 | 34 | public DynamicSynonymTokenFilterFactory(IndexSettings indexSettings, Environment env, 35 | String name, Settings settings) throws IOException { 36 | super(indexSettings, name, settings); 37 | 38 | // get the filter setting params 39 | final boolean ignoreCase = settings.getAsBoolean("ignore_case", false); 40 | final boolean expand = settings.getAsBoolean("expand", true); 41 | final String dbUrl = settings.get("db_url"); 42 | final String tokenizerName = settings.get("tokenizer", "whitespace"); 43 | 44 | Analyzer analyzer; 45 | if ("standand".equalsIgnoreCase(tokenizerName)) { 46 | analyzer = new StandardAnalyzer(); 47 | } else if ("keyword".equalsIgnoreCase(tokenizerName)) { 48 | analyzer = new KeywordAnalyzer(); 49 | } else if ("simple".equalsIgnoreCase(tokenizerName)) { 50 | analyzer = new SimpleAnalyzer(); 51 | } else { 52 | analyzer = new WhitespaceAnalyzer(); 53 | } 54 | 55 | // NOTE: the manager will only init once 56 | SynonymRuleManager.initial(new Configuration(ignoreCase, expand, analyzer, dbUrl)); 57 | } 58 | 59 | @Override 60 | public TokenStream create(TokenStream tokenStream) { 61 | return new DynamicSynonymTokenFilter(tokenStream); 62 | } 63 | } 64 | -------------------------------------------------------------------------------- /src/main/java/com/ginobefunny/elasticsearch/plugins/synonym/service/Configuration.java: -------------------------------------------------------------------------------- 1 | /* 2 | * Licensed under the Apache License, Version 2.0 (the "License"); 3 | * you may not use this file except in compliance with the License. 4 | * You may obtain a copy of the License at 5 | * 6 | * http://www.apache.org/licenses/LICENSE-2.0 7 | * 8 | * Unless required by applicable law or agreed to in writing, software 9 | * distributed under the License is distributed on an "AS IS" BASIS, 10 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 11 | * See the License for the specific language governing permissions and 12 | * limitations under the License. 13 | */ 14 | package com.ginobefunny.elasticsearch.plugins.synonym.service; 15 | 16 | import org.apache.lucene.analysis.Analyzer; 17 | 18 | /** 19 | * Created by ginozhang on 2017/1/12. 20 | */ 21 | public class Configuration { 22 | 23 | private final boolean ignoreCase; 24 | 25 | private final boolean expand; 26 | 27 | private final String dbUrl; 28 | 29 | private final Analyzer analyzer; 30 | 31 | public Configuration(boolean ignoreCase, boolean expand, Analyzer analyzer, String dbUrl) { 32 | this.ignoreCase = ignoreCase; 33 | this.expand = expand; 34 | this.analyzer = analyzer; 35 | this.dbUrl = dbUrl; 36 | } 37 | 38 | public Analyzer getAnalyzer() { 39 | return analyzer; 40 | } 41 | 42 | public boolean isIgnoreCase() { 43 | return ignoreCase; 44 | } 45 | 46 | public boolean isExpand() { 47 | return expand; 48 | } 49 | 50 | public String getDBUrl() { 51 | return dbUrl; 52 | } 53 | } 54 | -------------------------------------------------------------------------------- /src/main/java/com/ginobefunny/elasticsearch/plugins/synonym/service/DynamicSynonymTokenFilter.java: -------------------------------------------------------------------------------- 1 | /* 2 | * Licensed under the Apache License, Version 2.0 (the "License"); 3 | * you may not use this file except in compliance with the License. 4 | * You may obtain a copy of the License at 5 | * 6 | * http://www.apache.org/licenses/LICENSE-2.0 7 | * 8 | * Unless required by applicable law or agreed to in writing, software 9 | * distributed under the License is distributed on an "AS IS" BASIS, 10 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 11 | * See the License for the specific language governing permissions and 12 | * limitations under the License. 13 | */ 14 | package com.ginobefunny.elasticsearch.plugins.synonym.service; 15 | 16 | import org.apache.lucene.analysis.TokenFilter; 17 | import org.apache.lucene.analysis.TokenStream; 18 | import org.apache.lucene.analysis.tokenattributes.CharTermAttribute; 19 | import org.apache.lucene.analysis.tokenattributes.OffsetAttribute; 20 | import org.apache.lucene.analysis.tokenattributes.TypeAttribute; 21 | 22 | import java.io.IOException; 23 | import java.util.List; 24 | 25 | /** 26 | * Created by ginozhang on 2017/1/12. 27 | */ 28 | public class DynamicSynonymTokenFilter extends TokenFilter { 29 | 30 | public static final String TYPE_SYNONYM = "SYNONYM"; 31 | 32 | private final CharTermAttribute termAtt = addAttribute(CharTermAttribute.class); 33 | 34 | private final TypeAttribute typeAtt = addAttribute(TypeAttribute.class); 35 | 36 | private final OffsetAttribute offset = addAttribute(OffsetAttribute.class); 37 | 38 | private String currentInput = null; 39 | 40 | private int startOffset = 0; 41 | 42 | private int endOffset = 0; 43 | 44 | private List currentWords = null; 45 | 46 | private int currentIndex = 0; 47 | 48 | public DynamicSynonymTokenFilter(TokenStream input) { 49 | super(input); 50 | } 51 | 52 | @Override 53 | public boolean incrementToken() throws IOException { 54 | if (currentInput == null) { 55 | if (!input.incrementToken()) { 56 | return false; 57 | } 58 | 59 | currentInput = new String(termAtt.buffer(), 0, termAtt.length()); 60 | startOffset = offset.startOffset(); 61 | endOffset = offset.endOffset(); 62 | currentWords = SynonymRuleManager.getSingleton().getSynonymWords(currentInput); 63 | if (currentWords == null || currentWords.isEmpty()) { 64 | currentInput = null; 65 | 66 | // 返回当前的token 67 | return true; 68 | } 69 | currentIndex = 0; 70 | } 71 | 72 | if (currentIndex >= currentWords.size()) { 73 | currentInput = null; 74 | return incrementToken(); 75 | } 76 | 77 | String newWords = currentWords.get(currentIndex); 78 | currentIndex++; 79 | clearAttributes(); 80 | char[] output = newWords.toCharArray(); 81 | termAtt.copyBuffer(output, 0, output.length); 82 | typeAtt.setType(TYPE_SYNONYM); 83 | offset.setOffset(startOffset, endOffset); 84 | return true; 85 | } 86 | 87 | @Override 88 | public void reset() throws IOException { 89 | super.reset(); 90 | currentInput = null; 91 | startOffset = 0; 92 | endOffset = 0; 93 | currentWords = null; 94 | currentIndex = 0; 95 | } 96 | } 97 | -------------------------------------------------------------------------------- /src/main/java/com/ginobefunny/elasticsearch/plugins/synonym/service/SimpleSynonymMap.java: -------------------------------------------------------------------------------- 1 | /* 2 | * Licensed under the Apache License, Version 2.0 (the "License"); 3 | * you may not use this file except in compliance with the License. 4 | * You may obtain a copy of the License at 5 | * 6 | * http://www.apache.org/licenses/LICENSE-2.0 7 | * 8 | * Unless required by applicable law or agreed to in writing, software 9 | * distributed under the License is distributed on an "AS IS" BASIS, 10 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 11 | * See the License for the specific language governing permissions and 12 | * limitations under the License. 13 | */ 14 | package com.ginobefunny.elasticsearch.plugins.synonym.service; 15 | 16 | import com.ginobefunny.elasticsearch.plugins.synonym.service.utils.Monitor; 17 | import org.apache.logging.log4j.Logger; 18 | import org.apache.lucene.analysis.Analyzer; 19 | import org.apache.lucene.analysis.TokenStream; 20 | import org.apache.lucene.analysis.tokenattributes.CharTermAttribute; 21 | import org.apache.lucene.analysis.tokenattributes.PositionIncrementAttribute; 22 | import org.elasticsearch.common.logging.ESLoggerFactory; 23 | 24 | import java.io.IOException; 25 | import java.util.*; 26 | 27 | /** 28 | * Created by ginozhang on 2017/1/12. 29 | * SEE: org.apache.lucene.analysis.synonym.SolrSynonymParser 30 | */ 31 | public class SimpleSynonymMap { 32 | 33 | private static final Logger LOGGER = ESLoggerFactory.getLogger(Monitor.class.getName()); 34 | 35 | private Map> ruleMap = new HashMap>(); 36 | 37 | private final Configuration configuration; 38 | 39 | public SimpleSynonymMap(Configuration cfg) { 40 | this.configuration = cfg; 41 | } 42 | 43 | public void addRule(String rule) { 44 | try { 45 | addInternal(rule); 46 | } catch (Throwable t) { 47 | LOGGER.error("Add synonym rule failed. rule: " + rule, t); 48 | } 49 | } 50 | 51 | private void addInternal(String line) throws IOException { 52 | String sides[] = split(line, "=>"); 53 | if (sides.length > 1) { // explicit mapping 54 | if (sides.length != 2) { 55 | throw new IllegalArgumentException("more than one explicit mapping specified on the same line"); 56 | } 57 | 58 | List inputList = new ArrayList<>(); 59 | String inputStrings[] = split(sides[0], ","); 60 | for (int i = 0; i < inputStrings.length; i++) { 61 | inputList.addAll(analyze(process(inputStrings[i]))); 62 | } 63 | 64 | List outputList = new ArrayList<>(); 65 | String outputStrings[] = split(sides[1], ","); 66 | for (int i = 0; i < outputStrings.length; i++) { 67 | outputList.addAll(analyze(process(outputStrings[i]))); 68 | } 69 | 70 | // these mappings are explicit and never preserve original 71 | for (String input : inputList) { 72 | for (String output : outputList) { 73 | addToRuleMap(input, output); 74 | } 75 | } 76 | } else { 77 | List inputList = new ArrayList<>(); 78 | String inputStrings[] = split(line, ","); 79 | for (int i = 0; i < inputStrings.length; i++) { 80 | inputList.addAll(analyze(process(inputStrings[i]))); 81 | } 82 | 83 | if (configuration.isExpand()) { 84 | // all pairs 85 | for (String input : inputList) { 86 | for (String output : inputList) { 87 | addToRuleMap(input, output); 88 | } 89 | } 90 | } else { 91 | // all subsequent inputs map to first one; we also add inputs[0] here 92 | // so that we "effectively" (because we remove the original input and 93 | // add back a synonym with the same text) change that token's type to 94 | // SYNONYM (matching legacy behavior): 95 | for (int i = 0; i < inputList.size(); i++) { 96 | addToRuleMap(inputList.get(i), inputList.get(0)); 97 | } 98 | } 99 | } 100 | } 101 | 102 | private Set analyze(String text) throws IOException { 103 | Set result = new HashSet(); 104 | Analyzer analyzer = configuration.getAnalyzer(); 105 | try (TokenStream ts = analyzer.tokenStream("", text)) { 106 | CharTermAttribute termAtt = ts.addAttribute(CharTermAttribute.class); 107 | PositionIncrementAttribute posIncAtt = ts.addAttribute(PositionIncrementAttribute.class); 108 | ts.reset(); 109 | while (ts.incrementToken()) { 110 | int length = termAtt.length(); 111 | if (length == 0) { 112 | throw new IllegalArgumentException("term: " + text + " analyzed to a zero-length token"); 113 | } 114 | if (posIncAtt.getPositionIncrement() != 1) { 115 | throw new IllegalArgumentException("term: " + text + " analyzed to a token with posinc != 1"); 116 | } 117 | 118 | result.add(new String(termAtt.buffer(), 0, termAtt.length())); 119 | } 120 | 121 | ts.end(); 122 | return result; 123 | } 124 | } 125 | 126 | private void addToRuleMap(String inputString, String outputString) { 127 | List outputs = ruleMap.get(inputString); 128 | if (outputs == null) { 129 | outputs = new ArrayList(); 130 | ruleMap.put(inputString, outputs); 131 | } 132 | 133 | if (!outputs.contains(outputString)) { 134 | outputs.add(outputString); 135 | } 136 | } 137 | 138 | private static String[] split(String s, String separator) { 139 | List list = new ArrayList(2); 140 | StringBuilder sb = new StringBuilder(); 141 | int pos = 0, end = s.length(); 142 | while (pos < end) { 143 | if (s.startsWith(separator, pos)) { 144 | if (sb.length() > 0) { 145 | list.add(sb.toString()); 146 | sb = new StringBuilder(); 147 | } 148 | pos += separator.length(); 149 | continue; 150 | } 151 | 152 | char ch = s.charAt(pos++); 153 | if (ch == '\\') { 154 | sb.append(ch); 155 | if (pos >= end) break; // ERROR, or let it go? 156 | ch = s.charAt(pos++); 157 | } 158 | 159 | sb.append(ch); 160 | } 161 | 162 | if (sb.length() > 0) { 163 | list.add(sb.toString()); 164 | } 165 | 166 | return list.toArray(new String[list.size()]); 167 | } 168 | 169 | private String process(String input) { 170 | 171 | String inputStr = configuration.isIgnoreCase() ? input.trim().toLowerCase(Locale.getDefault()) : input; 172 | if (inputStr.indexOf("\\") >= 0) { 173 | StringBuilder sb = new StringBuilder(); 174 | for (int i = 0; i < inputStr.length(); i++) { 175 | char ch = inputStr.charAt(i); 176 | if (ch == '\\' && i < inputStr.length() - 1) { 177 | sb.append(inputStr.charAt(++i)); 178 | } else { 179 | sb.append(ch); 180 | } 181 | } 182 | return sb.toString(); 183 | } 184 | return inputStr; 185 | } 186 | 187 | public List getSynonymWords(String input) { 188 | if (!ruleMap.containsKey(input)) { 189 | return null; 190 | } 191 | 192 | return ruleMap.get(input); 193 | } 194 | 195 | } 196 | -------------------------------------------------------------------------------- /src/main/java/com/ginobefunny/elasticsearch/plugins/synonym/service/SynonymRuleManager.java: -------------------------------------------------------------------------------- 1 | /* 2 | * Licensed under the Apache License, Version 2.0 (the "License"); 3 | * you may not use this file except in compliance with the License. 4 | * You may obtain a copy of the License at 5 | * 6 | * http://www.apache.org/licenses/LICENSE-2.0 7 | * 8 | * Unless required by applicable law or agreed to in writing, software 9 | * distributed under the License is distributed on an "AS IS" BASIS, 10 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 11 | * See the License for the specific language governing permissions and 12 | * limitations under the License. 13 | */ 14 | package com.ginobefunny.elasticsearch.plugins.synonym.service; 15 | 16 | import com.ginobefunny.elasticsearch.plugins.synonym.service.utils.JDBCUtils; 17 | import com.ginobefunny.elasticsearch.plugins.synonym.service.utils.Monitor; 18 | import org.apache.logging.log4j.Logger; 19 | import org.elasticsearch.common.logging.ESLoggerFactory; 20 | 21 | import java.util.List; 22 | import java.util.concurrent.Executors; 23 | import java.util.concurrent.ScheduledExecutorService; 24 | import java.util.concurrent.ThreadFactory; 25 | import java.util.concurrent.TimeUnit; 26 | 27 | /** 28 | * Created by ginozhang on 2017/1/12. 29 | */ 30 | public class SynonymRuleManager { 31 | 32 | private static final Logger LOGGER = ESLoggerFactory.getLogger(Monitor.class.getName()); 33 | 34 | private static final int DB_CHECK_URL = 60; 35 | 36 | private static final ScheduledExecutorService executorService = Executors.newScheduledThreadPool(1, new ThreadFactory() { 37 | @Override 38 | public Thread newThread(Runnable r) { 39 | return new Thread(r, "monitor-thread"); 40 | } 41 | }); 42 | 43 | private static SynonymRuleManager singleton; 44 | 45 | private Configuration configuration; 46 | 47 | private SimpleSynonymMap synonymMap; 48 | 49 | public static synchronized SynonymRuleManager initial(Configuration cfg) { 50 | if (singleton == null) { 51 | synchronized (SynonymRuleManager.class) { 52 | if (singleton == null) { 53 | singleton = new SynonymRuleManager(); 54 | singleton.configuration = cfg; 55 | long loadedMaxVersion = singleton.loadSynonymRule(); 56 | executorService.scheduleWithFixedDelay(new Monitor(cfg, loadedMaxVersion), 1, 57 | DB_CHECK_URL, TimeUnit.SECONDS); 58 | } 59 | } 60 | } 61 | 62 | return singleton; 63 | } 64 | 65 | public static SynonymRuleManager getSingleton() { 66 | if (singleton == null) { 67 | throw new IllegalStateException("Please initial first."); 68 | } 69 | return singleton; 70 | } 71 | 72 | public List getSynonymWords(String inputToken) { 73 | if (this.synonymMap == null) { 74 | return null; 75 | } 76 | 77 | return this.synonymMap.getSynonymWords(inputToken); 78 | } 79 | 80 | private long loadSynonymRule() { 81 | try { 82 | long currentMaxVersion = JDBCUtils.queryMaxSynonymRuleVersion(configuration.getDBUrl()); 83 | List synonymRuleList = JDBCUtils.querySynonymRules(configuration.getDBUrl(), currentMaxVersion); 84 | this.synonymMap = new SimpleSynonymMap(this.configuration); 85 | for (String rule : synonymRuleList) { 86 | this.synonymMap.addRule(rule); 87 | } 88 | 89 | LOGGER.info("Load {} synonym rule succeed!", synonymRuleList.size()); 90 | return currentMaxVersion; 91 | } catch (Exception e) { 92 | LOGGER.error("Load synonym rule failed!", e); 93 | //throw new RuntimeException(e); 94 | return 0L; 95 | } 96 | } 97 | 98 | public boolean reloadSynonymRule(long maxVersion) { 99 | LOGGER.info("Start to reload synonym rule..."); 100 | boolean reloadResult = true; 101 | try { 102 | SynonymRuleManager tmpManager = new SynonymRuleManager(); 103 | tmpManager.configuration = getSingleton().configuration; 104 | List synonymRuleList = JDBCUtils.querySynonymRules(configuration.getDBUrl(), maxVersion); 105 | SimpleSynonymMap tempSynonymMap = new SimpleSynonymMap(tmpManager.configuration); 106 | for (String rule : synonymRuleList) { 107 | tempSynonymMap.addRule(rule); 108 | } 109 | 110 | this.synonymMap = tempSynonymMap; 111 | LOGGER.info("Succeed to reload {} synonym rule!", synonymRuleList.size()); 112 | } catch (Throwable t) { 113 | LOGGER.error("Failed to reload synonym rule!", t); 114 | reloadResult = false; 115 | } 116 | 117 | return reloadResult; 118 | } 119 | } 120 | -------------------------------------------------------------------------------- /src/main/java/com/ginobefunny/elasticsearch/plugins/synonym/service/utils/JDBCUtils.java: -------------------------------------------------------------------------------- 1 | /* 2 | * Licensed under the Apache License, Version 2.0 (the "License"); 3 | * you may not use this file except in compliance with the License. 4 | * You may obtain a copy of the License at 5 | * 6 | * http://www.apache.org/licenses/LICENSE-2.0 7 | * 8 | * Unless required by applicable law or agreed to in writing, software 9 | * distributed under the License is distributed on an "AS IS" BASIS, 10 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 11 | * See the License for the specific language governing permissions and 12 | * limitations under the License. 13 | */ 14 | package com.ginobefunny.elasticsearch.plugins.synonym.service.utils; 15 | 16 | import java.sql.*; 17 | import java.util.ArrayList; 18 | import java.util.List; 19 | 20 | /** 21 | * Created by ginozhang on 2017/1/12. 22 | */ 23 | public final class JDBCUtils { 24 | 25 | public static long queryMaxSynonymRuleVersion(String dbUrl) throws Exception { 26 | long maxVersion = 0; 27 | Connection conn = null; 28 | Statement stmt = null; 29 | ResultSet rs = null; 30 | try { 31 | Class.forName("com.mysql.jdbc.Driver"); 32 | conn = DriverManager.getConnection(dbUrl); 33 | stmt = conn.createStatement(); 34 | String sql = "SELECT max(version) VERSION FROM dynamic_synonym_rule"; 35 | rs = stmt.executeQuery(sql); 36 | while (rs.next()) { 37 | maxVersion = rs.getLong("VERSION"); 38 | } 39 | } finally { 40 | closeQuietly(conn, stmt, rs); 41 | } 42 | 43 | return maxVersion; 44 | } 45 | 46 | public static List querySynonymRules(String dbUrl, long lastestVersion) throws Exception { 47 | List list = new ArrayList(); 48 | Connection conn = null; 49 | Statement stmt = null; 50 | ResultSet rs = null; 51 | try { 52 | Class.forName("com.mysql.jdbc.Driver"); 53 | conn = DriverManager.getConnection(dbUrl); 54 | stmt = conn.createStatement(); 55 | String sql; 56 | if (lastestVersion > 0) { 57 | sql = "SELECT rule FROM dynamic_synonym_rule WHERE version <= " + lastestVersion + " and status = 1"; 58 | } else { 59 | sql = "SELECT rule FROM dynamic_synonym_rule WHERE status = 1"; 60 | } 61 | 62 | rs = stmt.executeQuery(sql); 63 | while (rs.next()) { 64 | list.add(rs.getString("rule")); 65 | } 66 | } finally { 67 | closeQuietly(conn, stmt, rs); 68 | } 69 | 70 | return list; 71 | 72 | } 73 | 74 | private static void closeQuietly(Connection conn, Statement stmt, ResultSet rs) { 75 | if (rs != null) { 76 | try { 77 | rs.close(); 78 | } catch (SQLException e) { 79 | } 80 | } 81 | if (stmt != null) { 82 | try { 83 | stmt.close(); 84 | } catch (SQLException e) { 85 | } 86 | } 87 | if (conn != null) { 88 | try { 89 | conn.close(); 90 | } catch (SQLException e) { 91 | } 92 | } 93 | } 94 | 95 | } 96 | -------------------------------------------------------------------------------- /src/main/java/com/ginobefunny/elasticsearch/plugins/synonym/service/utils/Monitor.java: -------------------------------------------------------------------------------- 1 | /* 2 | * Licensed under the Apache License, Version 2.0 (the "License"); 3 | * you may not use this file except in compliance with the License. 4 | * You may obtain a copy of the License at 5 | * 6 | * http://www.apache.org/licenses/LICENSE-2.0 7 | * 8 | * Unless required by applicable law or agreed to in writing, software 9 | * distributed under the License is distributed on an "AS IS" BASIS, 10 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 11 | * See the License for the specific language governing permissions and 12 | * limitations under the License. 13 | */ 14 | package com.ginobefunny.elasticsearch.plugins.synonym.service.utils; 15 | 16 | import com.ginobefunny.elasticsearch.plugins.synonym.service.Configuration; 17 | import com.ginobefunny.elasticsearch.plugins.synonym.service.SynonymRuleManager; 18 | import org.apache.logging.log4j.Logger; 19 | import org.elasticsearch.common.logging.ESLoggerFactory; 20 | 21 | /** 22 | * Created by ginozhang on 2017/1/12. 23 | */ 24 | public class Monitor implements Runnable { 25 | 26 | private static final Logger LOGGER = ESLoggerFactory.getLogger(Monitor.class.getName()); 27 | 28 | private Configuration configuration; 29 | 30 | private long lastUpdateVersion; 31 | 32 | public Monitor(Configuration cfg, long initialVersion) { 33 | this.configuration = cfg; 34 | this.lastUpdateVersion = initialVersion; 35 | } 36 | 37 | @Override 38 | public void run() { 39 | try { 40 | long currentMaxVersion = JDBCUtils.queryMaxSynonymRuleVersion(configuration.getDBUrl()); 41 | if (currentMaxVersion > lastUpdateVersion) { 42 | if (SynonymRuleManager.getSingleton().reloadSynonymRule(currentMaxVersion)) { 43 | lastUpdateVersion = currentMaxVersion; 44 | } 45 | } 46 | } catch (Exception e) { 47 | LOGGER.error("Failed to reload synonym rule!", e); 48 | } 49 | } 50 | 51 | } 52 | -------------------------------------------------------------------------------- /src/main/resources/license-check/license_header.txt: -------------------------------------------------------------------------------- 1 | Licensed under the Apache License, Version 2.0 (the "License"); 2 | you may not use this file except in compliance with the License. 3 | You may obtain a copy of the License at 4 | 5 | http://www.apache.org/licenses/LICENSE-2.0 6 | 7 | Unless required by applicable law or agreed to in writing, software 8 | distributed under the License is distributed on an "AS IS" BASIS, 9 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 10 | See the License for the specific language governing permissions and 11 | limitations under the License. -------------------------------------------------------------------------------- /src/main/resources/license-check/license_header_definition.xml: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | /* 5 | * 6 | */ 7 | (\s|\t)*/\*.*$ 8 | .*\*/(\s|\t)*$ 9 | false 10 | true 11 | 12 | -------------------------------------------------------------------------------- /src/main/resources/plugin-descriptor.properties: -------------------------------------------------------------------------------- 1 | # 'description': simple summary of the plugin 2 | description=${project.description} 3 | # 4 | # 'version': plugin's version 5 | version=${project.version} 6 | # 7 | # 'name': the plugin name 8 | name=${elasticsearch.plugin.name} 9 | 10 | ### mandatory elements for site plugins: 11 | # 12 | # 'site': set to true to indicate contents of the _site/ 13 | # directory in the root of the plugin should be served. 14 | site=${elasticsearch.plugin.site} 15 | # 16 | ### mandatory elements for jvm plugins : 17 | # 18 | # 'jvm': true if the 'classname' class should be loaded 19 | # from jar files in the root directory of the plugin. 20 | # Note that only jar files in the root directory are 21 | # added to the classpath for the plugin! If you need 22 | # other resources, package them into a resources jar. 23 | jvm=${elasticsearch.plugin.jvm} 24 | # 25 | # 'classname': the name of the class to load, fully-qualified. 26 | classname=${elasticsearch.plugin.classname} 27 | # 28 | # 'java.version' version of java the code is built against 29 | # use the system property java.specification.version 30 | # version string must be a sequence of nonnegative decimal integers 31 | # separated by "."'s and may have leading zeros 32 | java.version=${maven.compiler.target} 33 | # 34 | # 'elasticsearch.version' version of elasticsearch compiled against 35 | # You will have to release a new version of the plugin for each new 36 | # elasticsearch release. This version is checked when the plugin 37 | # is loaded so Elasticsearch will refuse to start in the presence of 38 | # plugins with the incorrect elasticsearch.version. 39 | elasticsearch.version=${elasticsearch.version} 40 | # 41 | ### deprecated elements for jvm plugins : 42 | # 43 | # 'isolated': true if the plugin should have its own classloader. 44 | # passing false is deprecated, and only intended to support plugins 45 | # that have hard dependencies against each other. If this is 46 | # not specified, then the plugin is isolated by default. 47 | isolated=${elasticsearch.plugin.isolated} 48 | # --------------------------------------------------------------------------------