├── .github ├── CONTRIBUTING.md ├── ISSUE_TEMPLATE.md └── PULL_REQUEST_TEMPLATE.md ├── .gitignore ├── .travis.yml ├── CHANGELOG.md ├── CONTRIBUTORS ├── Gemfile ├── LICENSE ├── NOTICE.TXT ├── README.md ├── Rakefile ├── docs └── index.asciidoc ├── lib └── logstash │ └── filters │ └── kv.rb ├── logstash-filter-kv.gemspec └── spec └── filters └── kv_spec.rb /.github/CONTRIBUTING.md: -------------------------------------------------------------------------------- 1 | # Contributing to Logstash 2 | 3 | All contributions are welcome: ideas, patches, documentation, bug reports, 4 | complaints, etc! 5 | 6 | Programming is not a required skill, and there are many ways to help out! 7 | It is more important to us that you are able to contribute. 8 | 9 | That said, some basic guidelines, which you are free to ignore :) 10 | 11 | ## Want to learn? 12 | 13 | Want to lurk about and see what others are doing with Logstash? 14 | 15 | * The irc channel (#logstash on irc.freenode.org) is a good place for this 16 | * The [forum](https://discuss.elastic.co/c/logstash) is also 17 | great for learning from others. 18 | 19 | ## Got Questions? 20 | 21 | Have a problem you want Logstash to solve for you? 22 | 23 | * You can ask a question in the [forum](https://discuss.elastic.co/c/logstash) 24 | * Alternately, you are welcome to join the IRC channel #logstash on 25 | irc.freenode.org and ask for help there! 26 | 27 | ## Have an Idea or Feature Request? 28 | 29 | * File a ticket on [GitHub](https://github.com/elastic/logstash/issues). Please remember that GitHub is used only for issues and feature requests. If you have a general question, the [forum](https://discuss.elastic.co/c/logstash) or IRC would be the best place to ask. 30 | 31 | ## Something Not Working? Found a Bug? 32 | 33 | If you think you found a bug, it probably is a bug. 34 | 35 | * If it is a general Logstash or a pipeline issue, file it in [Logstash GitHub](https://github.com/elasticsearch/logstash/issues) 36 | * If it is specific to a plugin, please file it in the respective repository under [logstash-plugins](https://github.com/logstash-plugins) 37 | * or ask the [forum](https://discuss.elastic.co/c/logstash). 38 | 39 | # Contributing Documentation and Code Changes 40 | 41 | If you have a bugfix or new feature that you would like to contribute to 42 | logstash, and you think it will take more than a few minutes to produce the fix 43 | (ie; write code), it is worth discussing the change with the Logstash users and developers first! You can reach us via [GitHub](https://github.com/elastic/logstash/issues), the [forum](https://discuss.elastic.co/c/logstash), or via IRC (#logstash on freenode irc) 44 | Please note that Pull Requests without tests will not be merged. If you would like to contribute but do not have experience with writing tests, please ping us on IRC/forum or create a PR and ask our help. 45 | 46 | ## Contributing to plugins 47 | 48 | Check our [documentation](https://www.elastic.co/guide/en/logstash/current/contributing-to-logstash.html) on how to contribute to plugins or write your own! It is super easy! 49 | 50 | ## Contribution Steps 51 | 52 | 1. Test your changes! [Run](https://github.com/elastic/logstash#testing) the test suite 53 | 2. Please make sure you have signed our [Contributor License 54 | Agreement](https://www.elastic.co/contributor-agreement/). We are not 55 | asking you to assign copyright to us, but to give us the right to distribute 56 | your code without restriction. We ask this of all contributors in order to 57 | assure our users of the origin and continuing existence of the code. You 58 | only need to sign the CLA once. 59 | 3. Send a pull request! Push your changes to your fork of the repository and 60 | [submit a pull 61 | request](https://help.github.com/articles/using-pull-requests). In the pull 62 | request, describe what your changes do and mention any bugs/issues related 63 | to the pull request. 64 | 65 | 66 | -------------------------------------------------------------------------------- /.github/ISSUE_TEMPLATE.md: -------------------------------------------------------------------------------- 1 | Please post all product and debugging questions on our [forum](https://discuss.elastic.co/c/logstash). Your questions will reach our wider community members there, and if we confirm that there is a bug, then we can open a new issue here. 2 | 3 | For all general issues, please provide the following details for fast resolution: 4 | 5 | - Version: 6 | - Operating System: 7 | - Config File (if you have sensitive info, please remove it): 8 | - Sample Data: 9 | - Steps to Reproduce: 10 | -------------------------------------------------------------------------------- /.github/PULL_REQUEST_TEMPLATE.md: -------------------------------------------------------------------------------- 1 | Thanks for contributing to Logstash! If you haven't already signed our CLA, here's a handy link: https://www.elastic.co/contributor-agreement/ 2 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | *.gem 2 | Gemfile.lock 3 | .bundle 4 | vendor 5 | -------------------------------------------------------------------------------- /.travis.yml: -------------------------------------------------------------------------------- 1 | import: 2 | - logstash-plugins/.ci:travis/travis.yml@1.x -------------------------------------------------------------------------------- /CHANGELOG.md: -------------------------------------------------------------------------------- 1 | ## 4.7.1 2 | - Improved action call-out of log warning when this plugin cannot enforce timeouts [#93](https://github.com/logstash-plugins/logstash-filter-kv/pull/93) 3 | 4 | ## 4.7.0 5 | - Allow attaching multiple tags on failure. The `tag_on_failure` option now also supports an array of strings [#100](https://github.com/logstash-plugins/logstash-filter-kv/issues/100). Fixes [#92](https://github.com/logstash-plugins/logstash-filter-kv/issues/92) 6 | 7 | ## 4.6.0 8 | - Added `allow_empty_values` option [#72](https://github.com/logstash-plugins/logstash-filter-kv/pull/72) 9 | 10 | ## 4.5.0 11 | - Feat: check that target is set in ECS mode [#96](https://github.com/logstash-plugins/logstash-filter-kv/pull/96) 12 | 13 | ## 4.4.1 14 | - Fixed issue where a `field_split_pattern` containing a literal backslash failed to match correctly [#87](https://github.com/logstash-plugins/logstash-filter-kv/issues/87) 15 | 16 | ## 4.4.0 17 | - Changed timeout handling using the Timeout class [#84](https://github.com/logstash-plugins/logstash-filter-kv/pull/84) 18 | 19 | ## 4.3.3 20 | - Fixed asciidoc formatting in docs 21 | 22 | ## 4.3.2 23 | - Resolved potential race condition in pipeline shutdown where the timeout enforcer could be shut down while work was still in-flight, potentially leading to stuck pipelines. 24 | - Resolved potential race condition in pipeline shutdown where work could be submitted to the timeout enforcer after it had been shutdown, potentially leading to stuck pipelines. 25 | 26 | ## 4.3.1 27 | - Fixed asciidoc formatting in documentation [#81](https://github.com/logstash-plugins/logstash-filter-kv/pull/81) 28 | 29 | ## 4.3.0 30 | - Added a timeout enforcer which prevents inputs that are pathological against the generated parser from blocking 31 | the pipeline. By default, timeout is a generous 30s, but can be configured or disabled entirely with the new 32 | `timeout_millis` and `tag_on_timeout` directives ([#79](https://github.com/logstash-plugins/logstash-filter-kv/pull/79)) 33 | - Made error-handling configurable with `tag_on_failure` directive. 34 | 35 | ## 4.2.1 36 | - Fixes performance regression introduced in 4.1.0 ([#70](https://github.com/logstash-plugins/logstash-filter-kv/issues/70)) 37 | 38 | ## 4.2.0 39 | - Added `whitespace => strict` mode, which allows the parser to behave more predictably when input is known to avoid unnecessary whitespace. 40 | - Added error handling, which tags the event with `_kv_filter_error` if an exception is raised while handling an event instead of allowing the plugin to crash. 41 | 42 | ## 4.1.2 43 | - bugfix: improves trim_key and trim_value to trim any _sequence_ of matching characters from the beginning and ends of the corresponding keys and values; a previous implementation limitited trim to a single character from each end, which was surprising. 44 | - bugfix: fixes issue where we can fail to correctly break up a sequence that includes a partially-quoted value followed by another fully-quoted value by slightly reducing greediness of quoted-value captures. 45 | 46 | ## 4.1.1 47 | - bugfix: correctly handle empty values between value separator and field separator (#58) 48 | 49 | ## 4.1.0 50 | - feature: add option to split fields and values using a regex pattern (#55) 51 | 52 | ## 4.0.3 53 | - Update gemspec summary 54 | 55 | ## 4.0.2 56 | - Fix some documentation issues 57 | 58 | ## 4.0.0 59 | - breaking: trim and trimkey options are renamed to trim_value and trim_key 60 | - bugfix: trim_value and trim_key options now remove only leading and trailing characters (#10) 61 | - feature: new options remove_char_value and remove_char_key to remove all characters from keys/values whatever their position 62 | 63 | ## 3.1.1 64 | - internal,deps: Relax constraint on logstash-core-plugin-api to >= 1.60 <= 2.99 65 | 66 | ## 3.1.0 67 | - Adds :transform_value and :transform_key options to lowercase/upcase or capitalize all keys/values 68 | ## 3.0.1 69 | - internal: Republish all the gems under jruby. 70 | 71 | ## 3.0.0 72 | - internal,deps: Update the plugin to the version 2.0 of the plugin api, this change is required for Logstash 5.0 compatibility. See https://github.com/elastic/logstash/issues/5141 73 | 74 | ## 2.0.7 75 | - feature: With include_brackets enabled, angle brackets (\< and \>) are treated the same as square brackets and parentheses, making it easy to parse strings like "a=\ c=\". 76 | - feature: An empty value_split option value now gives a useful error message. 77 | 78 | ## 2.0.6 79 | - internal,deps: Depend on logstash-core-plugin-api instead of logstash-core, removing the need to mass update plugins on major releases of logstash 80 | 81 | ## 2.0.5 82 | - internal,deps: New dependency requirements for logstash-core for the 5.0 release 83 | 84 | ## 2.0.4 85 | - bugfix: Fields without values could claim the next field + value under certain circumstances. Reported in #22 86 | 87 | ## 2.0.3 88 | - bugfix: fixed short circuit expressions, some optimizations, added specs, PR #20 89 | - bugfix: fixed event field assignment, PR #21 90 | 91 | ## 2.0.0 92 | - internal: Plugins were updated to follow the new shutdown semantic, this mainly allows Logstash to instruct input plugins to terminate gracefully, 93 | instead of using Thread.raise on the plugins' threads. Ref: https://github.com/elastic/logstash/pull/3895 94 | - internal,deps: Dependency on logstash-core update to 2.0 95 | 96 | ## 1.1.0 97 | - feature: support spaces between key and value_split, 98 | support brackets and recursive option. 99 | -------------------------------------------------------------------------------- /CONTRIBUTORS: -------------------------------------------------------------------------------- 1 | The following is a list of people who have contributed ideas, code, bug 2 | reports, or in general have helped logstash along its way. 3 | 4 | Contributors: 5 | * Abhijeet Rastogi (shadyabhi) 6 | * Alex Wheeler (awheeler) 7 | * Andrej Olejník (olej-a) 8 | * Colin Surprenant (colinsurprenant) 9 | * James Turnbull (jamtur01) 10 | * Jordan Sissel (jordansissel) 11 | * Kurt Hurtado (kurtado) 12 | * Matt Dainty (bodgit) 13 | * Michael Richards (mjr5749) 14 | * Paul Fletcher-Hill (pfletch1023) 15 | * Philippe Weber (wiibaa) 16 | * Pier-Hugues Pellerin (ph) 17 | * R. Toma (rtoma) 18 | * Richard Pijnenburg (electrical) 19 | * Scott Bessler (scottbessler) 20 | * Suyog Rao (suyograo) 21 | * piavlo 22 | * Fabien Baligand (fbaligand) 23 | 24 | Note: If you've sent us patches, bug reports, or otherwise contributed to 25 | Logstash, and you aren't on the list above and want to be, please let us know 26 | and we'll make sure you're here. Contributions from folks like you are what make 27 | open source awesome. 28 | -------------------------------------------------------------------------------- /Gemfile: -------------------------------------------------------------------------------- 1 | source 'https://rubygems.org' 2 | 3 | gemspec 4 | 5 | logstash_path = ENV["LOGSTASH_PATH"] || "../../logstash" 6 | use_logstash_source = ENV["LOGSTASH_SOURCE"] && ENV["LOGSTASH_SOURCE"].to_s == "1" 7 | 8 | if Dir.exist?(logstash_path) && use_logstash_source 9 | gem 'logstash-core', :path => "#{logstash_path}/logstash-core" 10 | gem 'logstash-core-plugin-api', :path => "#{logstash_path}/logstash-core-plugin-api" 11 | end 12 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | 2 | Apache License 3 | Version 2.0, January 2004 4 | http://www.apache.org/licenses/ 5 | 6 | TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 7 | 8 | 1. Definitions. 9 | 10 | "License" shall mean the terms and conditions for use, reproduction, 11 | and distribution as defined by Sections 1 through 9 of this document. 12 | 13 | "Licensor" shall mean the copyright owner or entity authorized by 14 | the copyright owner that is granting the License. 15 | 16 | "Legal Entity" shall mean the union of the acting entity and all 17 | other entities that control, are controlled by, or are under common 18 | control with that entity. For the purposes of this definition, 19 | "control" means (i) the power, direct or indirect, to cause the 20 | direction or management of such entity, whether by contract or 21 | otherwise, or (ii) ownership of fifty percent (50%) or more of the 22 | outstanding shares, or (iii) beneficial ownership of such entity. 23 | 24 | "You" (or "Your") shall mean an individual or Legal Entity 25 | exercising permissions granted by this License. 26 | 27 | "Source" form shall mean the preferred form for making modifications, 28 | including but not limited to software source code, documentation 29 | source, and configuration files. 30 | 31 | "Object" form shall mean any form resulting from mechanical 32 | transformation or translation of a Source form, including but 33 | not limited to compiled object code, generated documentation, 34 | and conversions to other media types. 35 | 36 | "Work" shall mean the work of authorship, whether in Source or 37 | Object form, made available under the License, as indicated by a 38 | copyright notice that is included in or attached to the work 39 | (an example is provided in the Appendix below). 40 | 41 | "Derivative Works" shall mean any work, whether in Source or Object 42 | form, that is based on (or derived from) the Work and for which the 43 | editorial revisions, annotations, elaborations, or other modifications 44 | represent, as a whole, an original work of authorship. For the purposes 45 | of this License, Derivative Works shall not include works that remain 46 | separable from, or merely link (or bind by name) to the interfaces of, 47 | the Work and Derivative Works thereof. 48 | 49 | "Contribution" shall mean any work of authorship, including 50 | the original version of the Work and any modifications or additions 51 | to that Work or Derivative Works thereof, that is intentionally 52 | submitted to Licensor for inclusion in the Work by the copyright owner 53 | or by an individual or Legal Entity authorized to submit on behalf of 54 | the copyright owner. For the purposes of this definition, "submitted" 55 | means any form of electronic, verbal, or written communication sent 56 | to the Licensor or its representatives, including but not limited to 57 | communication on electronic mailing lists, source code control systems, 58 | and issue tracking systems that are managed by, or on behalf of, the 59 | Licensor for the purpose of discussing and improving the Work, but 60 | excluding communication that is conspicuously marked or otherwise 61 | designated in writing by the copyright owner as "Not a Contribution." 62 | 63 | "Contributor" shall mean Licensor and any individual or Legal Entity 64 | on behalf of whom a Contribution has been received by Licensor and 65 | subsequently incorporated within the Work. 66 | 67 | 2. Grant of Copyright License. Subject to the terms and conditions of 68 | this License, each Contributor hereby grants to You a perpetual, 69 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 70 | copyright license to reproduce, prepare Derivative Works of, 71 | publicly display, publicly perform, sublicense, and distribute the 72 | Work and such Derivative Works in Source or Object form. 73 | 74 | 3. Grant of Patent License. Subject to the terms and conditions of 75 | this License, each Contributor hereby grants to You a perpetual, 76 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 77 | (except as stated in this section) patent license to make, have made, 78 | use, offer to sell, sell, import, and otherwise transfer the Work, 79 | where such license applies only to those patent claims licensable 80 | by such Contributor that are necessarily infringed by their 81 | Contribution(s) alone or by combination of their Contribution(s) 82 | with the Work to which such Contribution(s) was submitted. If You 83 | institute patent litigation against any entity (including a 84 | cross-claim or counterclaim in a lawsuit) alleging that the Work 85 | or a Contribution incorporated within the Work constitutes direct 86 | or contributory patent infringement, then any patent licenses 87 | granted to You under this License for that Work shall terminate 88 | as of the date such litigation is filed. 89 | 90 | 4. Redistribution. You may reproduce and distribute copies of the 91 | Work or Derivative Works thereof in any medium, with or without 92 | modifications, and in Source or Object form, provided that You 93 | meet the following conditions: 94 | 95 | (a) You must give any other recipients of the Work or 96 | Derivative Works a copy of this License; and 97 | 98 | (b) You must cause any modified files to carry prominent notices 99 | stating that You changed the files; and 100 | 101 | (c) You must retain, in the Source form of any Derivative Works 102 | that You distribute, all copyright, patent, trademark, and 103 | attribution notices from the Source form of the Work, 104 | excluding those notices that do not pertain to any part of 105 | the Derivative Works; and 106 | 107 | (d) If the Work includes a "NOTICE" text file as part of its 108 | distribution, then any Derivative Works that You distribute must 109 | include a readable copy of the attribution notices contained 110 | within such NOTICE file, excluding those notices that do not 111 | pertain to any part of the Derivative Works, in at least one 112 | of the following places: within a NOTICE text file distributed 113 | as part of the Derivative Works; within the Source form or 114 | documentation, if provided along with the Derivative Works; or, 115 | within a display generated by the Derivative Works, if and 116 | wherever such third-party notices normally appear. The contents 117 | of the NOTICE file are for informational purposes only and 118 | do not modify the License. You may add Your own attribution 119 | notices within Derivative Works that You distribute, alongside 120 | or as an addendum to the NOTICE text from the Work, provided 121 | that such additional attribution notices cannot be construed 122 | as modifying the License. 123 | 124 | You may add Your own copyright statement to Your modifications and 125 | may provide additional or different license terms and conditions 126 | for use, reproduction, or distribution of Your modifications, or 127 | for any such Derivative Works as a whole, provided Your use, 128 | reproduction, and distribution of the Work otherwise complies with 129 | the conditions stated in this License. 130 | 131 | 5. Submission of Contributions. Unless You explicitly state otherwise, 132 | any Contribution intentionally submitted for inclusion in the Work 133 | by You to the Licensor shall be under the terms and conditions of 134 | this License, without any additional terms or conditions. 135 | Notwithstanding the above, nothing herein shall supersede or modify 136 | the terms of any separate license agreement you may have executed 137 | with Licensor regarding such Contributions. 138 | 139 | 6. Trademarks. This License does not grant permission to use the trade 140 | names, trademarks, service marks, or product names of the Licensor, 141 | except as required for reasonable and customary use in describing the 142 | origin of the Work and reproducing the content of the NOTICE file. 143 | 144 | 7. Disclaimer of Warranty. Unless required by applicable law or 145 | agreed to in writing, Licensor provides the Work (and each 146 | Contributor provides its Contributions) on an "AS IS" BASIS, 147 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 148 | implied, including, without limitation, any warranties or conditions 149 | of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A 150 | PARTICULAR PURPOSE. You are solely responsible for determining the 151 | appropriateness of using or redistributing the Work and assume any 152 | risks associated with Your exercise of permissions under this License. 153 | 154 | 8. Limitation of Liability. In no event and under no legal theory, 155 | whether in tort (including negligence), contract, or otherwise, 156 | unless required by applicable law (such as deliberate and grossly 157 | negligent acts) or agreed to in writing, shall any Contributor be 158 | liable to You for damages, including any direct, indirect, special, 159 | incidental, or consequential damages of any character arising as a 160 | result of this License or out of the use or inability to use the 161 | Work (including but not limited to damages for loss of goodwill, 162 | work stoppage, computer failure or malfunction, or any and all 163 | other commercial damages or losses), even if such Contributor 164 | has been advised of the possibility of such damages. 165 | 166 | 9. Accepting Warranty or Additional Liability. While redistributing 167 | the Work or Derivative Works thereof, You may choose to offer, 168 | and charge a fee for, acceptance of support, warranty, indemnity, 169 | or other liability obligations and/or rights consistent with this 170 | License. However, in accepting such obligations, You may act only 171 | on Your own behalf and on Your sole responsibility, not on behalf 172 | of any other Contributor, and only if You agree to indemnify, 173 | defend, and hold each Contributor harmless for any liability 174 | incurred by, or claims asserted against, such Contributor by reason 175 | of your accepting any such warranty or additional liability. 176 | 177 | END OF TERMS AND CONDITIONS 178 | 179 | APPENDIX: How to apply the Apache License to your work. 180 | 181 | To apply the Apache License to your work, attach the following 182 | boilerplate notice, with the fields enclosed by brackets "[]" 183 | replaced with your own identifying information. (Don't include 184 | the brackets!) The text should be enclosed in the appropriate 185 | comment syntax for the file format. We also recommend that a 186 | file or class name and description of purpose be included on the 187 | same "printed page" as the copyright notice for easier 188 | identification within third-party archives. 189 | 190 | Copyright 2020 Elastic and contributors 191 | 192 | Licensed under the Apache License, Version 2.0 (the "License"); 193 | you may not use this file except in compliance with the License. 194 | You may obtain a copy of the License at 195 | 196 | http://www.apache.org/licenses/LICENSE-2.0 197 | 198 | Unless required by applicable law or agreed to in writing, software 199 | distributed under the License is distributed on an "AS IS" BASIS, 200 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 201 | See the License for the specific language governing permissions and 202 | limitations under the License. 203 | -------------------------------------------------------------------------------- /NOTICE.TXT: -------------------------------------------------------------------------------- 1 | Elasticsearch 2 | Copyright 2012-2015 Elasticsearch 3 | 4 | This product includes software developed by The Apache Software 5 | Foundation (http://www.apache.org/). -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Logstash Plugin 2 | 3 | [![Travis Build Status](https://travis-ci.com/logstash-plugins/logstash-filter-kv.svg)](https://travis-ci.com/logstash-plugins/logstash-filter-kv) 4 | 5 | This is a plugin for [Logstash](https://github.com/elastic/logstash). 6 | 7 | It is fully free and fully open source. The license is Apache 2.0, meaning you are pretty much free to use it however you want in whatever way. 8 | 9 | ## Documentation 10 | 11 | Logstash provides infrastructure to automatically generate documentation for this plugin. We use the asciidoc format to write documentation so any comments in the source code will be first converted into asciidoc and then into html. All plugin documentation are placed under one [central location](http://www.elastic.co/guide/en/logstash/current/). 12 | 13 | - For formatting code or config example, you can use the asciidoc `[source,ruby]` directive 14 | - For more asciidoc formatting tips, see the excellent reference here https://github.com/elastic/docs#asciidoc-guide 15 | 16 | ## Need Help? 17 | 18 | Need help? Try #logstash on freenode IRC or the https://discuss.elastic.co/c/logstash discussion forum. 19 | 20 | ## Developing 21 | 22 | ### 1. Plugin Developement and Testing 23 | 24 | #### Code 25 | - To get started, you'll need JRuby with the Bundler gem installed. 26 | 27 | - Create a new plugin or clone and existing from the GitHub [logstash-plugins](https://github.com/logstash-plugins) organization. We also provide [example plugins](https://github.com/logstash-plugins?query=example). 28 | 29 | - Install dependencies 30 | ```sh 31 | bundle install 32 | ``` 33 | 34 | #### Test 35 | 36 | - Update your dependencies 37 | 38 | ```sh 39 | bundle install 40 | ``` 41 | 42 | - Run tests 43 | 44 | ```sh 45 | bundle exec rspec 46 | ``` 47 | 48 | ### 2. Running your unpublished Plugin in Logstash 49 | 50 | #### 2.1 Run in a local Logstash clone 51 | 52 | - Edit Logstash `Gemfile` and add the local plugin path, for example: 53 | ```ruby 54 | gem "logstash-filter-awesome", :path => "/your/local/logstash-filter-awesome" 55 | ``` 56 | - Install plugin 57 | ```sh 58 | # Logstash 2.3 and higher 59 | bin/logstash-plugin install --no-verify 60 | 61 | # Prior to Logstash 2.3 62 | bin/plugin install --no-verify 63 | 64 | ``` 65 | - Run Logstash with your plugin 66 | ```sh 67 | bin/logstash -e 'filter {awesome {}}' 68 | ``` 69 | At this point any modifications to the plugin code will be applied to this local Logstash setup. After modifying the plugin, simply rerun Logstash. 70 | 71 | #### 2.2 Run in an installed Logstash 72 | 73 | You can use the same **2.1** method to run your plugin in an installed Logstash by editing its `Gemfile` and pointing the `:path` to your local plugin development directory or you can build the gem and install it using: 74 | 75 | - Build your plugin gem 76 | ```sh 77 | gem build logstash-filter-awesome.gemspec 78 | ``` 79 | - Install the plugin from the Logstash home 80 | ```sh 81 | # Logstash 2.3 and higher 82 | bin/logstash-plugin install --no-verify 83 | 84 | # Prior to Logstash 2.3 85 | bin/plugin install --no-verify 86 | 87 | ``` 88 | - Start Logstash and proceed to test the plugin 89 | 90 | ## Contributing 91 | 92 | All contributions are welcome: ideas, patches, documentation, bug reports, complaints, and even something you drew up on a napkin. 93 | 94 | Programming is not a required skill. Whatever you've seen about open source and maintainers or community members saying "send patches or die" - you will not see that here. 95 | 96 | It is more important to the community that you are able to contribute. 97 | 98 | For more information about contributing, see the [CONTRIBUTING](https://github.com/elastic/logstash/blob/master/CONTRIBUTING.md) file. -------------------------------------------------------------------------------- /Rakefile: -------------------------------------------------------------------------------- 1 | @files=[] 2 | 3 | task :default do 4 | system("rake -T") 5 | end 6 | 7 | require "logstash/devutils/rake" 8 | -------------------------------------------------------------------------------- /docs/index.asciidoc: -------------------------------------------------------------------------------- 1 | :plugin: kv 2 | :type: filter 3 | 4 | /////////////////////////////////////////// 5 | START - GENERATED VARIABLES, DO NOT EDIT! 6 | /////////////////////////////////////////// 7 | :version: %VERSION% 8 | :release_date: %RELEASE_DATE% 9 | :changelog_url: %CHANGELOG_URL% 10 | :include_path: ../../../../logstash/docs/include 11 | /////////////////////////////////////////// 12 | END - GENERATED VARIABLES, DO NOT EDIT! 13 | /////////////////////////////////////////// 14 | 15 | [id="plugins-{type}s-{plugin}"] 16 | 17 | === Kv filter plugin 18 | 19 | include::{include_path}/plugin_header.asciidoc[] 20 | 21 | ==== Description 22 | 23 | This filter helps automatically parse messages (or specific event fields) 24 | which are of the `foo=bar` variety. 25 | 26 | For example, if you have a log message which contains `ip=1.2.3.4 27 | error=REFUSED`, you can parse those automatically by configuring: 28 | [source,ruby] 29 | filter { 30 | kv { } 31 | } 32 | 33 | The above will result in a message of `ip=1.2.3.4 error=REFUSED` having 34 | the fields: 35 | 36 | * `ip: 1.2.3.4` 37 | * `error: REFUSED` 38 | 39 | This is great for postfix, iptables, and other types of logs that 40 | tend towards `key=value` syntax. 41 | 42 | You can configure any arbitrary strings to split your data on, 43 | in case your data is not structured using `=` signs and whitespace. 44 | For example, this filter can also be used to parse query parameters like 45 | `foo=bar&baz=fizz` by setting the `field_split` parameter to `&`. 46 | 47 | [id="plugins-{type}s-{plugin}-ecs_metadata"] 48 | ==== Event Metadata and the Elastic Common Schema (ECS) 49 | 50 | The plugin behaves the same regardless of ECS compatibility, except giving a warning when ECS is enabled and `target` isn't set. 51 | 52 | TIP: Set the `target` option to avoid potential schema conflicts. 53 | 54 | [id="plugins-{type}s-{plugin}-options"] 55 | ==== Kv Filter Configuration Options 56 | 57 | This plugin supports the following configuration options plus the <> described later. 58 | 59 | [cols="<,<,<",options="header",] 60 | |======================================================================= 61 | |Setting |Input type|Required 62 | | <> |<>|No 63 | | <> |<>|No 64 | | <> |<>|No 65 | | <> | <>|No 66 | | <> |<>|No 67 | | <> |<>|No 68 | | <> |<>|No 69 | | <> |<>|No 70 | | <> |<>|No 71 | | <> |<>|No 72 | | <> |<>|No 73 | | <> |<>|No 74 | | <> |<>|No 75 | | <> |<>|No 76 | | <> |<>|No 77 | | <> |<>|No 78 | | <> |<>|No 79 | | <> |<>|No 80 | | <> |<>, one of `["lowercase", "uppercase", "capitalize"]`|No 81 | | <> |<>, one of `["lowercase", "uppercase", "capitalize"]`|No 82 | | <> |<>|No 83 | | <> |<>|No 84 | | <> |<>|No 85 | | <> |<>|No 86 | | <> |<>, one of `["strict", "lenient"]`|No 87 | |======================================================================= 88 | 89 | Also see <> for a list of options supported by all 90 | filter plugins. 91 | 92 |   93 | 94 | [id="plugins-{type}s-{plugin}-allow_duplicate_values"] 95 | ===== `allow_duplicate_values` 96 | 97 | * Value type is <> 98 | * Default value is `true` 99 | 100 | A bool option for removing duplicate key/value pairs. When set to false, only 101 | one unique key/value pair will be preserved. 102 | 103 | For example, consider a source like `from=me from=me`. `[from]` will map to 104 | an Array with two elements: `["me", "me"]`. To only keep unique key/value pairs, 105 | you could use this configuration: 106 | [source,ruby] 107 | filter { 108 | kv { 109 | allow_duplicate_values => false 110 | } 111 | } 112 | 113 | [id="plugins-{type}s-{plugin}-allow_empty_values"] 114 | ===== `allow_empty_values` 115 | 116 | * Value type is <> 117 | * Default value is `false` 118 | 119 | A bool option for explicitly including empty values. 120 | When set to true, empty values will be added to the event. 121 | 122 | NOTE: Parsing empty values typically requires < strict`>>. 123 | 124 | [id="plugins-{type}s-{plugin}-default_keys"] 125 | ===== `default_keys` 126 | 127 | * Value type is <> 128 | * Default value is `{}` 129 | 130 | A hash specifying the default keys and their values which should be added to the event 131 | in case these keys do not exist in the source field being parsed. 132 | [source,ruby] 133 | filter { 134 | kv { 135 | default_keys => [ "from", "logstash@example.com", 136 | "to", "default@dev.null" ] 137 | } 138 | } 139 | 140 | [id="plugins-{type}s-{plugin}-ecs_compatibility"] 141 | ===== `ecs_compatibility` 142 | 143 | * Value type is <> 144 | * Supported values are: 145 | ** `disabled`: does not use ECS-compatible field names 146 | ** `v1`: Elastic Common Schema compliant behavior (warns when `target` isn't set) 147 | 148 | Controls this plugin's compatibility with the {ecs-ref}[Elastic Common Schema (ECS)]. 149 | See <> for detailed information. 150 | 151 | [id="plugins-{type}s-{plugin}-exclude_keys"] 152 | ===== `exclude_keys` 153 | 154 | * Value type is <> 155 | * Default value is `[]` 156 | 157 | An array specifying the parsed keys which should not be added to the event. 158 | By default no keys will be excluded. 159 | 160 | For example, consider a source like `Hey, from=, to=def foo=bar`. 161 | To exclude `from` and `to`, but retain the `foo` key, you could use this configuration: 162 | [source,ruby] 163 | filter { 164 | kv { 165 | exclude_keys => [ "from", "to" ] 166 | } 167 | } 168 | 169 | [id="plugins-{type}s-{plugin}-field_split"] 170 | ===== `field_split` 171 | 172 | * Value type is <> 173 | * Default value is `" "` 174 | 175 | A string of characters to use as single-character field delimiters for parsing out key-value pairs. 176 | 177 | These characters form a regex character class and thus you must escape special regex 178 | characters like `[` or `]` using `\`. 179 | 180 | *Example with URL Query Strings* 181 | 182 | For example, to split out the args from a url query string such as 183 | `?pin=12345~0&d=123&e=foo@bar.com&oq=bobo&ss=12345`: 184 | [source,ruby] 185 | filter { 186 | kv { 187 | field_split => "&?" 188 | } 189 | } 190 | 191 | The above splits on both `&` and `?` characters, giving you the following 192 | fields: 193 | 194 | * `pin: 12345~0` 195 | * `d: 123` 196 | * `e: foo@bar.com` 197 | * `oq: bobo` 198 | * `ss: 12345` 199 | 200 | [id="plugins-{type}s-{plugin}-field_split_pattern"] 201 | ===== `field_split_pattern` 202 | 203 | * Value type is <> 204 | * There is no default value for this setting. 205 | 206 | A regex expression to use as field delimiter for parsing out key-value pairs. 207 | Useful to define multi-character field delimiters. 208 | Setting the `field_split_pattern` options will take precedence over the `field_split` option. 209 | 210 | Note that you should avoid using captured groups in your regex and you should be 211 | cautious with lookaheads or lookbehinds and positional anchors. 212 | 213 | For example, to split fields on a repetition of one or more colons 214 | `k1=v1:k2=v2::k3=v3:::k4=v4`: 215 | [source,ruby] 216 | filter { kv { field_split_pattern => ":+" } } 217 | 218 | To split fields on a regex character that need escaping like the plus sign 219 | `k1=v1++k2=v2++k3=v3++k4=v4`: 220 | [source,ruby] 221 | filter { kv { field_split_pattern => "\\+\\+" } } 222 | 223 | [id="plugins-{type}s-{plugin}-include_brackets"] 224 | ===== `include_brackets` 225 | 226 | * Value type is <> 227 | * Default value is `true` 228 | 229 | A boolean specifying whether to treat square brackets, angle brackets, 230 | and parentheses as value "wrappers" that should be removed from the value. 231 | [source,ruby] 232 | filter { 233 | kv { 234 | include_brackets => true 235 | } 236 | } 237 | 238 | For example, the result of this line: 239 | `bracketsone=(hello world) bracketstwo=[hello world] bracketsthree=` 240 | 241 | will be: 242 | 243 | * bracketsone: hello world 244 | * bracketstwo: hello world 245 | * bracketsthree: hello world 246 | 247 | instead of: 248 | 249 | * bracketsone: (hello 250 | * bracketstwo: [hello 251 | * bracketsthree: > 258 | * Default value is `[]` 259 | 260 | An array specifying the parsed keys which should be added to the event. 261 | By default all keys will be added. 262 | 263 | For example, consider a source like `Hey, from=, to=def foo=bar`. 264 | To include `from` and `to`, but exclude the `foo` key, you could use this configuration: 265 | [source,ruby] 266 | filter { 267 | kv { 268 | include_keys => [ "from", "to" ] 269 | } 270 | } 271 | 272 | [id="plugins-{type}s-{plugin}-prefix"] 273 | ===== `prefix` 274 | 275 | * Value type is <> 276 | * Default value is `""` 277 | 278 | A string to prepend to all of the extracted keys. 279 | 280 | For example, to prepend arg_ to all keys: 281 | [source,ruby] 282 | filter { kv { prefix => "arg_" } } 283 | 284 | [id="plugins-{type}s-{plugin}-recursive"] 285 | ===== `recursive` 286 | 287 | * Value type is <> 288 | * Default value is `false` 289 | 290 | A boolean specifying whether to drill down into values 291 | and recursively get more key-value pairs from it. 292 | The extra key-value pairs will be stored as subkeys of the root key. 293 | 294 | Default is not to recursive values. 295 | [source,ruby] 296 | filter { 297 | kv { 298 | recursive => "true" 299 | } 300 | } 301 | 302 | 303 | [id="plugins-{type}s-{plugin}-remove_char_key"] 304 | ===== `remove_char_key` 305 | 306 | * Value type is <> 307 | * There is no default value for this setting. 308 | 309 | A string of characters to remove from the key. 310 | 311 | These characters form a regex character class and thus you must escape special regex 312 | characters like `[` or `]` using `\`. 313 | 314 | Contrary to trim option, all characters are removed from the key, whatever their position. 315 | 316 | For example, to remove `<` `>` `[` `]` and `,` characters from keys: 317 | [source,ruby] 318 | filter { 319 | kv { 320 | remove_char_key => "<>\[\]," 321 | } 322 | } 323 | 324 | [id="plugins-{type}s-{plugin}-remove_char_value"] 325 | ===== `remove_char_value` 326 | 327 | * Value type is <> 328 | * There is no default value for this setting. 329 | 330 | A string of characters to remove from the value. 331 | 332 | These characters form a regex character class and thus you must escape special regex 333 | characters like `[` or `]` using `\`. 334 | 335 | Contrary to trim option, all characters are removed from the value, whatever their position. 336 | 337 | For example, to remove `<`, `>`, `[`, `]` and `,` characters from values: 338 | [source,ruby] 339 | filter { 340 | kv { 341 | remove_char_value => "<>\[\]," 342 | } 343 | } 344 | 345 | [id="plugins-{type}s-{plugin}-source"] 346 | ===== `source` 347 | 348 | * Value type is <> 349 | * Default value is `"message"` 350 | 351 | The field to perform `key=value` searching on 352 | 353 | For example, to process the `not_the_message` field: 354 | [source,ruby] 355 | filter { kv { source => "not_the_message" } } 356 | 357 | [id="plugins-{type}s-{plugin}-target"] 358 | ===== `target` 359 | 360 | * Value type is <> 361 | * There is no default value for this setting. 362 | 363 | The name of the container to put all of the key-value pairs into. 364 | 365 | If this setting is omitted, fields will be written to the root of the 366 | event, as individual fields. 367 | 368 | For example, to place all keys into the event field kv: 369 | [source,ruby] 370 | filter { kv { target => "kv" } } 371 | 372 | [id="plugins-{type}s-{plugin}-tag_on_failure"] 373 | ===== `tag_on_failure` 374 | 375 | * Value type is <> 376 | * The default value for this setting is [`_kv_filter_error`]. 377 | 378 | When a kv operation causes a runtime exception to be thrown within the plugin, 379 | the operation is safely aborted without crashing the plugin, and the event is 380 | tagged with the provided values. 381 | 382 | [id="plugins-{type}s-{plugin}-tag_on_timeout"] 383 | ===== `tag_on_timeout` 384 | 385 | * Value type is <> 386 | * The default value for this setting is `_kv_filter_timeout`. 387 | 388 | When timeouts are enabled and a kv operation is aborted, the event is tagged 389 | with the provided value (see: <>). 390 | 391 | [id="plugins-{type}s-{plugin}-timeout_millis"] 392 | ===== `timeout_millis` 393 | 394 | * Value type is <> 395 | * The default value for this setting is 30000 (30 seconds). 396 | * Set to zero (`0`) to disable timeouts 397 | 398 | Timeouts provide a safeguard against inputs that are pathological against the 399 | regular expressions that are used to extract key/value pairs. When parsing an 400 | event exceeds this threshold the operation is aborted and the event is tagged 401 | in order to prevent the operation from blocking the pipeline 402 | (see: <>). 403 | 404 | [id="plugins-{type}s-{plugin}-transform_key"] 405 | ===== `transform_key` 406 | 407 | * Value can be any of: `lowercase`, `uppercase`, `capitalize` 408 | * There is no default value for this setting. 409 | 410 | Transform keys to lower case, upper case or capitals. 411 | 412 | For example, to lowercase all keys: 413 | [source,ruby] 414 | filter { 415 | kv { 416 | transform_key => "lowercase" 417 | } 418 | } 419 | 420 | [id="plugins-{type}s-{plugin}-transform_value"] 421 | ===== `transform_value` 422 | 423 | * Value can be any of: `lowercase`, `uppercase`, `capitalize` 424 | * There is no default value for this setting. 425 | 426 | Transform values to lower case, upper case or capitals. 427 | 428 | For example, to capitalize all values: 429 | [source,ruby] 430 | filter { 431 | kv { 432 | transform_value => "capitalize" 433 | } 434 | } 435 | 436 | [id="plugins-{type}s-{plugin}-trim_key"] 437 | ===== `trim_key` 438 | 439 | * Value type is <> 440 | * There is no default value for this setting. 441 | 442 | A string of characters to trim from the key. This is useful if your 443 | keys are wrapped in brackets or start with space. 444 | 445 | These characters form a regex character class and thus you must escape special regex 446 | characters like `[` or `]` using `\`. 447 | 448 | Only leading and trailing characters are trimed from the key. 449 | 450 | For example, to trim `<` `>` `[` `]` and `,` characters from keys: 451 | [source,ruby] 452 | filter { 453 | kv { 454 | trim_key => "<>\[\]," 455 | } 456 | } 457 | 458 | [id="plugins-{type}s-{plugin}-trim_value"] 459 | ===== `trim_value` 460 | 461 | * Value type is <> 462 | * There is no default value for this setting. 463 | 464 | Constants used for transform check 465 | A string of characters to trim from the value. This is useful if your 466 | values are wrapped in brackets or are terminated with commas (like postfix 467 | logs). 468 | 469 | These characters form a regex character class and thus you must escape special regex 470 | characters like `[` or `]` using `\`. 471 | 472 | Only leading and trailing characters are trimed from the value. 473 | 474 | For example, to trim `<`, `>`, `[`, `]` and `,` characters from values: 475 | [source,ruby] 476 | filter { 477 | kv { 478 | trim_value => "<>\[\]," 479 | } 480 | } 481 | 482 | [id="plugins-{type}s-{plugin}-value_split"] 483 | ===== `value_split` 484 | 485 | * Value type is <> 486 | * Default value is `"="` 487 | 488 | A non-empty string of characters to use as single-character value delimiters for parsing out key-value pairs. 489 | 490 | These characters form a regex character class and thus you must escape special regex 491 | characters like `[` or `]` using `\`. 492 | 493 | For example, to identify key-values such as 494 | `key1:value1 key2:value2`: 495 | [source,ruby] 496 | filter { kv { value_split => ":" } } 497 | 498 | 499 | [id="plugins-{type}s-{plugin}-value_split_pattern"] 500 | ===== `value_split_pattern` 501 | 502 | * Value type is <> 503 | * There is no default value for this setting. 504 | 505 | A regex expression to use as value delimiter for parsing out key-value pairs. 506 | Useful to define multi-character value delimiters. 507 | Setting the `value_split_pattern` options will take precedence over the `value_split option`. 508 | 509 | Note that you should avoid using captured groups in your regex and you should be 510 | cautious with lookaheads or lookbehinds and positional anchors. 511 | 512 | See `field_split_pattern` for examples. 513 | 514 | [id="plugins-{type}s-{plugin}-whitespace"] 515 | ===== `whitespace` 516 | 517 | * Value can be any of: `lenient`, `strict` 518 | * Default value is `lenient` 519 | 520 | An option specifying whether to be _lenient_ or _strict_ with the acceptance of unnecessary 521 | whitespace surrounding the configured value-split sequence. 522 | 523 | By default the plugin is run in `lenient` mode, which ignores spaces that occur before or 524 | after the value-splitter. While this allows the plugin to make reasonable guesses with most 525 | input, in some situations it may be too lenient. 526 | 527 | You may want to enable `whitespace => strict` mode if you have control of the input data and 528 | can guarantee that no extra spaces are added surrounding the pattern you have defined for 529 | splitting values. Doing so will ensure that a _field-splitter_ sequence immediately following 530 | a _value-splitter_ will be interpreted as an empty field. 531 | 532 | [id="plugins-{type}s-{plugin}-common-options"] 533 | include::{include_path}/{type}.asciidoc[] 534 | -------------------------------------------------------------------------------- /lib/logstash/filters/kv.rb: -------------------------------------------------------------------------------- 1 | # encoding: utf-8 2 | 3 | require "logstash/filters/base" 4 | require "logstash/namespace" 5 | require 'logstash/plugin_mixins/ecs_compatibility_support' 6 | require 'logstash/plugin_mixins/ecs_compatibility_support/target_check' 7 | require 'logstash/plugin_mixins/validator_support/field_reference_validation_adapter' 8 | require "timeout" 9 | 10 | # This filter helps automatically parse messages (or specific event fields) 11 | # which are of the `foo=bar` variety. 12 | # 13 | # For example, if you have a log message which contains `ip=1.2.3.4 14 | # error=REFUSED`, you can parse those automatically by configuring: 15 | # [source,ruby] 16 | # filter { 17 | # kv { } 18 | # } 19 | # 20 | # The above will result in a message of `ip=1.2.3.4 error=REFUSED` having 21 | # the fields: 22 | # 23 | # * `ip: 1.2.3.4` 24 | # * `error: REFUSED` 25 | # 26 | # This is great for postfix, iptables, and other types of logs that 27 | # tend towards `key=value` syntax. 28 | # 29 | # You can configure any arbitrary strings to split your data on, 30 | # in case your data is not structured using `=` signs and whitespace. 31 | # For example, this filter can also be used to parse query parameters like 32 | # `foo=bar&baz=fizz` by setting the `field_split` parameter to `&`. 33 | class LogStash::Filters::KV < LogStash::Filters::Base 34 | config_name "kv" 35 | 36 | include LogStash::PluginMixins::ECSCompatibilitySupport 37 | include LogStash::PluginMixins::ECSCompatibilitySupport::TargetCheck 38 | 39 | extend LogStash::PluginMixins::ValidatorSupport::FieldReferenceValidationAdapter 40 | 41 | # Constants used for transform check 42 | TRANSFORM_LOWERCASE_KEY = "lowercase" 43 | TRANSFORM_UPPERCASE_KEY = "uppercase" 44 | TRANSFORM_CAPITALIZE_KEY = "capitalize" 45 | 46 | # A string of characters to trim from the value. This is useful if your 47 | # values are wrapped in brackets or are terminated with commas (like postfix 48 | # logs). 49 | # 50 | # These characters form a regex character class and thus you must escape special regex 51 | # characters like `[` or `]` using `\`. 52 | # 53 | # Only leading and trailing characters are trimed from the value. 54 | # 55 | # For example, to trim `<`, `>`, `[`, `]` and `,` characters from values: 56 | # [source,ruby] 57 | # filter { 58 | # kv { 59 | # trim_value => "<>\[\]," 60 | # } 61 | # } 62 | config :trim_value, :validate => :string 63 | 64 | # A string of characters to trim from the key. This is useful if your 65 | # keys are wrapped in brackets or start with space. 66 | # 67 | # These characters form a regex character class and thus you must escape special regex 68 | # characters like `[` or `]` using `\`. 69 | # 70 | # Only leading and trailing characters are trimmed from the key. 71 | # 72 | # For example, to trim `<` `>` `[` `]` and `,` characters from keys: 73 | # [source,ruby] 74 | # filter { 75 | # kv { 76 | # trim_key => "<>\[\]," 77 | # } 78 | # } 79 | config :trim_key, :validate => :string 80 | 81 | # A string of characters to remove from the value. 82 | # 83 | # These characters form a regex character class and thus you must escape special regex 84 | # characters like `[` or `]` using `\`. 85 | # 86 | # Contrary to trim option, all characters are removed from the value, whatever their position. 87 | # 88 | # For example, to remove `<`, `>`, `[`, `]` and `,` characters from values: 89 | # [source,ruby] 90 | # filter { 91 | # kv { 92 | # remove_char_value => "<>\[\]," 93 | # } 94 | # } 95 | config :remove_char_value, :validate => :string 96 | 97 | # A string of characters to remove from the key. 98 | # 99 | # These characters form a regex character class and thus you must escape special regex 100 | # characters like `[` or `]` using `\`. 101 | # 102 | # Contrary to trim option, all characters are removed from the key, whatever their position. 103 | # 104 | # For example, to remove `<` `>` `[` `]` and `,` characters from keys: 105 | # [source,ruby] 106 | # filter { 107 | # kv { 108 | # remove_char_key => "<>\[\]," 109 | # } 110 | # } 111 | config :remove_char_key, :validate => :string 112 | 113 | # Transform values to lower case, upper case or capitals. 114 | # 115 | # For example, to capitalize all values: 116 | # [source,ruby] 117 | # filter { 118 | # kv { 119 | # transform_value => "capitalize" 120 | # } 121 | # } 122 | config :transform_value, :validate => [TRANSFORM_LOWERCASE_KEY, TRANSFORM_UPPERCASE_KEY, TRANSFORM_CAPITALIZE_KEY] 123 | 124 | # Transform keys to lower case, upper case or capitals. 125 | # 126 | # For example, to lowercase all keys: 127 | # [source,ruby] 128 | # filter { 129 | # kv { 130 | # transform_key => "lowercase" 131 | # } 132 | # } 133 | config :transform_key, :validate => [TRANSFORM_LOWERCASE_KEY, TRANSFORM_UPPERCASE_KEY, TRANSFORM_CAPITALIZE_KEY] 134 | 135 | # A string of characters to use as single-character field delimiters for parsing out key-value pairs. 136 | # 137 | # These characters form a regex character class and thus you must escape special regex 138 | # characters like `[` or `]` using `\`. 139 | # 140 | # #### Example with URL Query Strings 141 | # 142 | # For example, to split out the args from a url query string such as 143 | # `?pin=12345~0&d=123&e=foo@bar.com&oq=bobo&ss=12345`: 144 | # [source,ruby] 145 | # filter { 146 | # kv { 147 | # field_split => "&?" 148 | # } 149 | # } 150 | # 151 | # The above splits on both `&` and `?` characters, giving you the following 152 | # fields: 153 | # 154 | # * `pin: 12345~0` 155 | # * `d: 123` 156 | # * `e: foo@bar.com` 157 | # * `oq: bobo` 158 | # * `ss: 12345` 159 | config :field_split, :validate => :string, :default => ' ' 160 | 161 | # A regex expression to use as field delimiter for parsing out key-value pairs. 162 | # Useful to define multi-character field delimiters. 163 | # Setting the field_split_pattern options will take precedence over the field_split option. 164 | # 165 | # Note that you should avoid using captured groups in your regex and you should be 166 | # cautious with lookaheads or lookbehinds and positional anchors. 167 | # 168 | # For example, to split fields on a repetition of one or more colons 169 | # `k1=v1:k2=v2::k3=v3:::k4=v4`: 170 | # [source,ruby] 171 | # filter { kv { field_split_pattern => ":+" } } 172 | # 173 | # To split fields on a regex character that need escaping like the plus sign 174 | # `k1=v1++k2=v2++k3=v3++k4=v4`: 175 | # [source,ruby] 176 | # filter { kv { field_split_pattern => "\\+\\+" } } 177 | config :field_split_pattern, :validate => :string 178 | 179 | # A non-empty string of characters to use as single-character value delimiters for parsing out key-value pairs. 180 | # 181 | # These characters form a regex character class and thus you must escape special regex 182 | # characters like `[` or `]` using `\`. 183 | # 184 | # For example, to identify key-values such as 185 | # `key1:value1 key2:value2`: 186 | # [source,ruby] 187 | # filter { kv { value_split => ":" } } 188 | config :value_split, :validate => :string, :default => '=' 189 | 190 | # A regex expression to use as value delimiter for parsing out key-value pairs. 191 | # Useful to define multi-character value delimiters. 192 | # Setting the value_split_pattern options will take precedence over the value_split option. 193 | # 194 | # Note that you should avoid using captured groups in your regex and you should be 195 | # cautious with lookaheads or lookbehinds and positional anchors. 196 | # 197 | # See field_split_pattern for examples. 198 | config :value_split_pattern, :validate => :string 199 | 200 | # A string to prepend to all of the extracted keys. 201 | # 202 | # For example, to prepend arg_ to all keys: 203 | # [source,ruby] 204 | # filter { kv { prefix => "arg_" } } 205 | config :prefix, :validate => :string, :default => '' 206 | 207 | # The field to perform `key=value` searching on 208 | # 209 | # For example, to process the `not_the_message` field: 210 | # [source,ruby] 211 | # filter { kv { source => "not_the_message" } } 212 | config :source, :validate => :field_reference, :default => "message" 213 | 214 | # The name of the container to put all of the key-value pairs into. 215 | # 216 | # If this setting is omitted, fields will be written to the root of the 217 | # event, as individual fields. 218 | # 219 | # For example, to place all keys into the event field kv: 220 | # [source,ruby] 221 | # filter { kv { target => "kv" } } 222 | config :target, :validate => :field_reference 223 | 224 | # An array specifying the parsed keys which should be added to the event. 225 | # By default all keys will be added. 226 | # 227 | # For example, consider a source like `Hey, from=, to=def foo=bar`. 228 | # To include `from` and `to`, but exclude the `foo` key, you could use this configuration: 229 | # [source,ruby] 230 | # filter { 231 | # kv { 232 | # include_keys => [ "from", "to" ] 233 | # } 234 | # } 235 | config :include_keys, :validate => :array, :default => [] 236 | 237 | # An array specifying the parsed keys which should not be added to the event. 238 | # By default no keys will be excluded. 239 | # 240 | # For example, consider a source like `Hey, from=, to=def foo=bar`. 241 | # To exclude `from` and `to`, but retain the `foo` key, you could use this configuration: 242 | # [source,ruby] 243 | # filter { 244 | # kv { 245 | # exclude_keys => [ "from", "to" ] 246 | # } 247 | # } 248 | config :exclude_keys, :validate => :array, :default => [] 249 | 250 | # A hash specifying the default keys and their values which should be added to the event 251 | # in case these keys do not exist in the source field being parsed. 252 | # [source,ruby] 253 | # filter { 254 | # kv { 255 | # default_keys => [ "from", "logstash@example.com", 256 | # "to", "default@dev.null" ] 257 | # } 258 | # } 259 | config :default_keys, :validate => :hash, :default => {} 260 | 261 | # A bool option for removing duplicate key/value pairs. When set to false, only 262 | # one unique key/value pair will be preserved. 263 | # 264 | # For example, consider a source like `from=me from=me`. `[from]` will map to 265 | # an Array with two elements: `["me", "me"]`. To only keep unique key/value pairs, 266 | # you could use this configuration: 267 | # [source,ruby] 268 | # filter { 269 | # kv { 270 | # allow_duplicate_values => false 271 | # } 272 | # } 273 | config :allow_duplicate_values, :validate => :boolean, :default => true 274 | 275 | # A bool option for keeping empty or nil values. 276 | config :allow_empty_values, :validate => :boolean, :default => false 277 | 278 | # A boolean specifying whether to treat square brackets, angle brackets, 279 | # and parentheses as value "wrappers" that should be removed from the value. 280 | # [source,ruby] 281 | # filter { 282 | # kv { 283 | # include_brackets => true 284 | # } 285 | # } 286 | # 287 | # For example, the result of this line: 288 | # `bracketsone=(hello world) bracketstwo=[hello world] bracketsthree=` 289 | # 290 | # will be: 291 | # 292 | # * bracketsone: hello world 293 | # * bracketstwo: hello world 294 | # * bracketsthree: hello world 295 | # 296 | # instead of: 297 | # 298 | # * bracketsone: (hello 299 | # * bracketstwo: [hello 300 | # * bracketsthree: :boolean, :default => true 303 | 304 | # A boolean specifying whether to drill down into values 305 | # and recursively get more key-value pairs from it. 306 | # The extra key-value pairs will be stored as subkeys of the root key. 307 | # 308 | # Default is not to recursive values. 309 | # [source,ruby] 310 | # filter { 311 | # kv { 312 | # recursive => "true" 313 | # } 314 | # } 315 | # 316 | config :recursive, :validate => :boolean, :default => false 317 | 318 | # An option specifying whether to be _lenient_ or _strict_ with the acceptance of unnecessary 319 | # whitespace surrounding the configured value-split sequence. 320 | # 321 | # By default the plugin is run in `lenient` mode, which ignores spaces that occur before or 322 | # after the value-splitter. While this allows the plugin to make reasonable guesses with most 323 | # input, in some situations it may be too lenient. 324 | # 325 | # You may want to enable `whitespace => strict` mode if you have control of the input data and 326 | # can guarantee that no extra spaces are added surrounding the pattern you have defined for 327 | # splitting values. Doing so will ensure that a _field-splitter_ sequence immediately following 328 | # a _value-splitter_ will be interpreted as an empty field. 329 | # 330 | config :whitespace, :validate => %w(strict lenient), :default => "lenient" 331 | 332 | # Attempt to terminate regexps after this amount of time. 333 | # This applies per source field value if event has multiple values in the source field. 334 | # Set to 0 to disable timeouts 335 | config :timeout_millis, :validate => :number, :default => 30_000 336 | 337 | # Tag to apply if a kv regexp times out. 338 | config :tag_on_timeout, :validate => :string, :default => '_kv_filter_timeout' 339 | 340 | # Tag to apply if kv errors 341 | config :tag_on_failure, :validate => :array, :default => ['_kv_filter_error'] 342 | 343 | 344 | EMPTY_STRING = ''.freeze 345 | 346 | def register 347 | # Too late to set the regexp interruptible flag, at least warn if it is not set. 348 | require 'java' 349 | if java.lang.System.getProperty("jruby.regexp.interruptible") != "true" && @timeout_millis > 0 350 | logger.warn("KV Filter registered with `timeout_millis` safeguard enabled, but a required flag is missing so timeouts cannot be reliably enforced. " + 351 | "Without this safeguard, runaway matchers in the KV filter may lead to high CPU usage and stalled pipelines. " + 352 | "To resolve, add `-Djruby.regexp.interruptible=true` to your `config/jvm.options` and restart the Logstash process.") 353 | end 354 | 355 | if @value_split.empty? 356 | raise LogStash::ConfigurationError, I18n.t( 357 | "logstash.runner.configuration.invalid_plugin_register", 358 | :plugin => "filter", 359 | :type => "kv", 360 | :error => "Configuration option 'value_split' must be a non-empty string" 361 | ) 362 | end 363 | 364 | if @field_split_pattern && @field_split_pattern.empty? 365 | raise LogStash::ConfigurationError, I18n.t( 366 | "logstash.runner.configuration.invalid_plugin_register", 367 | :plugin => "filter", 368 | :type => "kv", 369 | :error => "Configuration option 'field_split_pattern' must be a non-empty string" 370 | ) 371 | end 372 | 373 | if @value_split_pattern && @value_split_pattern.empty? 374 | raise LogStash::ConfigurationError, I18n.t( 375 | "logstash.runner.configuration.invalid_plugin_register", 376 | :plugin => "filter", 377 | :type => "kv", 378 | :error => "Configuration option 'value_split_pattern' must be a non-empty string" 379 | ) 380 | end 381 | 382 | @trim_value_re = Regexp.new("^[#{@trim_value}]+|[#{@trim_value}]+$") if @trim_value 383 | @trim_key_re = Regexp.new("^[#{@trim_key}]+|[#{@trim_key}]+$") if @trim_key 384 | 385 | @remove_char_value_re = Regexp.new("[#{@remove_char_value}]") if @remove_char_value 386 | @remove_char_key_re = Regexp.new("[#{@remove_char_key}]") if @remove_char_key 387 | 388 | optional_whitespace = / */ 389 | eof = /$/ 390 | 391 | field_split_pattern = Regexp::compile(@field_split_pattern || "[#{@field_split}]") 392 | value_split_pattern = Regexp::compile(@value_split_pattern || "[#{@value_split}]") 393 | 394 | # in legacy-compatible lenient mode, the value splitter can be wrapped in optional whitespace 395 | if @whitespace == 'lenient' 396 | value_split_pattern = /#{optional_whitespace}#{value_split_pattern}#{optional_whitespace}/ 397 | end 398 | 399 | # a key is a _captured_ sequence of characters or escaped spaces before optional whitespace 400 | # and followed by either a `value_split`, a `field_split`, or EOF. 401 | key_pattern = (original_params.include?('value_split_pattern') || original_params.include?('field_split_pattern')) ? 402 | unquoted_capture_until_pattern(value_split_pattern, field_split_pattern) : 403 | unquoted_capture_until_charclass(@value_split + @field_split) 404 | 405 | value_pattern = begin 406 | # each component expression within value_pattern _must_ capture exactly once. 407 | value_patterns = [] 408 | 409 | value_patterns << quoted_capture(%q(")) # quoted double 410 | value_patterns << quoted_capture(%q(')) # quoted single 411 | if @include_brackets 412 | value_patterns << quoted_capture('(', ')') # bracketed paren 413 | value_patterns << quoted_capture('[', ']') # bracketed square 414 | value_patterns << quoted_capture('<', '>') # bracketed angle 415 | end 416 | 417 | # an unquoted value is a _captured_ sequence of characters or escaped spaces before a `field_split` or EOF. 418 | value_patterns << (original_params.include?('field_split_pattern') ? 419 | unquoted_capture_until_pattern(field_split_pattern) : 420 | unquoted_capture_until_charclass(@field_split)) 421 | 422 | Regexp.union(value_patterns) 423 | end 424 | 425 | @scan_re = /#{key_pattern}#{value_split_pattern}#{value_pattern}?#{Regexp::union(field_split_pattern, eof)}/ 426 | @value_split_re = value_split_pattern 427 | 428 | @logger.debug? && @logger.debug("KV scan regex", :regex => @scan_re.inspect) 429 | 430 | # divide by float to allow fractional seconds, the Timeout class timeout value is in seconds but the underlying 431 | # executor resolution is in microseconds so fractional second parameter down to microseconds is possible. 432 | # see https://github.com/jruby/jruby/blob/9.2.7.0/core/src/main/java/org/jruby/ext/timeout/Timeout.java#L125 433 | @timeout_seconds = @timeout_millis / 1000.0 434 | end 435 | 436 | def filter(event) 437 | value = event.get(@source) 438 | 439 | # if timeout is 0 avoid creating a closure although Timeout.timeout has a bypass for 0s timeouts. 440 | kv = @timeout_seconds > 0.0 ? Timeout.timeout(@timeout_seconds, TimeoutException) { parse_value(value, event) } : parse_value(value, event) 441 | 442 | # Add default key-values for missing keys 443 | kv = @default_keys.merge(kv) 444 | 445 | return if kv.empty? 446 | 447 | if @target 448 | if event.include?(@target) 449 | @logger.debug? && @logger.debug("Overwriting existing target field", field: @target, existing_value: event.get(@target)) 450 | end 451 | event.set(@target, kv) 452 | else 453 | kv.each{|k, v| event.set(k, v)} 454 | end 455 | 456 | filter_matched(event) 457 | 458 | rescue TimeoutException => e 459 | logger.warn("Timeout reached in KV filter with value #{summarize(value)}") 460 | event.tag(@tag_on_timeout) 461 | rescue => ex 462 | meta = { :exception => ex.message } 463 | meta[:backtrace] = ex.backtrace if logger.debug? 464 | logger.warn('Exception while parsing KV', meta) 465 | @tag_on_failure.each { |tag| event.tag(tag) } 466 | end 467 | 468 | def close 469 | end 470 | 471 | private 472 | 473 | def parse_value(value, event) 474 | kv = Hash.new 475 | 476 | case value 477 | when nil 478 | # Nothing to do 479 | when String 480 | parse(value, event, kv) 481 | when Array 482 | value.each { |v| parse(v, event, kv) } 483 | else 484 | @logger.warn("kv filter has no support for this type of data", :type => value.class, :value => value) 485 | end 486 | 487 | kv 488 | end 489 | 490 | # @overload summarize(value) 491 | # @param value [Array] 492 | # @return [String] 493 | # @overload summarize(value) 494 | # @param value [String] 495 | # @return [String] 496 | def summarize(value) 497 | if value.kind_of?(Array) 498 | value.map(&:to_s).map do |entry| 499 | summarize(entry) 500 | end.to_s 501 | end 502 | 503 | value = value.to_s 504 | 505 | value.bytesize < 255 ? "`#{value.dump}`" : "(entry too large to show; showing first 255 characters) `#{value[0..255].dump}`[...]" 506 | end 507 | 508 | def has_value_splitter?(s) 509 | s =~ @value_split_re 510 | end 511 | 512 | # Helper function for generating single-capture `Regexp` that, when matching a string bound by the given quotes 513 | # or brackets, will capture the content that is between the quotes or brackets. 514 | # 515 | # @api private 516 | # @param quote_sequence [String] a character sequence that begins a quoted expression 517 | # @param close_quote_sequence [String] a character sequence that ends a quoted expression; (default: quote_sequence) 518 | # @return [Regexp] with a single capture group representing content that is between the given quotes 519 | def quoted_capture(quote_sequence, close_quote_sequence=quote_sequence) 520 | fail('quote_sequence must be non-empty!') if quote_sequence.nil? || quote_sequence.empty? 521 | fail('close_quote_sequence must be non-empty!') if close_quote_sequence.nil? || close_quote_sequence.empty? 522 | 523 | open_pattern = /#{Regexp.quote(quote_sequence)}/ 524 | close_pattern = /#{Regexp.quote(close_quote_sequence)}/ 525 | 526 | # matches a sequence of zero or more characters are _not_ the `close_quote_sequence` 527 | quoted_value_pattern = unquoted_capture_until_charclass(Regexp.quote(close_quote_sequence)) 528 | 529 | /#{open_pattern}#{quoted_value_pattern}?#{close_pattern}/ 530 | end 531 | 532 | # Helper function for generating *capturing* `Regexp` that will match any sequence of characters that are either 533 | # backslash-escaped OR *NOT* matching any of the given pattern(s) 534 | # 535 | # @api private 536 | # @param *until_lookahead_patterns [Regexp] 537 | # @return [Regexp] 538 | def unquoted_capture_until_pattern(*patterns) 539 | pattern = patterns.size > 1 ? Regexp.union(patterns) : patterns.first 540 | /((?:(?!#{pattern})(?:\\.|.))+)/ 541 | end 542 | 543 | # Helper function for generating *capturing* `Regexp` that will _efficiently_ match any sequence of characters 544 | # that are either backslash-escaped or do _not_ belong to the given charclass. 545 | # 546 | # @api private 547 | # @param charclass [String] characters to be injected directly into a regexp charclass; special characters must be pre-escaped. 548 | # @return [Regexp] 549 | def unquoted_capture_until_charclass(charclass) 550 | /((?:\\.|[^#{charclass}])+)/ 551 | end 552 | 553 | def transform(text, method) 554 | case method 555 | when TRANSFORM_LOWERCASE_KEY 556 | return text.downcase 557 | when TRANSFORM_UPPERCASE_KEY 558 | return text.upcase 559 | when TRANSFORM_CAPITALIZE_KEY 560 | return text.capitalize 561 | end 562 | end 563 | 564 | # Parses the given `text`, using the `event` for context, into the provided `kv_keys` hash 565 | # 566 | # @param text [String]: the text to parse 567 | # @param event [LogStash::Event]: the event from which to extract context (e.g., sprintf vs (in|ex)clude keys) 568 | # @param kv_keys [Hash{String=>Object}]: the hash in which to inject found key/value pairs 569 | # 570 | # @return [void] 571 | def parse(text, event, kv_keys) 572 | # short circuit parsing if the text does not contain the @value_split 573 | return unless has_value_splitter?(text) 574 | 575 | # Interpret dynamic keys for @include_keys and @exclude_keys 576 | include_keys = @include_keys.map{|key| event.sprintf(key)} 577 | exclude_keys = @exclude_keys.map{|key| event.sprintf(key)} 578 | 579 | text.scan(@scan_re) do |key, *value_candidates| 580 | value = value_candidates.compact.first || EMPTY_STRING 581 | next if value.empty? && !@allow_empty_values 582 | 583 | key = key.gsub(@trim_key_re, EMPTY_STRING) if @trim_key 584 | key = key.gsub(@remove_char_key_re, EMPTY_STRING) if @remove_char_key 585 | key = transform(key, @transform_key) if @transform_key 586 | 587 | # Bail out as per the values of include_keys and exclude_keys 588 | next if not include_keys.empty? and not include_keys.include?(key) 589 | # next unless include_keys.include?(key) 590 | next if exclude_keys.include?(key) 591 | 592 | key = event.sprintf(@prefix) + key 593 | 594 | value = value.gsub(@trim_value_re, EMPTY_STRING) if @trim_value 595 | value = value.gsub(@remove_char_value_re, EMPTY_STRING) if @remove_char_value 596 | value = transform(value, @transform_value) if @transform_value 597 | 598 | # Bail out if inserting duplicate value in key mapping when unique_values 599 | # option is set to true. 600 | next if not @allow_duplicate_values and kv_keys.has_key?(key) and kv_keys[key].include?(value) 601 | 602 | # recursively get more kv pairs from the value 603 | if @recursive 604 | innerKv = {} 605 | parse(value, event, innerKv) 606 | value = innerKv unless innerKv.empty? 607 | end 608 | 609 | if kv_keys.has_key?(key) 610 | if kv_keys[key].is_a?(Array) 611 | kv_keys[key].push(value) 612 | else 613 | kv_keys[key] = [kv_keys[key], value] 614 | end 615 | else 616 | kv_keys[key] = value 617 | end 618 | end 619 | end 620 | 621 | class TimeoutException < RuntimeError 622 | end 623 | end 624 | -------------------------------------------------------------------------------- /logstash-filter-kv.gemspec: -------------------------------------------------------------------------------- 1 | Gem::Specification.new do |s| 2 | 3 | s.name = 'logstash-filter-kv' 4 | s.version = '4.7.1' 5 | s.licenses = ['Apache License (2.0)'] 6 | s.summary = "Parses key-value pairs" 7 | s.description = "This gem is a Logstash plugin required to be installed on top of the Logstash core pipeline using $LS_HOME/bin/logstash-plugin install gemname. This gem is not a stand-alone program" 8 | s.authors = ["Elastic"] 9 | s.email = 'info@elastic.co' 10 | s.homepage = "http://www.elastic.co/guide/en/logstash/current/index.html" 11 | s.require_paths = ["lib"] 12 | 13 | # Files 14 | s.files = Dir["lib/**/*","spec/**/*","*.gemspec","*.md","CONTRIBUTORS","Gemfile","LICENSE","NOTICE.TXT", "vendor/jar-dependencies/**/*.jar", "vendor/jar-dependencies/**/*.rb", "VERSION", "docs/**/*"] 15 | 16 | # Tests 17 | s.test_files = s.files.grep(%r{^(test|spec|features)/}) 18 | 19 | # Special flag to let us know this is actually a logstash plugin 20 | s.metadata = { "logstash_plugin" => "true", "logstash_group" => "filter" } 21 | 22 | # Gem dependencies 23 | s.add_runtime_dependency "logstash-core-plugin-api", ">= 1.60", "<= 2.99" 24 | s.add_runtime_dependency 'logstash-mixin-ecs_compatibility_support', '~> 1.3' 25 | s.add_runtime_dependency 'logstash-mixin-validator_support', '~> 1.0' 26 | 27 | s.add_development_dependency 'logstash-devutils' 28 | s.add_development_dependency 'insist' 29 | end 30 | -------------------------------------------------------------------------------- /spec/filters/kv_spec.rb: -------------------------------------------------------------------------------- 1 | # encoding: utf-8 2 | 3 | require "logstash/devutils/rspec/spec_helper" 4 | require "insist" 5 | require "logstash/filters/kv" 6 | 7 | # Logstash starts JRuby with a special flag to ensure that regexp's are 8 | # executed in an interruptible fashion. 9 | require 'java' 10 | if java.lang.System.getProperty("jruby.regexp.interruptible") != "true" 11 | fail("Java must be started with `-Djruby.regexp.interruptible=true`") 12 | end 13 | 14 | describe LogStash::Filters::KV do 15 | 16 | describe "defaults" do 17 | # The logstash config goes here. 18 | # At this time, only filters are supported. 19 | config <<-CONFIG 20 | filter { 21 | kv { } 22 | } 23 | CONFIG 24 | 25 | sample "hello=world foo=bar baz=fizz doublequoted=\"hello world\" singlequoted='hello world' bracketsone=(hello world) bracketstwo=[hello world] bracketsthree=" do 26 | insist { subject.get("hello") } == "world" 27 | insist { subject.get("foo") } == "bar" 28 | insist { subject.get("baz") } == "fizz" 29 | insist { subject.get("doublequoted") } == "hello world" 30 | insist { subject.get("singlequoted") } == "hello world" 31 | insist { subject.get("bracketsone") } == "hello world" 32 | insist { subject.get("bracketstwo") } == "hello world" 33 | insist { subject.get("bracketsthree") } == "hello world" 34 | end 35 | end 36 | 37 | describe "test transforming keys to uppercase and values to lowercase" do 38 | config <<-CONFIG 39 | filter { 40 | kv { 41 | transform_key => "uppercase" 42 | transform_value => "lowercase" 43 | } 44 | } 45 | CONFIG 46 | 47 | sample "hello = world Foo =Bar BAZ= FIZZ doublequoteD = \"hellO worlD\" Singlequoted= 'Hello World' brAckets =(hello World)" do 48 | insist { subject.get("HELLO") } == "world" 49 | insist { subject.get("FOO") } == "bar" 50 | insist { subject.get("BAZ") } == "fizz" 51 | insist { subject.get("DOUBLEQUOTED") } == "hello world" 52 | insist { subject.get("SINGLEQUOTED") } == "hello world" 53 | insist { subject.get("BRACKETS") } == "hello world" 54 | end 55 | end 56 | 57 | describe 'whitespace => strict' do 58 | config <<-CONFIG 59 | filter { 60 | kv { 61 | whitespace => strict 62 | } 63 | } 64 | CONFIG 65 | 66 | context 'unquoted values' do 67 | sample "IN=eth0 OUT= MAC=0f:5f:5e:aa:d3:a2:21:ff:09:00:0f:e1:c8:17 SRC=192.168.0.1" do 68 | insist { subject.get('IN') } == 'eth0' 69 | insist { subject.get('OUT') } == nil # when whitespace is strict, OUT is empty and thus uncaptured. 70 | insist { subject.get('MAC') } == '0f:5f:5e:aa:d3:a2:21:ff:09:00:0f:e1:c8:17' 71 | insist { subject.get('SRC') } == '192.168.0.1' 72 | end 73 | end 74 | 75 | context 'mixed quotations' do 76 | sample 'hello=world goodbye=cruel\\ world empty_quoted="" quoted="value1" empty_unquoted= unquoted=value2 empty_bracketed=[] bracketed=[value3] cake=delicious' do 77 | insist { subject.get('hello') } == 'world' 78 | insist { subject.get('goodbye') } == 'cruel\\ world' 79 | insist { subject.get('empty_quoted') } == nil 80 | insist { subject.get('quoted') } == 'value1' 81 | insist { subject.get('empty_unquoted') } == nil 82 | insist { subject.get('unquoted') } == 'value2' 83 | insist { subject.get('empty_bracketed') } == nil 84 | insist { subject.get('bracketed') } == 'value3' 85 | insist { subject.get('cake') } == 'delicious' 86 | end 87 | end 88 | 89 | context 'when given sloppy input, it extracts only the unambiguous bits' do 90 | sample "hello = world foo =bar baz= fizz whitespace=none doublequoted = \"hello world\" singlequoted= 'hello world' brackets =(hello world) strict=true" do 91 | insist { subject.get('whitespace') } == 'none' 92 | insist { subject.get('strict') } == 'true' 93 | 94 | insist { subject.to_hash.keys.sort } == %w(@timestamp @version message strict whitespace) 95 | end 96 | end 97 | end 98 | 99 | describe "test transforming keys to lowercase and values to uppercase" do 100 | config <<-CONFIG 101 | filter { 102 | kv { 103 | transform_key => "lowercase" 104 | transform_value => "uppercase" 105 | } 106 | } 107 | CONFIG 108 | 109 | sample "Hello = World fOo =bar baz= FIZZ DOUBLEQUOTED = \"hellO worlD\" singlequoted= 'hEllo wOrld' brackets =(HELLO world)" do 110 | insist { subject.get("hello") } == "WORLD" 111 | insist { subject.get("foo") } == "BAR" 112 | insist { subject.get("baz") } == "FIZZ" 113 | insist { subject.get("doublequoted") } == "HELLO WORLD" 114 | insist { subject.get("singlequoted") } == "HELLO WORLD" 115 | insist { subject.get("brackets") } == "HELLO WORLD" 116 | end 117 | end 118 | 119 | describe "test transforming keys and values to capitals" do 120 | config <<-CONFIG 121 | filter { 122 | kv { 123 | transform_key => "capitalize" 124 | transform_value => "capitalize" 125 | } 126 | } 127 | CONFIG 128 | 129 | sample "Hello = World fOo =bar baz= FIZZ DOUBLEQUOTED = \"hellO worlD\" singlequoted= 'hEllo wOrld' brackets =(HELLO world)" do 130 | insist { subject.get("Hello") } == "World" 131 | insist { subject.get("Foo") } == "Bar" 132 | insist { subject.get("Baz") } == "Fizz" 133 | insist { subject.get("Doublequoted") } == "Hello world" 134 | insist { subject.get("Singlequoted") } == "Hello world" 135 | insist { subject.get("Brackets") } == "Hello world" 136 | end 137 | end 138 | 139 | describe "test spaces attached to the field_split" do 140 | config <<-CONFIG 141 | filter { 142 | kv { } 143 | } 144 | CONFIG 145 | 146 | sample "hello = world foo =bar baz= fizz doublequoted = \"hello world\" singlequoted= 'hello world' brackets =(hello world)" do 147 | insist { subject.get("hello") } == "world" 148 | insist { subject.get("foo") } == "bar" 149 | insist { subject.get("baz") } == "fizz" 150 | insist { subject.get("doublequoted") } == "hello world" 151 | insist { subject.get("singlequoted") } == "hello world" 152 | insist { subject.get("brackets") } == "hello world" 153 | end 154 | end 155 | 156 | describe "LOGSTASH-624: allow escaped space in key or value " do 157 | config <<-CONFIG 158 | filter { 159 | kv { value_split => ':' } 160 | } 161 | CONFIG 162 | 163 | sample 'IKE:=Quick\ Mode\ completion IKE\ IDs:=subnet:\ x.x.x.x\ (mask=\ 255.255.255.254)\ and\ host:\ y.y.y.y' do 164 | insist { subject.get("IKE") } == '=Quick\ Mode\ completion' 165 | insist { subject.get('IKE\ IDs') } == '=subnet:\ x.x.x.x\ (mask=\ 255.255.255.254)\ and\ host:\ y.y.y.y' 166 | end 167 | end 168 | 169 | describe "test value_split" do 170 | context "using an alternate splitter" do 171 | config <<-CONFIG 172 | filter { 173 | kv { value_split => ':' } 174 | } 175 | CONFIG 176 | 177 | sample "hello:=world foo:bar baz=:fizz doublequoted:\"hello world\" singlequoted:'hello world' brackets:(hello world)" do 178 | insist { subject.get("hello") } == "=world" 179 | insist { subject.get("foo") } == "bar" 180 | insist { subject.get("baz=") } == "fizz" 181 | insist { subject.get("doublequoted") } == "hello world" 182 | insist { subject.get("singlequoted") } == "hello world" 183 | insist { subject.get("brackets") } == "hello world" 184 | end 185 | end 186 | end 187 | 188 | # these specs are quite implementation specific by testing on the private method 189 | # has_value_splitter? - this is what I figured would help fixing the short circuit 190 | # broken code that was previously in place 191 | describe "short circuit" do 192 | subject do 193 | plugin = LogStash::Filters::KV.new(options) 194 | plugin.register 195 | plugin 196 | end 197 | let(:data) { {"message" => message} } 198 | let(:event) { LogStash::Event.new(data) } 199 | 200 | context "plain message" do 201 | let(:options) { {} } 202 | 203 | context "without splitter" do 204 | let(:message) { "foo:bar" } 205 | it "should short circuit" do 206 | expect(subject.send(:has_value_splitter?, message)).to be_falsey 207 | expect(subject).to receive(:has_value_splitter?).with(message).once.and_return(false) 208 | subject.filter(event) 209 | end 210 | end 211 | 212 | context "with splitter" do 213 | let(:message) { "foo=bar" } 214 | it "should not short circuit" do 215 | expect(subject.send(:has_value_splitter?, message)).to be_truthy 216 | expect(subject).to receive(:has_value_splitter?).with(message).once.and_return(true) 217 | subject.filter(event) 218 | end 219 | end 220 | end 221 | 222 | context "recursive message" do 223 | context "without inner splitter" do 224 | let(:inner) { "bar" } 225 | let(:message) { "foo=#{inner}" } 226 | let(:options) { {"recursive" => "true"} } 227 | 228 | it "should extract kv" do 229 | subject.filter(event) 230 | expect(event.get("foo")).to eq(inner) 231 | end 232 | 233 | it "should short circuit" do 234 | expect(subject.send(:has_value_splitter?, message)).to be_truthy 235 | expect(subject.send(:has_value_splitter?, inner)).to be_falsey 236 | expect(subject).to receive(:has_value_splitter?).with(message).once.and_return(true) 237 | expect(subject).to receive(:has_value_splitter?).with(inner).once.and_return(false) 238 | subject.filter(event) 239 | end 240 | end 241 | 242 | context "with inner splitter" do 243 | let(:foo_val) { "1" } 244 | let(:baz_val) { "2" } 245 | let(:inner) { "baz=#{baz_val}" } 246 | let(:message) { "foo=#{foo_val} bar=(#{inner})" } # foo=1 bar=(baz=2) 247 | let(:options) { {"recursive" => "true"} } 248 | 249 | it "should extract kv" do 250 | subject.filter(event) 251 | expect(event.get("foo")).to eq(foo_val) 252 | expect(event.get("[bar][baz]")).to eq(baz_val) 253 | end 254 | 255 | it "should short circuit" do 256 | expect(subject.send(:has_value_splitter?, message)).to be_truthy 257 | expect(subject.send(:has_value_splitter?, foo_val)).to be_falsey 258 | 259 | expect(subject.send(:has_value_splitter?, inner)).to be_truthy 260 | expect(subject.send(:has_value_splitter?, baz_val)).to be_falsey 261 | 262 | expect(subject).to receive(:has_value_splitter?).with(message).once.and_return(true) 263 | expect(subject).to receive(:has_value_splitter?).with(foo_val).once.and_return(false) 264 | 265 | expect(subject).to receive(:has_value_splitter?).with(inner).once.and_return(true) 266 | expect(subject).to receive(:has_value_splitter?).with(baz_val).once.and_return(false) 267 | 268 | subject.filter(event) 269 | end 270 | end 271 | end 272 | end 273 | 274 | describe "test field_split" do 275 | config <<-CONFIG 276 | filter { 277 | kv { field_split => '?&' } 278 | } 279 | CONFIG 280 | 281 | sample "?hello=world&foo=bar&baz=fizz&doublequoted=\"hello world\"&singlequoted='hello world'&ignoreme&foo12=bar12" do 282 | insist { subject.get("hello") } == "world" 283 | insist { subject.get("foo") } == "bar" 284 | insist { subject.get("baz") } == "fizz" 285 | insist { subject.get("doublequoted") } == "hello world" 286 | insist { subject.get("singlequoted") } == "hello world" 287 | insist { subject.get("foo12") } == "bar12" 288 | end 289 | end 290 | 291 | describe "test include_brackets is false" do 292 | config <<-CONFIG 293 | filter { 294 | kv { include_brackets => "false" } 295 | } 296 | CONFIG 297 | 298 | sample "bracketsone=(hello world) bracketstwo=[hello world]" do 299 | insist { subject.get("bracketsone") } == "(hello" 300 | insist { subject.get("bracketstwo") } == "[hello" 301 | end 302 | end 303 | 304 | describe "test recursive" do 305 | config <<-CONFIG 306 | filter { 307 | kv { 308 | recursive => 'true' 309 | } 310 | } 311 | CONFIG 312 | 313 | sample 'IKE="Quick Mode completion" IKE\ IDs = (subnet= x.x.x.x mask= 255.255.255.254 and host=y.y.y.y)' do 314 | insist { subject.get("IKE") } == 'Quick Mode completion' 315 | insist { subject.get('IKE\ IDs')['subnet'] } == 'x.x.x.x' 316 | insist { subject.get('IKE\ IDs')['mask'] } == '255.255.255.254' 317 | insist { subject.get('IKE\ IDs')['host'] } == 'y.y.y.y' 318 | end 319 | end 320 | 321 | describe "delimited fields should override space default (reported by LOGSTASH-733)" do 322 | config <<-CONFIG 323 | filter { 324 | kv { field_split => "|" } 325 | } 326 | CONFIG 327 | 328 | sample "field1=test|field2=another test|field3=test3" do 329 | insist { subject.get("field1") } == "test" 330 | insist { subject.get("field2") } == "another test" 331 | insist { subject.get("field3") } == "test3" 332 | end 333 | end 334 | 335 | describe "test prefix" do 336 | config <<-CONFIG 337 | filter { 338 | kv { prefix => '__' } 339 | } 340 | CONFIG 341 | 342 | sample "hello=world foo=bar baz=fizz doublequoted=\"hello world\" singlequoted='hello world'" do 343 | insist { subject.get("__hello") } == "world" 344 | insist { subject.get("__foo") } == "bar" 345 | insist { subject.get("__baz") } == "fizz" 346 | insist { subject.get("__doublequoted") } == "hello world" 347 | insist { subject.get("__singlequoted") } == "hello world" 348 | end 349 | 350 | end 351 | 352 | describe "speed test", :performance => true do 353 | count = 10000 + rand(3000) 354 | config <<-CONFIG 355 | input { 356 | generator { 357 | count => #{count} 358 | type => foo 359 | message => "hello=world bar='baz fizzle'" 360 | } 361 | } 362 | 363 | filter { 364 | kv { } 365 | } 366 | 367 | output { 368 | null { } 369 | } 370 | CONFIG 371 | 372 | start = Time.now 373 | agent do 374 | duration = (Time.now - start) 375 | puts "filters/kv rate: #{"%02.0f/sec" % (count / duration)}, elapsed: #{duration}s" 376 | end 377 | end 378 | 379 | describe "add_tag" do 380 | context "should activate when successful" do 381 | config <<-CONFIG 382 | filter { 383 | kv { add_tag => "hello" } 384 | } 385 | CONFIG 386 | 387 | sample "hello=world" do 388 | insist { subject.get("hello") } == "world" 389 | insist { subject.get("tags") }.include?("hello") 390 | end 391 | end 392 | context "should not activate when failing" do 393 | config <<-CONFIG 394 | filter { 395 | kv { add_tag => "hello" } 396 | } 397 | CONFIG 398 | 399 | sample "this is not key value" do 400 | insist { subject.get("tags") }.nil? 401 | end 402 | end 403 | end 404 | 405 | describe "add_field" do 406 | context "should activate when successful" do 407 | config <<-CONFIG 408 | filter { 409 | kv { add_field => [ "whoa", "fancypants" ] } 410 | } 411 | CONFIG 412 | 413 | sample "hello=world" do 414 | insist { subject.get("hello") } == "world" 415 | insist { subject.get("whoa") } == "fancypants" 416 | end 417 | end 418 | 419 | context "should not activate when failing" do 420 | config <<-CONFIG 421 | filter { 422 | kv { add_tag => "hello" } 423 | } 424 | CONFIG 425 | 426 | sample "this is not key value" do 427 | reject { subject.get("whoa") } == "fancypants" 428 | end 429 | end 430 | end 431 | 432 | #New tests 433 | describe "test target" do 434 | config <<-CONFIG 435 | filter { 436 | kv { target => 'kv' } 437 | } 438 | CONFIG 439 | 440 | sample "hello=world foo=bar baz=fizz doublequoted=\"hello world\" singlequoted='hello world'" do 441 | insist { subject.get("kv")["hello"] } == "world" 442 | insist { subject.get("kv")["foo"] } == "bar" 443 | insist { subject.get("kv")["baz"] } == "fizz" 444 | insist { subject.get("kv")["doublequoted"] } == "hello world" 445 | insist { subject.get("kv")["singlequoted"] } == "hello world" 446 | insist {subject.get("kv").count } == 5 447 | end 448 | 449 | end 450 | 451 | describe "test empty target" do 452 | config <<-CONFIG 453 | filter { 454 | kv { target => 'kv' } 455 | } 456 | CONFIG 457 | 458 | sample "hello:world:foo:bar:baz:fizz" do 459 | insist { subject.get("kv") } == nil 460 | end 461 | end 462 | 463 | describe "test data from specific sub source" do 464 | config <<-CONFIG 465 | filter { 466 | kv { 467 | source => "data" 468 | } 469 | } 470 | CONFIG 471 | sample({"data" => "hello=world foo=bar baz=fizz doublequoted=\"hello world\" singlequoted='hello world'"}) do 472 | insist { subject.get("hello") } == "world" 473 | insist { subject.get("foo") } == "bar" 474 | insist { subject.get("baz") } == "fizz" 475 | insist { subject.get("doublequoted") } == "hello world" 476 | insist { subject.get("singlequoted") } == "hello world" 477 | end 478 | end 479 | 480 | describe "test data from specific top source" do 481 | config <<-CONFIG 482 | filter { 483 | kv { 484 | source => "@data" 485 | } 486 | } 487 | CONFIG 488 | sample({"@data" => "hello=world foo=bar baz=fizz doublequoted=\"hello world\" singlequoted='hello world'"}) do 489 | insist { subject.get("hello") } == "world" 490 | insist { subject.get("foo") } == "bar" 491 | insist { subject.get("baz") } == "fizz" 492 | insist { subject.get("doublequoted") } == "hello world" 493 | insist { subject.get("singlequoted") } == "hello world" 494 | end 495 | end 496 | 497 | describe 'field_split_pattern with literal backslashes' do 498 | config <<-CONFIG 499 | filter { 500 | kv { 501 | source => headers 502 | field_split_pattern => "\\\\r\\\\n" 503 | value_split_pattern => ": " 504 | whitespace => strict 505 | target => headerskv 506 | } 507 | } 508 | CONFIG 509 | 510 | sample({"headers"=>"Host: foo.com\\r\\nUser-Agent: Qwerty/1.2.3 (www.qwerty.org)\\r\\nContent-Type: text/xml; charset=utf-8\\r\\nAccept: */*\\r\\nAccept-Encoding: gzip, deflate\\r\\nContent-Length: 123\\r\\nX-UUID: 0:15713435944943992\\r\\n\\r\\n"}) do 511 | insist { subject.get("[headerskv][Host]") } == "foo.com" 512 | insist { subject.get("[headerskv][User-Agent]") } == "Qwerty/1.2.3 (www.qwerty.org)" 513 | insist { subject.get("[headerskv][Content-Type]") } == "text/xml; charset=utf-8" 514 | insist { subject.get("[headerskv][Accept]") } == "*/*" 515 | insist { subject.get("[headerskv][Accept-Encoding]") } == "gzip, deflate" 516 | insist { subject.get("[headerskv][Content-Length]") } == "123" 517 | insist { subject.get("[headerskv][X-UUID]") } == "0:15713435944943992" 518 | end 519 | end 520 | 521 | describe "test data from specific sub source and target" do 522 | config <<-CONFIG 523 | filter { 524 | kv { 525 | source => "data" 526 | target => "kv" 527 | } 528 | } 529 | CONFIG 530 | sample({"data" => "hello=world foo=bar baz=fizz doublequoted=\"hello world\" singlequoted='hello world'"}) do 531 | insist { subject.get("kv")["hello"] } == "world" 532 | insist { subject.get("kv")["foo"] } == "bar" 533 | insist { subject.get("kv")["baz"] } == "fizz" 534 | insist { subject.get("kv")["doublequoted"] } == "hello world" 535 | insist { subject.get("kv")["singlequoted"] } == "hello world" 536 | insist { subject.get("kv").count } == 5 537 | end 538 | end 539 | 540 | describe "test data from nil sub source, should not issue a warning" do 541 | config <<-CONFIG 542 | filter { 543 | kv { 544 | source => "non-exisiting-field" 545 | target => "kv" 546 | } 547 | } 548 | CONFIG 549 | sample "" do 550 | insist { subject.get("non-exisiting-field") } == nil 551 | insist { subject.get("kv") } == nil 552 | end 553 | end 554 | 555 | describe "test include_keys" do 556 | config <<-CONFIG 557 | filter { 558 | kv { 559 | include_keys => [ "foo", "singlequoted" ] 560 | } 561 | } 562 | CONFIG 563 | 564 | sample "hello=world foo=bar baz=fizz doublequoted=\"hello world\" singlequoted='hello world'" do 565 | insist { subject.get("foo") } == "bar" 566 | insist { subject.get("singlequoted") } == "hello world" 567 | end 568 | end 569 | 570 | describe "test exclude_keys" do 571 | config <<-CONFIG 572 | filter { 573 | kv { 574 | exclude_keys => [ "foo", "singlequoted" ] 575 | } 576 | } 577 | CONFIG 578 | 579 | sample "hello=world foo=bar baz=fizz doublequoted=\"hello world\" singlequoted='hello world'" do 580 | insist { subject.get("hello") } == "world" 581 | insist { subject.get("baz") } == "fizz" 582 | insist { subject.get("doublequoted") } == "hello world" 583 | end 584 | end 585 | 586 | describe "test include_keys with prefix" do 587 | config <<-CONFIG 588 | filter { 589 | kv { 590 | include_keys => [ "foo", "singlequoted" ] 591 | prefix => "__" 592 | } 593 | } 594 | CONFIG 595 | 596 | sample "hello=world foo=bar baz=fizz doublequoted=\"hello world\" singlequoted='hello world'" do 597 | insist { subject.get("__foo") } == "bar" 598 | insist { subject.get("__singlequoted") } == "hello world" 599 | end 600 | end 601 | 602 | describe "test exclude_keys with prefix" do 603 | config <<-CONFIG 604 | filter { 605 | kv { 606 | exclude_keys => [ "foo", "singlequoted" ] 607 | prefix => "__" 608 | } 609 | } 610 | CONFIG 611 | 612 | sample "hello=world foo=bar baz=fizz doublequoted=\"hello world\" singlequoted='hello world'" do 613 | insist { subject.get("__hello") } == "world" 614 | insist { subject.get("__baz") } == "fizz" 615 | insist { subject.get("__doublequoted") } == "hello world" 616 | end 617 | end 618 | 619 | describe "test include_keys with dynamic key" do 620 | config <<-CONFIG 621 | filter { 622 | kv { 623 | source => "data" 624 | include_keys => [ "%{key}"] 625 | } 626 | } 627 | CONFIG 628 | 629 | sample({"data" => "foo=bar baz=fizz", "key" => "foo"}) do 630 | insist { subject.get("foo") } == "bar" 631 | insist { subject.get("baz") } == nil 632 | end 633 | end 634 | 635 | describe "test exclude_keys with dynamic key" do 636 | config <<-CONFIG 637 | filter { 638 | kv { 639 | source => "data" 640 | exclude_keys => [ "%{key}"] 641 | } 642 | } 643 | CONFIG 644 | 645 | sample({"data" => "foo=bar baz=fizz", "key" => "foo"}) do 646 | insist { subject.get("foo") } == nil 647 | insist { subject.get("baz") } == "fizz" 648 | end 649 | end 650 | 651 | describe "test include_keys and exclude_keys" do 652 | config <<-CONFIG 653 | filter { 654 | kv { 655 | # This should exclude everything as a result of both settings. 656 | include_keys => [ "foo", "singlequoted" ] 657 | exclude_keys => [ "foo", "singlequoted" ] 658 | } 659 | } 660 | CONFIG 661 | 662 | sample "hello=world foo=bar baz=fizz doublequoted=\"hello world\" singlequoted='hello world'" do 663 | %w(hello foo baz doublequoted singlequoted).each do |field| 664 | reject { subject }.include?(field) 665 | end 666 | end 667 | end 668 | 669 | describe "test default_keys" do 670 | config <<-CONFIG 671 | filter { 672 | kv { 673 | default_keys => [ "foo", "xxx", 674 | "goo", "yyy" ] 675 | } 676 | } 677 | CONFIG 678 | 679 | sample "hello=world foo=bar baz=fizz doublequoted=\"hello world\" singlequoted='hello world'" do 680 | insist { subject.get("hello") } == "world" 681 | insist { subject.get("foo") } == "bar" 682 | insist { subject.get("goo") } == "yyy" 683 | insist { subject.get("baz") } == "fizz" 684 | insist { subject.get("doublequoted") } == "hello world" 685 | insist { subject.get("singlequoted") } == "hello world" 686 | end 687 | end 688 | 689 | describe "overwriting a string field (often the source)" do 690 | config <<-CONFIG 691 | filter { 692 | kv { 693 | source => "happy" 694 | target => "happy" 695 | } 696 | } 697 | CONFIG 698 | 699 | sample({"happy" => "foo=bar baz=fizz"}) do 700 | insist { subject.get("[happy][foo]") } == "bar" 701 | insist { subject.get("[happy][baz]") } == "fizz" 702 | end 703 | 704 | end 705 | 706 | describe "Removing duplicate key/value pairs" do 707 | config <<-CONFIG 708 | filter { 709 | kv { 710 | field_split => "&" 711 | source => "source" 712 | allow_duplicate_values => false 713 | } 714 | } 715 | CONFIG 716 | 717 | sample({"source" => "foo=bar&foo=yeah&foo=yeah"}) do 718 | insist { subject.get("[foo]") } == ["bar", "yeah"] 719 | end 720 | end 721 | 722 | describe "Allowing empty values" do 723 | config <<-CONFIG 724 | filter { 725 | kv { 726 | field_split => " " 727 | source => "source" 728 | allow_empty_values => true 729 | whitespace => strict 730 | } 731 | } 732 | CONFIG 733 | 734 | sample({"source" => "present=one empty= emptyquoted='' present=two emptybracketed=[] endofinput="}) do 735 | insist { subject.get('[present]') } == ['one','two'] 736 | insist { subject.get('[empty]') } == '' 737 | insist { subject.get('[emptyquoted]') } == '' 738 | insist { subject.get('[emptybracketed]') } == '' 739 | insist { subject.get('[endofinput]') } == '' 740 | end 741 | end 742 | 743 | describe "Allow duplicate key/value pairs by default" do 744 | config <<-CONFIG 745 | filter { 746 | kv { 747 | field_split => "&" 748 | source => "source" 749 | } 750 | } 751 | CONFIG 752 | 753 | sample({"source" => "foo=bar&foo=yeah&foo=yeah"}) do 754 | insist { subject.get("[foo]") } == ["bar", "yeah", "yeah"] 755 | end 756 | end 757 | 758 | describe "keys without values (reported in #22)" do 759 | subject do 760 | plugin = LogStash::Filters::KV.new(options) 761 | plugin.register 762 | plugin 763 | end 764 | 765 | let(:f1) { "AccountStatus" } 766 | let(:v1) { "4" } 767 | let(:f2) { "AdditionalInformation" } 768 | let(:f3) { "Code" } 769 | let(:f4) { "HttpStatusCode" } 770 | let(:f5) { "IsSuccess" } 771 | let(:v5) { "True" } 772 | let(:f6) { "Message" } 773 | 774 | let(:message) { "#{f1}: #{v1}\r\n#{f2}\r\n\r\n#{f3}: \r\n#{f4}: \r\n#{f5}: #{v5}\r\n#{f6}: \r\n" } 775 | let(:data) { {"message" => message} } 776 | let(:event) { LogStash::Event.new(data) } 777 | let(:options) { 778 | { 779 | "field_split" => "\r\n", 780 | "value_split" => " ", 781 | "trim_key" => ":" 782 | } 783 | } 784 | 785 | context "key and splitters with no value" do 786 | it "should ignore the incomplete key/value pairs" do 787 | subject.filter(event) 788 | expect(event.get(f1)).to eq(v1) 789 | expect(event.get(f5)).to eq(v5) 790 | expect(event.include?(f2)).to be false 791 | expect(event.include?(f3)).to be false 792 | expect(event.include?(f4)).to be false 793 | expect(event.include?(f6)).to be false 794 | end 795 | end 796 | end 797 | 798 | describe "trim_key/trim_value options : trim only leading and trailing spaces in keys/values (reported in #10)" do 799 | subject do 800 | plugin = LogStash::Filters::KV.new(options) 801 | plugin.register 802 | plugin 803 | end 804 | 805 | let(:message) { "key1= value1 with spaces | key2 with spaces =value2" } 806 | let(:data) { {"message" => message} } 807 | let(:event) { LogStash::Event.new(data) } 808 | let(:options) { 809 | { 810 | "field_split" => "\|", 811 | "value_split" => "=", 812 | "trim_value" => " ", 813 | "trim_key" => " " 814 | } 815 | } 816 | 817 | context "key and value with leading, trailing and middle spaces" do 818 | it "should trim only leading and trailing spaces" do 819 | subject.filter(event) 820 | expect(event.get("key1")).to eq("value1 with spaces") 821 | expect(event.get("key2 with spaces")).to eq("value2") 822 | end 823 | end 824 | end 825 | 826 | describe "trim_key/trim_value options : trim multiple matching characters from either end" do 827 | subject do 828 | plugin = LogStash::Filters::KV.new(options) 829 | plugin.register 830 | plugin 831 | end 832 | 833 | let(:data) { {"message" => message} } 834 | let(:event) { LogStash::Event.new(data) } 835 | 836 | 837 | context 'repeated same-character sequence' do 838 | let(:message) { "key1= value1 with spaces | key2 with spaces =value2" } 839 | let(:options) { 840 | { 841 | "field_split" => "|", 842 | "value_split" => "=", 843 | "trim_value" => " ", 844 | "trim_key" => " " 845 | } 846 | } 847 | 848 | it 'trims all the right bits' do 849 | subject.filter(event) 850 | expect(event.get('key1')).to eq('value1 with spaces') 851 | expect(event.get('key2 with spaces')).to eq('value2') 852 | end 853 | end 854 | 855 | context 'multi-character sequence' do 856 | let(:message) { "to=, orig_to=, %+relay=mail.example.com[private/dovecot-lmtp], delay=2.2, delays=1.9/0.01/0.01/0.21, dsn=2.0.0, status=sent (250 2.0.0 YERDHejiRSXFDSdfUXTV Saved) " } 857 | let(:options) { 858 | { 859 | "field_split" => " ", 860 | "value_split" => "=", 861 | "trim_value" => "<>,", 862 | "trim_key" => "%+" 863 | } 864 | } 865 | 866 | it 'trims all the right bits' do 867 | subject.filter(event) 868 | expect(event.get('to')).to eq('foo@example.com') 869 | expect(event.get('orig_to')).to eq('bar@example.com') 870 | expect(event.get('relay')).to eq('mail.example.com[private/dovecot-lmtp]') 871 | expect(event.get('delay')).to eq('2.2') 872 | expect(event.get('delays')).to eq('1.9/0.01/0.01/0.21') 873 | expect(event.get('dsn')).to eq('2.0.0') 874 | expect(event.get('status')).to eq('sent') 875 | end 876 | end 877 | end 878 | 879 | describe "remove_char_key/remove_char_value options : remove all characters in keys/values whatever their position" do 880 | subject do 881 | plugin = LogStash::Filters::KV.new(options) 882 | plugin.register 883 | plugin 884 | end 885 | 886 | let(:message) { "key1= value1 with spaces | key2 with spaces =value2" } 887 | let(:data) { {"message" => message} } 888 | let(:event) { LogStash::Event.new(data) } 889 | let(:options) { 890 | { 891 | "field_split" => "\|", 892 | "value_split" => "=", 893 | "remove_char_value" => " ", 894 | "remove_char_key" => " " 895 | } 896 | } 897 | 898 | context "key and value with leading, trailing and middle spaces" do 899 | it "should remove all spaces" do 900 | subject.filter(event) 901 | expect(event.get("key1")).to eq("value1withspaces") 902 | expect(event.get("key2withspaces")).to eq("value2") 903 | end 904 | end 905 | end 906 | 907 | describe "an empty value_split option should be reported" do 908 | config <<-CONFIG 909 | filter { 910 | kv { 911 | value_split => "" 912 | } 913 | } 914 | CONFIG 915 | 916 | sample({"message" => "random message"}) do 917 | insist { subject }.raises(LogStash::ConfigurationError) 918 | end 919 | end 920 | end 921 | 922 | describe "multi character splitting" do 923 | subject do 924 | plugin = LogStash::Filters::KV.new(options) 925 | plugin.register 926 | plugin 927 | end 928 | 929 | let(:data) { {"message" => message} } 930 | let(:event) { LogStash::Event.new(data) } 931 | 932 | shared_examples "parsing all fields and values" do 933 | it "parses all fields and values" do 934 | subject.filter(event) 935 | expect(event.get("hello")).to eq("world") 936 | expect(event.get("foo")).to eq("bar") 937 | expect(event.get("baz")).to eq("fizz") 938 | expect(event.get("doublequoted")).to eq("hello world") 939 | expect(event.get("singlequoted")).to eq("hello world") 940 | expect(event.get("bracketsone")).to eq("hello world") 941 | expect(event.get("bracketstwo")).to eq("hello world") 942 | expect(event.get("bracketsthree")).to eq("hello world") 943 | end 944 | end 945 | 946 | context "empty value_split_pattern" do 947 | let(:options) { { "value_split_pattern" => "" } } 948 | it "should raise ConfigurationError" do 949 | expect{subject}.to raise_error(LogStash::ConfigurationError) 950 | end 951 | end 952 | 953 | context "empty field_split_pattern" do 954 | let(:options) { { "field_split_pattern" => "" } } 955 | it "should raise ConfigurationError" do 956 | expect{subject}.to raise_error(LogStash::ConfigurationError) 957 | end 958 | end 959 | 960 | context "single split" do 961 | let(:message) { "hello:world foo:bar baz:fizz doublequoted:\"hello world\" singlequoted:'hello world' bracketsone:(hello world) bracketstwo:[hello world] bracketsthree:" } 962 | let(:options) { 963 | { 964 | "field_split" => " ", 965 | "value_split" => ":", 966 | } 967 | } 968 | it_behaves_like "parsing all fields and values" 969 | end 970 | 971 | context "value split multi" do 972 | let(:message) { "hello::world foo::bar baz::fizz doublequoted::\"hello world\" singlequoted::'hello world' bracketsone::(hello world) bracketstwo::[hello world] bracketsthree::" } 973 | let(:options) { 974 | { 975 | "field_split" => " ", 976 | "value_split_pattern" => "::", 977 | } 978 | } 979 | it_behaves_like "parsing all fields and values" 980 | end 981 | 982 | context 'multi-char field split pattern with value that begins quoted and contains more unquoted' do 983 | let(:message) { 'foo=bar!!!!!baz="quoted stuff" and more unquoted!!!!!msg="fully-quoted with a part! of the separator"!!!!!blip="this!!!!!is it"!!!!!empty=""!!!!!non-empty="foo"' } 984 | let(:options) { 985 | { 986 | "field_split_pattern" => "!!!!!" 987 | } 988 | } 989 | it 'gets the right bits' do 990 | subject.filter(event) 991 | expect(event.get("foo")).to eq('bar') 992 | expect(event.get("baz")).to eq('"quoted stuff" and more unquoted') 993 | expect(event.get("msg")).to eq('fully-quoted with a part! of the separator') 994 | expect(event.get("blip")).to eq('this!!!!!is it') 995 | expect(event.get("empty")).to be_nil 996 | expect(event.get("non-empty")).to eq('foo') 997 | end 998 | end 999 | 1000 | context 'standard field split pattern with value that begins quoted and contains more unquoted' do 1001 | let(:message) { 'foo=bar baz="quoted stuff" and more unquoted msg="some fully-quoted message " empty="" non-empty="foo"' } 1002 | let(:options) { 1003 | { 1004 | } 1005 | } 1006 | it 'gets the right bits' do 1007 | subject.filter(event) 1008 | expect(event.get("foo")).to eq('bar') 1009 | expect(event.get("baz")).to eq('quoted stuff') # NOTE: outside the quotes is truncated because field split pattern wins. 1010 | expect(event.get("msg")).to eq('some fully-quoted message ') 1011 | expect(event.get("empty")).to be_nil 1012 | expect(event.get("non-empty")).to eq('foo') 1013 | end 1014 | end 1015 | 1016 | context "field and value split multi" do 1017 | let(:message) { "hello::world__foo::bar__baz::fizz__doublequoted::\"hello world\"__singlequoted::'hello world'__bracketsone::(hello world)__bracketstwo::[hello world]__bracketsthree::" } 1018 | let(:options) { 1019 | { 1020 | "field_split_pattern" => "__", 1021 | "value_split_pattern" => "::", 1022 | } 1023 | } 1024 | it_behaves_like "parsing all fields and values" 1025 | end 1026 | 1027 | context "field and value split multi with regex" do 1028 | let(:message) { "hello:world_foo::bar__baz:::fizz___doublequoted:::\"hello world\"____singlequoted:::::'hello world'____bracketsone:::(hello world)__bracketstwo:[hello world]_bracketsthree::::::" } 1029 | let(:options) { 1030 | { 1031 | "field_split_pattern" => "_+", 1032 | "value_split_pattern" => ":+", 1033 | } 1034 | } 1035 | it_behaves_like "parsing all fields and values" 1036 | end 1037 | 1038 | context "field and value split multi using singe char" do 1039 | let(:message) { "hello:world foo:bar baz:fizz doublequoted:\"hello world\" singlequoted:'hello world' bracketsone:(hello world) bracketstwo:[hello world] bracketsthree:" } 1040 | let(:options) { 1041 | { 1042 | "field_split_pattern" => " ", 1043 | "value_split_pattern" => ":", 1044 | } 1045 | } 1046 | it_behaves_like "parsing all fields and values" 1047 | end 1048 | 1049 | context "field and value split multi using escaping" do 1050 | let(:message) { "hello++world??foo++bar??baz++fizz??doublequoted++\"hello world\"??singlequoted++'hello world'??bracketsone++(hello world)??bracketstwo++[hello world]??bracketsthree++" } 1051 | let(:options) { 1052 | { 1053 | "field_split_pattern" => "\\?\\?", 1054 | "value_split_pattern" => "\\+\\+", 1055 | } 1056 | } 1057 | it_behaves_like "parsing all fields and values" 1058 | end 1059 | 1060 | context "example from @guyboertje in #15" do 1061 | let(:message) { 'key1: val1; key2: val2; key3: https://site/?g={......"...; CLR rv:11.0)"..}; key4: val4;' } 1062 | let(:options) { 1063 | { 1064 | "field_split_pattern" => ";\s*(?=key.+?:)|;$", 1065 | "value_split_pattern" => ":\s+", 1066 | } 1067 | } 1068 | 1069 | it "parses all fields and values" do 1070 | subject.filter(event) 1071 | 1072 | expect(event.get("key1")).to eq("val1") 1073 | expect(event.get("key2")).to eq("val2") 1074 | expect(event.get("key3")).to eq("https://site/?g={......\"...; CLR rv:11.0)\"..}") 1075 | expect(event.get("key4")).to eq("val4") 1076 | end 1077 | end 1078 | 1079 | describe "handles empty values" do 1080 | let(:message) { 'a=1|b=|c=3' } 1081 | 1082 | shared_examples "parse empty values" do 1083 | it "splits correctly upon empty value" do 1084 | subject.filter(event) 1085 | 1086 | expect(event.get("a")).to eq("1") 1087 | expect(event.get("b")).to be_nil 1088 | expect(event.get("c")).to eq("3") 1089 | end 1090 | end 1091 | 1092 | context "using char class splitters" do 1093 | let(:options) { 1094 | { 1095 | "field_split" => "|", 1096 | "value_split" => "=", 1097 | } 1098 | } 1099 | it_behaves_like "parse empty values" 1100 | end 1101 | 1102 | context "using pattern splitters" do 1103 | let(:options) { 1104 | { 1105 | "field_split_pattern" => '\|', 1106 | "value_split_pattern" => "=", 1107 | } 1108 | } 1109 | it_behaves_like "parse empty values" 1110 | end 1111 | end 1112 | end 1113 | 1114 | context 'runtime errors' do 1115 | 1116 | let(:options) { {} } 1117 | let(:plugin) do 1118 | LogStash::Filters::KV.new(options).instance_exec { register; self } 1119 | end 1120 | 1121 | let(:data) { {"message" => message} } 1122 | let(:event) { LogStash::Event.new(data) } 1123 | let(:message) { "foo=bar hello=world" } 1124 | 1125 | 1126 | before(:each) do 1127 | expect(plugin).to receive(:parse) { fail('intentional') } 1128 | end 1129 | 1130 | context 'when a runtime error is raised' do 1131 | it 'does not cascade the exception to crash the plugin' do 1132 | plugin.filter(event) 1133 | end 1134 | it 'tags the event with "_kv_filter_error"' do 1135 | plugin.filter(event) 1136 | expect(event.get('tags')).to_not be_nil 1137 | expect(event.get('tags')).to include('_kv_filter_error') 1138 | end 1139 | it 'logs an informative message' do 1140 | logger_double = double('Logger').as_null_object 1141 | expect(plugin).to receive(:logger).and_return(logger_double).at_least(:once) 1142 | expect(logger_double).to receive(:warn).with('Exception while parsing KV', anything) 1143 | 1144 | plugin.filter(event) 1145 | end 1146 | context 'when a custom tag is defined' do 1147 | let(:options) { super().merge("tag_on_failure" => "KV-ERROR")} 1148 | it 'tags the event with the custom tag' do 1149 | plugin.filter(event) 1150 | expect(event.get('tags')).to_not be_nil 1151 | expect(event.get('tags')).to include('KV-ERROR') 1152 | expect(event.get('tags')).to_not include('_kv_filter_error') 1153 | end 1154 | end 1155 | context 'when multiple custom tags are defined' do 1156 | let(:options) { super().merge("tag_on_failure" => ["kv_FAIL_one", "_kv_fail_TWO"])} 1157 | it 'tags the event with the custom tag' do 1158 | plugin.filter(event) 1159 | expect(event.get('tags')).to_not be_nil 1160 | expect(event.get('tags')).to include('kv_FAIL_one') 1161 | expect(event.get('tags')).to include('_kv_fail_TWO') 1162 | expect(event.get('tags')).to_not include('_kv_filter_error') 1163 | end 1164 | end 1165 | end 1166 | end 1167 | 1168 | # This group intentionally uses patterns that are vulnerable to pathological inputs to test timeouts. 1169 | # 1170 | # patterns of the form `/(?:x+x+)+y/` are vulnerable to inputs that have long sequences matching `/x/` 1171 | # that are _not_ followed by a sequence matching `/y/`. 1172 | context 'timeouts' do 1173 | let(:options) do 1174 | { 1175 | "value_split_pattern" => "(?:=+=+)+:" 1176 | } 1177 | end 1178 | subject(:plugin) do 1179 | LogStash::Filters::KV.new(options).instance_exec { register; self } 1180 | end 1181 | 1182 | let(:data) { {"message" => message} } 1183 | let(:event) { LogStash::Event.new(data) } 1184 | let(:message) { "foo=bar hello=world" } 1185 | 1186 | after(:each) { plugin.close } 1187 | 1188 | # since we are dealing with potentially-pathological specs, ensure specs fail in a timely 1189 | # manner if they block for longer than `spec_blocking_threshold_seconds`. 1190 | let(:spec_blocking_threshold_seconds) { 10 } 1191 | around(:each) do |example| 1192 | begin 1193 | blocking_exception_class = Class.new(::Exception) # avoid RuntimeError, which is handled in KV#filter 1194 | Timeout.timeout(spec_blocking_threshold_seconds, blocking_exception_class, &example) 1195 | rescue blocking_exception_class 1196 | fail('execution blocked') 1197 | end 1198 | end 1199 | 1200 | context 'when timeouts are enabled' do 1201 | let(:options) { super().merge("timeout_millis" => 250) } 1202 | let(:spec_blocking_threshold_seconds) { 3 } 1203 | 1204 | context 'when given a pathological input' do 1205 | let(:message) { "foo========:bar baz================================================bingo" } 1206 | 1207 | it 'tags the event' do 1208 | plugin.filter(event) 1209 | 1210 | expect(event.get('tags')).to be_a_kind_of(Enumerable) 1211 | expect(event.get('tags')).to include('_kv_filter_timeout') 1212 | end 1213 | 1214 | context 'when given a custom `tag_on_timeout`' do 1215 | let(:options) { super().merge('tag_on_timeout' => 'BADKV') } 1216 | 1217 | it 'tags the event with the custom tag' do 1218 | plugin.filter(event) 1219 | 1220 | expect(event.get('tags')).to be_a_kind_of(Enumerable) 1221 | expect(event.get('tags')).to include('BADKV') 1222 | end 1223 | end 1224 | 1225 | context 'when default_keys are provided' do 1226 | let(:options) { super().merge("default_keys" => {"default" => "key"})} 1227 | 1228 | it 'does not populate default keys' do 1229 | plugin.filter(event) 1230 | 1231 | expect(event).to_not include('default') 1232 | end 1233 | end 1234 | context 'when filter_matched hooks are provided' do 1235 | let(:options) { super().merge("add_field" => {"kv" => "success"})} 1236 | 1237 | it 'does not call filter_matched hooks' do 1238 | plugin.filter(event) 1239 | 1240 | expect(event).to_not include('kv') 1241 | end 1242 | end 1243 | end 1244 | 1245 | context 'when given a non-pathological input' do 1246 | let(:message) { "foo==:bar baz==:bingo" } 1247 | 1248 | it 'extracts the k/v' do 1249 | plugin.filter(event) 1250 | 1251 | expect(event.get('foo')).to eq('bar') 1252 | expect(event.get('baz')).to eq('bingo') 1253 | end 1254 | end 1255 | end 1256 | 1257 | context 'when timeouts are explicitly disabled' do 1258 | let(:options) { super().merge("timeout_millis" => 0) } 1259 | 1260 | context 'when given a pathological input' do 1261 | let(:message) { "foo========:bar baz================================================================bingo"} 1262 | 1263 | it 'blocks for at least 3 seconds' do 1264 | blocking_exception_class = Class.new(::Exception) # avoid RuntimeError, which is handled in KV#filter 1265 | expect do 1266 | Timeout.timeout(3, blocking_exception_class) do 1267 | plugin.filter(event) 1268 | end 1269 | end.to raise_exception(blocking_exception_class) 1270 | end 1271 | end 1272 | 1273 | context 'when given a non-pathological input' do 1274 | let(:message) { "foo==:bar baz==:bingo" } 1275 | 1276 | it 'extracts the k/v' do 1277 | plugin.filter(event) 1278 | 1279 | expect(event.get('foo')).to eq('bar') 1280 | expect(event.get('baz')).to eq('bingo') 1281 | end 1282 | end 1283 | end 1284 | end 1285 | --------------------------------------------------------------------------------