├── .github ├── CONTRIBUTING.md ├── ISSUE_TEMPLATE.md └── PULL_REQUEST_TEMPLATE.md ├── .gitignore ├── .travis.yml ├── CHANGELOG.md ├── CONTRIBUTORS ├── Gemfile ├── LICENSE ├── NOTICE.TXT ├── README.md ├── Rakefile ├── docs └── index.asciidoc ├── lib └── logstash │ └── codecs │ ├── cef.rb │ └── cef │ └── timestamp_normalizer.rb ├── logstash-codec-cef.gemspec └── spec └── codecs ├── cef └── timestamp_normalizer_spec.rb └── cef_spec.rb /.github/CONTRIBUTING.md: -------------------------------------------------------------------------------- 1 | # Contributing to Logstash 2 | 3 | All contributions are welcome: ideas, patches, documentation, bug reports, 4 | complaints, etc! 5 | 6 | Programming is not a required skill, and there are many ways to help out! 7 | It is more important to us that you are able to contribute. 8 | 9 | That said, some basic guidelines, which you are free to ignore :) 10 | 11 | ## Want to learn? 12 | 13 | Want to lurk about and see what others are doing with Logstash? 14 | 15 | * The irc channel (#logstash on irc.freenode.org) is a good place for this 16 | * The [forum](https://discuss.elastic.co/c/logstash) is also 17 | great for learning from others. 18 | 19 | ## Got Questions? 20 | 21 | Have a problem you want Logstash to solve for you? 22 | 23 | * You can ask a question in the [forum](https://discuss.elastic.co/c/logstash) 24 | * Alternately, you are welcome to join the IRC channel #logstash on 25 | irc.freenode.org and ask for help there! 26 | 27 | ## Have an Idea or Feature Request? 28 | 29 | * File a ticket on [GitHub](https://github.com/elastic/logstash/issues). Please remember that GitHub is used only for issues and feature requests. If you have a general question, the [forum](https://discuss.elastic.co/c/logstash) or IRC would be the best place to ask. 30 | 31 | ## Something Not Working? Found a Bug? 32 | 33 | If you think you found a bug, it probably is a bug. 34 | 35 | * If it is a general Logstash or a pipeline issue, file it in [Logstash GitHub](https://github.com/elasticsearch/logstash/issues) 36 | * If it is specific to a plugin, please file it in the respective repository under [logstash-plugins](https://github.com/logstash-plugins) 37 | * or ask the [forum](https://discuss.elastic.co/c/logstash). 38 | 39 | # Contributing Documentation and Code Changes 40 | 41 | If you have a bugfix or new feature that you would like to contribute to 42 | logstash, and you think it will take more than a few minutes to produce the fix 43 | (ie; write code), it is worth discussing the change with the Logstash users and developers first! You can reach us via [GitHub](https://github.com/elastic/logstash/issues), the [forum](https://discuss.elastic.co/c/logstash), or via IRC (#logstash on freenode irc) 44 | Please note that Pull Requests without tests will not be merged. If you would like to contribute but do not have experience with writing tests, please ping us on IRC/forum or create a PR and ask our help. 45 | 46 | ## Contributing to plugins 47 | 48 | Check our [documentation](https://www.elastic.co/guide/en/logstash/current/contributing-to-logstash.html) on how to contribute to plugins or write your own! It is super easy! 49 | 50 | ## Contribution Steps 51 | 52 | 1. Test your changes! [Run](https://github.com/elastic/logstash#testing) the test suite 53 | 2. Please make sure you have signed our [Contributor License 54 | Agreement](https://www.elastic.co/contributor-agreement/). We are not 55 | asking you to assign copyright to us, but to give us the right to distribute 56 | your code without restriction. We ask this of all contributors in order to 57 | assure our users of the origin and continuing existence of the code. You 58 | only need to sign the CLA once. 59 | 3. Send a pull request! Push your changes to your fork of the repository and 60 | [submit a pull 61 | request](https://help.github.com/articles/using-pull-requests). In the pull 62 | request, describe what your changes do and mention any bugs/issues related 63 | to the pull request. 64 | 65 | 66 | -------------------------------------------------------------------------------- /.github/ISSUE_TEMPLATE.md: -------------------------------------------------------------------------------- 1 | Please post all product and debugging questions on our [forum](https://discuss.elastic.co/c/logstash). Your questions will reach our wider community members there, and if we confirm that there is a bug, then we can open a new issue here. 2 | 3 | For all general issues, please provide the following details for fast resolution: 4 | 5 | - Version: 6 | - Operating System: 7 | - Config File (if you have sensitive info, please remove it): 8 | - Sample Data: 9 | - Steps to Reproduce: 10 | -------------------------------------------------------------------------------- /.github/PULL_REQUEST_TEMPLATE.md: -------------------------------------------------------------------------------- 1 | Thanks for contributing to Logstash! If you haven't already signed our CLA, here's a handy link: https://www.elastic.co/contributor-agreement/ 2 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | build 2 | vendor 3 | tools 4 | .VERSION.mk 5 | *.gem 6 | *.lock 7 | *.swp 8 | -------------------------------------------------------------------------------- /.travis.yml: -------------------------------------------------------------------------------- 1 | import: 2 | - logstash-plugins/.ci:travis/travis.yml@1.x -------------------------------------------------------------------------------- /CHANGELOG.md: -------------------------------------------------------------------------------- 1 | ## 6.2.8 2 | - [Doc] Added `raw_data_field` to docs. [#105](https://github.com/logstash-plugins/logstash-codec-cef/pull/105) 3 | 4 | ## 6.2.7 5 | - Fix: when decoding in an ecs_compatibility mode, timestamp-normalized fields now handle provided-but-empty values [#102](https://github.com/logstash-plugins/logstash-codec-cef/issues/102) 6 | 7 | ## 6.2.6 8 | - Fix: when decoding, escaped newlines and carriage returns in extension values are now correctly decoded into literal newlines and carriage returns respectively [#98](https://github.com/logstash-plugins/logstash-codec-cef/pull/98) 9 | - Fix: when decoding, non-CEF payloads are identified and intercepted to prevent data-loss and corruption. They now cause a descriptive log message to be emitted, and are emitted as their own `_cefparsefailure`-tagged event containing the original bytes in its `message` field [#99](https://github.com/logstash-plugins/logstash-codec-cef/issues/99) 10 | - Fix: when decoding while configured with a `delimiter`, flushing this codec now correctly consumes the remainder of its internal buffer. This resolves an issue where bytes that are written without a trailing delimiter could be lost [#100](https://github.com/logstash-plugins/logstash-codec-cef/issues/100) 11 | 12 | ## 6.2.5 13 | - [DOC] Update link to CEF implementation guide [#97](https://github.com/logstash-plugins/logstash-codec-cef/pull/97) 14 | 15 | ## 6.2.4 16 | - [DOC] Emphasize importance of delimiter setting for byte stream inputs [#95](https://github.com/logstash-plugins/logstash-codec-cef/pull/95) 17 | 18 | ## 6.2.3 19 | - Feat: event_factory support [#94](https://github.com/logstash-plugins/logstash-codec-cef/pull/94) 20 | 21 | ## 6.2.2 22 | - Fixed invalid Field Reference that could occur when ECS mode was enabled and the CEF field `fileHash` was parsed. 23 | - Added expanded mapping for numbered `deviceCustom*` and `deviceCustom*Label` fields so that all now include numbers 1 through 15. [#89](https://github.com/logstash-plugins/logstash-codec-cef/pull/89). 24 | 25 | ## 6.2.1 26 | - Added field mapping to docs. 27 | - Fixed ECS mapping of `deviceMacAddress` field. [#88](https://github.com/logstash-plugins/logstash-codec-cef/pull/88). 28 | 29 | ## 6.2.0 30 | - Introduce ECS Compatibility mode [#83](https://github.com/logstash-plugins/logstash-codec-cef/pull/83). 31 | 32 | ## 6.1.2 33 | - Added error log with full payload when something bad happens in decoding a message [#84](https://github.com/logstash-plugins/logstash-codec-cef/pull/84) 34 | 35 | ## 6.1.1 36 | - Improved encoding performance, especially when encoding many extension fields [#81](https://github.com/logstash-plugins/logstash-codec-cef/pull/81) 37 | 38 | ## 6.1.0 39 | - Fixed CEF short to long name translation for ahost/agentHostName field, according to documentation [#75](https://github.com/logstash-plugins/logstash-codec-cef/pull/75) 40 | 41 | ## 6.0.1 42 | - Fixed support for deep dot notation [#73](https://github.com/logstash-plugins/logstash-codec-cef/pull/73) 43 | 44 | ## 6.0.0 45 | - Removed obsolete `sev` and `deprecated_v1_fields` fields 46 | 47 | ## 5.0.7 48 | - Fixed minor doc inconsistencies (added reverse_mapping to options table, moved it to alpha order in option descriptions, fixed typo) 49 | [#60](https://github.com/logstash-plugins/logstash-codec-cef/pull/60) 50 | 51 | ## 5.0.6 52 | - Added reverse_mapping option, which can be used to make encoder compliant to spec [#51](https://github.com/logstash-plugins/logstash-codec-cef/pull/51) 53 | 54 | ## 5.0.5 55 | - Fix handling of malformed inputs that have illegal unescaped-equals characters in extension field values (restores behaviour from <= v5.0.3 in some edge-cases) ([#56](https://github.com/logstash-plugins/logstash-codec-cef/issues/56)) 56 | 57 | ## 5.0.4 58 | - Fix bug in parsing headers where certain legal escape sequences could cause non-escaped pipe characters to be ignored. 59 | - Fix bug in parsing extension values where a legal unescaped space in a field's value could be interpreted as a field separator (#54) 60 | - Add explicit handling for extension key names that use array-like syntax that isn't legal with the strict-mode field-reference parser (e.g., `fieldname[0]` becomes `[fieldname][0]`). 61 | 62 | ## 5.0.3 63 | - Fix handling of higher-plane UTF-8 characters in message body 64 | 65 | ## 5.0.2 66 | - Update gemspec summary 67 | 68 | ## 5.0.1 69 | - Fix some documentation issues 70 | 71 | ## 5.0.0 72 | - move `sev` and `deprecated_v1_fields` fields from deprecated to obsolete 73 | 74 | ## 4.1.2 75 | - added mapping for outcome = eventOutcome from CEF whitepaper (ref:p26/39) 76 | 77 | ## 4.1.1 78 | - changed rt from receiptTime to deviceReceiptTime (ref:p27/39) 79 | - changed tokenizer to include additional fields (ad.fieldname) 80 | 81 | ## 4.1.0 82 | - Add `delimiter` setting. This allows the decoder to be used with inputs like the TCP input where event delimiters are used. 83 | 84 | ## 4.0.0 85 | - Implements the dictionary translation for abbreviated CEF field names from chapter Chapter 2: ArcSight Extension Dictionary page 3 of 39 [CEF specification](https://protect724.hp.com/docs/DOC-1072). 86 | - add `_cefparsefailure` tag on failed decode 87 | 88 | ## 3.0.0 89 | - breaking: Updated plugin to use new Java Event APIs 90 | 91 | ## 2.1.3 92 | - Switch in-place sub! to sub when extracting `cef_version`. new Logstash Java Event does not support in-place String changes. 93 | 94 | ## 2.1.2 95 | - Depend on logstash-core-plugin-api instead of logstash-core, removing the need to mass update plugins on major releases of logstash 96 | 97 | ## 2.1.1 98 | - New dependency requirements for logstash-core for the 5.0 release 99 | 100 | ## 2.1.0 101 | - Implements `encode` with escaping according to the [CEF specification](https://protect724.hp.com/docs/DOC-1072). 102 | - Config option `sev` is deprecated, use `severity` instead. 103 | 104 | ## 2.0.0 105 | - Plugins were updated to follow the new shutdown semantic, this mainly allows Logstash to instruct input plugins to terminate gracefully, 106 | instead of using Thread.raise on the plugins' threads. Ref: https://github.com/elastic/logstash/pull/3895 107 | - Dependency on logstash-core update to 2.0 108 | -------------------------------------------------------------------------------- /CONTRIBUTORS: -------------------------------------------------------------------------------- 1 | The following is a list of people who have contributed ideas, code, bug 2 | reports, or in general have helped logstash along its way. 3 | 4 | Maintainers: 5 | * Lucas Bremgartner (breml) 6 | 7 | Contributors: 8 | * Aaron Mildenstein (untergeek) 9 | * Colin Surprenant (colinsurprenant) 10 | * Jason Kendall (coolacid) 11 | * Jordan Sissel (jordansissel) 12 | * João Duarte (jsvd) 13 | * Nick Ethier (nickethier) 14 | * Nicholas Lim (nich07as) 15 | * Pete Fritchman (fetep) 16 | * Pier-Hugues Pellerin (ph) 17 | * Karl Stoney (Stono) 18 | * Lucas Bremgartner (breml) 19 | 20 | Note: If you've sent us patches, bug reports, or otherwise contributed to 21 | Logstash, and you aren't on the list above and want to be, please let us know 22 | and we'll make sure you're here. Contributions from folks like you are what make 23 | open source awesome. 24 | -------------------------------------------------------------------------------- /Gemfile: -------------------------------------------------------------------------------- 1 | source 'https://rubygems.org' 2 | 3 | gemspec 4 | 5 | logstash_path = ENV["LOGSTASH_PATH"] || "../../logstash" 6 | use_logstash_source = ENV["LOGSTASH_SOURCE"] && ENV["LOGSTASH_SOURCE"].to_s == "1" 7 | 8 | if Dir.exist?(logstash_path) && use_logstash_source 9 | gem 'logstash-core', :path => "#{logstash_path}/logstash-core" 10 | gem 'logstash-core-plugin-api', :path => "#{logstash_path}/logstash-core-plugin-api" 11 | end 12 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | 2 | Apache License 3 | Version 2.0, January 2004 4 | http://www.apache.org/licenses/ 5 | 6 | TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 7 | 8 | 1. Definitions. 9 | 10 | "License" shall mean the terms and conditions for use, reproduction, 11 | and distribution as defined by Sections 1 through 9 of this document. 12 | 13 | "Licensor" shall mean the copyright owner or entity authorized by 14 | the copyright owner that is granting the License. 15 | 16 | "Legal Entity" shall mean the union of the acting entity and all 17 | other entities that control, are controlled by, or are under common 18 | control with that entity. For the purposes of this definition, 19 | "control" means (i) the power, direct or indirect, to cause the 20 | direction or management of such entity, whether by contract or 21 | otherwise, or (ii) ownership of fifty percent (50%) or more of the 22 | outstanding shares, or (iii) beneficial ownership of such entity. 23 | 24 | "You" (or "Your") shall mean an individual or Legal Entity 25 | exercising permissions granted by this License. 26 | 27 | "Source" form shall mean the preferred form for making modifications, 28 | including but not limited to software source code, documentation 29 | source, and configuration files. 30 | 31 | "Object" form shall mean any form resulting from mechanical 32 | transformation or translation of a Source form, including but 33 | not limited to compiled object code, generated documentation, 34 | and conversions to other media types. 35 | 36 | "Work" shall mean the work of authorship, whether in Source or 37 | Object form, made available under the License, as indicated by a 38 | copyright notice that is included in or attached to the work 39 | (an example is provided in the Appendix below). 40 | 41 | "Derivative Works" shall mean any work, whether in Source or Object 42 | form, that is based on (or derived from) the Work and for which the 43 | editorial revisions, annotations, elaborations, or other modifications 44 | represent, as a whole, an original work of authorship. For the purposes 45 | of this License, Derivative Works shall not include works that remain 46 | separable from, or merely link (or bind by name) to the interfaces of, 47 | the Work and Derivative Works thereof. 48 | 49 | "Contribution" shall mean any work of authorship, including 50 | the original version of the Work and any modifications or additions 51 | to that Work or Derivative Works thereof, that is intentionally 52 | submitted to Licensor for inclusion in the Work by the copyright owner 53 | or by an individual or Legal Entity authorized to submit on behalf of 54 | the copyright owner. For the purposes of this definition, "submitted" 55 | means any form of electronic, verbal, or written communication sent 56 | to the Licensor or its representatives, including but not limited to 57 | communication on electronic mailing lists, source code control systems, 58 | and issue tracking systems that are managed by, or on behalf of, the 59 | Licensor for the purpose of discussing and improving the Work, but 60 | excluding communication that is conspicuously marked or otherwise 61 | designated in writing by the copyright owner as "Not a Contribution." 62 | 63 | "Contributor" shall mean Licensor and any individual or Legal Entity 64 | on behalf of whom a Contribution has been received by Licensor and 65 | subsequently incorporated within the Work. 66 | 67 | 2. Grant of Copyright License. Subject to the terms and conditions of 68 | this License, each Contributor hereby grants to You a perpetual, 69 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 70 | copyright license to reproduce, prepare Derivative Works of, 71 | publicly display, publicly perform, sublicense, and distribute the 72 | Work and such Derivative Works in Source or Object form. 73 | 74 | 3. Grant of Patent License. Subject to the terms and conditions of 75 | this License, each Contributor hereby grants to You a perpetual, 76 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 77 | (except as stated in this section) patent license to make, have made, 78 | use, offer to sell, sell, import, and otherwise transfer the Work, 79 | where such license applies only to those patent claims licensable 80 | by such Contributor that are necessarily infringed by their 81 | Contribution(s) alone or by combination of their Contribution(s) 82 | with the Work to which such Contribution(s) was submitted. If You 83 | institute patent litigation against any entity (including a 84 | cross-claim or counterclaim in a lawsuit) alleging that the Work 85 | or a Contribution incorporated within the Work constitutes direct 86 | or contributory patent infringement, then any patent licenses 87 | granted to You under this License for that Work shall terminate 88 | as of the date such litigation is filed. 89 | 90 | 4. Redistribution. You may reproduce and distribute copies of the 91 | Work or Derivative Works thereof in any medium, with or without 92 | modifications, and in Source or Object form, provided that You 93 | meet the following conditions: 94 | 95 | (a) You must give any other recipients of the Work or 96 | Derivative Works a copy of this License; and 97 | 98 | (b) You must cause any modified files to carry prominent notices 99 | stating that You changed the files; and 100 | 101 | (c) You must retain, in the Source form of any Derivative Works 102 | that You distribute, all copyright, patent, trademark, and 103 | attribution notices from the Source form of the Work, 104 | excluding those notices that do not pertain to any part of 105 | the Derivative Works; and 106 | 107 | (d) If the Work includes a "NOTICE" text file as part of its 108 | distribution, then any Derivative Works that You distribute must 109 | include a readable copy of the attribution notices contained 110 | within such NOTICE file, excluding those notices that do not 111 | pertain to any part of the Derivative Works, in at least one 112 | of the following places: within a NOTICE text file distributed 113 | as part of the Derivative Works; within the Source form or 114 | documentation, if provided along with the Derivative Works; or, 115 | within a display generated by the Derivative Works, if and 116 | wherever such third-party notices normally appear. The contents 117 | of the NOTICE file are for informational purposes only and 118 | do not modify the License. You may add Your own attribution 119 | notices within Derivative Works that You distribute, alongside 120 | or as an addendum to the NOTICE text from the Work, provided 121 | that such additional attribution notices cannot be construed 122 | as modifying the License. 123 | 124 | You may add Your own copyright statement to Your modifications and 125 | may provide additional or different license terms and conditions 126 | for use, reproduction, or distribution of Your modifications, or 127 | for any such Derivative Works as a whole, provided Your use, 128 | reproduction, and distribution of the Work otherwise complies with 129 | the conditions stated in this License. 130 | 131 | 5. Submission of Contributions. Unless You explicitly state otherwise, 132 | any Contribution intentionally submitted for inclusion in the Work 133 | by You to the Licensor shall be under the terms and conditions of 134 | this License, without any additional terms or conditions. 135 | Notwithstanding the above, nothing herein shall supersede or modify 136 | the terms of any separate license agreement you may have executed 137 | with Licensor regarding such Contributions. 138 | 139 | 6. Trademarks. This License does not grant permission to use the trade 140 | names, trademarks, service marks, or product names of the Licensor, 141 | except as required for reasonable and customary use in describing the 142 | origin of the Work and reproducing the content of the NOTICE file. 143 | 144 | 7. Disclaimer of Warranty. Unless required by applicable law or 145 | agreed to in writing, Licensor provides the Work (and each 146 | Contributor provides its Contributions) on an "AS IS" BASIS, 147 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 148 | implied, including, without limitation, any warranties or conditions 149 | of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A 150 | PARTICULAR PURPOSE. You are solely responsible for determining the 151 | appropriateness of using or redistributing the Work and assume any 152 | risks associated with Your exercise of permissions under this License. 153 | 154 | 8. Limitation of Liability. In no event and under no legal theory, 155 | whether in tort (including negligence), contract, or otherwise, 156 | unless required by applicable law (such as deliberate and grossly 157 | negligent acts) or agreed to in writing, shall any Contributor be 158 | liable to You for damages, including any direct, indirect, special, 159 | incidental, or consequential damages of any character arising as a 160 | result of this License or out of the use or inability to use the 161 | Work (including but not limited to damages for loss of goodwill, 162 | work stoppage, computer failure or malfunction, or any and all 163 | other commercial damages or losses), even if such Contributor 164 | has been advised of the possibility of such damages. 165 | 166 | 9. Accepting Warranty or Additional Liability. While redistributing 167 | the Work or Derivative Works thereof, You may choose to offer, 168 | and charge a fee for, acceptance of support, warranty, indemnity, 169 | or other liability obligations and/or rights consistent with this 170 | License. However, in accepting such obligations, You may act only 171 | on Your own behalf and on Your sole responsibility, not on behalf 172 | of any other Contributor, and only if You agree to indemnify, 173 | defend, and hold each Contributor harmless for any liability 174 | incurred by, or claims asserted against, such Contributor by reason 175 | of your accepting any such warranty or additional liability. 176 | 177 | END OF TERMS AND CONDITIONS 178 | 179 | APPENDIX: How to apply the Apache License to your work. 180 | 181 | To apply the Apache License to your work, attach the following 182 | boilerplate notice, with the fields enclosed by brackets "[]" 183 | replaced with your own identifying information. (Don't include 184 | the brackets!) The text should be enclosed in the appropriate 185 | comment syntax for the file format. We also recommend that a 186 | file or class name and description of purpose be included on the 187 | same "printed page" as the copyright notice for easier 188 | identification within third-party archives. 189 | 190 | Copyright 2020 Elastic and contributors 191 | 192 | Licensed under the Apache License, Version 2.0 (the "License"); 193 | you may not use this file except in compliance with the License. 194 | You may obtain a copy of the License at 195 | 196 | http://www.apache.org/licenses/LICENSE-2.0 197 | 198 | Unless required by applicable law or agreed to in writing, software 199 | distributed under the License is distributed on an "AS IS" BASIS, 200 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 201 | See the License for the specific language governing permissions and 202 | limitations under the License. 203 | -------------------------------------------------------------------------------- /NOTICE.TXT: -------------------------------------------------------------------------------- 1 | Elasticsearch 2 | Copyright 2012-2015 Elasticsearch 3 | 4 | This product includes software developed by The Apache Software 5 | Foundation (http://www.apache.org/). -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Logstash Plugin 2 | 3 | [![Travis Build Status](https://travis-ci.com/logstash-plugins/logstash-codec-cef.svg)](https://travis-ci.com/logstash-plugins/logstash-codec-cef) 4 | 5 | This is a plugin for [Logstash](https://github.com/elastic/logstash). 6 | 7 | It is fully free and fully open source. The license is Apache 2.0, meaning you are pretty much free to use it however you want in whatever way. 8 | 9 | ## Documentation 10 | 11 | Logstash provides infrastructure to automatically generate documentation for this plugin. We use the asciidoc format to write documentation so any comments in the source code will be first converted into asciidoc and then into html. All plugin documentation are placed under one [central location](http://www.elastic.co/guide/en/logstash/current/). 12 | 13 | - For formatting code or config example, you can use the asciidoc `[source,ruby]` directive 14 | - For more asciidoc formatting tips, see the excellent reference here https://github.com/elastic/docs#asciidoc-guide 15 | 16 | ## Need Help? 17 | 18 | Need help? Try #logstash on freenode IRC or the https://discuss.elastic.co/c/logstash discussion forum. 19 | 20 | ## Developing with Docker 21 | You can use a docker container with all of the requirements pre installed to save you installing the development environment on your host. 22 | 23 | ### 1. Starting the container 24 | Simply type `docker-compose run devenv` and you'll be entered into the container. Then you'll need to do `jruby -S bundle install` to get all the dependencies down. 25 | 26 | ### 2. Running tests 27 | Once you've done #1 above, you can run your tests with `jruby -S bundle exec rspec` 28 | 29 | ## Developing without Docker 30 | 31 | ### 1. Plugin Developement and Testing 32 | 33 | #### Code 34 | - To get started, you'll need JRuby with the Bundler gem installed. 35 | 36 | - Create a new plugin or clone and existing from the GitHub [logstash-plugins](https://github.com/logstash-plugins) organization. We also provide [example plugins](https://github.com/logstash-plugins?query=example). 37 | 38 | - Install dependencies 39 | ```sh 40 | bundle install 41 | ``` 42 | 43 | #### Test 44 | 45 | - Update your dependencies 46 | 47 | ```sh 48 | bundle install 49 | ``` 50 | 51 | - Run tests 52 | 53 | ```sh 54 | bundle exec rspec 55 | ``` 56 | 57 | ### 2. Running your unpublished Plugin in Logstash 58 | 59 | #### 2.1 Run in a local Logstash clone 60 | 61 | - Edit Logstash `Gemfile` and add the local plugin path, for example: 62 | ```ruby 63 | gem "logstash-filter-awesome", :path => "/your/local/logstash-filter-awesome" 64 | ``` 65 | - Install plugin 66 | ```sh 67 | # Logstash 2.3 and higher 68 | bin/logstash-plugin install --no-verify 69 | 70 | # Prior to Logstash 2.3 71 | bin/plugin install --no-verify 72 | 73 | ``` 74 | - Run Logstash with your plugin 75 | ```sh 76 | bin/logstash -e 'filter {awesome {}}' 77 | ``` 78 | At this point any modifications to the plugin code will be applied to this local Logstash setup. After modifying the plugin, simply rerun Logstash. 79 | 80 | #### 2.2 Run in an installed Logstash 81 | 82 | You can use the same **2.1** method to run your plugin in an installed Logstash by editing its `Gemfile` and pointing the `:path` to your local plugin development directory or you can build the gem and install it using: 83 | 84 | - Build your plugin gem 85 | ```sh 86 | gem build logstash-filter-awesome.gemspec 87 | ``` 88 | - Install the plugin from the Logstash home 89 | ```sh 90 | # Logstash 2.3 and higher 91 | bin/logstash-plugin install --no-verify 92 | 93 | # Prior to Logstash 2.3 94 | bin/plugin install --no-verify 95 | 96 | ``` 97 | - Start Logstash and proceed to test the plugin 98 | 99 | ## Contributing 100 | 101 | All contributions are welcome: ideas, patches, documentation, bug reports, complaints, and even something you drew up on a napkin. 102 | 103 | Programming is not a required skill. Whatever you've seen about open source and maintainers or community members saying "send patches or die" - you will not see that here. 104 | 105 | It is more important to the community that you are able to contribute. 106 | 107 | For more information about contributing, see the [CONTRIBUTING](https://github.com/elastic/logstash/blob/master/CONTRIBUTING.md) file. 108 | -------------------------------------------------------------------------------- /Rakefile: -------------------------------------------------------------------------------- 1 | @files=[] 2 | 3 | task :default do 4 | system("rake -T") 5 | end 6 | 7 | require "logstash/devutils/rake" 8 | -------------------------------------------------------------------------------- /docs/index.asciidoc: -------------------------------------------------------------------------------- 1 | :plugin: cef 2 | :type: codec 3 | 4 | /////////////////////////////////////////// 5 | START - GENERATED VARIABLES, DO NOT EDIT! 6 | /////////////////////////////////////////// 7 | :version: %VERSION% 8 | :release_date: %RELEASE_DATE% 9 | :changelog_url: %CHANGELOG_URL% 10 | :include_path: ../../../../logstash/docs/include 11 | /////////////////////////////////////////// 12 | END - GENERATED VARIABLES, DO NOT EDIT! 13 | /////////////////////////////////////////// 14 | 15 | [id="plugins-{type}s-{plugin}"] 16 | 17 | === Cef codec plugin 18 | 19 | include::{include_path}/plugin_header.asciidoc[] 20 | 21 | ==== Description 22 | 23 | Implementation of a Logstash codec for the ArcSight Common Event Format (CEF). 24 | It is based on https://www.microfocus.com/documentation/arcsight/arcsight-smartconnectors/pdfdoc/common-event-format-v25/common-event-format-v25.pdf[Implementing ArcSight CEF Revision 25, September 2017]. 25 | 26 | If this codec receives a payload from an input that is not a valid CEF message, then it 27 | produces an event with the payload as the 'message' field and a '_cefparsefailure' tag. 28 | 29 | ==== Compatibility with the Elastic Common Schema (ECS) 30 | 31 | This plugin can be used to decode CEF events _into_ the Elastic Common Schema, or to encode ECS-compatible events into CEF. 32 | It can also be used _without_ ECS, encoding and decoding events using only CEF-defined field names and keys. 33 | 34 | The ECS Compatibility mode for a specific plugin instance can be controlled by setting <> when defining the codec: 35 | 36 | [source,sh] 37 | ----- 38 | input { 39 | tcp { 40 | # ... 41 | codec => cef { 42 | ecs_compatibility => v1 43 | } 44 | } 45 | } 46 | ----- 47 | 48 | If left unspecified, the value of the `pipeline.ecs_compatibility` setting is used. 49 | 50 | ===== Timestamps and ECS compatiblity 51 | 52 | When decoding in ECS Compatibility Mode, timestamp-type fields are parsed and normalized 53 | to specific points on the timeline. 54 | 55 | Because the CEF format allows ambiguous timestamp formats, some reasonable assumptions are made: 56 | 57 | - When the timestamp does not include a year, we assume it happened in the recent past 58 | (or _very_ near future to accommodate out-of-sync clocks and timezone offsets). 59 | - When the timestamp does not include UTC-offset information, we use the event's 60 | timezone (`dtz` or `deviceTimeZone` field), or fall through to this plugin's 61 | <>. 62 | - Localized timestamps are parsed using the provided <>. 63 | 64 | [id="plugins-{type}s-{plugin}-field-mapping"] 65 | ===== Field mapping 66 | 67 | The header fields from each CEF payload is expanded to the following fields, depending on whether ECS is enabled. 68 | 69 | [id="plugins-{type}s-{plugin}-header-field"] 70 | ====== Header field mapping 71 | |===== 72 | |ECS Disabled | ECS Field 73 | 74 | |`cefVersion` |`[cef][version]` 75 | |`deviceVendor` |`[observer][vendor]` 76 | |`deviceProduct` |`[observer][product]` 77 | |`deviceVersion` |`[observer][version]` 78 | |`deviceEventClassId`|`[event][code]` 79 | |`name` |`[cef][name]` 80 | |`severity` |`[event][severity]` 81 | |===== 82 | 83 | When decoding CEF payloads with `ecs_compatibility => disabled`, the abbreviated CEF Keys found in extensions are expanded, and CEF Field Names are inserted at the root level of the event. 84 | 85 | When decoding in an ECS Compatibility mode, the ECS Fields are populated from the corresponding CEF Field Names _or_ CEF Keys found in the payload's extensions. 86 | 87 | The following is a mapping between these fields. 88 | 89 | // Templates for short-hand notes in the table below 90 | :cef-ambiguous-higher: pass:quotes[Multiple possible CEF fields map to this ECS Field. When decoding, the last entry encountered wins. When encoding, this field has _higher_ priority.] 91 | :cef-ambiguous-lower: pass:quotes[Multiple possible CEF fields map to this ECS Field. When decoding, the last entry encountered wins. When encoding, this field has _lower_ priority.] 92 | :cef-normalize-timestamp: pass:quotes[This field contains a timestamp. In ECS Compatibility Mode, it is parsed to a specific point in time.] 93 | :cef-plugin-config-condition: pass:quotes[When plugin configured with] 94 | 95 | 96 | [id="plugins-{type}s-{plugin}-ext-field"] 97 | ====== Extension field mapping 98 | |======================================================================================================================= 99 | |CEF Field Name (optional CEF Key) |ECS Field 100 | 101 | |`agentAddress` (`agt`) |`[agent][ip]` 102 | |`agentDnsDomain` |`[cef][agent][registered_domain]` 103 | 104 | {cef-ambiguous-higher} 105 | |`agentHostName` (`ahost`) |`[agent][name]` 106 | |`agentId` (`aid`) |`[agent][id]` 107 | |`agentMacAddress` (`amac`) |`[agent][mac]` 108 | |`agentNtDomain` |`[cef][agent][registered_domain]` 109 | 110 | {cef-ambiguous-lower} 111 | |`agentReceiptTime` (`art`) |`[event][created]` 112 | 113 | {cef-normalize-timestamp} 114 | |`agentTimeZone` (`atz`) |`[cef][agent][timezone]` 115 | |`agentTranslatedAddress` |`[cef][agent][nat][ip]` 116 | |`agentTranslatedZoneExternalID` |`[cef][agent][translated_zone][external_id]` 117 | |`agentTranslatedZoneURI` |`[cef][agent][translated_zone][uri]` 118 | |`agentType` (`at`) |`[agent][type]` 119 | |`agentVersion` (`av`) |`[agent][version]` 120 | |`agentZoneExternalID` |`[cef][agent][zone][external_id]` 121 | |`agentZoneURI` |`[cef][agent][zone][uri]` 122 | |`applicationProtocol` (`app`) |`[network][protocol]` 123 | |`baseEventCount` (`cnt`) |`[cef][base_event_count]` 124 | |`bytesIn` (`in`) |`[source][bytes]` 125 | |`bytesOut` (`out`) |`[destination][bytes]` 126 | |`categoryDeviceType` (`catdt`) |`[cef][device_type]` 127 | |`customerExternalID` |`[organization][id]` 128 | |`customerURI` |`[organization][name]` 129 | |`destinationAddress` (`dst`) |`[destination][ip]` 130 | |`destinationDnsDomain` |`[destination][registered_domain]` 131 | 132 | {cef-ambiguous-higher} 133 | |`destinationGeoLatitude` (`dlat`) |`[destination][geo][location][lat]` 134 | |`destinationGeoLongitude` (`dlong`) |`[destination][geo][location][lon]` 135 | |`destinationHostName` (`dhost`) |`[destination][domain]` 136 | |`destinationMacAddress` (`dmac`) |`[destination][mac]` 137 | |`destinationNtDomain` (`dntdom`) |`[destination][registered_domain]` 138 | 139 | {cef-ambiguous-lower} 140 | |`destinationPort` (`dpt`) |`[destination][port]` 141 | |`destinationProcessId` (`dpid`) |`[destination][process][pid]` 142 | |`destinationProcessName` (`dproc`) |`[destination][process][name]` 143 | |`destinationServiceName` |`[destination][service][name]` 144 | |`destinationTranslatedAddress` |`[destination][nat][ip]` 145 | |`destinationTranslatedPort` |`[destination][nat][port]` 146 | |`destinationTranslatedZoneExternalID` |`[cef][destination][translated_zone][external_id]` 147 | |`destinationTranslatedZoneURI` |`[cef][destination][translated_zone][uri]` 148 | |`destinationUserId` (`duid`) |`[destination][user][id]` 149 | |`destinationUserName` (`duser`) |`[destination][user][name]` 150 | |`destinationUserPrivileges` (`dpriv`) |`[destination][user][group][name]` 151 | |`destinationZoneExternalID` |`[cef][destination][zone][external_id]` 152 | |`destinationZoneURI` |`[cef][destination][zone][uri]` 153 | |`deviceAction` (`act`) |`[event][action]` 154 | .2+|`deviceAddress` (`dvc`) |`[observer][ip]` 155 | 156 | {cef-plugin-config-condition} `device => observer` 157 | |`[host][ip]` 158 | 159 | {cef-plugin-config-condition} `device => host` 160 | |`deviceCustomFloatingPoint1` (`cfp1`) |`[cef][device_custom_floating_point_1][value]` 161 | |`deviceCustomFloatingPoint1Label` (`cfp1Label`)|`[cef][device_custom_floating_point_1][label]` 162 | |`deviceCustomFloatingPoint2` (`cfp2`) |`[cef][device_custom_floating_point_2][value]` 163 | |`deviceCustomFloatingPoint2Label` (`cfp2Label`)|`[cef][device_custom_floating_point_2][label]` 164 | |`deviceCustomFloatingPoint3` (`cfp3`) |`[cef][device_custom_floating_point_3][value]` 165 | |`deviceCustomFloatingPoint3Label` (`cfp3Label`)|`[cef][device_custom_floating_point_3][label]` 166 | |`deviceCustomFloatingPoint4` (`cfp4`) |`[cef][device_custom_floating_point_4][value]` 167 | |`deviceCustomFloatingPoint4Label` (`cfp4Label`)|`[cef][device_custom_floating_point_4][label]` 168 | |`deviceCustomFloatingPoint5` (`cfp5`) |`[cef][device_custom_floating_point_5][value]` 169 | |`deviceCustomFloatingPoint5Label` (`cfp5Label`)|`[cef][device_custom_floating_point_5][label]` 170 | |`deviceCustomFloatingPoint6` (`cfp6`) |`[cef][device_custom_floating_point_6][value]` 171 | |`deviceCustomFloatingPoint6Label` (`cfp6Label`)|`[cef][device_custom_floating_point_6][label]` 172 | |`deviceCustomFloatingPoint7` (`cfp7`) |`[cef][device_custom_floating_point_7][value]` 173 | |`deviceCustomFloatingPoint7Label` (`cfp7Label`)|`[cef][device_custom_floating_point_7][label]` 174 | |`deviceCustomFloatingPoint8` (`cfp8`) |`[cef][device_custom_floating_point_8][value]` 175 | |`deviceCustomFloatingPoint8Label` (`cfp8Label`)|`[cef][device_custom_floating_point_8][label]` 176 | |`deviceCustomFloatingPoint9` (`cfp9`) |`[cef][device_custom_floating_point_9][value]` 177 | |`deviceCustomFloatingPoint9Label` (`cfp9Label`)|`[cef][device_custom_floating_point_9][label]` 178 | |`deviceCustomFloatingPoint10` (`cfp10`) |`[cef][device_custom_floating_point_10][value]` 179 | |`deviceCustomFloatingPoint10Label` (`cfp10Label`)|`[cef][device_custom_floating_point_10][label]` 180 | |`deviceCustomFloatingPoint11` (`cfp11`) |`[cef][device_custom_floating_point_11][value]` 181 | |`deviceCustomFloatingPoint11Label` (`cfp11Label`)|`[cef][device_custom_floating_point_11][label]` 182 | |`deviceCustomFloatingPoint12` (`cfp12`) |`[cef][device_custom_floating_point_12][value]` 183 | |`deviceCustomFloatingPoint12Label` (`cfp12Label`)|`[cef][device_custom_floating_point_12][label]` 184 | |`deviceCustomFloatingPoint13` (`cfp13`) |`[cef][device_custom_floating_point_13][value]` 185 | |`deviceCustomFloatingPoint13Label` (`cfp13Label`)|`[cef][device_custom_floating_point_13][label]` 186 | |`deviceCustomFloatingPoint14` (`cfp14`) |`[cef][device_custom_floating_point_14][value]` 187 | |`deviceCustomFloatingPoint14Label` (`cfp14Label`)|`[cef][device_custom_floating_point_14][label]` 188 | |`deviceCustomFloatingPoint15` (`cfp15`) |`[cef][device_custom_floating_point_15][value]` 189 | |`deviceCustomFloatingPoint15Label` (`cfp15Label`)|`[cef][device_custom_floating_point_15][label]` 190 | |`deviceCustomIPv6Address1` (`c6a1`) |`[cef][device_custom_ipv6_address_1][value]` 191 | |`deviceCustomIPv6Address1Label` (`c6a1Label`) |`[cef][device_custom_ipv6_address_1][label]` 192 | |`deviceCustomIPv6Address2` (`c6a2`) |`[cef][device_custom_ipv6_address_2][value]` 193 | |`deviceCustomIPv6Address2Label` (`c6a2Label`) |`[cef][device_custom_ipv6_address_2][label]` 194 | |`deviceCustomIPv6Address3` (`c6a3`) |`[cef][device_custom_ipv6_address_3][value]` 195 | |`deviceCustomIPv6Address3Label` (`c6a3Label`) |`[cef][device_custom_ipv6_address_3][label]` 196 | |`deviceCustomIPv6Address4` (`c6a4`) |`[cef][device_custom_ipv6_address_4][value]` 197 | |`deviceCustomIPv6Address4Label` (`c6a4Label`) |`[cef][device_custom_ipv6_address_4][label]` 198 | |`deviceCustomIPv6Address5` (`c6a5`) |`[cef][device_custom_ipv6_address_5][value]` 199 | |`deviceCustomIPv6Address5Label` (`c6a5Label`) |`[cef][device_custom_ipv6_address_5][label]` 200 | |`deviceCustomIPv6Address6` (`c6a6`) |`[cef][device_custom_ipv6_address_6][value]` 201 | |`deviceCustomIPv6Address6Label` (`c6a6Label`) |`[cef][device_custom_ipv6_address_6][label]` 202 | |`deviceCustomIPv6Address7` (`c6a7`) |`[cef][device_custom_ipv6_address_7][value]` 203 | |`deviceCustomIPv6Address7Label` (`c6a7Label`) |`[cef][device_custom_ipv6_address_7][label]` 204 | |`deviceCustomIPv6Address8` (`c6a8`) |`[cef][device_custom_ipv6_address_8][value]` 205 | |`deviceCustomIPv6Address8Label` (`c6a8Label`) |`[cef][device_custom_ipv6_address_8][label]` 206 | |`deviceCustomIPv6Address9` (`c6a9`) |`[cef][device_custom_ipv6_address_9][value]` 207 | |`deviceCustomIPv6Address9Label` (`c6a9Label`) |`[cef][device_custom_ipv6_address_9][label]` 208 | |`deviceCustomIPv6Address10` (`c6a10`) |`[cef][device_custom_ipv6_address_10][value]` 209 | |`deviceCustomIPv6Address10Label` (`c6a10Label`)|`[cef][device_custom_ipv6_address_10][label]` 210 | |`deviceCustomIPv6Address11` (`c6a11`) |`[cef][device_custom_ipv6_address_11][value]` 211 | |`deviceCustomIPv6Address11Label` (`c6a11Label`)|`[cef][device_custom_ipv6_address_11][label]` 212 | |`deviceCustomIPv6Address12` (`c6a12`) |`[cef][device_custom_ipv6_address_12][value]` 213 | |`deviceCustomIPv6Address12Label` (`c6a12Label`)|`[cef][device_custom_ipv6_address_12][label]` 214 | |`deviceCustomIPv6Address13` (`c6a13`) |`[cef][device_custom_ipv6_address_13][value]` 215 | |`deviceCustomIPv6Address13Label` (`c6a13Label`)|`[cef][device_custom_ipv6_address_13][label]` 216 | |`deviceCustomIPv6Address14` (`c6a14`) |`[cef][device_custom_ipv6_address_14][value]` 217 | |`deviceCustomIPv6Address14Label` (`c6a14Label`)|`[cef][device_custom_ipv6_address_14][label]` 218 | |`deviceCustomIPv6Address15` (`c6a15`) |`[cef][device_custom_ipv6_address_15][value]` 219 | |`deviceCustomIPv6Address15Label` (`c6a15Label`)|`[cef][device_custom_ipv6_address_15][label]` 220 | |`deviceCustomNumber1` (`cn1`) |`[cef][device_custom_number_1][value]` 221 | |`deviceCustomNumber1Label` (`cn1Label`) |`[cef][device_custom_number_1][label]` 222 | |`deviceCustomNumber2` (`cn2`) |`[cef][device_custom_number_2][value]` 223 | |`deviceCustomNumber2Label` (`cn2Label`) |`[cef][device_custom_number_2][label]` 224 | |`deviceCustomNumber3` (`cn3`) |`[cef][device_custom_number_3][value]` 225 | |`deviceCustomNumber3Label` (`cn3Label`) |`[cef][device_custom_number_3][label]` 226 | |`deviceCustomNumber4` (`cn4`) |`[cef][device_custom_number_4][value]` 227 | |`deviceCustomNumber4Label` (`cn4Label`) |`[cef][device_custom_number_4][label]` 228 | |`deviceCustomNumber5` (`cn5`) |`[cef][device_custom_number_5][value]` 229 | |`deviceCustomNumber5Label` (`cn5Label`) |`[cef][device_custom_number_5][label]` 230 | |`deviceCustomNumber6` (`cn6`) |`[cef][device_custom_number_6][value]` 231 | |`deviceCustomNumber6Label` (`cn6Label`) |`[cef][device_custom_number_6][label]` 232 | |`deviceCustomNumber7` (`cn7`) |`[cef][device_custom_number_7][value]` 233 | |`deviceCustomNumber7Label` (`cn7Label`) |`[cef][device_custom_number_7][label]` 234 | |`deviceCustomNumber8` (`cn8`) |`[cef][device_custom_number_8][value]` 235 | |`deviceCustomNumber8Label` (`cn8Label`) |`[cef][device_custom_number_8][label]` 236 | |`deviceCustomNumber9` (`cn9`) |`[cef][device_custom_number_9][value]` 237 | |`deviceCustomNumber9Label` (`cn9Label`) |`[cef][device_custom_number_9][label]` 238 | |`deviceCustomNumber10` (`cn10`) |`[cef][device_custom_number_10][value]` 239 | |`deviceCustomNumber10Label` (`cn10Label`) |`[cef][device_custom_number_10][label]` 240 | |`deviceCustomNumber11` (`cn11`) |`[cef][device_custom_number_11][value]` 241 | |`deviceCustomNumber11Label` (`cn11Label`) |`[cef][device_custom_number_11][label]` 242 | |`deviceCustomNumber12` (`cn12`) |`[cef][device_custom_number_12][value]` 243 | |`deviceCustomNumber12Label` (`cn12Label`) |`[cef][device_custom_number_12][label]` 244 | |`deviceCustomNumber13` (`cn13`) |`[cef][device_custom_number_13][value]` 245 | |`deviceCustomNumber13Label` (`cn13Label`) |`[cef][device_custom_number_13][label]` 246 | |`deviceCustomNumber14` (`cn14`) |`[cef][device_custom_number_14][value]` 247 | |`deviceCustomNumber14Label` (`cn14Label`) |`[cef][device_custom_number_14][label]` 248 | |`deviceCustomNumber15` (`cn15`) |`[cef][device_custom_number_15][value]` 249 | |`deviceCustomNumber15Label` (`cn15Label`) |`[cef][device_custom_number_15][label]` 250 | |`deviceCustomString1` (`cs1`) |`[cef][device_custom_string_1][value]` 251 | |`deviceCustomString1Label` (`cs1Label`) |`[cef][device_custom_string_1][label]` 252 | |`deviceCustomString2` (`cs2`) |`[cef][device_custom_string_2][value]` 253 | |`deviceCustomString2Label` (`cs2Label`) |`[cef][device_custom_string_2][label]` 254 | |`deviceCustomString3` (`cs3`) |`[cef][device_custom_string_3][value]` 255 | |`deviceCustomString3Label` (`cs3Label`) |`[cef][device_custom_string_3][label]` 256 | |`deviceCustomString4` (`cs4`) |`[cef][device_custom_string_4][value]` 257 | |`deviceCustomString4Label` (`cs4Label`) |`[cef][device_custom_string_4][label]` 258 | |`deviceCustomString5` (`cs5`) |`[cef][device_custom_string_5][value]` 259 | |`deviceCustomString5Label` (`cs5Label`) |`[cef][device_custom_string_5][label]` 260 | |`deviceCustomString6` (`cs6`) |`[cef][device_custom_string_6][value]` 261 | |`deviceCustomString6Label` (`cs6Label`) |`[cef][device_custom_string_6][label]` 262 | |`deviceCustomString7` (`cs7`) |`[cef][device_custom_string_7][value]` 263 | |`deviceCustomString7Label` (`cs7Label`) |`[cef][device_custom_string_7][label]` 264 | |`deviceCustomString8` (`cs8`) |`[cef][device_custom_string_8][value]` 265 | |`deviceCustomString8Label` (`cs8Label`) |`[cef][device_custom_string_8][label]` 266 | |`deviceCustomString9` (`cs9`) |`[cef][device_custom_string_9][value]` 267 | |`deviceCustomString9Label` (`cs9Label`) |`[cef][device_custom_string_9][label]` 268 | |`deviceCustomString10` (`cs10`) |`[cef][device_custom_string_10][value]` 269 | |`deviceCustomString10Label` (`cs10Label`) |`[cef][device_custom_string_10][label]` 270 | |`deviceCustomString11` (`cs11`) |`[cef][device_custom_string_11][value]` 271 | |`deviceCustomString11Label` (`cs11Label`) |`[cef][device_custom_string_11][label]` 272 | |`deviceCustomString12` (`cs12`) |`[cef][device_custom_string_12][value]` 273 | |`deviceCustomString12Label` (`cs12Label`) |`[cef][device_custom_string_12][label]` 274 | |`deviceCustomString13` (`cs13`) |`[cef][device_custom_string_13][value]` 275 | |`deviceCustomString13Label` (`cs13Label`) |`[cef][device_custom_string_13][label]` 276 | |`deviceCustomString14` (`cs14`) |`[cef][device_custom_string_14][value]` 277 | |`deviceCustomString14Label` (`cs14Label`) |`[cef][device_custom_string_14][label]` 278 | |`deviceCustomString15` (`cs15`) |`[cef][device_custom_string_15][value]` 279 | |`deviceCustomString15Label` (`cs15Label`) |`[cef][device_custom_string_15][label]` 280 | |`deviceDirection` |`[network][direction]` 281 | .2+|`deviceDnsDomain` |`[observer][registered_domain]` 282 | 283 | {cef-plugin-config-condition} `device => observer`. 284 | |`[host][registered_domain]` 285 | 286 | {cef-plugin-config-condition} `device => host`. 287 | |`deviceEventCategory` (`cat`) |`[cef][category]` 288 | .2+|`deviceExternalId` |`[observer][name]` 289 | 290 | {cef-plugin-config-condition} `device => observer`. 291 | |`[host][id]` 292 | 293 | {cef-plugin-config-condition} `device => host`. 294 | |`deviceFacility` |`[log][syslog][facility][code]` 295 | .2+|`deviceHostName` (`dvchost`) |`[observer][hostname]` 296 | 297 | {cef-plugin-config-condition} `device => observer`. 298 | |`[host][name]` 299 | 300 | {cef-plugin-config-condition} `device => host`. 301 | |`deviceInboundInterface` |`[observer][ingress][interface][name]` 302 | .2+|`deviceMacAddress` (`dvcmac`) |`[observer][mac]` 303 | 304 | {cef-plugin-config-condition} `device => observer`. 305 | |`[host][mac]` 306 | 307 | {cef-plugin-config-condition} `device => host`. 308 | |`deviceNtDomain` |`[cef][nt_domain]` 309 | |`deviceOutboundInterface` |`[observer][egress][interface][name]` 310 | |`devicePayloadId` |`[cef][payload_id]` 311 | |`deviceProcessId` (`dvcpid`) |`[process][pid]` 312 | |`deviceProcessName` |`[process][name]` 313 | |`deviceReceiptTime` (`rt`) |`@timestamp` 314 | 315 | {cef-normalize-timestamp} 316 | |`deviceTimeZone` (`dtz`) |`[event][timezone]` 317 | |`deviceTranslatedAddress` |`[host][nat][ip]` 318 | |`deviceTranslatedZoneExternalID` |`[cef][translated_zone][external_id]` 319 | |`deviceTranslatedZoneURI` |`[cef][translated_zone][uri]` 320 | |`deviceVersion` |`[observer][version]` 321 | |`deviceZoneExternalID` |`[cef][zone][external_id]` 322 | |`deviceZoneURI` |`[cef][zone][uri]` 323 | |`endTime` (`end`) |`[event][end]` 324 | 325 | {cef-normalize-timestamp} 326 | |`eventId` |`[event][id]` 327 | |`eventOutcome` (`outcome`) |`[event][outcome]` 328 | |`externalId` |`[cef][external_id]` 329 | |`fileCreateTime` |`[file][created]` 330 | |`fileHash` |`[file][hash]` 331 | |`fileId` |`[file][inode]` 332 | |`fileModificationTime` |`[file][mtime]` 333 | 334 | {cef-normalize-timestamp} 335 | |`fileName` (`fname`) |`[file][name]` 336 | |`filePath` |`[file][path]` 337 | |`filePermission` |`[file][group]` 338 | |`fileSize` (`fsize`) |`[file][size]` 339 | |`fileType` |`[file][extension]` 340 | |`managerReceiptTime` (`mrt`) |`[event][ingested]` 341 | 342 | {cef-normalize-timestamp} 343 | |`message` (`msg`) |`[message]` 344 | |`oldFileCreateTime` |`[cef][old_file][created]` 345 | 346 | {cef-normalize-timestamp} 347 | |`oldFileHash` |`[cef][old_file][hash]` 348 | |`oldFileId` |`[cef][old_file][inode]` 349 | |`oldFileModificationTime` |`[cef][old_file][mtime]` 350 | 351 | {cef-normalize-timestamp} 352 | |`oldFileName` |`[cef][old_file][name]` 353 | |`oldFilePath` |`[cef][old_file][path]` 354 | |`oldFilePermission` |`[cef][old_file][group]` 355 | |`oldFileSize` |`[cef][old_file][size]` 356 | |`oldFileType` |`[cef][old_file][extension]` 357 | |`rawEvent` |`[event][original]` 358 | |`Reason` (`reason`) |`[event][reason]` 359 | |`requestClientApplication` |`[user_agent][original]` 360 | |`requestContext` |`[http][request][referrer]` 361 | |`requestCookies` |`[cef][request][cookies]` 362 | |`requestMethod` |`[http][request][method]` 363 | |`requestUrl` (`request`) |`[url][original]` 364 | |`sourceAddress` (`src`) |`[source][ip]` 365 | |`sourceDnsDomain` |`[source][registered_domain]` 366 | 367 | {cef-ambiguous-higher} 368 | |`sourceGeoLatitude` (`slat`) |`[source][geo][location][lat]` 369 | |`sourceGeoLongitude` (`slong`) |`[source][geo][location][lon]` 370 | |`sourceHostName` (`shost`) |`[source][domain]` 371 | |`sourceMacAddress` (`smac`) |`[source][mac]` 372 | |`sourceNtDomain` (`sntdom`) |`[source][registered_domain]` 373 | 374 | {cef-ambiguous-lower} 375 | |`sourcePort` (`spt`) |`[source][port]` 376 | |`sourceProcessId` (`spid`) |`[source][process][pid]` 377 | |`sourceProcessName` (`sproc`) |`[source][process][name]` 378 | |`sourceServiceName` |`[source][service][name]` 379 | |`sourceTranslatedAddress` |`[source][nat][ip]` 380 | |`sourceTranslatedPort` |`[source][nat][port]` 381 | |`sourceTranslatedZoneExternalID` |`[cef][source][translated_zone][external_id]` 382 | |`sourceTranslatedZoneURI` |`[cef][source][translated_zone][uri]` 383 | |`sourceUserId` (`suid`) |`[source][user][id]` 384 | |`sourceUserName` (`suser`) |`[source][user][name]` 385 | |`sourceUserPrivileges` (`spriv`) |`[source][user][group][name]` 386 | |`sourceZoneExternalID` |`[cef][source][zone][external_id]` 387 | |`sourceZoneURI` |`[cef][source][zone][uri]` 388 | |`startTime` (`start`) |`[event][start]` 389 | 390 | {cef-normalize-timestamp} 391 | |`transportProtocol` (`proto`) |`[network][transport]` 392 | |`type` |`[cef][type]` 393 | |======================================================================================================================= 394 | 395 | 396 | [id="plugins-{type}s-{plugin}-options"] 397 | ==== Cef Codec Configuration Options 398 | 399 | [cols="<,<,<",options="header",] 400 | |======================================================================= 401 | |Setting |Input type|Required 402 | | <> |<>|No 403 | | <> |<>|No 404 | | <> |<>|No 405 | | <> |<>|No 406 | | <> |<>|No 407 | | <> |<>|No 408 | | <> |<>|No 409 | | <> |<>|No 410 | | <> |<>|No 411 | | <> |<>|No 412 | | <> |<>|No 413 | | <> |<>|No 414 | | <> |<>|No 415 | | <> |<>|No 416 | |======================================================================= 417 | 418 |   419 | 420 | [id="plugins-{type}s-{plugin}-default_timezone"] 421 | ===== `default_timezone` 422 | 423 | * Value type is <> 424 | * Supported values are: 425 | ** https://en.wikipedia.org/wiki/List_of_tz_database_time_zones[Timezone names] (such as `Europe/Moscow`, `America/Argentina/Buenos_Aires`) 426 | ** UTC Offsets (such as `-08:00`, `+03:00`) 427 | * The default value is your system time zone 428 | * This option has no effect when _encoding_. 429 | 430 | When parsing timestamp fields in ECS mode and encountering timestamps that 431 | do not contain UTC-offset information, the `deviceTimeZone` (`dtz`) field 432 | from the CEF payload is used to interpret the given time. If the event does 433 | not include timezone information, this `default_timezone` is used instead. 434 | 435 | [id="plugins-{type}s-{plugin}-delimiter"] 436 | ===== `delimiter` 437 | 438 | * Value type is <> 439 | * There is no default value for this setting. 440 | 441 | If your input puts a delimiter between each CEF event, you'll want to set 442 | this to be that delimiter. 443 | 444 | NOTE: Byte stream inputs such as TCP require delimiter to be specified. Otherwise input can be truncated or incorrectly split. 445 | 446 | **Example** 447 | 448 | [source,ruby] 449 | ----- 450 | input { 451 | tcp { 452 | codec => cef { delimiter => "\r\n" } 453 | # ... 454 | } 455 | } 456 | ----- 457 | 458 | This setting allows the following character sequences to have special meaning: 459 | 460 | * `\\r` (backslash "r") - means carriage return (ASCII 0x0D) 461 | * `\\n` (backslash "n") - means newline (ASCII 0x0A) 462 | 463 | [id="plugins-{type}s-{plugin}-device"] 464 | ===== `device` 465 | 466 | * Value type is <> 467 | * Supported values are: 468 | ** `observer`: indicates that device-specific fields represent the device used to _observe_ the event. 469 | ** `host`: indicates that device-specific fields represent the device on which the event _occurred_. 470 | * The default value for this setting is `observer`. 471 | * Option has no effect when < disabled`>>. 472 | * Option has no effect when _encoding_ 473 | 474 | Defines a set of device-specific CEF fields as either representing the device on which an 475 | event _occurred_, or merely the device from which the event was _observed_. 476 | This causes the relevant fields to be routed to either the `host` or the `observer` 477 | top-level groupings. 478 | 479 | If the codec handles data from a variety of sources, the ECS recommendation is to use `observer`. 480 | 481 | [id="plugins-{type}s-{plugin}-ecs_compatibility"] 482 | ===== `ecs_compatibility` 483 | 484 | * Value type is <> 485 | * Supported values are: 486 | ** `disabled`: uses CEF-defined field names in the event (e.g., `bytesIn`, `sourceAddress`) 487 | ** `v1`: supports ECS-compatible event fields (e.g., `[source][bytes]`, `[source][ip]`) 488 | * Default value depends on which version of Logstash is running: 489 | ** When Logstash provides a `pipeline.ecs_compatibility` setting, its value is used as the default 490 | ** Otherwise, the default value is `disabled`. 491 | 492 | Controls this plugin's compatibility with the {ecs-ref}[Elastic Common Schema (ECS)]. 493 | 494 | [id="plugins-{type}s-{plugin}-fields"] 495 | ===== `fields` 496 | 497 | * Value type is <> 498 | * Default value is `[]` 499 | * Option has no effect when _decoding_ 500 | 501 | When this codec is used in an Output Plugin, a list of fields can be provided to be included in CEF extensions part as key/value pairs. 502 | 503 | [id="plugins-{type}s-{plugin}-locale"] 504 | ===== `locale` 505 | 506 | * Value type is <> 507 | * Supported values are: 508 | ** Abbreviated language_COUNTRY format (e.g., `en_GB`, `pt_BR`) 509 | ** Valid https://tools.ietf.org/html/bcp47[IETF BCP 47] language tag (e.g., `zh-cmn-Hans-CN`) 510 | * The default value is your system locale 511 | * Option has no effect when _encoding_ 512 | 513 | When parsing timestamp fields in ECS mode and encountering timestamps in 514 | a localized format, this `locale` is used to interpret locale-specific strings 515 | such as month abbreviations. 516 | 517 | [id="plugins-{type}s-{plugin}-name"] 518 | ===== `name` 519 | 520 | * Value type is <> 521 | * Default value is `"Logstash"` 522 | * Option has no effect when _decoding_ 523 | 524 | When this codec is used in an Output Plugin, this option can be used to specify the 525 | value of the name field in the CEF header. The new value can include `%{foo}` strings 526 | to help you build a new value from other parts of the event. 527 | 528 | [id="plugins-{type}s-{plugin}-product"] 529 | ===== `product` 530 | 531 | * Value type is <> 532 | * Default value is `"Logstash"` 533 | * Option has no effect when _decoding_ 534 | 535 | When this codec is used in an Output Plugin, this option can be used to specify the 536 | value of the device product field in CEF header. The new value can include `%{foo}` strings 537 | to help you build a new value from other parts of the event. 538 | 539 | [id="plugins-{type}s-{plugin}-raw_data_field"] 540 | ===== `raw_data_field` 541 | 542 | * Value type is <> 543 | * There is no default value for this setting 544 | 545 | Store the raw data to the field, for example `[event][original]`. Existing target field will be overriden. 546 | 547 | [id="plugins-{type}s-{plugin}-reverse_mapping"] 548 | ===== `reverse_mapping` 549 | 550 | * Value type is <> 551 | * Default value is `false` 552 | * Option has no effect when _decoding_ 553 | 554 | Set to true to adhere to the specifications and encode using the CEF key name (short name) for the CEF field names. 555 | 556 | [id="plugins-{type}s-{plugin}-severity"] 557 | ===== `severity` 558 | 559 | * Value type is <> 560 | * Default value is `"6"` 561 | * Option has no effect when _decoding_ 562 | 563 | When this codec is used in an Output Plugin, this option can be used to specify the 564 | value of the severity field in CEF header. The new value can include `%{foo}` strings 565 | to help you build a new value from other parts of the event. 566 | 567 | Defined as field of type string to allow sprintf. The value will be validated 568 | to be an integer in the range from 0 to 10 (including). 569 | All invalid values will be mapped to the default of 6. 570 | 571 | [id="plugins-{type}s-{plugin}-signature"] 572 | ===== `signature` 573 | 574 | * Value type is <> 575 | * Default value is `"Logstash"` 576 | * Option has no effect when _decoding_ 577 | 578 | When this codec is used in an Output Plugin, this option can be used to specify the 579 | value of the signature ID field in CEF header. The new value can include `%{foo}` strings 580 | to help you build a new value from other parts of the event. 581 | 582 | [id="plugins-{type}s-{plugin}-vendor"] 583 | ===== `vendor` 584 | 585 | * Value type is <> 586 | * Default value is `"Elasticsearch"` 587 | * Option has no effect when _decoding_ 588 | 589 | When this codec is used in an Output Plugin, this option can be used to specify the 590 | value of the device vendor field in CEF header. The new value can include `%{foo}` strings 591 | to help you build a new value from other parts of the event. 592 | 593 | [id="plugins-{type}s-{plugin}-version"] 594 | ===== `version` 595 | 596 | * Value type is <> 597 | * Default value is `"1.0"` 598 | * Option has no effect when _decoding_ 599 | 600 | When this codec is used in an Output Plugin, this option can be used to specify the 601 | value of the device version field in CEF header. The new value can include `%{foo}` strings 602 | to help you build a new value from other parts of the event. 603 | -------------------------------------------------------------------------------- /lib/logstash/codecs/cef.rb: -------------------------------------------------------------------------------- 1 | # encoding: utf-8 2 | require "logstash/util/buftok" 3 | require "logstash/util/charset" 4 | require "logstash/codecs/base" 5 | require "json" 6 | require "time" 7 | 8 | require 'logstash/plugin_mixins/ecs_compatibility_support' 9 | require 'logstash/plugin_mixins/event_support/event_factory_adapter' 10 | 11 | # Implementation of a Logstash codec for the ArcSight Common Event Format (CEF) 12 | # Based on Revision 20 of Implementing ArcSight CEF, dated from June 05, 2013 13 | # https://community.saas.hpe.com/dcvta86296/attachments/dcvta86296/connector-documentation/1116/1/CommonEventFormatv23.pdf 14 | # 15 | # If this codec receives a payload from an input that is not a valid CEF message, then it will 16 | # produce an event with the payload as the 'message' field and a '_cefparsefailure' tag. 17 | class LogStash::Codecs::CEF < LogStash::Codecs::Base 18 | config_name "cef" 19 | 20 | include LogStash::PluginMixins::ECSCompatibilitySupport(:disabled, :v1, :v8 => :v1) 21 | include LogStash::PluginMixins::EventSupport::EventFactoryAdapter 22 | 23 | InvalidTimestamp = Class.new(StandardError) 24 | 25 | # Device vendor field in CEF header. The new value can include `%{foo}` strings 26 | # to help you build a new value from other parts of the event. 27 | config :vendor, :validate => :string, :default => "Elasticsearch" 28 | 29 | # Device product field in CEF header. The new value can include `%{foo}` strings 30 | # to help you build a new value from other parts of the event. 31 | config :product, :validate => :string, :default => "Logstash" 32 | 33 | # Device version field in CEF header. The new value can include `%{foo}` strings 34 | # to help you build a new value from other parts of the event. 35 | config :version, :validate => :string, :default => "1.0" 36 | 37 | # Signature ID field in CEF header. The new value can include `%{foo}` strings 38 | # to help you build a new value from other parts of the event. 39 | config :signature, :validate => :string, :default => "Logstash" 40 | 41 | # Name field in CEF header. The new value can include `%{foo}` strings 42 | # to help you build a new value from other parts of the event. 43 | config :name, :validate => :string, :default => "Logstash" 44 | 45 | # Severity field in CEF header. The new value can include `%{foo}` strings 46 | # to help you build a new value from other parts of the event. 47 | # 48 | # Defined as field of type string to allow sprintf. The value will be validated 49 | # to be an integer in the range from 0 to 10 (including). 50 | # All invalid values will be mapped to the default of 6. 51 | config :severity, :validate => :string, :default => "6" 52 | 53 | # Fields to be included in CEV extension part as key/value pairs 54 | config :fields, :validate => :array, :default => [] 55 | 56 | # When encoding to CEF, set this to true to adhere to the specifications and 57 | # encode using the CEF key name (short name) for the CEF field names. 58 | # Defaults to false to preserve previous behaviour that was to use the long 59 | # version of the CEF field names. 60 | config :reverse_mapping, :validate => :boolean, :default => false 61 | 62 | # If your input puts a delimiter between each CEF event, you'll want to set 63 | # this to be that delimiter. 64 | # 65 | # For example, with the TCP input, you probably want to put this: 66 | # 67 | # input { 68 | # tcp { 69 | # codec => cef { delimiter => "\r\n" } 70 | # # ... 71 | # } 72 | # } 73 | # 74 | # This setting allows the following character sequences to have special meaning: 75 | # 76 | # * `\\r` (backslash "r") - means carriage return (ASCII 0x0D) 77 | # * `\\n` (backslash "n") - means newline (ASCII 0x0A) 78 | config :delimiter, :validate => :string 79 | 80 | # When parsing timestamps that do not include a UTC offset in payloads that do not 81 | # include the device's timezone, the default timezone is used. 82 | # If none is provided the system timezone is used. 83 | config :default_timezone, :validate => :string 84 | 85 | # The locale is used to parse abbreviated month names from some CEF timestamp 86 | # formats. 87 | # If none is provided, the system default is used. 88 | config :locale, :validate => :string 89 | 90 | # If raw_data_field is set, during decode of an event an additional field with 91 | # the provided name is added, which contains the raw data. 92 | config :raw_data_field, :validate => :string 93 | 94 | # Defines whether a set of device-specific CEF fields represent the _observer_, 95 | # or the actual `host` on which the event occurred. If this codec handles a mix, 96 | # it is safe to use the default `observer`. 97 | config :device, :validate => %w(observer host), :default => 'observer' 98 | 99 | # A CEF Header is a sequence of zero or more: 100 | # - backslash-escaped pipes; OR 101 | # - backslash-escaped backslashes; OR 102 | # - non-pipe characters 103 | HEADER_PATTERN = /(?:\\\||\\\\|[^|])*?/ 104 | 105 | # Cache of a scanner pattern that _captures_ a HEADER followed by EOF or an unescaped pipe 106 | HEADER_NEXT_FIELD_PATTERN = /(#{HEADER_PATTERN})#{Regexp.quote('|')}/ 107 | 108 | # Cache of a gsub pattern that matches a backslash-escaped backslash or backslash-escaped pipe, _capturing_ the escaped character 109 | HEADER_ESCAPE_CAPTURE = /\\([\\|])/ 110 | 111 | # While the original CEF spec calls out that extension keys must be alphanumeric and must not contain spaces, 112 | # in practice many "CEF" producers like the Arcsight smart connector produce non-legal keys including underscores, 113 | # commas, periods, and square-bracketed index offsets. 114 | # 115 | # To support this, we look for a specific sequence of characters that are followed by an equals sign. This pattern 116 | # will correctly identify all strictly-legal keys, and will also match those that include a dot-joined "subkeys" and 117 | # square-bracketed array indexing 118 | # 119 | # That sequence must begin with one or more `\w` (word: alphanumeric + underscore), which _optionally_ may be followed 120 | # by one or more "subkey" sequences and an optional square-bracketed index. 121 | # 122 | # To be understood by this implementation, a "subkey" sequence must consist of a literal dot (`.`) followed by one or 123 | # more characters that do not convey semantic meaning within CEF (e.g., literal-dot (`.`), literal-equals (`=`), 124 | # whitespace (`\s`), literal-pipe (`|`), literal-backslash ('\'), or literal-square brackets (`[` or `]`)). 125 | EXTENSION_KEY_PATTERN = /(?:\w+(?:\.[^\.=\s\|\\\[\]]+)*(?:\[[0-9]+\])?(?==))/ 126 | 127 | # Some CEF extension keys seen in the wild use an undocumented array-like syntax that may not be compatible with 128 | # the Event API's strict-mode FieldReference parser (e.g., `fieldname[0]`). 129 | # Cache of a `String#sub` pattern matching array-like syntax and capturing both the base field name and the 130 | # array-indexing portion so we can convert to a valid FieldReference (e.g., `[fieldname][0]`). 131 | EXTENSION_KEY_ARRAY_CAPTURE = /^([^\[\]]+)((?:\[[0-9]+\])+)$/ # '[\1]\2' 132 | 133 | # In extensions, spaces may be included in an extension value without any escaping, 134 | # so an extension value is a sequence of zero or more: 135 | # - non-whitespace character; OR 136 | # - runs of whitespace that are NOT followed by something that looks like a key-equals sequence 137 | EXTENSION_VALUE_PATTERN = /(?:\S|\s++(?!#{EXTENSION_KEY_PATTERN}=))*/ 138 | 139 | # Cache of a pattern that _captures_ the NEXT extension field key/value pair 140 | EXTENSION_NEXT_KEY_VALUE_PATTERN = /^(#{EXTENSION_KEY_PATTERN})=(#{EXTENSION_VALUE_PATTERN})\s*/ 141 | 142 | ## 143 | # @see CEF#sanitize_header_field 144 | HEADER_FIELD_SANITIZER_MAPPING = { 145 | "\\" => "\\\\", 146 | "|" => "\\|", 147 | "\n" => " ", 148 | "\r" => " ", 149 | } 150 | HEADER_FIELD_SANITIZER_PATTERN = Regexp.union(HEADER_FIELD_SANITIZER_MAPPING.keys) 151 | private_constant :HEADER_FIELD_SANITIZER_MAPPING, :HEADER_FIELD_SANITIZER_PATTERN 152 | 153 | ## 154 | # @see CEF#sanitize_extension_val 155 | EXTENSION_VALUE_SANITIZER_MAPPING = { 156 | "\\" => "\\\\", 157 | "=" => "\\=", 158 | "\n" => "\\n", 159 | "\r" => "\\n", 160 | }.freeze 161 | EXTENSION_VALUE_SANITIZER_PATTERN = Regexp.union(EXTENSION_VALUE_SANITIZER_MAPPING.keys) 162 | private_constant :EXTENSION_VALUE_SANITIZER_MAPPING, :EXTENSION_VALUE_SANITIZER_PATTERN 163 | 164 | 165 | LITERAL_BACKSLASH = "\\".freeze 166 | private_constant :LITERAL_BACKSLASH 167 | LITERAL_NEWLINE = "\n".freeze 168 | private_constant :LITERAL_NEWLINE 169 | LITERAL_CARRIAGE_RETURN = "\r".freeze 170 | private_constant :LITERAL_CARRIAGE_RETURN 171 | 172 | ## 173 | # @see CEF#desanitize_extension_val 174 | EXTENSION_VALUE_SANITIZER_REVERSE_MAPPING = { 175 | LITERAL_BACKSLASH+LITERAL_BACKSLASH => LITERAL_BACKSLASH, 176 | LITERAL_BACKSLASH+'=' => '=', 177 | LITERAL_BACKSLASH+'n' => LITERAL_NEWLINE, 178 | LITERAL_BACKSLASH+'r' => LITERAL_CARRIAGE_RETURN, 179 | }.freeze 180 | EXTENSION_VALUE_SANITIZER_REVERSE_PATTERN = Regexp.union(EXTENSION_VALUE_SANITIZER_REVERSE_MAPPING.keys) 181 | private_constant :EXTENSION_VALUE_SANITIZER_REVERSE_MAPPING, :EXTENSION_VALUE_SANITIZER_REVERSE_PATTERN 182 | 183 | 184 | CEF_PREFIX = 'CEF:'.freeze 185 | 186 | public 187 | def initialize(params={}) 188 | super(params) 189 | 190 | # CEF input MUST be UTF-8, per the CEF White Paper that serves as the format's specification: 191 | # https://web.archive.org/web/20160422182529/https://kc.mcafee.com/resources/sites/MCAFEE/content/live/CORP_KNOWLEDGEBASE/78000/KB78712/en_US/CEF_White_Paper_20100722.pdf 192 | @utf8_charset = LogStash::Util::Charset.new('UTF-8') 193 | @utf8_charset.logger = self.logger 194 | 195 | if @delimiter 196 | # Logstash configuration doesn't have built-in support for escaping, 197 | # so we implement it here. Feature discussion for escaping is here: 198 | # https://github.com/elastic/logstash/issues/1645 199 | @delimiter = @delimiter.gsub("\\r", "\r").gsub("\\n", "\n") 200 | @buffer = FileWatch::BufferedTokenizer.new(@delimiter) 201 | end 202 | 203 | require_relative 'cef/timestamp_normalizer' 204 | @timestamp_normalizer = TimestampNormalizer.new(locale: @locale, timezone: @default_timezone) 205 | 206 | generate_header_fields! 207 | generate_mappings! 208 | end 209 | 210 | public 211 | def decode(data, &block) 212 | if @delimiter 213 | @logger.trace("Buffering #{data.bytesize}B of data") if @logger.trace? 214 | @buffer.extract(data).each do |line| 215 | @logger.trace("Decoding #{line.bytesize + @delimiter.bytesize}B of buffered data") if @logger.trace? 216 | handle(line, &block) 217 | end 218 | else 219 | @logger.trace("Decoding #{data.bytesize}B of unbuffered data") if @logger.trace? 220 | handle(data, &block) 221 | end 222 | end 223 | 224 | def flush(&block) 225 | if @delimiter && (remainder = @buffer.flush) 226 | @logger.trace("Flushing #{remainder.bytesize}B of buffered data") if @logger.trace? 227 | handle(remainder, &block) unless remainder.empty? 228 | end 229 | end 230 | 231 | def handle(data, &block) 232 | original_data = data.dup 233 | event = event_factory.new_event 234 | event.set(raw_data_field, data) unless raw_data_field.nil? 235 | 236 | @utf8_charset.convert(data) 237 | 238 | # Several of the many operations in the rest of this method will fail when they encounter UTF8-tagged strings 239 | # that contain invalid byte sequences; fail early to avoid wasted work. 240 | fail('invalid byte sequence in UTF-8') unless data.valid_encoding? 241 | 242 | # Strip any quotations at the start and end, flex connectors seem to send this 243 | if data[0] == "\"" 244 | data = data[1..-2] 245 | end 246 | 247 | # Use a scanning parser to capture the HEADER_FIELDS 248 | unprocessed_data = data.chomp 249 | if unprocessed_data.include?(LITERAL_NEWLINE) 250 | fail("message is not valid CEF because it contains unescaped newline characters; " + 251 | "use the `delimiter` setting to enable in-codec buffering and delimiter-splitting") 252 | end 253 | @header_fields.each_with_index do |field_name, idx| 254 | match_data = HEADER_NEXT_FIELD_PATTERN.match(unprocessed_data) 255 | if match_data.nil? 256 | fail("message is not valid CEF; found #{idx} of 7 required pipe-terminated header fields") 257 | end 258 | 259 | escaped_field_value = match_data[1] 260 | next if escaped_field_value.nil? 261 | 262 | # process legal header escape sequences 263 | unescaped_field_value = escaped_field_value.gsub(HEADER_ESCAPE_CAPTURE, '\1') 264 | 265 | event.set(field_name, unescaped_field_value) 266 | unprocessed_data = match_data.post_match 267 | end 268 | 269 | #Remainder is message 270 | message = unprocessed_data 271 | 272 | # Try and parse out the syslog header if there is one 273 | cef_version_field = @header_fields[0] 274 | if (cef_version = event.get(cef_version_field)).include?(' ') 275 | split_cef_version = cef_version.rpartition(' ') 276 | event.set(@syslog_header, split_cef_version[0]) 277 | event.set(cef_version_field, split_cef_version[2]) 278 | end 279 | 280 | # Get rid of the CEF bit in the version 281 | event.set(cef_version_field, delete_cef_prefix(event.get(cef_version_field))) 282 | 283 | # Use a scanning parser to capture the Extension Key/Value Pairs 284 | if message && !message.empty? 285 | message = message.strip 286 | extension_fields = {} 287 | 288 | while (match = message.match(EXTENSION_NEXT_KEY_VALUE_PATTERN)) 289 | extension_field_key, raw_extension_field_value = match.captures 290 | message = match.post_match 291 | 292 | # expand abbreviated extension field keys 293 | extension_field_key = @decode_mapping.fetch(extension_field_key, extension_field_key) 294 | 295 | # convert extension field name to strict legal field_reference, fixing field names with ambiguous array-like syntax 296 | extension_field_key = extension_field_key.sub(EXTENSION_KEY_ARRAY_CAPTURE, '[\1]\2') if extension_field_key.end_with?(']') 297 | 298 | # process legal extension field value escapes 299 | extension_field_value = desanitize_extension_val(raw_extension_field_value) 300 | 301 | extension_fields[extension_field_key] = extension_field_value 302 | end 303 | if !message.empty? 304 | fail("invalid extensions; keyless value present `#{message}`") 305 | end 306 | 307 | # in ECS mode, normalize timestamps including timezone. 308 | if ecs_compatibility != :disabled 309 | device_timezone = extension_fields['[event][timezone]'] 310 | @timestamp_fields.each do |timestamp_field_name| 311 | raw_timestamp = extension_fields.delete(timestamp_field_name) or next 312 | value = normalize_timestamp(raw_timestamp, device_timezone) 313 | event.set(timestamp_field_name, value) 314 | end 315 | end 316 | 317 | extension_fields.each do |field_key, field_value| 318 | event.set(field_key, field_value) 319 | end 320 | end 321 | 322 | yield event 323 | rescue => e 324 | @logger.error("Failed to decode CEF payload. Generating failure event with payload in message field.", 325 | log_metadata(:original_data => original_data)) 326 | yield event_factory.new_event("message" => data, "tags" => ["_cefparsefailure"]) 327 | end 328 | 329 | public 330 | def encode(event) 331 | # "CEF:0|Elasticsearch|Logstash|1.0|Signature|Name|Sev|" 332 | 333 | vendor = sanitize_header_field(event.sprintf(@vendor)) 334 | vendor = self.class.get_config["vendor"][:default] if vendor.empty? 335 | 336 | product = sanitize_header_field(event.sprintf(@product)) 337 | product = self.class.get_config["product"][:default] if product.empty? 338 | 339 | version = sanitize_header_field(event.sprintf(@version)) 340 | version = self.class.get_config["version"][:default] if version.empty? 341 | 342 | signature = sanitize_header_field(event.sprintf(@signature)) 343 | signature = self.class.get_config["signature"][:default] if signature.empty? 344 | 345 | name = sanitize_header_field(event.sprintf(@name)) 346 | name = self.class.get_config["name"][:default] if name.empty? 347 | 348 | severity = sanitize_severity(event, @severity) 349 | 350 | # Should also probably set the fields sent 351 | header = ["CEF:0", vendor, product, version, signature, name, severity].join("|") 352 | values = @fields.map { |fieldname| get_value(fieldname, event) }.compact.join(" ") 353 | 354 | @on_event.call(event, "#{header}|#{values}#{@delimiter}") 355 | end 356 | 357 | private 358 | 359 | def generate_header_fields! 360 | # @header_fields is an _ordered_ set of fields. 361 | @header_fields = [ 362 | ecs_select[disabled: 'cefVersion', v1: '[cef][version]'], 363 | ecs_select[disabled: 'deviceVendor', v1: '[observer][vendor]'], 364 | ecs_select[disabled: 'deviceProduct', v1: '[observer][product]'], 365 | ecs_select[disabled: 'deviceVersion', v1: '[observer][version]'], 366 | ecs_select[disabled: 'deviceEventClassId', v1: '[event][code]'], 367 | ecs_select[disabled: 'name', v1: '[cef][name]'], 368 | ecs_select[disabled: 'severity', v1: '[event][severity]'] 369 | ].map(&:freeze).freeze 370 | # the @syslog_header is the field name used when a syslog header preceeds the CEF Version. 371 | @syslog_header = ecs_select[disabled:'syslog',v1:'[log][syslog][header]'] 372 | end 373 | 374 | ## 375 | # produces log metadata, injecting the current exception and log-level-relevant backtraces 376 | # @param context [Hash{Symbol=>Object}]: the base context 377 | def log_metadata(context={}) 378 | return context unless $! 379 | 380 | exception_context = {} 381 | exception_context[:exception] = "#{$!.class}: #{$!.message}" 382 | exception_context[:backtrace] = $!.backtrace if @logger.debug? 383 | 384 | exception_context.merge(context) 385 | end 386 | 387 | class CEFField 388 | ## 389 | # @param name [String]: the full CEF name of a field 390 | # @param key [String] (optional): an abbreviated CEF key to use when encoding a value with `reverse_mapping => true` 391 | # when left unspecified, the `key` is the field's `name`. 392 | # @param ecs_field [String] (optional): an ECS-compatible field reference to use, with square-bracket syntax. 393 | # when left unspecified, the `ecs_field` is the field's `name`. 394 | # @param legacy [String] (optional): a legacy CEF name to support in pass-through. 395 | # in decoding mode without ECS, field name will be used as-provided. 396 | # in encoding mode without ECS when provided to `fields` and `reverse_mapping => false`, 397 | # field name will be used as-provided. 398 | # @param priority [Integer] (optional): when multiple fields resolve to the same ECS field name, the field with the 399 | # highest `prioriry` will be used by the encoder. 400 | def initialize(name, key: name, ecs_field: name, legacy:nil, priority:0, normalize:nil) 401 | @name = name 402 | @key = key 403 | @ecs_field = ecs_field 404 | @legacy = legacy 405 | @priority = priority 406 | @normalize = normalize 407 | end 408 | attr_reader :name 409 | attr_reader :key 410 | attr_reader :ecs_field 411 | attr_reader :legacy 412 | attr_reader :priority 413 | attr_reader :normalize 414 | end 415 | 416 | def generate_mappings! 417 | encode_mapping = Hash.new 418 | decode_mapping = Hash.new 419 | timestamp_fields = Set.new 420 | [ 421 | CEFField.new("agentAddress", key: "agt", ecs_field: "[agent][ip]"), 422 | CEFField.new("agentDnsDomain", ecs_field: "[cef][agent][registered_domain]", priority: 10), 423 | CEFField.new("agentHostName", key: "ahost", ecs_field: "[agent][name]"), 424 | CEFField.new("agentId", key: "aid", ecs_field: "[agent][id]"), 425 | CEFField.new("agentMacAddress", key: "amac", ecs_field: "[agent][mac]"), 426 | CEFField.new("agentNtDomain", ecs_field: "[cef][agent][registered_domain]"), 427 | CEFField.new("agentReceiptTime", key: "art", ecs_field: "[event][created]", normalize: :timestamp), 428 | CEFField.new("agentTimeZone", key: "atz", ecs_field: "[cef][agent][timezone]"), 429 | CEFField.new("agentTranslatedAddress", ecs_field: "[cef][agent][nat][ip]"), 430 | CEFField.new("agentTranslatedZoneExternalID", ecs_field: "[cef][agent][translated_zone][external_id]"), 431 | CEFField.new("agentTranslatedZoneURI", ecs_field: "[cef][agent][translated_zone][uri]"), 432 | CEFField.new("agentType", key: "at", ecs_field: "[agent][type]"), 433 | CEFField.new("agentVersion", key: "av", ecs_field: "[agent][version]"), 434 | CEFField.new("agentZoneExternalID", ecs_field: "[cef][agent][zone][external_id]"), 435 | CEFField.new("agentZoneURI", ecs_field: "[cef][agent][zone][uri]"), 436 | CEFField.new("applicationProtocol", key: "app", ecs_field: "[network][protocol]"), 437 | CEFField.new("baseEventCount", key: "cnt", ecs_field: "[cef][base_event_count]"), 438 | CEFField.new("bytesIn", key: "in", ecs_field: "[source][bytes]"), 439 | CEFField.new("bytesOut", key: "out", ecs_field: "[destination][bytes]"), 440 | CEFField.new("categoryDeviceType", key: "catdt", ecs_field: "[cef][device_type]"), 441 | CEFField.new("customerExternalID", ecs_field: "[organization][id]"), 442 | CEFField.new("customerURI", ecs_field: "[organization][name]"), 443 | CEFField.new("destinationAddress", key: "dst", ecs_field: "[destination][ip]"), 444 | CEFField.new("destinationDnsDomain", ecs_field: "[destination][registered_domain]", priority: 10), 445 | CEFField.new("destinationGeoLatitude", key: "dlat", ecs_field: "[destination][geo][location][lat]", legacy: "destinationLatitude"), 446 | CEFField.new("destinationGeoLongitude", key: "dlong", ecs_field: "[destination][geo][location][lon]", legacy: "destinationLongitude"), 447 | CEFField.new("destinationHostName", key: "dhost", ecs_field: "[destination][domain]"), 448 | CEFField.new("destinationMacAddress", key: "dmac", ecs_field: "[destination][mac]"), 449 | CEFField.new("destinationNtDomain", key: "dntdom", ecs_field: "[destination][registered_domain]"), 450 | CEFField.new("destinationPort", key: "dpt", ecs_field: "[destination][port]"), 451 | CEFField.new("destinationProcessId", key: "dpid", ecs_field: "[destination][process][pid]"), 452 | CEFField.new("destinationProcessName", key: "dproc", ecs_field: "[destination][process][name]"), 453 | CEFField.new("destinationServiceName", ecs_field: "[destination][service][name]"), 454 | CEFField.new("destinationTranslatedAddress", ecs_field: "[destination][nat][ip]"), 455 | CEFField.new("destinationTranslatedPort", ecs_field: "[destination][nat][port]"), 456 | CEFField.new("destinationTranslatedZoneExternalID", ecs_field: "[cef][destination][translated_zone][external_id]"), 457 | CEFField.new("destinationTranslatedZoneURI", ecs_field: "[cef][destination][translated_zone][uri]"), 458 | CEFField.new("destinationUserId", key: "duid", ecs_field: "[destination][user][id]"), 459 | CEFField.new("destinationUserName", key: "duser", ecs_field: "[destination][user][name]"), 460 | CEFField.new("destinationUserPrivileges", key: "dpriv", ecs_field: "[destination][user][group][name]"), 461 | CEFField.new("destinationZoneExternalID", ecs_field: "[cef][destination][zone][external_id]"), 462 | CEFField.new("destinationZoneURI", ecs_field: "[cef][destination][zone][uri]"), 463 | CEFField.new("deviceAction", key: "act", ecs_field: "[event][action]"), 464 | CEFField.new("deviceAddress", key: "dvc", ecs_field: "[#{@device}][ip]"), 465 | (1..15).map do |idx| 466 | [ 467 | CEFField.new("deviceCustomFloatingPoint#{idx}", key: "cfp#{idx}", ecs_field: "[cef][device_custom_floating_point_#{idx}][value]"), 468 | CEFField.new("deviceCustomFloatingPoint#{idx}Label", key: "cfp#{idx}Label", ecs_field: "[cef][device_custom_floating_point_#{idx}][label]"), 469 | CEFField.new("deviceCustomIPv6Address#{idx}", key: "c6a#{idx}", ecs_field: "[cef][device_custom_ipv6_address_#{idx}][value]"), 470 | CEFField.new("deviceCustomIPv6Address#{idx}Label", key: "c6a#{idx}Label", ecs_field: "[cef][device_custom_ipv6_address_#{idx}][label]"), 471 | CEFField.new("deviceCustomNumber#{idx}", key: "cn#{idx}", ecs_field: "[cef][device_custom_number_#{idx}][value]"), 472 | CEFField.new("deviceCustomNumber#{idx}Label", key: "cn#{idx}Label", ecs_field: "[cef][device_custom_number_#{idx}][label]"), 473 | CEFField.new("deviceCustomString#{idx}", key: "cs#{idx}", ecs_field: "[cef][device_custom_string_#{idx}][value]"), 474 | CEFField.new("deviceCustomString#{idx}Label", key: "cs#{idx}Label", ecs_field: "[cef][device_custom_string_#{idx}][label]"), 475 | ] 476 | end, 477 | CEFField.new("deviceDirection", ecs_field: "[network][direction]"), 478 | CEFField.new("deviceDnsDomain", ecs_field: "[#{@device}][registered_domain]", priority: 10), 479 | CEFField.new("deviceEventCategory", key: "cat", ecs_field: "[cef][category]"), 480 | CEFField.new("deviceExternalId", ecs_field: (@device == 'host' ? "[host][id]" : "[observer][name]")), 481 | CEFField.new("deviceFacility", ecs_field: "[log][syslog][facility][code]"), 482 | CEFField.new("deviceHostName", key: "dvchost", ecs_field: (@device == 'host' ? '[host][name]' : '[observer][hostname]')), 483 | CEFField.new("deviceInboundInterface", ecs_field: "[observer][ingress][interface][name]"), 484 | CEFField.new("deviceMacAddress", key: "dvcmac", ecs_field: "[#{@device}][mac]"), 485 | CEFField.new("deviceNtDomain", ecs_field: "[cef][nt_domain]"), 486 | CEFField.new("deviceOutboundInterface", ecs_field: "[observer][egress][interface][name]"), 487 | CEFField.new("devicePayloadId", ecs_field: "[cef][payload_id]"), 488 | CEFField.new("deviceProcessId", key: "dvcpid", ecs_field: "[process][pid]"), 489 | CEFField.new("deviceProcessName", ecs_field: "[process][name]"), 490 | CEFField.new("deviceReceiptTime", key: "rt", ecs_field: "@timestamp", normalize: :timestamp), 491 | CEFField.new("deviceTimeZone", key: "dtz", ecs_field: "[event][timezone]", legacy: "destinationTimeZone"), 492 | CEFField.new("deviceTranslatedAddress", ecs_field: "[host][nat][ip]"), 493 | CEFField.new("deviceTranslatedZoneExternalID", ecs_field: "[cef][translated_zone][external_id]"), 494 | CEFField.new("deviceTranslatedZoneURI", ecs_field: "[cef][translated_zone][uri]"), 495 | CEFField.new("deviceVersion", ecs_field: "[observer][version]"), 496 | CEFField.new("deviceZoneExternalID", ecs_field: "[cef][zone][external_id]"), 497 | CEFField.new("deviceZoneURI", ecs_field: "[cef][zone][uri]"), 498 | CEFField.new("endTime", key: "end", ecs_field: "[event][end]", normalize: :timestamp), 499 | CEFField.new("eventId", ecs_field: "[event][id]"), 500 | CEFField.new("eventOutcome", key: "outcome", ecs_field: "[event][outcome]"), 501 | CEFField.new("externalId", ecs_field: "[cef][external_id]"), 502 | CEFField.new("fileCreateTime", ecs_field: "[file][created]"), 503 | CEFField.new("fileHash", ecs_field: "[file][hash]"), 504 | CEFField.new("fileId", ecs_field: "[file][inode]"), 505 | CEFField.new("fileModificationTime", ecs_field: "[file][mtime]", normalize: :timestamp), 506 | CEFField.new("fileName", key: "fname", ecs_field: "[file][name]"), 507 | CEFField.new("filePath", ecs_field: "[file][path]"), 508 | CEFField.new("filePermission", ecs_field: "[file][group]"), 509 | CEFField.new("fileSize", key: "fsize", ecs_field: "[file][size]"), 510 | CEFField.new("fileType", ecs_field: "[file][extension]"), 511 | CEFField.new("managerReceiptTime", key: "mrt", ecs_field: "[event][ingested]", normalize: :timestamp), 512 | CEFField.new("message", key: "msg", ecs_field: "[message]"), 513 | CEFField.new("oldFileCreateTime", ecs_field: "[cef][old_file][created]", normalize: :timestamp), 514 | CEFField.new("oldFileHash", ecs_field: "[cef][old_file][hash]"), 515 | CEFField.new("oldFileId", ecs_field: "[cef][old_file][inode]"), 516 | CEFField.new("oldFileModificationTime", ecs_field: "[cef][old_file][mtime]", normalize: :timestamp), 517 | CEFField.new("oldFileName", ecs_field: "[cef][old_file][name]"), 518 | CEFField.new("oldFilePath", ecs_field: "[cef][old_file][path]"), 519 | CEFField.new("oldFilePermission", ecs_field: "[cef][old_file][group]"), 520 | CEFField.new("oldFileSize", ecs_field: "[cef][old_file][size]"), 521 | CEFField.new("oldFileType", ecs_field: "[cef][old_file][extension]"), 522 | CEFField.new("rawEvent", ecs_field: "[event][original]"), 523 | CEFField.new("Reason", key: "reason", ecs_field: "[event][reason]"), 524 | CEFField.new("requestClientApplication", ecs_field: "[user_agent][original]"), 525 | CEFField.new("requestContext", ecs_field: "[http][request][referrer]"), 526 | CEFField.new("requestCookies", ecs_field: "[cef][request][cookies]"), 527 | CEFField.new("requestMethod", ecs_field: "[http][request][method]"), 528 | CEFField.new("requestUrl", key: "request", ecs_field: "[url][original]"), 529 | CEFField.new("sourceAddress", key: "src", ecs_field: "[source][ip]"), 530 | CEFField.new("sourceDnsDomain", ecs_field: "[source][registered_domain]", priority: 10), 531 | CEFField.new("sourceGeoLatitude", key: "slat", ecs_field: "[source][geo][location][lat]", legacy: "sourceLatitude"), 532 | CEFField.new("sourceGeoLongitude", key: "slong", ecs_field: "[source][geo][location][lon]", legacy: "sourceLongitude"), 533 | CEFField.new("sourceHostName", key: "shost", ecs_field: "[source][domain]"), 534 | CEFField.new("sourceMacAddress", key: "smac", ecs_field: "[source][mac]"), 535 | CEFField.new("sourceNtDomain", key: "sntdom", ecs_field: "[source][registered_domain]"), 536 | CEFField.new("sourcePort", key: "spt", ecs_field: "[source][port]"), 537 | CEFField.new("sourceProcessId", key: "spid", ecs_field: "[source][process][pid]"), 538 | CEFField.new("sourceProcessName", key: "sproc", ecs_field: "[source][process][name]"), 539 | CEFField.new("sourceServiceName", ecs_field: "[source][service][name]"), 540 | CEFField.new("sourceTranslatedAddress", ecs_field: "[source][nat][ip]"), 541 | CEFField.new("sourceTranslatedPort", ecs_field: "[source][nat][port]"), 542 | CEFField.new("sourceTranslatedZoneExternalID", ecs_field: "[cef][source][translated_zone][external_id]"), 543 | CEFField.new("sourceTranslatedZoneURI", ecs_field: "[cef][source][translated_zone][uri]"), 544 | CEFField.new("sourceUserId", key: "suid", ecs_field: "[source][user][id]"), 545 | CEFField.new("sourceUserName", key: "suser", ecs_field: "[source][user][name]"), 546 | CEFField.new("sourceUserPrivileges", key: "spriv", ecs_field: "[source][user][group][name]"), 547 | CEFField.new("sourceZoneExternalID", ecs_field: "[cef][source][zone][external_id]"), 548 | CEFField.new("sourceZoneURI", ecs_field: "[cef][source][zone][uri]"), 549 | CEFField.new("startTime", key: "start", ecs_field: "[event][start]", normalize: :timestamp), 550 | CEFField.new("transportProtocol", key: "proto", ecs_field: "[network][transport]"), 551 | CEFField.new("type", ecs_field: "[cef][type]"), 552 | ].flatten.sort_by(&:priority).each do |cef| 553 | field_name = ecs_select[disabled:cef.name, v1:cef.ecs_field] 554 | 555 | # whether the source is a cef_key or cef_name, normalize to field_name 556 | decode_mapping[cef.key] = field_name 557 | decode_mapping[cef.name] = field_name 558 | 559 | # whether source is a cef_name or a field_name, normalize to target 560 | normalized_encode_target = @reverse_mapping ? cef.key : cef.name 561 | encode_mapping[field_name] = normalized_encode_target 562 | encode_mapping[cef.name] = normalized_encode_target unless cef.name == field_name 563 | 564 | # if a field has an alias, normalize pass-through 565 | if cef.legacy 566 | decode_mapping[cef.legacy] = ecs_select[disabled:cef.legacy, v1:cef.ecs_field] 567 | encode_mapping[cef.legacy] = @reverse_mapping ? cef.key : cef.legacy 568 | end 569 | 570 | timestamp_fields << field_name if ecs_compatibility != :disabled && cef.normalize == :timestamp 571 | end 572 | 573 | @decode_mapping = decode_mapping.dup.freeze 574 | @encode_mapping = encode_mapping.dup.freeze 575 | @timestamp_fields = timestamp_fields.dup.freeze 576 | end 577 | 578 | # Escape pipes and backslashes in the header. Equal signs are ok. 579 | # Newlines are forbidden. 580 | def sanitize_header_field(value) 581 | value.to_s 582 | .gsub("\r\n", "\n") 583 | .gsub(HEADER_FIELD_SANITIZER_PATTERN, HEADER_FIELD_SANITIZER_MAPPING) 584 | end 585 | 586 | # Keys must be made up of a single word, with no spaces 587 | # must be alphanumeric 588 | def sanitize_extension_key(value) 589 | value.to_s 590 | .gsub(/[^a-zA-Z0-9]/, "") 591 | end 592 | 593 | # Escape equal signs in the extensions. Canonicalize newlines. 594 | # CEF spec leaves it up to us to choose \r or \n for newline. 595 | # We choose \n as the default. 596 | def sanitize_extension_val(value) 597 | value.to_s 598 | .gsub("\r\n", "\n") 599 | .gsub(EXTENSION_VALUE_SANITIZER_PATTERN, EXTENSION_VALUE_SANITIZER_MAPPING) 600 | end 601 | 602 | def desanitize_extension_val(value) 603 | value.to_s.gsub(EXTENSION_VALUE_SANITIZER_REVERSE_PATTERN, EXTENSION_VALUE_SANITIZER_REVERSE_MAPPING) 604 | end 605 | 606 | def normalize_timestamp(value, device_timezone_name) 607 | return nil if value.nil? || value.to_s.strip.empty? 608 | 609 | normalized = @timestamp_normalizer.normalize(value, device_timezone_name).iso8601(9) 610 | 611 | LogStash::Timestamp.new(normalized) 612 | rescue => e 613 | @logger.error("Failed to parse CEF timestamp value `#{value}` (#{e.message})") 614 | raise InvalidTimestamp.new("Not a valid CEF timestamp: `#{value}`") 615 | end 616 | 617 | def get_value(fieldname, event) 618 | val = event.get(fieldname) 619 | 620 | return nil if val.nil? 621 | 622 | key = @encode_mapping.fetch(fieldname, fieldname) 623 | key = sanitize_extension_key(key) 624 | 625 | case val 626 | when Array, Hash 627 | return "#{key}=#{sanitize_extension_val(val.to_json)}" 628 | when LogStash::Timestamp 629 | return "#{key}=#{val.to_s}" 630 | else 631 | return "#{key}=#{sanitize_extension_val(val)}" 632 | end 633 | end 634 | 635 | def sanitize_severity(event, severity) 636 | severity = sanitize_header_field(event.sprintf(severity)).strip 637 | severity = self.class.get_config["severity"][:default] unless valid_severity?(severity) 638 | severity.to_i.to_s 639 | end 640 | 641 | def valid_severity?(sev) 642 | f = Float(sev) 643 | # check if it's an integer or a float with no remainder 644 | # and if the value is between 0 and 10 (inclusive) 645 | (f % 1 == 0) && f.between?(0,10) 646 | rescue TypeError, ArgumentError 647 | false 648 | end 649 | 650 | if Gem::Requirement.new(">= 2.5.0").satisfied_by? Gem::Version.new(RUBY_VERSION) 651 | def delete_cef_prefix(cef_version) 652 | cef_version.delete_prefix(CEF_PREFIX) 653 | end 654 | else 655 | def delete_cef_prefix(cef_version) 656 | cef_version.start_with?(CEF_PREFIX) ? cef_version[CEF_PREFIX.length..-1] : cef_version 657 | end 658 | end 659 | end 660 | -------------------------------------------------------------------------------- /lib/logstash/codecs/cef/timestamp_normalizer.rb: -------------------------------------------------------------------------------- 1 | # encoding: utf-8 2 | 3 | require 'java' 4 | 5 | # The CEF specification allows a variety of timestamp formats, some of which 6 | # cannot be unambiguously parsed to a specific points in time, and may require 7 | # additional side-channel information to do so, namely: 8 | # - the time zone or UTC offset (which MAY be included in a separate field) 9 | # - the locale (for parsing abbreviated month names) 10 | # - the year (assume "recent") 11 | # 12 | # This normalizer attempts to use the provided context and make reasonable 13 | # assumptions when parsing ambiguous dates. 14 | class LogStash::Codecs::CEF::TimestampNormalizer 15 | 16 | java_import java.time.Clock 17 | java_import java.time.LocalDate 18 | java_import java.time.LocalTime 19 | java_import java.time.MonthDay 20 | java_import java.time.OffsetDateTime 21 | java_import java.time.ZoneId 22 | java_import java.time.ZonedDateTime 23 | java_import java.time.format.DateTimeFormatter 24 | java_import java.util.Locale 25 | 26 | def initialize(locale:nil, timezone:nil, clock: Clock.systemUTC) 27 | @clock = clock 28 | 29 | java_locale = locale ? get_locale(locale) : Locale.get_default 30 | java_timezone = timezone ? ZoneId.of(timezone) : ZoneId.system_default 31 | 32 | @cef_timestamp_format_parser = DateTimeFormatter 33 | .ofPattern("MMM dd[ yyyy] HH:mm:ss[.SSSSSSSSS][.SSSSSS][.SSS][ zzz]") 34 | .withZone(java_timezone) 35 | .withLocale(java_locale) 36 | end 37 | 38 | INTEGER_OR_DECIMAL_PATTERN = /\A[1-9][0-9]*(?:\.[0-9]+)?\z/ 39 | private_constant :INTEGER_OR_DECIMAL_PATTERN 40 | 41 | # @param value [String,Time,Numeric] 42 | # The value to parse. `Time`s are returned without modification, and `Numeric` values 43 | # are treated as millis-since-epoch (as are fully-numeric strings). 44 | # Strings are parsed unsing any of the supported CEF formats, and when the timestamp 45 | # does not encode a year, we assume the year from contextual information like the 46 | # current time. 47 | # @param device_timezone_name [String,nil] (optional): 48 | # If known, the time-zone or UTC offset of the device that encoded the timestamp. 49 | # This value is used to determine the offset when no offset is encoded in the timestamp. 50 | # If not provided, the system default time zone is used instead. 51 | # @return [Time] 52 | def normalize(value, device_timezone_name=nil) 53 | return value if value.kind_of?(Time) 54 | 55 | case value 56 | when Numeric then Time.at(Rational(value, 1000)) 57 | when INTEGER_OR_DECIMAL_PATTERN then Time.at(Rational(value, 1000)) 58 | else 59 | parse_cef_format_string(value.to_s, device_timezone_name) 60 | end 61 | end 62 | 63 | private 64 | 65 | def get_locale(spec) 66 | if spec.nil? 67 | Locale.get_default 68 | elsif spec =~ /\A([a-z]{2})_([A-Z]{2})\z/ 69 | lang, country = Regexp.last_match(1), Regexp.last_match(2) 70 | Locale.new(lang, country) 71 | else 72 | Locale.for_language_tag(spec) 73 | end 74 | end 75 | 76 | def parse_cef_format_string(value, context_timezone=nil) 77 | cef_timestamp_format_parser = @cef_timestamp_format_parser 78 | cef_timestamp_format_parser = cef_timestamp_format_parser.with_zone(java.time.ZoneId.of(context_timezone)) unless context_timezone.nil? 79 | 80 | parsed_time = cef_timestamp_format_parser.parse_best(value, 81 | ->(v){ ZonedDateTime.from(v) }, 82 | ->(v){ OffsetDateTime.from(v) }, 83 | ->(v){ resolve_assuming_year(v) }).to_instant 84 | 85 | # Ruby's `Time::at(sec, microseconds_with_frac)` 86 | Time.at(parsed_time.get_epoch_second, Rational(parsed_time.get_nano, 1000)) 87 | end 88 | 89 | def resolve_assuming_year(parsed_temporal_accessor) 90 | parsed_monthday = MonthDay.from(parsed_temporal_accessor) 91 | parsed_time = LocalTime.from(parsed_temporal_accessor) 92 | parsed_zone = ZoneId.from(parsed_temporal_accessor) 93 | 94 | now = ZonedDateTime.now(@clock.with_zone(parsed_zone)) 95 | 96 | parsed_timestamp_with_current_year = ZonedDateTime.of(parsed_monthday.at_year(now.get_year), parsed_time, parsed_zone) 97 | 98 | if (parsed_timestamp_with_current_year > now.plus_days(2)) 99 | # e.g., on May 12, parsing a date from May 15 or later is plausibly from 100 | # the prior calendar year and not actually from the future 101 | return ZonedDateTime.of(parsed_monthday.at_year(now.get_year - 1), parsed_time, parsed_zone) 102 | elsif now.get_month_value == 12 && (parsed_timestamp_with_current_year.plus_years(1) <= now.plus_days(2)) 103 | # e.g., on December 31, parsing a date from January 1 could plausibly be 104 | # from the very-near future but next calendar year due to out-of-sync 105 | # clocks, mismatched timezones, etc. 106 | return ZonedDateTime.of(parsed_monthday.at_year(now.get_year + 1), parsed_time, parsed_zone) 107 | else 108 | # otherwise, assume current calendar year 109 | return parsed_timestamp_with_current_year 110 | end 111 | end 112 | end 113 | -------------------------------------------------------------------------------- /logstash-codec-cef.gemspec: -------------------------------------------------------------------------------- 1 | Gem::Specification.new do |s| 2 | 3 | s.name = 'logstash-codec-cef' 4 | s.version = '6.2.8' 5 | s.platform = 'java' 6 | s.licenses = ['Apache License (2.0)'] 7 | s.summary = "Reads the ArcSight Common Event Format (CEF)." 8 | s.description = "This gem is a Logstash plugin required to be installed on top of the Logstash core pipeline using $LS_HOME/bin/logstash-plugin install gemname. This gem is not a stand-alone program" 9 | s.authors = ["Elastic"] 10 | s.email = 'info@elastic.co' 11 | s.homepage = "http://www.elastic.co/guide/en/logstash/current/index.html" 12 | s.require_paths = ["lib"] 13 | 14 | # Files 15 | s.files = Dir["lib/**/*","spec/**/*","*.gemspec","*.md","CONTRIBUTORS","Gemfile","LICENSE","NOTICE.TXT", "vendor/jar-dependencies/**/*.jar", "vendor/jar-dependencies/**/*.rb", "VERSION", "docs/**/*"] 16 | 17 | # Tests 18 | s.test_files = s.files.grep(%r{^(test|spec|features)/}) 19 | 20 | # Special flag to let us know this is actually a logstash plugin 21 | s.metadata = { "logstash_plugin" => "true", "logstash_group" => "codec" } 22 | 23 | # Gem dependencies 24 | s.add_runtime_dependency "logstash-core-plugin-api", ">= 1.60", "<= 2.99" 25 | s.add_runtime_dependency "logstash-mixin-ecs_compatibility_support", '~> 1.3' 26 | s.add_runtime_dependency "logstash-mixin-event_support", '~> 1.0' 27 | 28 | s.add_development_dependency 'logstash-devutils' 29 | s.add_development_dependency 'insist' 30 | end 31 | -------------------------------------------------------------------------------- /spec/codecs/cef/timestamp_normalizer_spec.rb: -------------------------------------------------------------------------------- 1 | # encoding: utf-8 2 | 3 | require 'logstash/util' 4 | require "logstash/devutils/rspec/spec_helper" 5 | require "insist" 6 | require "logstash/codecs/cef" 7 | require 'logstash/codecs/cef/timestamp_normalizer' 8 | 9 | describe LogStash::Codecs::CEF::TimestampNormalizer do 10 | 11 | subject(:timestamp_normalizer) { described_class.new } 12 | let(:parsed_result) { timestamp_normalizer.normalize(parsable_string) } 13 | 14 | context "parsing dates with a year specified" do 15 | let(:parsable_string) { "Jun 17 2027 17:57:06.456" } 16 | it 'parses the year correctly' do 17 | expect(parsed_result.year).to eq(2027) 18 | end 19 | end 20 | 21 | context "unparsable inputs" do 22 | let(:parsable_string) { "Last Thursday" } 23 | it "raises a StandardError exception that can be caught upstream" do 24 | expect { parsed_result }.to raise_error(StandardError, /#{Regexp::escape parsable_string}/) 25 | end 26 | end 27 | 28 | context "side-channel time zone indicators" do 29 | let(:context_timezone) { 'America/New_York' } 30 | let(:parsed_result) { timestamp_normalizer.normalize(parsable_string, context_timezone) } 31 | 32 | context "when parsed input does not include offset information" do 33 | let(:parsable_string) { "Jun 17 2027 17:57:06.456" } 34 | 35 | it 'offsets to the context timezone time' do 36 | expect(parsed_result).to eq(Time.parse("2027-06-17T21:57:06.456Z")) 37 | end 38 | end 39 | context "when parsed input includes offset information" do 40 | let(:parsable_string) { "Jun 17 2027 17:57:06.456 -07:00" } 41 | 42 | it 'uses the parsed offset' do 43 | expect(parsed_result).to eq(Time.parse("2027-06-18T00:57:06.456Z")) 44 | end 45 | end 46 | context "when parsed input is a millis-since-epoch timestamp" do 47 | let(:parsable_string) { "1616623591694" } 48 | 49 | it "does not offset the time" do 50 | expect(parsed_result).to eq(Time.at(Rational(1616623591694,1_000))) 51 | expect(parsed_result.nsec).to eq(694_000_000) 52 | end 53 | end 54 | context "when parsed input is a millis-since-epoch timestamp with decimal part and microsecond precision" do 55 | let(:parsable_string) { "1616623591694.176" } 56 | 57 | it "does not offset the time" do 58 | expect(parsed_result).to eq(Time.at(Rational(1616623591694176,1_000_000))) 59 | expect(parsed_result.nsec).to eq(694_176_000) 60 | end 61 | end 62 | context "when parsed input is a millis-since-epoch timestamp with decimal part and nanosecond precision" do 63 | let(:parsable_string) { "1616623591694.176789" } 64 | 65 | it "does not offset the time" do 66 | expect(parsed_result).to eq(Time.at(Rational(1616623591694176789,1_000_000_000))) 67 | expect(parsed_result.nsec).to eq(694_176_789) 68 | end 69 | end 70 | end 71 | 72 | context "when locale is specified" do 73 | let(:locale_language) { 'de' } 74 | let(:locale_country) { 'DE' } 75 | let(:locale_spec) { "#{locale_language}_#{locale_country}" } 76 | 77 | # Due to locale-provider loading changes in JDK 9, abbreviations for months 78 | # depend on a combination of the JDK version and the `java.locale.providers` 79 | # system property. 80 | # Instead of hard-coding a localized month name, use this process's locales 81 | # to generate one. 82 | let(:java_locale) { java.util.Locale.new(locale_language, locale_country) } 83 | let(:localized_march_abbreviation) do 84 | months = java.text.DateFormatSymbols.new(java_locale).get_short_months 85 | months[2] # march 86 | end 87 | 88 | subject(:timestamp_normalizer) { described_class.new(locale: locale_spec) } 89 | 90 | let(:parsable_string) { "#{localized_march_abbreviation} 17 2019 17:57:06.456 +01:00" } 91 | 92 | it 'uses the locale to parse the date' do 93 | expect(parsed_result).to eq(Time.parse("2019-03-17T17:57:06.456+01:00")) 94 | end 95 | end 96 | 97 | context "parsing dates with sub-second precision" do 98 | context "whole second precision" do 99 | let(:parsable_string) { "Mar 17 2021 12:34:56 +00:00" } 100 | it "is accurate to the second" do 101 | expect(parsed_result.nsec).to eq(000_000_000) 102 | expect(parsed_result).to eq(Time.parse("2021-03-17T12:34:56Z")) 103 | end 104 | end 105 | context "millisecond sub-second precision" do 106 | let(:parsable_string) { "Mar 17 2021 12:34:56.987" } 107 | let(:format_string) { "%b %d %H:%M:%S.%3N" } 108 | it "is accurate to the millisecond" do 109 | expect(parsed_result.nsec).to eq(987_000_000) 110 | expect(parsed_result).to eq(Time.parse("2021-03-17T12:34:56.987Z")) 111 | end 112 | end 113 | context "microsecond sub-second precision" do 114 | let(:parsable_string) { "Mar 17 2021 12:34:56.987654" } 115 | let(:format_string) { "%b %d %H:%M:%S.%6N" } 116 | it "is accurate to the microsecond" do 117 | expect(parsed_result.nsec).to eq(987_654_000) 118 | expect(parsed_result).to eq(Time.parse("2021-03-17T12:34:56.987654Z")) 119 | end 120 | end 121 | context "nanosecond sub-second precision" do 122 | let(:parsable_string) { "Mar 17 2021 12:34:56.987654321" } 123 | let(:format_string) { "%b %d %H:%M:%S.%9N" } 124 | it "is accurate to the nanosecond" do 125 | expect(parsed_result.nsec).to eq(987_654_321) 126 | expect(parsed_result).to eq(Time.parse("2021-03-17T12:34:56.987654321Z")) 127 | end 128 | end 129 | end 130 | 131 | context "parsing dates with no year specified" do 132 | let(:time_of_parse) { fail(NotImplementedError) } 133 | let(:format_to_parse) { "%b %d %H:%M:%S.%3N" } 134 | let(:offset_days) { fail(NotImplementedError) } 135 | let(:time_to_parse) { (time_of_parse + (offset_days * 86400)) } 136 | let(:parsable_string) { time_to_parse.strftime(format_to_parse) } 137 | 138 | 139 | let(:anchored_clock) do 140 | instant = java.time.Instant.of_epoch_second(time_of_parse.to_i) 141 | zone = java.time.ZoneId.system_default 142 | 143 | java.time.Clock.fixed(instant, zone) 144 | end 145 | 146 | subject(:timestamp_normalizer) { described_class.new(clock: anchored_clock) } 147 | 148 | let(:parsed_result) { timestamp_normalizer.normalize(parsable_string) } 149 | 150 | context 'when parsing a date during late December' do 151 | let(:time_of_parse) { Time.parse("2020-12-31T23:53:08.123456789Z") } 152 | context 'and handling a date string from early january' do 153 | let(:time_to_parse) { Time.parse("2021-01-01T00:00:08.123456789Z") } 154 | it 'assumes that the date being parsed is in the very near future' do 155 | expect(parsed_result.month).to eq(1) 156 | expect(parsed_result.year).to eq(time_of_parse.year + 1) 157 | end 158 | end 159 | context 'and handling a yearless date string from mid january' do 160 | let(:time_to_parse) { Time.parse("2021-01-17T00:00:08.123456789Z") } 161 | it 'assumes that the date being parsed is in the distant past' do 162 | expect(parsed_result.month).to eq(1) 163 | expect(parsed_result.year).to eq(time_of_parse.year) 164 | end 165 | end 166 | end 167 | 168 | # As a smoke test to validate the guess-the-year feature when the provided CEF timestamp 169 | # does not include the year, we iterate through a variety of dates that we want to parse, 170 | # and with each of them we parse with a mock clock as if we were performing the parsing 171 | # operation at a variety of date-times relative to the timestamp represented. 172 | %w( 173 | 2021-01-20T04:10:22.961Z 174 | 2021-06-08T03:38:55.518Z 175 | 2021-07-12T18:46:12.149Z 176 | 2021-08-12T04:17:36.680Z 177 | 2021-08-12T13:20:14.951Z 178 | 2021-09-17T13:18:57.534Z 179 | 2021-09-23T16:35:40.404Z 180 | 2021-10-30T18:52:29.263Z 181 | 2021-11-11T00:52:39.409Z 182 | 2021-11-19T13:37:07.189Z 183 | 2021-12-02T01:09:21.846Z 184 | 2021-12-11T16:35:05.641Z 185 | 2021-12-15T14:17:22.152Z 186 | 2021-12-19T05:53:57.200Z 187 | 2021-12-20T16:18:17.637Z 188 | 2021-12-22T12:06:48.965Z 189 | 2021-12-26T04:45:14.964Z 190 | 2022-01-05T09:42:39.895Z 191 | 2022-02-02T04:58:22.080Z 192 | 2022-02-05T08:10:15.386Z 193 | 2022-02-15T16:48:27.083Z 194 | 2022-02-31T13:26:55.298Z 195 | 2022-03-10T20:16:25.732Z 196 | 2022-03-20T23:38:58.734Z 197 | 2022-03-30T03:42:09.546Z 198 | 2022-04-09T05:55:18.697Z 199 | 2022-04-14T05:05:29.278Z 200 | 2022-04-25T15:29:19.567Z 201 | 2022-05-02T08:34:21.666Z 202 | 2022-05-24T02:59:02.257Z 203 | 2022-07-25T01:58:35.713Z 204 | 2022-07-27T03:27:57.568Z 205 | 2022-07-28T20:28:22.704Z 206 | 2022-09-21T08:59:10.508Z 207 | 2022-10-29T23:54:02.372Z 208 | 2022-11-12T15:22:51.758Z 209 | 2022-11-22T22:02:33.278Z 210 | 2022-12-30T03:18:38.333Z 211 | 2023-01-02T16:55:57.829Z 212 | 2023-01-13T16:37:38.078Z 213 | 2023-01-27T07:27:09.296Z 214 | 2023-01-30T17:56:43.665Z 215 | 2023-02-18T11:41:18.886Z 216 | 2023-02-28T18:51:59.504Z 217 | 2023-03-10T06:52:14.285Z 218 | 2023-04-17T16:25:06.489Z 219 | 2023-04-18T20:46:29.611Z 220 | 2023-04-27T10:21:41.036Z 221 | 2023-05-08T02:54:57.131Z 222 | 2023-05-13T01:17:37.396Z 223 | 2023-05-24T18:23:05.136Z 224 | 2023-06-01T11:09:48.129Z 225 | 2023-06-22T07:44:56.876Z 226 | 2023-06-25T20:17:44.394Z 227 | 2023-06-25T20:53:36.329Z 228 | 2023-07-24T13:07:58.536Z 229 | 2023-07-27T21:35:54.299Z 230 | 2023-08-07T11:15:33.803Z 231 | 2023-08-12T18:45:46.791Z 232 | 2023-08-19T23:22:19.717Z 233 | 2023-08-22T23:19:41.075Z 234 | 2023-08-25T15:22:47.405Z 235 | 2023-09-03T14:34:13.345Z 236 | 2023-09-28T05:48:20.040Z 237 | 2023-09-29T21:14:15.531Z 238 | 2023-11-12T21:25:55.233Z 239 | 2023-11-30T00:41:21.834Z 240 | 2023-12-11T10:14:51.676Z 241 | 2023-12-14T18:02:33.005Z 242 | 2023-12-18T09:00:43.589Z 243 | 2023-12-20T20:02:42.205Z 244 | 2023-12-22T10:13:37.553Z 245 | 2023-12-27T19:42:37.905Z 246 | 2023-12-31T17:52:50.101Z 247 | 2024-02-29T01:23:45.678Z 248 | ).map {|ts| Time.parse(ts) }.each do |timestamp| 249 | cef_parsable_timestamp = timestamp.strftime("%b %d %H:%M:%S.%3N Z") 250 | 251 | context "when parsing the string `#{cef_parsable_timestamp}`" do 252 | 253 | let(:expected_result) { timestamp } 254 | let(:parsable_string) { cef_parsable_timestamp } 255 | 256 | { 257 | 'very recent past' => -30.789, # ~ 30 seconds ago 258 | 'somewhat recent past' => -608976.678, # ~ 1 week days ago 259 | 'distant past' => -29879991.916, # ~ 11-1/2 months days ago 260 | 'near future' => 132295.719, # ~ 1.5 days from now 261 | }.each do |desc, shift| 262 | shifted_now = timestamp - shift 263 | context "when that string could plausibly be in the #{desc} (NOW: #{shifted_now.iso8601(3)})" do 264 | let(:time_of_parse) { shifted_now } 265 | it "produces a time in the #{desc} (#{timestamp.iso8601(3)})" do 266 | expect(parsed_result).to eq(expected_result) 267 | end 268 | end 269 | end 270 | end 271 | end 272 | end 273 | end 274 | -------------------------------------------------------------------------------- /spec/codecs/cef_spec.rb: -------------------------------------------------------------------------------- 1 | # encoding: utf-8 2 | require 'logstash/util' 3 | require "logstash/devutils/rspec/spec_helper" 4 | require "insist" 5 | require "logstash/codecs/cef" 6 | require "logstash/event" 7 | require "json" 8 | 9 | require 'logstash/plugin_mixins/ecs_compatibility_support/spec_helper' 10 | 11 | describe LogStash::Codecs::CEF do 12 | subject(:codec) do 13 | next LogStash::Codecs::CEF.new 14 | end 15 | 16 | context "#encode", :ecs_compatibility_support do 17 | subject(:codec) { LogStash::Codecs::CEF.new } 18 | 19 | let(:results) { [] } 20 | 21 | context "with delimiter set" do 22 | # '\r\n' in single quotes to simulate the real input from a config 23 | # containing \r\n as 4-character sequence in the config: 24 | # 25 | # delimiter => "\r\n" 26 | # 27 | # Related: https://github.com/elastic/logstash/issues/1645 28 | subject(:codec) { LogStash::Codecs::CEF.new("delimiter" => '\r\n') } 29 | 30 | it "should append the delimiter to the result" do 31 | codec.on_event { |data, newdata| results << newdata } 32 | codec.encode(LogStash::Event.new({})) 33 | expect(results.first).to end_with("\r\n") 34 | end 35 | end 36 | 37 | it "should not fail if fields is nil" do 38 | codec.on_event{|data, newdata| results << newdata} 39 | event = LogStash::Event.new("foo" => "bar") 40 | codec.encode(event) 41 | expect(results.first).to match(/^CEF:0\|Elasticsearch\|Logstash\|1.0\|Logstash\|Logstash\|6\|$/m) 42 | end 43 | 44 | it "should assert all header fields are present" do 45 | codec.on_event{|data, newdata| results << newdata} 46 | codec.fields = [] 47 | event = LogStash::Event.new("foo" => "bar") 48 | codec.encode(event) 49 | expect(results.first).to match(/^CEF:0\|Elasticsearch\|Logstash\|1.0\|Logstash\|Logstash\|6\|$/m) 50 | end 51 | 52 | it "should use default values for empty header fields" do 53 | codec.on_event{|data, newdata| results << newdata} 54 | codec.vendor = "" 55 | codec.product = "" 56 | codec.version = "" 57 | codec.signature = "" 58 | codec.name = "" 59 | codec.severity = "" 60 | codec.fields = [] 61 | event = LogStash::Event.new("foo" => "bar") 62 | codec.encode(event) 63 | expect(results.first).to match(/^CEF:0\|Elasticsearch\|Logstash\|1.0\|Logstash\|Logstash\|6\|$/m) 64 | end 65 | 66 | it "should use configured values for header fields" do 67 | codec.on_event{|data, newdata| results << newdata} 68 | codec.vendor = "vendor" 69 | codec.product = "product" 70 | codec.version = "2.0" 71 | codec.signature = "signature" 72 | codec.name = "name" 73 | codec.severity = "1" 74 | codec.fields = [] 75 | event = LogStash::Event.new("foo" => "bar") 76 | codec.encode(event) 77 | expect(results.first).to match(/^CEF:0\|vendor\|product\|2.0\|signature\|name\|1\|$/m) 78 | end 79 | 80 | it "should use sprintf for header fields" do 81 | codec.on_event{|data, newdata| results << newdata} 82 | codec.vendor = "%{vendor}" 83 | codec.product = "%{product}" 84 | codec.version = "%{version}" 85 | codec.signature = "%{signature}" 86 | codec.name = "%{name}" 87 | codec.severity = "%{severity}" 88 | codec.fields = [] 89 | event = LogStash::Event.new("vendor" => "vendor", "product" => "product", "version" => "2.0", "signature" => "signature", "name" => "name", "severity" => "1") 90 | codec.encode(event) 91 | expect(results.first).to match(/^CEF:0\|vendor\|product\|2.0\|signature\|name\|1\|$/m) 92 | end 93 | 94 | it "should use default, if severity is not numeric" do 95 | codec.on_event{|data, newdata| results << newdata} 96 | codec.severity = "foo" 97 | codec.fields = [] 98 | event = LogStash::Event.new("foo" => "bar") 99 | codec.encode(event) 100 | expect(results.first).to match(/^CEF:0\|Elasticsearch\|Logstash\|1.0\|Logstash\|Logstash\|6\|$/m) 101 | end 102 | 103 | it "should use default, if severity is > 10" do 104 | codec.on_event{|data, newdata| results << newdata} 105 | codec.severity = "11" 106 | codec.fields = [] 107 | event = LogStash::Event.new("foo" => "bar") 108 | codec.encode(event) 109 | expect(results.first).to match(/^CEF:0\|Elasticsearch\|Logstash\|1.0\|Logstash\|Logstash\|6\|$/m) 110 | end 111 | 112 | it "should use default, if severity is < 0" do 113 | codec.on_event{|data, newdata| results << newdata} 114 | codec.severity = "-1" 115 | codec.fields = [] 116 | event = LogStash::Event.new("foo" => "bar") 117 | codec.encode(event) 118 | expect(results.first).to match(/^CEF:0\|Elasticsearch\|Logstash\|1.0\|Logstash\|Logstash\|6\|$/m) 119 | end 120 | 121 | it "should use default, if severity is float with decimal part" do 122 | codec.on_event{|data, newdata| results << newdata} 123 | codec.severity = "5.4" 124 | codec.fields = [] 125 | event = LogStash::Event.new("foo" => "bar") 126 | codec.encode(event) 127 | expect(results.first).to match(/^CEF:0\|Elasticsearch\|Logstash\|1.0\|Logstash\|Logstash\|6\|$/m) 128 | end 129 | 130 | it "should append fields as key/value pairs in cef extension part" do 131 | codec.on_event{|data, newdata| results << newdata} 132 | codec.fields = [ "foo", "bar" ] 133 | event = LogStash::Event.new("foo" => "foo value", "bar" => "bar value") 134 | codec.encode(event) 135 | expect(results.first).to match(/^CEF:0\|Elasticsearch\|Logstash\|1.0\|Logstash\|Logstash\|6\|foo=foo value bar=bar value$/m) 136 | end 137 | 138 | it "should ignore fields in fields if not present in event" do 139 | codec.on_event{|data, newdata| results << newdata} 140 | codec.fields = [ "foo", "bar", "baz" ] 141 | event = LogStash::Event.new("foo" => "foo value", "baz" => "baz value") 142 | codec.encode(event) 143 | expect(results.first).to match(/^CEF:0\|Elasticsearch\|Logstash\|1.0\|Logstash\|Logstash\|6\|foo=foo value baz=baz value$/m) 144 | end 145 | 146 | it "should sanitize header fields" do 147 | codec.on_event{|data, newdata| results << newdata} 148 | codec.vendor = "ven\ndor" 149 | codec.product = "pro|duct" 150 | codec.version = "ver\\sion" 151 | codec.signature = "sig\r\nnature" 152 | codec.name = "na\rme" 153 | codec.severity = "4\n" 154 | codec.fields = [] 155 | event = LogStash::Event.new("foo" => "bar") 156 | codec.encode(event) 157 | expect(results.first).to match(/^CEF:0\|ven dor\|pro\\\|duct\|ver\\\\sion\|sig nature\|na me\|4\|$/m) 158 | end 159 | 160 | it "should sanitize extension keys" do 161 | codec.on_event{|data, newdata| results << newdata} 162 | codec.fields = [ "f o\no", "@b-a_r" ] 163 | event = LogStash::Event.new("f o\no" => "foo value", "@b-a_r" => "bar value") 164 | codec.encode(event) 165 | expect(results.first).to match(/^CEF:0\|Elasticsearch\|Logstash\|1.0\|Logstash\|Logstash\|6\|foo=foo value bar=bar value$/m) 166 | end 167 | 168 | it "should sanitize extension values" do 169 | codec.on_event{|data, newdata| results << newdata} 170 | codec.fields = [ "foo", "bar", "baz" ] 171 | event = LogStash::Event.new("foo" => "foo\\value\n", "bar" => "bar=value\r") 172 | codec.encode(event) 173 | expect(results.first).to match(/^CEF:0\|Elasticsearch\|Logstash\|1.0\|Logstash\|Logstash\|6\|foo=foo\\\\value\\n bar=bar\\=value\\n$/m) 174 | end 175 | 176 | it "should encode a hash value" do 177 | codec.on_event{|data, newdata| results << newdata} 178 | codec.fields = [ "foo" ] 179 | event = LogStash::Event.new("foo" => { "bar" => "bar value", "baz" => "baz value" }) 180 | codec.encode(event) 181 | foo = results.first[/^CEF:0\|Elasticsearch\|Logstash\|1.0\|Logstash\|Logstash\|6\|foo=(.*)$/, 1] 182 | expect(foo).not_to be_nil 183 | foo_hash = JSON.parse(foo) 184 | expect(foo_hash).to eq({"bar" => "bar value", "baz" => "baz value"}) 185 | end 186 | 187 | it "should encode an array value" do 188 | codec.on_event{|data, newdata| results << newdata} 189 | codec.fields = [ "foo" ] 190 | event = LogStash::Event.new("foo" => [ "bar", "baz" ]) 191 | codec.encode(event) 192 | foo = results.first[/^CEF:0\|Elasticsearch\|Logstash\|1.0\|Logstash\|Logstash\|6\|foo=(.*)$/, 1] 193 | expect(foo).not_to be_nil 194 | foo_array = JSON.parse(foo) 195 | expect(foo_array).to eq(["bar", "baz"]) 196 | end 197 | 198 | it "should encode a hash in an array value" do 199 | codec.on_event{|data, newdata| results << newdata} 200 | codec.fields = [ "foo" ] 201 | event = LogStash::Event.new("foo" => [ { "bar" => "bar value" }, "baz" ]) 202 | codec.encode(event) 203 | foo = results.first[/^CEF:0\|Elasticsearch\|Logstash\|1.0\|Logstash\|Logstash\|6\|foo=(.*)$/, 1] 204 | expect(foo).not_to be_nil 205 | foo_array = JSON.parse(foo) 206 | expect(foo_array).to eq([{"bar" => "bar value"}, "baz"]) 207 | end 208 | 209 | it "should encode a LogStash::Timestamp" do 210 | codec.on_event{|data, newdata| results << newdata} 211 | codec.fields = [ "foo" ] 212 | event = LogStash::Event.new("foo" => LogStash::Timestamp.new) 213 | codec.encode(event) 214 | expect(results.first).to match(/^CEF:0\|Elasticsearch\|Logstash\|1.0\|Logstash\|Logstash\|6\|foo=[0-9TZ.:-]+$/m) 215 | end 216 | 217 | ecs_compatibility_matrix(:disabled,:v1) do |ecs_select| 218 | before(:each) do 219 | allow_any_instance_of(described_class).to receive(:ecs_compatibility).and_return(ecs_compatibility) 220 | end 221 | 222 | it "should encode the CEF field names to their long versions" do 223 | # This is with the default value of "reverse_mapping" that is "false". 224 | codec.on_event{|data, newdata| results << newdata} 225 | codec.fields = [ "deviceAction", "applicationProtocol", "deviceCustomIPv6Address1", "deviceCustomIPv6Address1Label", "deviceCustomIPv6Address2", "deviceCustomIPv6Address2Label", "deviceCustomIPv6Address3", "deviceCustomIPv6Address3Label", "deviceCustomIPv6Address4", "deviceCustomIPv6Address4Label", "deviceEventCategory", "deviceCustomFloatingPoint1", "deviceCustomFloatingPoint1Label", "deviceCustomFloatingPoint2", "deviceCustomFloatingPoint2Label", "deviceCustomFloatingPoint3", "deviceCustomFloatingPoint3Label", "deviceCustomFloatingPoint4", "deviceCustomFloatingPoint4Label", "deviceCustomNumber1", "deviceCustomNumber1Label", "deviceCustomNumber2", "deviceCustomNumber2Label", "deviceCustomNumber3", "deviceCustomNumber3Label", "baseEventCount", "deviceCustomString1", "deviceCustomString1Label", "deviceCustomString2", "deviceCustomString2Label", "deviceCustomString3", "deviceCustomString3Label", "deviceCustomString4", "deviceCustomString4Label", "deviceCustomString5", "deviceCustomString5Label", "deviceCustomString6", "deviceCustomString6Label", "destinationHostName", "destinationMacAddress", "destinationNtDomain", "destinationProcessId", "destinationUserPrivileges", "destinationProcessName", "destinationPort", "destinationAddress", "destinationUserId", "destinationUserName", "deviceAddress", "deviceHostName", "deviceProcessId", "endTime", "fileName", "fileSize", "bytesIn", "message", "bytesOut", "eventOutcome", "transportProtocol", "requestUrl", "deviceReceiptTime", "sourceHostName", "sourceMacAddress", "sourceNtDomain", "sourceProcessId", "sourceUserPrivileges", "sourceProcessName", "sourcePort", "sourceAddress", "startTime", "sourceUserId", "sourceUserName", "agentHostName", "agentReceiptTime", "agentType", "agentId", "agentAddress", "agentVersion", "agentTimeZone", "destinationTimeZone", "sourceLongitude", "sourceLatitude", "destinationLongitude", "destinationLatitude", "categoryDeviceType", "managerReceiptTime", "agentMacAddress" ] 226 | event = LogStash::Event.new("deviceAction" => "foobar", "applicationProtocol" => "foobar", "deviceCustomIPv6Address1" => "foobar", "deviceCustomIPv6Address1Label" => "foobar", "deviceCustomIPv6Address2" => "foobar", "deviceCustomIPv6Address2Label" => "foobar", "deviceCustomIPv6Address3" => "foobar", "deviceCustomIPv6Address3Label" => "foobar", "deviceCustomIPv6Address4" => "foobar", "deviceCustomIPv6Address4Label" => "foobar", "deviceEventCategory" => "foobar", "deviceCustomFloatingPoint1" => "foobar", "deviceCustomFloatingPoint1Label" => "foobar", "deviceCustomFloatingPoint2" => "foobar", "deviceCustomFloatingPoint2Label" => "foobar", "deviceCustomFloatingPoint3" => "foobar", "deviceCustomFloatingPoint3Label" => "foobar", "deviceCustomFloatingPoint4" => "foobar", "deviceCustomFloatingPoint4Label" => "foobar", "deviceCustomNumber1" => "foobar", "deviceCustomNumber1Label" => "foobar", "deviceCustomNumber2" => "foobar", "deviceCustomNumber2Label" => "foobar", "deviceCustomNumber3" => "foobar", "deviceCustomNumber3Label" => "foobar", "baseEventCount" => "foobar", "deviceCustomString1" => "foobar", "deviceCustomString1Label" => "foobar", "deviceCustomString2" => "foobar", "deviceCustomString2Label" => "foobar", "deviceCustomString3" => "foobar", "deviceCustomString3Label" => "foobar", "deviceCustomString4" => "foobar", "deviceCustomString4Label" => "foobar", "deviceCustomString5" => "foobar", "deviceCustomString5Label" => "foobar", "deviceCustomString6" => "foobar", "deviceCustomString6Label" => "foobar", "destinationHostName" => "foobar", "destinationMacAddress" => "foobar", "destinationNtDomain" => "foobar", "destinationProcessId" => "foobar", "destinationUserPrivileges" => "foobar", "destinationProcessName" => "foobar", "destinationPort" => "foobar", "destinationAddress" => "foobar", "destinationUserId" => "foobar", "destinationUserName" => "foobar", "deviceAddress" => "foobar", "deviceHostName" => "foobar", "deviceProcessId" => "foobar", "endTime" => "foobar", "fileName" => "foobar", "fileSize" => "foobar", "bytesIn" => "foobar", "message" => "foobar", "bytesOut" => "foobar", "eventOutcome" => "foobar", "transportProtocol" => "foobar", "requestUrl" => "foobar", "deviceReceiptTime" => "foobar", "sourceHostName" => "foobar", "sourceMacAddress" => "foobar", "sourceNtDomain" => "foobar", "sourceProcessId" => "foobar", "sourceUserPrivileges" => "foobar", "sourceProcessName"=> "foobar", "sourcePort" => "foobar", "sourceAddress" => "foobar", "startTime" => "foobar", "sourceUserId" => "foobar", "sourceUserName" => "foobar", "agentHostName" => "foobar", "agentReceiptTime" => "foobar", "agentType" => "foobar", "agentId" => "foobar", "agentAddress" => "foobar", "agentVersion" => "foobar", "agentTimeZone" => "foobar", "destinationTimeZone" => "foobar", "sourceLongitude" => "foobar", "sourceLatitude" => "foobar", "destinationLongitude" => "foobar", "destinationLatitude" => "foobar", "categoryDeviceType" => "foobar", "managerReceiptTime" => "foobar", "agentMacAddress" => "foobar") 227 | codec.encode(event) 228 | expect(results.first).to match(/^CEF:0\|Elasticsearch\|Logstash\|1.0\|Logstash\|Logstash\|6\|deviceAction=foobar applicationProtocol=foobar deviceCustomIPv6Address1=foobar deviceCustomIPv6Address1Label=foobar deviceCustomIPv6Address2=foobar deviceCustomIPv6Address2Label=foobar deviceCustomIPv6Address3=foobar deviceCustomIPv6Address3Label=foobar deviceCustomIPv6Address4=foobar deviceCustomIPv6Address4Label=foobar deviceEventCategory=foobar deviceCustomFloatingPoint1=foobar deviceCustomFloatingPoint1Label=foobar deviceCustomFloatingPoint2=foobar deviceCustomFloatingPoint2Label=foobar deviceCustomFloatingPoint3=foobar deviceCustomFloatingPoint3Label=foobar deviceCustomFloatingPoint4=foobar deviceCustomFloatingPoint4Label=foobar deviceCustomNumber1=foobar deviceCustomNumber1Label=foobar deviceCustomNumber2=foobar deviceCustomNumber2Label=foobar deviceCustomNumber3=foobar deviceCustomNumber3Label=foobar baseEventCount=foobar deviceCustomString1=foobar deviceCustomString1Label=foobar deviceCustomString2=foobar deviceCustomString2Label=foobar deviceCustomString3=foobar deviceCustomString3Label=foobar deviceCustomString4=foobar deviceCustomString4Label=foobar deviceCustomString5=foobar deviceCustomString5Label=foobar deviceCustomString6=foobar deviceCustomString6Label=foobar destinationHostName=foobar destinationMacAddress=foobar destinationNtDomain=foobar destinationProcessId=foobar destinationUserPrivileges=foobar destinationProcessName=foobar destinationPort=foobar destinationAddress=foobar destinationUserId=foobar destinationUserName=foobar deviceAddress=foobar deviceHostName=foobar deviceProcessId=foobar endTime=foobar fileName=foobar fileSize=foobar bytesIn=foobar message=foobar bytesOut=foobar eventOutcome=foobar transportProtocol=foobar requestUrl=foobar deviceReceiptTime=foobar sourceHostName=foobar sourceMacAddress=foobar sourceNtDomain=foobar sourceProcessId=foobar sourceUserPrivileges=foobar sourceProcessName=foobar sourcePort=foobar sourceAddress=foobar startTime=foobar sourceUserId=foobar sourceUserName=foobar agentHostName=foobar agentReceiptTime=foobar agentType=foobar agentId=foobar agentAddress=foobar agentVersion=foobar agentTimeZone=foobar destinationTimeZone=foobar sourceLongitude=foobar sourceLatitude=foobar destinationLongitude=foobar destinationLatitude=foobar categoryDeviceType=foobar managerReceiptTime=foobar agentMacAddress=foobar$/m) 229 | end 230 | 231 | if ecs_select.active_mode != :disabled 232 | let(:event_flat_hash) do 233 | { 234 | "[event][action]" => "floop", # deviceAction 235 | "[network][protocol]" => "https", # applicationProtocol 236 | "[cef][device_custom_ipv6_address_1][value]" => "4302:c0a5:0bb9:2dfd:7b4e:97f7:a328:98a9", # deviceCustomIPv6Address1 237 | "[cef][device_custom_ipv6_address_1][label]" => "internal-interface", # deviceCustomIPv6Address1Label 238 | "[observer][ip]" => "123.45.67.89", # deviceAddress 239 | "[observer][hostname]" => "banana", # deviceHostName 240 | "[user_agent][original]" => "'Foo-Bar/2018.1.7; Email:user@example.com; Guid:test='", # requestClientApplication 241 | "[source][registered_domain]" => "monkey.see" # sourceDnsDomain 242 | } 243 | end 244 | 245 | let(:event) do 246 | event_flat_hash.each_with_object(LogStash::Event.new) do |(fr,v),memo| 247 | memo.set(fr, v) 248 | end 249 | end 250 | 251 | it 'encodes the ECS field names to their CEF name' do 252 | codec.on_event{|data, newdata| results << newdata} 253 | codec.fields = event_flat_hash.keys 254 | 255 | codec.encode(event) 256 | 257 | expect(results.first).to match(%r{^CEF:0\|Elasticsearch\|Logstash\|1.0\|Logstash\|Logstash\|6\|deviceAction=floop applicationProtocol=https deviceCustomIPv6Address1=4302:c0a5:0bb9:2dfd:7b4e:97f7:a328:98a9 deviceCustomIPv6Address1Label=internal-interface deviceAddress=123\.45\.67\.89 deviceHostName=banana requestClientApplication='Foo-Bar/2018\.1\.7; Email:user@example\.com; Guid:test\\=' sourceDnsDomain=monkey.see$}m) 258 | end 259 | end 260 | 261 | context "with reverse_mapping set to true" do 262 | subject(:codec) { LogStash::Codecs::CEF.new("reverse_mapping" => true) } 263 | 264 | it "should encode the CEF field names to their short versions" do 265 | codec.on_event{|data, newdata| results << newdata} 266 | codec.fields = [ "deviceAction", "applicationProtocol", "deviceCustomIPv6Address1", "deviceCustomIPv6Address1Label", "deviceCustomIPv6Address2", "deviceCustomIPv6Address2Label", "deviceCustomIPv6Address3", "deviceCustomIPv6Address3Label", "deviceCustomIPv6Address4", "deviceCustomIPv6Address4Label", "deviceEventCategory", "deviceCustomFloatingPoint1", "deviceCustomFloatingPoint1Label", "deviceCustomFloatingPoint2", "deviceCustomFloatingPoint2Label", "deviceCustomFloatingPoint3", "deviceCustomFloatingPoint3Label", "deviceCustomFloatingPoint4", "deviceCustomFloatingPoint4Label", "deviceCustomNumber1", "deviceCustomNumber1Label", "deviceCustomNumber2", "deviceCustomNumber2Label", "deviceCustomNumber3", "deviceCustomNumber3Label", "baseEventCount", "deviceCustomString1", "deviceCustomString1Label", "deviceCustomString2", "deviceCustomString2Label", "deviceCustomString3", "deviceCustomString3Label", "deviceCustomString4", "deviceCustomString4Label", "deviceCustomString5", "deviceCustomString5Label", "deviceCustomString6", "deviceCustomString6Label", "destinationHostName", "destinationMacAddress", "destinationNtDomain", "destinationProcessId", "destinationUserPrivileges", "destinationProcessName", "destinationPort", "destinationAddress", "destinationUserId", "destinationUserName", "deviceAddress", "deviceHostName", "deviceProcessId", "endTime", "fileName", "fileSize", "bytesIn", "message", "bytesOut", "eventOutcome", "transportProtocol", "requestUrl", "deviceReceiptTime", "sourceHostName", "sourceMacAddress", "sourceNtDomain", "sourceProcessId", "sourceUserPrivileges", "sourceProcessName", "sourcePort", "sourceAddress", "startTime", "sourceUserId", "sourceUserName", "agentHostName", "agentReceiptTime", "agentType", "agentId", "agentAddress", "agentVersion", "agentTimeZone", "destinationTimeZone", "sourceLongitude", "sourceLatitude", "destinationLongitude", "destinationLatitude", "categoryDeviceType", "managerReceiptTime", "agentMacAddress" ] 267 | event = LogStash::Event.new("deviceAction" => "foobar", "applicationProtocol" => "foobar", "deviceCustomIPv6Address1" => "foobar", "deviceCustomIPv6Address1Label" => "foobar", "deviceCustomIPv6Address2" => "foobar", "deviceCustomIPv6Address2Label" => "foobar", "deviceCustomIPv6Address3" => "foobar", "deviceCustomIPv6Address3Label" => "foobar", "deviceCustomIPv6Address4" => "foobar", "deviceCustomIPv6Address4Label" => "foobar", "deviceEventCategory" => "foobar", "deviceCustomFloatingPoint1" => "foobar", "deviceCustomFloatingPoint1Label" => "foobar", "deviceCustomFloatingPoint2" => "foobar", "deviceCustomFloatingPoint2Label" => "foobar", "deviceCustomFloatingPoint3" => "foobar", "deviceCustomFloatingPoint3Label" => "foobar", "deviceCustomFloatingPoint4" => "foobar", "deviceCustomFloatingPoint4Label" => "foobar", "deviceCustomNumber1" => "foobar", "deviceCustomNumber1Label" => "foobar", "deviceCustomNumber2" => "foobar", "deviceCustomNumber2Label" => "foobar", "deviceCustomNumber3" => "foobar", "deviceCustomNumber3Label" => "foobar", "baseEventCount" => "foobar", "deviceCustomString1" => "foobar", "deviceCustomString1Label" => "foobar", "deviceCustomString2" => "foobar", "deviceCustomString2Label" => "foobar", "deviceCustomString3" => "foobar", "deviceCustomString3Label" => "foobar", "deviceCustomString4" => "foobar", "deviceCustomString4Label" => "foobar", "deviceCustomString5" => "foobar", "deviceCustomString5Label" => "foobar", "deviceCustomString6" => "foobar", "deviceCustomString6Label" => "foobar", "destinationHostName" => "foobar", "destinationMacAddress" => "foobar", "destinationNtDomain" => "foobar", "destinationProcessId" => "foobar", "destinationUserPrivileges" => "foobar", "destinationProcessName" => "foobar", "destinationPort" => "foobar", "destinationAddress" => "foobar", "destinationUserId" => "foobar", "destinationUserName" => "foobar", "deviceAddress" => "foobar", "deviceHostName" => "foobar", "deviceProcessId" => "foobar", "endTime" => "foobar", "fileName" => "foobar", "fileSize" => "foobar", "bytesIn" => "foobar", "message" => "foobar", "bytesOut" => "foobar", "eventOutcome" => "foobar", "transportProtocol" => "foobar", "requestUrl" => "foobar", "deviceReceiptTime" => "foobar", "sourceHostName" => "foobar", "sourceMacAddress" => "foobar", "sourceNtDomain" => "foobar", "sourceProcessId" => "foobar", "sourceUserPrivileges" => "foobar", "sourceProcessName"=> "foobar", "sourcePort" => "foobar", "sourceAddress" => "foobar", "startTime" => "foobar", "sourceUserId" => "foobar", "sourceUserName" => "foobar", "agentHostName" => "foobar", "agentReceiptTime" => "foobar", "agentType" => "foobar", "agentId" => "foobar", "agentAddress" => "foobar", "agentVersion" => "foobar", "agentTimeZone" => "foobar", "destinationTimeZone" => "foobar", "sourceLongitude" => "foobar", "sourceLatitude" => "foobar", "destinationLongitude" => "foobar", "destinationLatitude" => "foobar", "categoryDeviceType" => "foobar", "managerReceiptTime" => "foobar", "agentMacAddress" => "foobar") 268 | codec.encode(event) 269 | expect(results.first).to match(/^CEF:0\|Elasticsearch\|Logstash\|1.0\|Logstash\|Logstash\|6\|act=foobar app=foobar c6a1=foobar c6a1Label=foobar c6a2=foobar c6a2Label=foobar c6a3=foobar c6a3Label=foobar c6a4=foobar c6a4Label=foobar cat=foobar cfp1=foobar cfp1Label=foobar cfp2=foobar cfp2Label=foobar cfp3=foobar cfp3Label=foobar cfp4=foobar cfp4Label=foobar cn1=foobar cn1Label=foobar cn2=foobar cn2Label=foobar cn3=foobar cn3Label=foobar cnt=foobar cs1=foobar cs1Label=foobar cs2=foobar cs2Label=foobar cs3=foobar cs3Label=foobar cs4=foobar cs4Label=foobar cs5=foobar cs5Label=foobar cs6=foobar cs6Label=foobar dhost=foobar dmac=foobar dntdom=foobar dpid=foobar dpriv=foobar dproc=foobar dpt=foobar dst=foobar duid=foobar duser=foobar dvc=foobar dvchost=foobar dvcpid=foobar end=foobar fname=foobar fsize=foobar in=foobar msg=foobar out=foobar outcome=foobar proto=foobar request=foobar rt=foobar shost=foobar smac=foobar sntdom=foobar spid=foobar spriv=foobar sproc=foobar spt=foobar src=foobar start=foobar suid=foobar suser=foobar ahost=foobar art=foobar at=foobar aid=foobar agt=foobar av=foobar atz=foobar dtz=foobar slong=foobar slat=foobar dlong=foobar dlat=foobar catdt=foobar mrt=foobar amac=foobar$/m) 270 | end 271 | 272 | if ecs_select.active_mode != :disabled 273 | let(:event_flat_hash) do 274 | { 275 | "[event][action]" => "floop", # act 276 | "[network][protocol]" => "https", # app 277 | "[cef][device_custom_ipv6_address_1][value]" => "4302:c0a5:0bb9:2dfd:7b4e:97f7:a328:98a9", # c6a1 278 | "[cef][device_custom_ipv6_address_1][label]" => "internal-interface", # c6a1Label 279 | "[observer][ip]" => "123.45.67.89", # dvc 280 | "[observer][hostname]" => "banana", # dvchost 281 | "[user_agent][original]" => "'Foo-Bar/2018.1.7; Email:user@example.com; Guid:test='", 282 | "[source][registered_domain]" => "monkey.see" # sourceDnsDomain 283 | } 284 | end 285 | 286 | let(:event) do 287 | event_flat_hash.each_with_object(LogStash::Event.new) do |(fr,v),memo| 288 | memo.set(fr, v) 289 | end 290 | end 291 | 292 | 293 | it 'encodes the ECS field names to their CEF keys' do 294 | codec.on_event{|data, newdata| results << newdata} 295 | codec.fields = event_flat_hash.keys 296 | 297 | codec.encode(event) 298 | 299 | expect(results.first).to match(%r{^CEF:0\|Elasticsearch\|Logstash\|1.0\|Logstash\|Logstash\|6\|act=floop app=https c6a1=4302:c0a5:0bb9:2dfd:7b4e:97f7:a328:98a9 c6a1Label=internal-interface dvc=123\.45\.67\.89 dvchost=banana requestClientApplication='Foo-Bar/2018\.1\.7; Email:user@example\.com; Guid:test\\=' sourceDnsDomain=monkey.see$}m) 300 | end 301 | end 302 | end 303 | end 304 | end 305 | 306 | context "sanitize header field" do 307 | subject(:codec) { LogStash::Codecs::CEF.new } 308 | 309 | it "should sanitize" do 310 | expect(codec.send(:sanitize_header_field, "foo")).to be == "foo" 311 | expect(codec.send(:sanitize_header_field, "foo\nbar")).to be == "foo bar" 312 | expect(codec.send(:sanitize_header_field, "foo\rbar")).to be == "foo bar" 313 | expect(codec.send(:sanitize_header_field, "foo\r\nbar")).to be == "foo bar" 314 | expect(codec.send(:sanitize_header_field, "foo\r\nbar\r\nbaz")).to be == "foo bar baz" 315 | expect(codec.send(:sanitize_header_field, "foo\\bar")).to be == "foo\\\\bar" 316 | expect(codec.send(:sanitize_header_field, "foo|bar")).to be == "foo\\|bar" 317 | expect(codec.send(:sanitize_header_field, "foo=bar")).to be == "foo=bar" 318 | expect(codec.send(:sanitize_header_field, 123)).to be == "123" # Input value is a Fixnum 319 | expect(codec.send(:sanitize_header_field, 123.123)).to be == "123.123" # Input value is a Float 320 | expect(codec.send(:sanitize_header_field, [])).to be == "[]" # Input value is an Array 321 | expect(codec.send(:sanitize_header_field, {})).to be == "{}" # Input value is a Hash 322 | end 323 | end 324 | 325 | context "sanitize extension key" do 326 | subject(:codec) { LogStash::Codecs::CEF.new } 327 | 328 | it "should sanitize" do 329 | expect(codec.send(:sanitize_extension_key, " foo ")).to be == "foo" 330 | expect(codec.send(:sanitize_extension_key, " FOO 123 ")).to be == "FOO123" 331 | expect(codec.send(:sanitize_extension_key, "foo\nbar\rbaz")).to be == "foobarbaz" 332 | expect(codec.send(:sanitize_extension_key, "Foo_Bar\r\nBaz")).to be == "FooBarBaz" 333 | expect(codec.send(:sanitize_extension_key, "foo-@bar=baz")).to be == "foobarbaz" 334 | expect(codec.send(:sanitize_extension_key, "[foo]|bar.baz")).to be == "foobarbaz" 335 | expect(codec.send(:sanitize_extension_key, 123)).to be == "123" # Input value is a Fixnum 336 | expect(codec.send(:sanitize_extension_key, 123.123)).to be == "123123" # Input value is a Float, "." is not allowed and therefore removed 337 | expect(codec.send(:sanitize_extension_key, [])).to be == "" # Input value is an Array, "[" and "]" are not allowed and therefore removed 338 | expect(codec.send(:sanitize_extension_key, {})).to be == "" # Input value is a Hash, "{" and "}" are not allowed and therefore removed 339 | end 340 | end 341 | 342 | context "sanitize extension value" do 343 | subject(:codec) { LogStash::Codecs::CEF.new } 344 | 345 | it "should sanitize" do 346 | expect(codec.send(:sanitize_extension_val, "foo")).to be == "foo" 347 | expect(codec.send(:sanitize_extension_val, "foo\nbar")).to be == "foo\\nbar" 348 | expect(codec.send(:sanitize_extension_val, "foo\rbar")).to be == "foo\\nbar" 349 | expect(codec.send(:sanitize_extension_val, "foo\r\nbar")).to be == "foo\\nbar" 350 | expect(codec.send(:sanitize_extension_val, "foo\r\nbar\r\nbaz")).to be == "foo\\nbar\\nbaz" 351 | expect(codec.send(:sanitize_extension_val, "foo\\bar")).to be == "foo\\\\bar" 352 | expect(codec.send(:sanitize_extension_val, "foo|bar")).to be == "foo|bar" 353 | expect(codec.send(:sanitize_extension_val, "foo=bar")).to be == "foo\\=bar" 354 | expect(codec.send(:sanitize_extension_val, 123)).to be == "123" # Input value is a Fixnum 355 | expect(codec.send(:sanitize_extension_val, 123.123)).to be == "123.123" # Input value is a Float 356 | expect(codec.send(:sanitize_extension_val, [])).to be == "[]" # Input value is an Array 357 | expect(codec.send(:sanitize_extension_val, {})).to be == "{}" # Input value is a Hash 358 | end 359 | end 360 | 361 | context "valid_severity?" do 362 | subject(:codec) { LogStash::Codecs::CEF.new } 363 | 364 | it "should validate severity" do 365 | expect(codec.send(:valid_severity?, nil)).to be == false 366 | expect(codec.send(:valid_severity?, "")).to be == false 367 | expect(codec.send(:valid_severity?, "foo")).to be == false 368 | expect(codec.send(:valid_severity?, "1.5")).to be == false 369 | expect(codec.send(:valid_severity?, "-1")).to be == false 370 | expect(codec.send(:valid_severity?, "11")).to be == false 371 | expect(codec.send(:valid_severity?, "0")).to be == true 372 | expect(codec.send(:valid_severity?, "10")).to be == true 373 | expect(codec.send(:valid_severity?, "1.0")).to be == true 374 | expect(codec.send(:valid_severity?, 1)).to be == true 375 | expect(codec.send(:valid_severity?, 1.0)).to be == true 376 | end 377 | end 378 | 379 | module DecodeHelpers 380 | def validate(e) 381 | insist { e.is_a?(LogStash::Event) } 382 | send("validate_ecs_#{ecs_compatibility}", e) 383 | end 384 | 385 | def validate_ecs_v1(e) 386 | insist { e.get('[cef][version]') } == "0" 387 | insist { e.get('[observer][version]') } == "1.0" 388 | insist { e.get('[event][code]') } == "100" 389 | insist { e.get('[cef][name]') } == "trojan successfully stopped" 390 | insist { e.get('[event][severity]') } == "10" 391 | end 392 | 393 | def validate_ecs_disabled(e) 394 | insist { e.get('cefVersion') } == "0" 395 | insist { e.get('deviceVersion') } == "1.0" 396 | insist { e.get('deviceEventClassId') } == "100" 397 | insist { e.get('name') } == "trojan successfully stopped" 398 | insist { e.get('severity') } == "10" 399 | end 400 | 401 | ## 402 | # Use the given codec to decode the given data, ensuring exactly one event is emitted. 403 | # 404 | # If a block is given, yield the resulting event to the block _outside_ of `LogStash::Codecs::CEF#decode(String)` 405 | # in order to avoid mismatched-exceptions raised by RSpec triggering the codec's exception-handling. 406 | # 407 | # @param codec [#decode] 408 | # @param data [String] 409 | # @yieldparam event [Event] 410 | # @yieldreturn [void] 411 | # @return [Event] 412 | def decode_one(codec, data, flush: true, &block) 413 | events = do_decode(codec, data, flush: flush) 414 | fail("Expected one event, got #{events.size} events: #{events.inspect}") unless events.size == 1 415 | event = events.first 416 | 417 | if block 418 | enriched_event_validation(event) do |e| 419 | aggregate_failures('decode one') do 420 | yield e 421 | end 422 | end 423 | end 424 | 425 | event 426 | end 427 | 428 | ## 429 | # Use the given codec to decode the given data, returning an Array of the resulting Events 430 | # 431 | # If a block is given, each event is yielded to the block _outside_ of `LogStash::Codecs::CEF#decode(String)` 432 | # in order to avoid mismatched-exceptions raised by RSpec triggering the codec's exception-handling. 433 | # 434 | # @param codec [#decode] 435 | # @param data [String] 436 | # @yieldparam event [Event] 437 | # @yieldreturn [void] 438 | # @return [Array] 439 | def do_decode(codec, data, flush: true, &block) 440 | events = [] 441 | codec.decode(data) do |event| 442 | events << event 443 | end 444 | flush && codec.flush do |event| 445 | events << event 446 | end 447 | 448 | if block 449 | events.each do |event| 450 | enriched_event_validation(event, &block) 451 | end 452 | end 453 | 454 | events 455 | end 456 | 457 | ## 458 | # Enrich event validation by outputting the serialized event to stderr 459 | # if-and-only-if the provided block's rspec expectations are not met. 460 | # 461 | # @param event [#to_hash_with_metadata] 462 | def enriched_event_validation(event) 463 | yield(event) 464 | rescue RSpec::Expectations::ExpectationNotMetError 465 | $stderr.puts("\e[35m#{event.to_hash_with_metadata}\e[0m\n") 466 | raise 467 | end 468 | end 469 | 470 | context "#decode", :ecs_compatibility_support do 471 | ecs_compatibility_matrix(:disabled,:v1) do |ecs_select| 472 | before(:each) do 473 | allow_any_instance_of(described_class).to receive(:ecs_compatibility).and_return(ecs_compatibility) 474 | end 475 | 476 | let(:message) { "CEF:0|security|threatmanager|1.0|100|trojan successfully stopped|10|src=10.0.0.192 dst=12.121.122.82 spt=1232" } 477 | 478 | include DecodeHelpers 479 | 480 | context "with delimiter set" do 481 | # '\r\n' in single quotes to simulate the real input from a config 482 | # containing \r\n as 4-character sequence in the config: 483 | # 484 | # delimiter => "\r\n" 485 | # 486 | # Related: https://github.com/elastic/logstash/issues/1645 487 | subject(:codec) { LogStash::Codecs::CEF.new("delimiter" => '\r\n') } 488 | 489 | let(:message_two) { "CEF:0|fun|whimsy|1.0|100|trojan successfully stopped|10|src=10.0.0.192 dst=12.121.122.82 spt=1232" } 490 | 491 | # testing implicit flush when 492 | it "should parse on the delimiter " do 493 | do_decode(subject, message, flush: false) do |e| 494 | raise Exception.new("Should not get here. If we do, it means the decoder emitted an event before the delimiter was seen?") 495 | end 496 | 497 | # the delimiter's presence flushes what we already received, but not the new bytes we send 498 | decode_one(subject, "\r\n#{message_two}", flush: false) do |e| 499 | validate(e) 500 | insist { e.get(ecs_select[disabled: "deviceVendor", v1:"[observer][vendor]"]) } == "security" 501 | insist { e.get(ecs_select[disabled: "deviceProduct", v1:"[observer][product]"]) } == "threatmanager" 502 | end 503 | 504 | # allowing a flush emits the buffered event with our new bits appended 505 | decode_one(subject, " split=perfect", flush: true) do |e| 506 | validate(e) 507 | insist { e.get(ecs_select[disabled: "deviceVendor", v1:"[observer][vendor]"]) } == "fun" 508 | insist { e.get(ecs_select[disabled: "deviceProduct", v1:"[observer][product]"]) } == "whimsy" 509 | insist { e.get("split") } == "perfect" 510 | end 511 | end 512 | 513 | it 'flushes on close' do 514 | # message does NOT have delimiter, but we still get our event 515 | decode_one(subject, message, flush: true) do |e| 516 | validate(e) 517 | insist { e.get(ecs_select[disabled: "deviceVendor", v1:"[observer][vendor]"]) } == "security" 518 | insist { e.get(ecs_select[disabled: "deviceProduct", v1:"[observer][product]"]) } == "threatmanager" 519 | end 520 | end 521 | 522 | it 'emits multiple from a single decode operation' do 523 | events = do_decode(subject, "#{message}\r\n#{message_two}") 524 | expect(events.size).to eq(2) 525 | 526 | enriched_event_validation(events[0]) do |event| 527 | validate(event) 528 | insist { event.get(ecs_select[disabled: "deviceVendor", v1:"[observer][vendor]"]) } == "security" 529 | insist { event.get(ecs_select[disabled: "deviceProduct", v1:"[observer][product]"]) } == "threatmanager" 530 | end 531 | 532 | enriched_event_validation(events[1]) do |event| 533 | validate(event) 534 | insist { event.get(ecs_select[disabled: "deviceVendor", v1:"[observer][vendor]"]) } == "fun" 535 | insist { event.get(ecs_select[disabled: "deviceProduct", v1:"[observer][product]"]) } == "whimsy" 536 | end 537 | end 538 | end 539 | 540 | # CEF requires seven pipe-terminated headers before optional extensions 541 | context 'with a non-CEF payload' do 542 | let(:logger_stub) { double('Logger').as_null_object } 543 | before(:each) do 544 | allow_any_instance_of(described_class).to receive(:logger).and_return(logger_stub) 545 | end 546 | 547 | context 'containing 0 header-like sections' do 548 | let(:message) { 'this is not cef' } 549 | it 'logs helpfully and produces a tagged event' do 550 | do_decode(subject,message) do |event| 551 | expect(event.get('tags')).to include('_cefparsefailure') 552 | expect(event.get('message')).to eq(message) 553 | end 554 | expect(logger_stub).to have_received(:error) 555 | .with(a_string_including('Failed to decode CEF payload. Generating failure event with payload in message field'), 556 | a_hash_including(exception: a_string_including("found 0 of 7 required pipe-terminated header fields"), 557 | original_data: message)) 558 | end 559 | end 560 | context 'containing 4 header-like sections' do 561 | let(:message) { "a|b|c with several \\| escaped\\| pipes|d|bananas" } 562 | it 'logs helpfully and produces a tagged event' do 563 | do_decode(subject,message) do |event| 564 | expect(event.get('tags')).to include('_cefparsefailure') 565 | expect(event.get('message')).to eq(message) 566 | end 567 | expect(logger_stub).to have_received(:error) 568 | .with(a_string_including('Failed to decode CEF payload. Generating failure event with payload in message field'), 569 | a_hash_including(exception: a_string_including("found 4 of 7 required pipe-terminated header fields"), 570 | original_data: message)) 571 | end 572 | end 573 | context 'containing non-key/value extensions' do 574 | let (:message) { "CEF:0|security|threatmanager|1.0|100|trojan successfully stopped|10|this is in the extensions space but it is not valid because it is not equals-separated key/value" } 575 | it 'logs helpfully and produces a tagged event' do 576 | do_decode(subject,message) do |event| 577 | expect(event.get('tags')).to include('_cefparsefailure') 578 | expect(event.get('message')).to eq(message) 579 | end 580 | expect(logger_stub).to have_received(:error) 581 | .with(a_string_including('Failed to decode CEF payload. Generating failure event with payload in message field'), 582 | a_hash_including(exception: a_string_including("invalid extensions; keyless value present"), 583 | original_data: message)) 584 | end 585 | end 586 | context 'containing unescaped newlines' do 587 | # when not using a `delimiter`, we expect exactly one CEF log per call to decode. 588 | let (:message) { 589 | <<~EOMESSAGE 590 | CEF:0|security|threatmanager|1.0|100|trojan successfully stopped|10|src=10.0.0.67 591 | CEF:0|security|threatmanager|1.0|100|trojan successfully stopped|10|src=10.0.0.67 592 | CEF:0|security|threatmanager|1.0|100|trojan successfully stopped|10|src=10.0.0.67 593 | CEF:0|security|threatmanager|1.0|100|trojan successfully stopped|10|src=10.0.0.67 594 | CEF:0|security|threatmanager|1.0|100|trojan successfully stopped|10|src=10.0.0.67 595 | EOMESSAGE 596 | } 597 | it 'logs helpfully and produces a tagged event' do 598 | do_decode(subject, message) do |event| 599 | expect(event.get('tags')).to include('_cefparsefailure') 600 | expect(event.get('message')).to eq(message) 601 | end 602 | expect(logger_stub).to have_received(:error) 603 | .with(a_string_including('Failed to decode CEF payload. Generating failure event with payload in message field'), 604 | a_hash_including(exception: a_string_including("message is not valid CEF because it contains unescaped newline characters", 605 | "use the `delimiter` setting to enable in-codec buffering and delimiter-splitting"), 606 | original_data: message)) 607 | end 608 | end 609 | end 610 | 611 | context 'when a CEF header ends with a pair of properly-escaped backslashes' do 612 | let(:backslash) { '\\' } 613 | let(:pipe) { '|' } 614 | let(:message) { "CEF:0|security|threatmanager|1.0|100|double backslash" + 615 | backslash + backslash + # escaped backslash 616 | backslash + backslash + # escaped backslash 617 | "|10|src=10.0.0.192 dst=12.121.122.82 spt=1232" } 618 | 619 | it 'should include the backslashes unescaped' do 620 | event = decode_one(subject, message) 621 | 622 | expect(event.get(ecs_select[disabled:'name', v1:'[cef][name]'])).to eq('double backslash' + backslash + backslash ) 623 | expect(event.get(ecs_select[disabled:'severity',v1:'[event][severity]'])).to eq('10') # ensure we didn't consume the separator 624 | end 625 | end 626 | 627 | it "should parse the cef headers" do 628 | decode_one(subject, message) do |e| 629 | validate(e) 630 | insist { e.get(ecs_select[disabled:"deviceVendor", v1:"[observer][vendor]"]) } == "security" 631 | insist { e.get(ecs_select[disabled:"deviceProduct",v1:"[observer][product]"]) } == "threatmanager" 632 | end 633 | end 634 | 635 | it "should parse the cef body" do 636 | decode_one(subject, message) do |e| 637 | insist { e.get(ecs_select[disabled:"sourceAddress", v1:"[source][ip]"])} == "10.0.0.192" 638 | insist { e.get(ecs_select[disabled:"destinationAddress",v1:"[destination][ip]"]) } == "12.121.122.82" 639 | insist { e.get(ecs_select[disabled:"sourcePort", v1:"[source][port]"]) } == "1232" 640 | end 641 | end 642 | 643 | let (:missing_headers) { "CEF:0|||1.0|100|trojan successfully stopped|10|src=10.0.0.192 dst=12.121.122.82 spt=1232" } 644 | it "should be OK with missing CEF headers (multiple pipes in sequence)" do 645 | decode_one(subject, missing_headers) do |e| 646 | validate(e) 647 | insist { e.get(ecs_select[disabled:"deviceVendor", v1:"[observer][vendor]"]) } == "" 648 | insist { e.get(ecs_select[disabled:"deviceProduct",v1:"[observer][product]"]) } == "" 649 | end 650 | end 651 | 652 | let (:leading_whitespace) { "CEF:0|security|threatmanager|1.0|100|trojan successfully stopped|10| src=10.0.0.192 dst=12.121.122.82 spt=1232" } 653 | it "should strip leading whitespace from the message" do 654 | decode_one(subject, leading_whitespace) do |e| 655 | validate(e) 656 | end 657 | end 658 | 659 | let (:escaped_pipes) { 'CEF:0|security|threatmanager|1.0|100|trojan successfully stopped|10|moo=this\|has an escaped pipe' } 660 | it "should be OK with escaped pipes in the message" do 661 | decode_one(subject, escaped_pipes) do |e| 662 | insist { e.get("moo") } == 'this\|has an escaped pipe' 663 | end 664 | end 665 | 666 | let (:pipes_in_message) {'CEF:0|security|threatmanager|1.0|100|trojan successfully stopped|10|moo=this|has an pipe'} 667 | it "should be OK with not escaped pipes in the message" do 668 | decode_one(subject, pipes_in_message) do |e| 669 | insist { e.get("moo") } == 'this|has an pipe' 670 | end 671 | end 672 | 673 | # while we may see these in practice, equals MUST be escaped in the extensions per the spec. 674 | let (:equal_in_message) {'CEF:0|security|threatmanager|1.0|100|trojan successfully stopped|10|moo=this =has = equals\='} 675 | it "should be OK with equal in the message" do 676 | decode_one(subject, equal_in_message) do |e| 677 | insist { e.get("moo") } == 'this =has = equals=' 678 | end 679 | end 680 | 681 | let(:literal_newline) { "\n" } 682 | let(:literal_carriage_return) { "\r" } 683 | let(:literal_equals) { "=" } 684 | let(:literal_backslash) { "\\" } 685 | let(:escaped_newline) { literal_backslash + 'n' } 686 | let(:escaped_carriage_return) { literal_backslash + 'r' } 687 | let(:escaped_equals) { literal_backslash + literal_equals } 688 | let(:escaped_backslash) { literal_backslash + literal_backslash } 689 | let(:escaped_sequences_in_extension_value) { "CEF:0|security|threatmanager|1.0|100|trojan successfully stopped|10|foo=bar msg=this message has escaped equals #{escaped_equals} and escaped newlines #{escaped_newline} escaped carriage returns #{escaped_carriage_return} and escaped backslashes #{escaped_backslash} in it bar=baz" } 690 | it "decodes embedded newlines, carriage regurns, backslashes, and equals signs" do 691 | decode_one(subject, escaped_sequences_in_extension_value) do |e| 692 | insist { e.get("foo") } == 'bar' 693 | insist { e.get("message") } == "this message has escaped equals #{literal_equals} and escaped newlines #{literal_newline} escaped carriage returns #{literal_carriage_return} and escaped backslashes #{literal_backslash} in it" 694 | insist { e.get("bar") } == 'baz' 695 | end 696 | end 697 | 698 | context "zoneless deviceReceiptTime(rt) when deviceTimeZone(dtz) is provided" do 699 | let(:cef_formatted_timestamp) { 'Jul 19 2017 10:50:21.127' } 700 | let(:zone_name) { 'Europe/Moscow' } 701 | 702 | let(:utc_timestamp) { Time.iso8601("2017-07-19T07:50:21.127Z") } # In summer of 2017, Europe/Moscow was UTC+03:00 703 | 704 | let(:destination_time_zoned) { %Q{CEF:0|Security|threatmanager|1.0|100|worm successfully stopped|Very-High| eventId=1 msg=Worm successfully stopped art=1500464384997 deviceSeverity=10 rt=#{cef_formatted_timestamp} src=10.0.0.1 sourceZoneURI=/All Zones/ArcSight System/Private Address Space Zones/RFC1918: 10.0.0.0-10.255.255.255 spt=1232 dst=2.1.2.2 destinationZoneURI=/All Zones/ArcSight System/Public Address Space Zones/RIPE NCC/2.0.0.0-2.255.255.255 (RIPE NCC) ahost=connector.rhel72 agt=192.168.231.129 agentZoneURI=/All Zones/ArcSight System/Private Address Space Zones/RFC1918: 192.168.0.0-192.168.255.255 amac=00-0C-29-51-8A-84 av=7.6.0.8009.0 atz=Europe/Lisbon at=syslog_file dvchost=client1 dtz=#{zone_name} _cefVer=0.1 aid=3UBajWl0BABCABBzZSlmUdw==} } 705 | 706 | if ecs_select.active_mode == :disabled 707 | it 'persists deviceReceiptTime and deviceTimeZone verbatim' do 708 | decode_one(subject, destination_time_zoned) do |event| 709 | expect(event.get('deviceReceiptTime')).to eq("Jul 19 2017 10:50:21.127") 710 | expect(event.get('deviceTimeZone')).to eq('Europe/Moscow') 711 | end 712 | end 713 | else 714 | it 'sets the @timestamp using the value in `rt` combined with the offset provided by `dtz`' do 715 | decode_one(subject, destination_time_zoned) do |event| 716 | expected_time = LogStash::Timestamp.new(utc_timestamp) 717 | expect(event.get('[@timestamp]').to_s).to eq(expected_time.to_s) 718 | expect(event.get('[event][timezone]')).to eq(zone_name) 719 | end 720 | end 721 | end 722 | end 723 | 724 | context "timestamp-normalized fields" do 725 | context 'empty values' do 726 | let(:message_with_empty_start) { %Q{CEF:0|Security|threatmanager|1.0|100|worm successfully stopped|Very-High| eventId=1 msg=Worm successfully stopped start=} } 727 | if ecs_select.active_mode == :disabled 728 | it 'leaves the empty value in-tact' do 729 | decode_one(subject, message_with_empty_start) do |event| 730 | expect(event.get('startTime')).to eq('') 731 | end 732 | end 733 | else 734 | it 'stores a nil value' do 735 | decode_one(subject, message_with_empty_start) do |event| 736 | expect(event).to include '[event][start]' 737 | expect(event.get('[event][start]')).to be nil 738 | end 739 | end 740 | end 741 | end 742 | end 743 | 744 | let(:malformed_unescaped_equals_in_extension_value) { %q{CEF:0|FooBar|Web Gateway|1.2.3.45.67|200|Success|2|rt=Sep 07 2018 14:50:39 cat=Access Log dst=1.1.1.1 dhost=foo.example.com suser=redacted src=2.2.2.2 requestMethod=POST request='https://foo.example.com/bar/bingo/1' requestClientApplication='Foo-Bar/2018.1.7; Email:user@example.com; Guid:test=' cs1= cs1Label=Foo Bar} } 745 | it 'should split correctly' do 746 | decode_one(subject, malformed_unescaped_equals_in_extension_value) do |event| 747 | expect(event.get(ecs_select[disabled:"cefVersion", v1:"[cef][version]"])).to eq('0') 748 | expect(event.get(ecs_select[disabled:"deviceVendor", v1:"[observer][vendor]"])).to eq('FooBar') 749 | expect(event.get(ecs_select[disabled:"deviceProduct", v1:"[observer][product]"])).to eq('Web Gateway') 750 | expect(event.get(ecs_select[disabled:"deviceVersion", v1:"[observer][version]"])).to eq('1.2.3.45.67') 751 | expect(event.get(ecs_select[disabled:"deviceEventClassId",v1:"[event][code]"])).to eq('200') 752 | expect(event.get(ecs_select[disabled:"name", v1:"[cef][name]"])).to eq('Success') 753 | expect(event.get(ecs_select[disabled:"severity", v1:"[event][severity]"])).to eq('2') 754 | 755 | # extension key/value pairs 756 | if ecs_compatibility == :disabled 757 | expect(event.get('deviceReceiptTime')).to eq('Sep 07 2018 14:50:39') 758 | else 759 | expected_time = LogStash::Timestamp.new(Time.parse('Sep 07 2018 14:50:39')).to_s 760 | expect(event.get('[@timestamp]').to_s).to eq(expected_time) 761 | end 762 | expect(event.get(ecs_select[disabled:'deviceEventCategory', v1:'[cef][category]'])).to eq('Access Log') 763 | expect(event.get(ecs_select[disabled:'deviceVersion', v1:'[observer][version]'])).to eq('1.2.3.45.67') 764 | expect(event.get(ecs_select[disabled:'destinationAddress', v1:'[destination][ip]'])).to eq('1.1.1.1') 765 | expect(event.get(ecs_select[disabled:'destinationHostName', v1:'[destination][domain]'])).to eq('foo.example.com') 766 | expect(event.get(ecs_select[disabled:'sourceUserName', v1:'[source][user][name]'])).to eq('redacted') 767 | expect(event.get(ecs_select[disabled:'sourceAddress', v1:'[source][ip]'])).to eq('2.2.2.2') 768 | expect(event.get(ecs_select[disabled:'requestMethod', v1:'[http][request][method]'])).to eq('POST') 769 | expect(event.get(ecs_select[disabled:'requestUrl', v1:'[url][original]'])).to eq(%q{'https://foo.example.com/bar/bingo/1'}) 770 | # Although the value for `requestClientApplication` contains an illegal unquoted equals sign, the sequence 771 | # preceeding the unescaped-equals isn't shaped like a key, so we allow it to be a part of the value. 772 | expect(event.get(ecs_select[disabled:'requestClientApplication',v1:'[user_agent][original]'])).to eq(%q{'Foo-Bar/2018.1.7; Email:user@example.com; Guid:test='}) 773 | expect(event.get(ecs_select[disabled:'deviceCustomString1Label',v1:'[cef][device_custom_string_1][label]'])).to eq('Foo Bar') 774 | expect(event.get(ecs_select[disabled:'deviceCustomString1', v1:'[cef][device_custom_string_1][value]'])).to eq('') 775 | end 776 | end 777 | 778 | context('escaped-equals and unescaped-spaces in the extension values') do 779 | let(:query_string) { 'key1=value1&key2=value3 aa.bc&key3=value4'} 780 | let(:escaped_query_string) { query_string.gsub('=','\\=') } 781 | let(:cef_message) { "CEF:0|security|threatmanager|1.0|100|trojan successfully stopped|10|go=start now query_string=#{escaped_query_string} final=done" } 782 | 783 | it 'captures the extension values correctly' do 784 | event = decode_one(subject, cef_message) 785 | 786 | expect(event.get('go')).to eq('start now') 787 | expect(event.get('query_string')).to eq(query_string) 788 | expect(event.get('final')).to eq('done') 789 | end 790 | end 791 | 792 | let (:escaped_backslash_in_header) {'CEF:0|secu\\\\rity|threat\\\\manager|1.\\\\0|10\\\\0|tro\\\\jan successfully stopped|\\\\10|'} 793 | it "should be OK with escaped backslash in the headers" do 794 | decode_one(subject, escaped_backslash_in_header) do |e| 795 | insist { e.get(ecs_select[disabled:"cefVersion", v1:"[cef][version]"]) } == '0' 796 | insist { e.get(ecs_select[disabled:"deviceVendor", v1:"[observer][vendor]"]) } == 'secu\\rity' 797 | insist { e.get(ecs_select[disabled:"deviceProduct", v1:"[observer][product]"]) } == 'threat\\manager' 798 | insist { e.get(ecs_select[disabled:"deviceVersion", v1:"[observer][version]"]) } == '1.\\0' 799 | insist { e.get(ecs_select[disabled:"deviceEventClassId",v1:"[event][code]"]) } == '10\\0' 800 | insist { e.get(ecs_select[disabled:"name", v1:"[cef][name]"]) } == 'tro\\jan successfully stopped' 801 | insist { e.get(ecs_select[disabled:"severity", v1:"[event][severity]"]) } == '\\10' 802 | end 803 | end 804 | 805 | let (:escaped_backslash_in_header_edge_case) {'CEF:0|security\\\\\\||threatmanager\\\\|1.0|100|trojan successfully stopped|10|'} 806 | it "should be OK with escaped backslash in the headers (edge case: escaped slash in front of pipe)" do 807 | decode_one(subject, escaped_backslash_in_header_edge_case) do |e| 808 | validate(e) 809 | insist { e.get(ecs_select[disabled:"deviceVendor", v1:"[observer][vendor]"]) } == 'security\\|' 810 | insist { e.get(ecs_select[disabled:"deviceProduct",v1:"[observer][product]"]) } == 'threatmanager\\' 811 | end 812 | end 813 | 814 | let (:escaped_pipes_in_header) {'CEF:0|secu\\|rity|threatmanager\\||1.\\|0|10\\|0|tro\\|jan successfully stopped|\\|10|'} 815 | it "should be OK with escaped pipes in the headers" do 816 | decode_one(subject, escaped_pipes_in_header) do |e| 817 | insist { e.get(ecs_select[disabled:"cefVersion", v1:"[cef][version]"]) } == '0' 818 | insist { e.get(ecs_select[disabled:"deviceVendor", v1:"[observer][vendor]"]) } == 'secu|rity' 819 | insist { e.get(ecs_select[disabled:"deviceProduct", v1:"[observer][product]"]) } == 'threatmanager|' 820 | insist { e.get(ecs_select[disabled:"deviceVersion", v1:"[observer][version]"]) } == '1.|0' 821 | insist { e.get(ecs_select[disabled:"deviceEventClassId",v1:"[event][code]"]) } == '10|0' 822 | insist { e.get(ecs_select[disabled:"name", v1:"[cef][name]"]) } == 'tro|jan successfully stopped' 823 | insist { e.get(ecs_select[disabled:"severity", v1:"[event][severity]"]) } == '|10' 824 | end 825 | end 826 | 827 | let (:backslash_in_message) {'CEF:0|security|threatmanager|1.0|100|trojan successfully stopped|10|moo=this \\has \\ backslashs\\'} 828 | it "should be OK with backslashs in the message" do 829 | decode_one(subject, backslash_in_message) do |e| 830 | insist { e.get("moo") } == 'this \\has \\ backslashs\\' 831 | end 832 | end 833 | 834 | let (:equal_in_header) {'CEF:0|security|threatmanager=equal|1.0|100|trojan successfully stopped|10|'} 835 | it "should be OK with equal in the headers" do 836 | decode_one(subject, equal_in_header) do |e| 837 | validate(e) 838 | insist { e.get(ecs_select[disabled:"deviceProduct",v1:"[observer][product]"]) } == "threatmanager=equal" 839 | end 840 | end 841 | 842 | let (:spaces_in_between_keys) {'CEF:0|security|threatmanager|1.0|100|trojan successfully stopped|10| src=10.0.0.192 dst=12.121.122.82 spt=1232'} 843 | it "should be OK to have one or more spaces between keys" do 844 | decode_one(subject, spaces_in_between_keys) do |e| 845 | validate(e) 846 | insist { e.get(ecs_select[disabled:"sourceAddress",v1:"[source][ip]"]) } == "10.0.0.192" 847 | insist { e.get(ecs_select[disabled:"destinationAddress",v1:"[destination][ip]"]) } == "12.121.122.82" 848 | insist { e.get(ecs_select[disabled:"sourcePort",v1:"[source][port]"]) } == "1232" 849 | end 850 | end 851 | 852 | let (:dots_in_keys) {'CEF:0|Vendor|Device|Version|13|my message|5|dvchost=loghost cat=traffic deviceSeverity=notice ad.nn=TEST src=192.168.0.1 destinationPort=53'} 853 | it "should be OK with dots in keys" do 854 | decode_one(subject, dots_in_keys) do |e| 855 | insist { e.get(ecs_select[disabled:"deviceHostName",v1:"[observer][hostname]"]) } == "loghost" 856 | insist { e.get("ad.nn") } == 'TEST' 857 | insist { e.get(ecs_select[disabled:"sourceAddress",v1:"[source][ip]"]) } == '192.168.0.1' 858 | insist { e.get(ecs_select[disabled:"destinationPort",v1:"[destination][port]"]) } == '53' 859 | end 860 | end 861 | 862 | let (:allow_spaces_in_values) {'CEF:0|security|threatmanager|1.0|100|trojan successfully stopped|10|src=10.0.0.192 dst=12.121.122.82 spt=1232 dproc=InternetExplorer x.x.x.x'} 863 | it "should be OK to have one or more spaces in values" do 864 | decode_one(subject, allow_spaces_in_values) do |e| 865 | validate(e) 866 | insist { e.get(ecs_select[disabled:"sourceAddress",v1:"[source][ip]"]) } == "10.0.0.192" 867 | insist { e.get(ecs_select[disabled:"destinationAddress",v1:"[destination][ip]"]) } == "12.121.122.82" 868 | insist { e.get(ecs_select[disabled:"sourcePort",v1:"[source][port]"]) } == "1232" 869 | insist { e.get(ecs_select[disabled:"destinationProcessName",v1:"[destination][process][name]"]) } == "InternetExplorer x.x.x.x" 870 | end 871 | end 872 | 873 | let (:preserve_additional_fields_with_dot_notations) {'CEF:0|security|threatmanager|1.0|100|trojan successfully stopped|10|src=10.0.0.192 additional.dotfieldName=new_value ad.Authentification=MICROSOFT_AUTHENTICATION_PACKAGE_V1_0 ad.Error_,Code=3221225578 dst=12.121.122.82 ad.field[0]=field0 ad.name[1]=new_name'} 874 | it "should keep ad.fields" do 875 | decode_one(subject, preserve_additional_fields_with_dot_notations) do |e| 876 | validate(e) 877 | insist { e.get(ecs_select[disabled:"sourceAddress",v1:"[source][ip]"]) } == "10.0.0.192" 878 | insist { e.get(ecs_select[disabled:"destinationAddress",v1:"[destination][ip]"]) } == "12.121.122.82" 879 | insist { e.get("[ad.field][0]") } == "field0" 880 | insist { e.get("[ad.name][1]") } == "new_name" 881 | insist { e.get("ad.Authentification") } == "MICROSOFT_AUTHENTICATION_PACKAGE_V1_0" 882 | insist { e.get('ad.Error_,Code') } == "3221225578" 883 | insist { e.get("additional.dotfieldName") } == "new_value" 884 | end 885 | end 886 | 887 | let(:preserve_complex_multiple_dot_notation_in_extension_fields) { 'CEF:0|security|threatmanager|1.0|100|trojan successfully stopped|10|src=10.0.0.192 additional.dotfieldName=new_value ad.Authentification=MICROSOFT_AUTHENTICATION_PACKAGE_V1_0 ad.Error_,Code=3221225578 dst=12.121.122.82 ad.field[0]=field0 ad.foo.name[1]=new_name' } 888 | it "should keep ad.fields" do 889 | decode_one(subject, preserve_complex_multiple_dot_notation_in_extension_fields) do |e| 890 | validate(e) 891 | insist { e.get(ecs_select[disabled:"sourceAddress",v1:"[source][ip]"]) } == "10.0.0.192" 892 | insist { e.get(ecs_select[disabled:"destinationAddress",v1:"[destination][ip]"]) } == "12.121.122.82" 893 | insist { e.get("[ad.field][0]") } == "field0" 894 | insist { e.get("[ad.foo.name][1]") } == "new_name" 895 | insist { e.get("ad.Authentification") } == "MICROSOFT_AUTHENTICATION_PACKAGE_V1_0" 896 | insist { e.get('ad.Error_,Code') } == "3221225578" 897 | insist { e.get("additional.dotfieldName") } == "new_value" 898 | end 899 | end 900 | 901 | let (:preserve_random_values_key_value_pairs_alongside_with_additional_fields) {'CEF:0|security|threatmanager|1.0|100|trojan successfully stopped|10|src=10.0.0.192 cs4=401 random.user Admin 0 23041A10181C0000 23041810181C0000 /CN\=random.user/OU\=User Login End-Entity /CN\=TEST/OU\=Login CA TEST 34 additional.dotfieldName=new_value ad.Authentification=MICROSOFT_AUTHENTICATION_PACKAGE_V1_0 ad.Error_,Code=3221225578 dst=12.121.122.82 ad.field[0]=field0 ad.name[1]=new_name'} 902 | it "should correctly parse random values even with additional fields in message" do 903 | decode_one(subject, preserve_random_values_key_value_pairs_alongside_with_additional_fields) do |e| 904 | validate(e) 905 | insist { e.get(ecs_select[disabled:"sourceAddress",v1:"[source][ip]"]) } == "10.0.0.192" 906 | insist { e.get(ecs_select[disabled:"destinationAddress",v1:"[destination][ip]"]) } == "12.121.122.82" 907 | insist { e.get("[ad.field][0]") } == "field0" 908 | insist { e.get("[ad.name][1]") } == "new_name" 909 | insist { e.get("ad.Authentification") } == "MICROSOFT_AUTHENTICATION_PACKAGE_V1_0" 910 | insist { e.get("ad.Error_,Code") } == "3221225578" 911 | insist { e.get("additional.dotfieldName") } == "new_value" 912 | insist { e.get(ecs_select[disabled:"deviceCustomString4",v1:"[cef][device_custom_string_4][value]"]) } == "401 random.user Admin 0 23041A10181C0000 23041810181C0000 /CN\=random.user/OU\=User Login End-Entity /CN\=TEST/OU\=Login CA TEST 34" 913 | end 914 | end 915 | 916 | let (:preserve_unmatched_key_mappings) {'CEF:0|security|threatmanager|1.0|100|trojan successfully stopped|10|src=10.0.0.192 dst=12.121.122.82 new_key_by_device=new_values here'} 917 | it "should preserve unmatched key mappings" do 918 | decode_one(subject, preserve_unmatched_key_mappings) do |e| 919 | validate(e) 920 | insist { e.get(ecs_select[disabled:"sourceAddress",v1:"[source][ip]"]) } == "10.0.0.192" 921 | insist { e.get(ecs_select[disabled:"destinationAddress",v1:"[destination][ip]"]) } == "12.121.122.82" 922 | insist { e.get("new_key_by_device") } == "new_values here" 923 | end 924 | end 925 | 926 | let (:translate_abbreviated_cef_fields) {'CEF:0|security|threatmanager|1.0|100|trojan successfully stopped|10|src=10.0.0.192 dst=12.121.122.82 proto=TCP shost=source.host.name dhost=destination.host.name spt=11024 dpt=9200 outcome=Success amac=00:80:48:1c:24:91'} 927 | it "should translate most known abbreviated CEF field names" do 928 | decode_one(subject, translate_abbreviated_cef_fields) do |e| 929 | validate(e) 930 | insist { e.get(ecs_select[disabled:"sourceAddress", v1:"[source][ip]"]) } == "10.0.0.192" 931 | insist { e.get(ecs_select[disabled:"destinationAddress", v1:"[destination][ip]"]) } == "12.121.122.82" 932 | insist { e.get(ecs_select[disabled:"transportProtocol", v1:"[network][transport]"]) } == "TCP" 933 | insist { e.get(ecs_select[disabled:"sourceHostName", v1:"[source][domain]"]) } == "source.host.name" 934 | insist { e.get(ecs_select[disabled:"destinationHostName",v1:"[destination][domain]"]) } == "destination.host.name" 935 | insist { e.get(ecs_select[disabled:"sourcePort", v1:"[source][port]"]) } == "11024" 936 | insist { e.get(ecs_select[disabled:"destinationPort", v1:"[destination][port]"]) } == "9200" 937 | insist { e.get(ecs_select[disabled:"eventOutcome", v1:"[event][outcome]"]) } == "Success" 938 | insist { e.get(ecs_select[disabled:"agentMacAddress", v1:"[agent][mac]"])} == "00:80:48:1c:24:91" 939 | end 940 | end 941 | 942 | let (:syslog) { "Syslogdate Sysloghost CEF:0|security|threatmanager|1.0|100|trojan successfully stopped|10|src=10.0.0.192 dst=12.121.122.82 spt=1232" } 943 | it "Should detect headers before CEF starts" do 944 | decode_one(subject, syslog) do |e| 945 | validate(e) 946 | insist { e.get(ecs_select[disabled:'syslog',v1:'[log][syslog][header]']) } == 'Syslogdate Sysloghost' 947 | end 948 | end 949 | 950 | let(:log_with_fileHash) { "Syslogdate Sysloghost CEF:0|security|threatmanager|1.0|100|trojan successfully stopped|10|fileHash=1bad1dea" } 951 | it 'decodes fileHash to [file][hash]' do 952 | decode_one(subject, log_with_fileHash) do |e| 953 | validate(e) 954 | insist { e.get(ecs_select[disabled:"fileHash", v1:"[file][hash]"]) } == "1bad1dea" 955 | end 956 | end 957 | 958 | let(:log_with_custom_typed_fields) { "Syslogdate Sysloghost CEF:0|security|threatmanager|1.0|100|trojan successfully stopped|10|cfp15=3.1415926 cfp15Label=pi c6a12=::1 c6a12Label=localhost cn7=8191 cn7Label=mersenne cs4=silly cs4Label=theory" } 959 | it 'decodes to mapped numbered fields' do 960 | decode_one(subject, log_with_custom_typed_fields) do |e| 961 | validate(e) 962 | insist { e.get(ecs_select[disabled: "deviceCustomFloatingPoint15", v1: "[cef][device_custom_floating_point_15][value]"]) } == "3.1415926" 963 | insist { e.get(ecs_select[disabled: "deviceCustomFloatingPoint15Label", v1: "[cef][device_custom_floating_point_15][label]"]) } == "pi" 964 | insist { e.get(ecs_select[disabled: "deviceCustomIPv6Address12", v1: "[cef][device_custom_ipv6_address_12][value]"]) } == "::1" 965 | insist { e.get(ecs_select[disabled: "deviceCustomIPv6Address12Label", v1: "[cef][device_custom_ipv6_address_12][label]"]) } == "localhost" 966 | insist { e.get(ecs_select[disabled: "deviceCustomNumber7", v1: "[cef][device_custom_number_7][value]"]) } == "8191" 967 | insist { e.get(ecs_select[disabled: "deviceCustomNumber7Label", v1: "[cef][device_custom_number_7][label]"]) } == "mersenne" 968 | insist { e.get(ecs_select[disabled: "deviceCustomString4", v1: "[cef][device_custom_string_4][value]"]) } == "silly" 969 | insist { e.get(ecs_select[disabled: "deviceCustomString4Label", v1: "[cef][device_custom_string_4][label]"]) } == "theory" 970 | end 971 | end 972 | 973 | context 'with UTF-8 message' do 974 | let(:message) { 'CEF:0|security|threatmanager|1.0|100|trojan successfully stopped|10|src=192.168.1.11 target=aaaaaああああaaaa msg=Description Omitted' } 975 | 976 | # since this spec is encoded UTF-8, the literal strings it contains are encoded with UTF-8, 977 | # but codecs in Logstash tend to receive their input as BINARY (or: ASCII-8BIT); ensure that 978 | # we can handle either without losing the UTF-8 characters from the higher planes. 979 | %w( 980 | BINARY 981 | UTF-8 982 | ).each do |external_encoding| 983 | context "externally encoded as #{external_encoding}" do 984 | let(:message) { super().force_encoding(external_encoding) } 985 | it 'should keep the higher-plane characters' do 986 | decode_one(subject, message.dup) do |event| 987 | validate(event) 988 | insist { event.get("target") } == "aaaaaああああaaaa" 989 | insist { event.get("target").encoding } == Encoding::UTF_8 990 | end 991 | end 992 | end 993 | end 994 | end 995 | 996 | context 'non-UTF-8 message' do 997 | let(:logger_stub) { double('Logger').as_null_object } 998 | before(:each) do 999 | allow_any_instance_of(described_class).to receive(:logger).and_return(logger_stub) 1000 | end 1001 | let(:message) { 'CEF:0|security|threatmanager|1.0|100|trojan successfully stopped|10|src=192.168.1.11 target=aaaaaああああaaaa msg=Description Omitted'.encode('SHIFT_JIS') } 1002 | it 'should emit message unparsed with _cefparsefailure tag' do 1003 | decode_one(subject, message.dup) do |event| 1004 | insist { event.get("message").bytes.to_a } == message.bytes.to_a 1005 | insist { event.get("tags") } == ['_cefparsefailure'] 1006 | end 1007 | expect(logger_stub).to have_received(:error).with(/Failed to decode CEF payload/, any_args) 1008 | end 1009 | end 1010 | 1011 | context "with raw_data_field set" do 1012 | subject(:codec) { LogStash::Codecs::CEF.new("raw_data_field" => "message_raw") } 1013 | 1014 | it "should return the raw message in field message_raw" do 1015 | decode_one(subject, message.dup) do |e| 1016 | validate(e) 1017 | insist { e.get("message_raw") } == message 1018 | end 1019 | end 1020 | end 1021 | 1022 | context "legacy aliases" do 1023 | let(:cef_line) { "CEF:0|security|threatmanager|1.0|100|target acquired|10|destinationLongitude=-73.614830 destinationLatitude=45.505918 sourceLongitude=45.4628328 sourceLatitude=9.1076927" } 1024 | 1025 | it ecs_select[disabled:"creates the fields as provided",v1:"maps to ECS fields"] do 1026 | decode_one(codec, cef_line.dup) do |event| 1027 | # |---- LEGACY: AS-PROVIDED ----| |--------- ECS: MAP TO FIELD ----------| 1028 | expect(event.get(ecs_select[disabled:'destinationLongitude',v1:'[destination][geo][location][lon]'])).to eq('-73.614830') 1029 | expect(event.get(ecs_select[disabled:'destinationLatitude', v1:'[destination][geo][location][lat]'])).to eq('45.505918') 1030 | expect(event.get(ecs_select[disabled:'sourceLongitude', v1:'[source][geo][location][lon]' ])).to eq('45.4628328') 1031 | expect(event.get(ecs_select[disabled:'sourceLatitude', v1:'[source][geo][location][lat]' ])).to eq('9.1076927') 1032 | end 1033 | end 1034 | end 1035 | end 1036 | end 1037 | 1038 | context "encode and decode", :ecs_compatibility_support do 1039 | subject(:codec) { LogStash::Codecs::CEF.new } 1040 | 1041 | let(:results) { [] } 1042 | 1043 | ecs_compatibility_matrix(:disabled, :v1, :v8 => :v1) do |ecs_select| 1044 | before(:each) do 1045 | allow_any_instance_of(described_class).to receive(:ecs_compatibility).and_return(ecs_compatibility) 1046 | end 1047 | 1048 | let(:vendor_field) { ecs_select[disabled:'deviceVendor', v1:'[observer][vendor]'] } 1049 | let(:product_field) { ecs_select[disabled:'deviceProduct', v1:'[observer][product]']} 1050 | let(:version_field) { ecs_select[disabled:'deviceVersion', v1:'[observer][version]']} 1051 | let(:signature_field) { ecs_select[disabled:'deviceEventClassId', v1:'[event][code]']} 1052 | let(:name_field) { ecs_select[disabled:'name', v1:'[cef][name]']} 1053 | let(:severity_field) { ecs_select[disabled:'severity', v1:'[event][severity]']} 1054 | 1055 | let(:source_dns_domain_field) { ecs_select[disabled:'sourceDnsDomain',v1:'[source][registered_domain]'] } 1056 | 1057 | it "should return an equal event if encoded and decoded again" do 1058 | codec.on_event{|data, newdata| results << newdata} 1059 | codec.vendor = "%{" + vendor_field + "}" 1060 | codec.product = "%{" + product_field + "}" 1061 | codec.version = "%{" + version_field + "}" 1062 | codec.signature = "%{" + signature_field + "}" 1063 | codec.name = "%{" + name_field + "}" 1064 | codec.severity = "%{" + severity_field + "}" 1065 | codec.fields = [ "foo", source_dns_domain_field ] 1066 | event = LogStash::Event.new.tap do |e| 1067 | e.set(vendor_field, "vendor") 1068 | e.set(product_field, "product") 1069 | e.set(version_field, "2.0") 1070 | e.set(signature_field, "signature") 1071 | e.set(name_field, "name") 1072 | e.set(severity_field, "1") 1073 | e.set("foo", "bar") 1074 | e.set(source_dns_domain_field, "apple") 1075 | end 1076 | codec.encode(event) 1077 | codec.decode(results.first) do |e| 1078 | expect(e.get(vendor_field)).to be == event.get(vendor_field) 1079 | expect(e.get(product_field)).to be == event.get(product_field) 1080 | expect(e.get(version_field)).to be == event.get(version_field) 1081 | expect(e.get(signature_field)).to be == event.get(signature_field) 1082 | expect(e.get(name_field)).to be == event.get(name_field) 1083 | expect(e.get(severity_field)).to be == event.get(severity_field) 1084 | expect(e.get('foo')).to be == event.get('foo') 1085 | expect(e.get(source_dns_domain_field)).to be == event.get(source_dns_domain_field) 1086 | end 1087 | end 1088 | end 1089 | end 1090 | 1091 | end 1092 | --------------------------------------------------------------------------------