├── .github ├── CONTRIBUTING.md ├── ISSUE_TEMPLATE.md └── PULL_REQUEST_TEMPLATE.md ├── .gitignore ├── .travis.yml ├── CHANGELOG.md ├── CONTRIBUTORS ├── DEVELOPER.md ├── Gemfile ├── LICENSE ├── NOTICE.TXT ├── README.md ├── Rakefile ├── build.gradle ├── ci ├── build.sh ├── cleanup.sh ├── run.sh └── setup.sh ├── docs └── index.asciidoc ├── gradle.properties ├── gradle └── wrapper │ ├── gradle-wrapper.jar │ └── gradle-wrapper.properties ├── gradlew ├── gradlew.bat ├── kafka_test_setup.sh ├── kafka_test_teardown.sh ├── lib ├── logstash-input-kafka_jars.rb └── logstash │ └── inputs │ └── kafka.rb ├── logstash-input-kafka.gemspec └── spec ├── integration └── inputs │ └── kafka_spec.rb └── unit └── inputs └── kafka_spec.rb /.github/CONTRIBUTING.md: -------------------------------------------------------------------------------- 1 | # Contributing to Logstash 2 | 3 | All contributions are welcome: ideas, patches, documentation, bug reports, 4 | complaints, etc! 5 | 6 | Programming is not a required skill, and there are many ways to help out! 7 | It is more important to us that you are able to contribute. 8 | 9 | That said, some basic guidelines, which you are free to ignore :) 10 | 11 | ## Want to learn? 12 | 13 | Want to lurk about and see what others are doing with Logstash? 14 | 15 | * The irc channel (#logstash on irc.freenode.org) is a good place for this 16 | * The [forum](https://discuss.elastic.co/c/logstash) is also 17 | great for learning from others. 18 | 19 | ## Got Questions? 20 | 21 | Have a problem you want Logstash to solve for you? 22 | 23 | * You can ask a question in the [forum](https://discuss.elastic.co/c/logstash) 24 | * Alternately, you are welcome to join the IRC channel #logstash on 25 | irc.freenode.org and ask for help there! 26 | 27 | ## Have an Idea or Feature Request? 28 | 29 | * File a ticket on [GitHub](https://github.com/elastic/logstash/issues). Please remember that GitHub is used only for issues and feature requests. If you have a general question, the [forum](https://discuss.elastic.co/c/logstash) or IRC would be the best place to ask. 30 | 31 | ## Something Not Working? Found a Bug? 32 | 33 | If you think you found a bug, it probably is a bug. 34 | 35 | * If it is a general Logstash or a pipeline issue, file it in [Logstash GitHub](https://github.com/elasticsearch/logstash/issues) 36 | * If it is specific to a plugin, please file it in the respective repository under [logstash-plugins](https://github.com/logstash-plugins) 37 | * or ask the [forum](https://discuss.elastic.co/c/logstash). 38 | 39 | # Contributing Documentation and Code Changes 40 | 41 | If you have a bugfix or new feature that you would like to contribute to 42 | logstash, and you think it will take more than a few minutes to produce the fix 43 | (ie; write code), it is worth discussing the change with the Logstash users and developers first! You can reach us via [GitHub](https://github.com/elastic/logstash/issues), the [forum](https://discuss.elastic.co/c/logstash), or via IRC (#logstash on freenode irc) 44 | Please note that Pull Requests without tests will not be merged. If you would like to contribute but do not have experience with writing tests, please ping us on IRC/forum or create a PR and ask our help. 45 | 46 | ## Contributing to plugins 47 | 48 | Check our [documentation](https://www.elastic.co/guide/en/logstash/current/contributing-to-logstash.html) on how to contribute to plugins or write your own! It is super easy! 49 | 50 | ## Contribution Steps 51 | 52 | 1. Test your changes! [Run](https://github.com/elastic/logstash#testing) the test suite 53 | 2. Please make sure you have signed our [Contributor License 54 | Agreement](https://www.elastic.co/contributor-agreement/). We are not 55 | asking you to assign copyright to us, but to give us the right to distribute 56 | your code without restriction. We ask this of all contributors in order to 57 | assure our users of the origin and continuing existence of the code. You 58 | only need to sign the CLA once. 59 | 3. Send a pull request! Push your changes to your fork of the repository and 60 | [submit a pull 61 | request](https://help.github.com/articles/using-pull-requests). In the pull 62 | request, describe what your changes do and mention any bugs/issues related 63 | to the pull request. 64 | 65 | 66 | -------------------------------------------------------------------------------- /.github/ISSUE_TEMPLATE.md: -------------------------------------------------------------------------------- 1 | ## Kafka Input Plugin's Issue Tracker Has Moved 2 | 3 | The Kafka Input Plugin is now a part of the [Kafka Integration Plugin][integration-source]. 4 | This project remains open for backports of fixes from that project to the 9.x series where possible, but issues should first be filed on the [integration plugin][integration-issues]. 5 | 6 | Please post all product and debugging questions on our [forum][logstash-forum]. 7 | Your questions will reach our wider community members there. If we confirm that there is a bug, then we can open a new issue on the appropriate project. 8 | 9 | [integration-source]: https://github.com/logstash-plugins/logstash-integration-kafka 10 | [integration-issues]: https://github.com/logstash-plugins/logstash-integration-kafka/issues/ 11 | [logstash-forum]: https://discuss.elastic.co/c/logstash 12 | 13 | 14 | -------------------------------------------------------------------------------- /.github/PULL_REQUEST_TEMPLATE.md: -------------------------------------------------------------------------------- 1 | ## Kafka Input Plugin's Source Has Moved 2 | 3 | This Kafka Input Plugin is now a part of the [Kafka Integration Plugin][integration-source]. This project remains open for backports of fixes from that project to the 9.x series where possible, but pull-requests should first be made on the [integration plugin][integration-pull-requests]. 4 | 5 | If you have already made commits on a clone of this stand-alone repository, it's ok! Go ahead and open the Pull Request here, and open an Issue linking to it on the [integration plugin][integration-issues] -- we'll work with you to sort it all out and to get the backport applied. 6 | 7 | ## Contributor Agreement 8 | 9 | Thanks for contributing to Logstash! If you haven't already signed our CLA, here's a handy link: https://www.elastic.co/contributor-agreement/ 10 | 11 | [integration-source]: https://github.com/logstash-plugins/logstash-integration-kafka 12 | [integration-issues]: https://github.com/logstash-plugins/logstash-integration-kafka/issues/ 13 | [integration-pull-requests]: https://github.com/logstash-plugins/logstash-integration-kafka/pulls -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | *.gem 2 | Gemfile.lock 3 | .bundle 4 | .gradle 5 | .idea 6 | lib/log4j/ 7 | lib/net/ 8 | lib/org/ 9 | vendor/ 10 | build/ -------------------------------------------------------------------------------- /.travis.yml: -------------------------------------------------------------------------------- 1 | --- 2 | sudo: false 3 | language: ruby 4 | cache: bundler 5 | matrix: 6 | include: 7 | - rvm: jruby-9.1.13.0 8 | env: LOGSTASH_BRANCH=master 9 | - rvm: jruby-9.1.13.0 10 | env: LOGSTASH_BRANCH=7.0 11 | - rvm: jruby-9.1.13.0 12 | env: LOGSTASH_BRANCH=6.7 13 | - rvm: jruby-9.1.13.0 14 | env: LOGSTASH_BRANCH=6.6 15 | - rvm: jruby-1.7.27 16 | env: LOGSTASH_BRANCH=5.6 17 | fast_finish: true 18 | install: true 19 | before_script: ci/build.sh 20 | script: ci/run.sh 21 | after_script: ci/cleanup.sh 22 | jdk: openjdk8 23 | before_install: gem install bundler -v '< 2' 24 | -------------------------------------------------------------------------------- /CHANGELOG.md: -------------------------------------------------------------------------------- 1 | ## 9.1.0 2 | - Updated Kafka client version to 2.3.0 3 | 4 | ## 9.0.1 5 | - Added support for `sasl_jaas_config` setting to allow JAAS config per plugin, rather than per JVM [#313](https://github.com/logstash-plugins/logstash-input-kafka/pull/313) 6 | 7 | ## 9.0.0 8 | - Removed obsolete `ssl` option 9 | 10 | ## 8.3.1 11 | - Added support for kafka property ssl.endpoint.identification.algorithm #302(https://github.com/logstash-plugins/logstash-input-kafka/pull/302) 12 | 13 | ## 8.3.0 14 | - Changed Kafka client version to 2.1.0 15 | 16 | ## 8.2.1 17 | - Changed Kafka client version to 2.0.1 [#295](https://github.com/logstash-plugins/logstash-input-kafka/pull/295) 18 | 19 | ## 8.2.0 20 | - Upgrade Kafka client to version 2.0.0 21 | 22 | ## 8.1.2 23 | - Docs: Correct list formatting for `decorate_events` 24 | - Docs: Add kafka default to `partition_assignment_strategy` 25 | 26 | ## 8.1.1 27 | - Fix race-condition where shutting down a Kafka Input before it has finished starting could cause Logstash to crash 28 | 29 | ## 8.1.0 30 | - Internal: Update build to gradle 31 | - Upgrade Kafka client to version 1.1.0 32 | 33 | ## 8.0.6 34 | - Fix broken 8.0.5 release 35 | 36 | ## 8.0.5 37 | - Docs: Set the default_codec doc attribute. 38 | 39 | ## 8.0.4 40 | - Upgrade Kafka client to version 1.0.0 41 | 42 | ## 8.0.3 43 | - Update gemspec summary 44 | 45 | ## 8.0.2 46 | - Fix some documentation issues 47 | 48 | ## 8.0.1 49 | - Fixed an issue that prevented setting a custom `metadata_max_age_ms` value 50 | 51 | ## 8.0.0 52 | - Breaking: mark deprecated `ssl` option as obsolete 53 | 54 | ## 7.0.0 55 | - Breaking: Nest the decorated fields under `@metadata` field to avoid mapping conflicts with beats. 56 | Fixes #198, #180 57 | 58 | ## 6.3.4 59 | - Fix an issue that led to random failures in decoding messages when using more than one input thread 60 | 61 | ## 6.3.3 62 | - Upgrade Kafka client to version 0.11.0.0 63 | 64 | ## 6.3.1 65 | - fix: Added record timestamp in event decoration 66 | 67 | ## 6.3.0 68 | - Upgrade Kafka client to version 0.10.2.1 69 | 70 | ## 6.2.7 71 | - Fix NPE when SASL_SSL+PLAIN (no Kerberos) is specified. 72 | 73 | ## 6.2.6 74 | - fix: Client ID is no longer reused across multiple Kafka consumer instances 75 | 76 | ## 6.2.5 77 | - Fix a bug where consumer was not correctly setup when `SASL_SSL` option was specified. 78 | 79 | ## 6.2.4 80 | - Make error reporting more clear when connection fails 81 | 82 | ## 6.2.3 83 | - Docs: Update Kafka compatibility matrix 84 | 85 | ## 6.2.2 86 | - update kafka-clients dependency to 0.10.1.1 87 | 88 | ## 6.2.1 89 | - Docs: Clarify compatibility matrix and remove it from the changelog to avoid duplication. 90 | 91 | ## 6.2.0 92 | - Expose config `max_poll_interval_ms` to allow consumer to send heartbeats from a background thread 93 | - Expose config `fetch_max_bytes` to control client's fetch response size limit 94 | 95 | ## 6.1.0 96 | - Add Kerberos authentication support. 97 | 98 | ## 6.0.1 99 | - default `poll_timeout_ms` to 100ms 100 | 101 | ## 6.0.0 102 | - Breaking: Support for Kafka 0.10.1.0. Only supports brokers 0.10.1.x or later. 103 | 104 | ## 5.0.5 105 | - place setup_log4j for logging registration behind version check 106 | 107 | ## 5.0.4 108 | - Update to Kafka version 0.10.0.1 for bug fixes 109 | 110 | ## 5.0.3 111 | - Internal: gem cleanup 112 | 113 | ## 5.0.2 114 | - Release a new version of the gem that includes jars 115 | 116 | ## 5.0.1 117 | - Relax constraint on logstash-core-plugin-api to >= 1.60 <= 2.99 118 | 119 | ## 5.0.0 120 | - Support for Kafka 0.10 which is not backward compatible with 0.9 broker. 121 | 122 | ## 4.0.0 123 | - Republish all the gems under jruby. 124 | - Update the plugin to the version 2.0 of the plugin api, this change is required for Logstash 5.0 compatibility. See https://github.com/elastic/logstash/issues/5141 125 | - Support for Kafka 0.9 for LS 5.x 126 | 127 | ## 3.0.0.beta7 128 | - Fix Log4j warnings by setting up the logger 129 | 130 | ## 3.0.0.beta5 and 3.0.0.beta6 131 | - Internal: Use jar dependency 132 | - Fixed issue with snappy compression 133 | 134 | ## 3.0.0.beta3 and 3.0.0.beta4 135 | - Internal: Update gemspec dependency 136 | 137 | ## 3.0.0.beta2 138 | - internal: Use jar dependencies library instead of manually downloading jars 139 | - Fixes "java.lang.ClassNotFoundException: org.xerial.snappy.SnappyOutputStream" issue (#50) 140 | 141 | ## 3.0.0.beta2 142 | - Added SSL/TLS connection support to Kafka 143 | - Breaking: Changed default codec to plain instead of SSL. Json codec is really slow when used 144 | with inputs because inputs by default are single threaded. This makes it a bad 145 | first user experience. Plain codec is a much better default. 146 | 147 | ## 3.0.0.beta1 148 | - Refactor to use new Java based consumer, bypassing jruby-kafka 149 | - Breaking: Change configuration to match Kafka's configuration. This version is not backward compatible 150 | 151 | ## 2.0.7 152 | - Update to jruby-kafka 1.6 which includes Kafka 0.8.2.2 enabling LZ4 decompression. 153 | 154 | ## 2.0.6 155 | - Depend on logstash-core-plugin-api instead of logstash-core, removing the need to mass update plugins on major releases of logstash 156 | 157 | ## 2.0.5 158 | - New dependency requirements for logstash-core for the 5.0 release 159 | 160 | ## 2.0.4 161 | - Fix safe shutdown while plugin waits on Kafka for new events 162 | - Expose auto_commit_interval_ms to control offset commit frequency 163 | 164 | ## 2.0.3 165 | - Fix infinite loop when no new messages are found in Kafka 166 | 167 | ## 2.0.0 168 | - Plugins were updated to follow the new shutdown semantic, this mainly allows Logstash to instruct input plugins to terminate gracefully, 169 | instead of using Thread.raise on the plugins' threads. Ref: https://github.com/elastic/logstash/pull/3895 170 | - Dependency on logstash-core update to 2.0 171 | -------------------------------------------------------------------------------- /CONTRIBUTORS: -------------------------------------------------------------------------------- 1 | The following is a list of people who have contributed ideas, code, bug 2 | reports, or in general have helped logstash along its way. 3 | 4 | Contributors: 5 | * Joseph Lawson (joekiller) 6 | * Pere Urbón (purbon) 7 | * Pier-Hugues Pellerin (ph) 8 | * Richard Pijnenburg (electrical) 9 | * Suyog Rao (suyograo) 10 | * Tal Levy (talevy) 11 | 12 | Note: If you've sent us patches, bug reports, or otherwise contributed to 13 | Logstash, and you aren't on the list above and want to be, please let us know 14 | and we'll make sure you're here. Contributions from folks like you are what make 15 | open source awesome. 16 | -------------------------------------------------------------------------------- /DEVELOPER.md: -------------------------------------------------------------------------------- 1 | logstash-input-kafka 2 | ==================== 3 | 4 | Apache Kafka input for Logstash. This input will consume messages from a Kafka topic using the high level consumer API exposed by Kafka. 5 | 6 | For more information about Kafka, refer to this [documentation](http://kafka.apache.org/documentation.html) 7 | 8 | Information about high level consumer API can be found [here](http://kafka.apache.org/documentation.html#highlevelconsumerapi) 9 | 10 | Logstash Configuration 11 | ==================== 12 | 13 | See http://kafka.apache.org/documentation.html#consumerconfigs for details about the Kafka consumer options. 14 | 15 | input { 16 | kafka { 17 | topic_id => ... # string (optional), default: nil, The topic to consume messages from. Can be a java regular expression for whitelist of topics. 18 | white_list => ... # string (optional), default: nil, Blacklist of topics to exclude from consumption. 19 | black_list => ... # string (optional), default: nil, Whitelist of topics to include for consumption. 20 | zk_connect => ... # string (optional), default: "localhost:2181", Specifies the ZooKeeper connection string in the form hostname:port 21 | group_id => ... # string (optional), default: "logstash", A string that uniquely identifies the group of consumer processes 22 | reset_beginning => ... # boolean (optional), default: false, Specify whether to jump to beginning of the queue when there is no initial offset in ZK 23 | auto_offset_reset => ... # string (optional), one of [ "largest", "smallest"] default => 'largest', Where consumer should start if group does not already have an established offset or offset is invalid 24 | consumer_threads => ... # number (optional), default: 1, Number of threads to read from the partitions 25 | queue_size => ... # number (optional), default: 20, Internal Logstash queue size used to hold events in memory 26 | rebalance_max_retries => ... # number (optional), default: 4 27 | rebalance_backoff_ms => ... # number (optional), default: 2000 28 | consumer_timeout_ms => ... # number (optional), default: -1 29 | consumer_restart_on_error => ... # boolean (optional), default: true 30 | consumer_restart_sleep_ms => ... # number (optional), default: 0 31 | decorate_events => ... # boolean (optional), default: false, Option to add Kafka metadata like topic, message size to the event 32 | consumer_id => ... # string (optional), default: nil 33 | fetch_message_max_bytes => ... # number (optional), default: 1048576 34 | } 35 | } 36 | 37 | The default codec is json 38 | 39 | Dependencies 40 | ==================== 41 | 42 | * Apache Kafka version 0.8.1.1 43 | * jruby-kafka library 44 | -------------------------------------------------------------------------------- /Gemfile: -------------------------------------------------------------------------------- 1 | source 'https://rubygems.org' 2 | 3 | gemspec 4 | 5 | logstash_path = ENV["LOGSTASH_PATH"] || "../../logstash" 6 | use_logstash_source = ENV["LOGSTASH_SOURCE"] && ENV["LOGSTASH_SOURCE"].to_s == "1" 7 | 8 | if Dir.exist?(logstash_path) && use_logstash_source 9 | gem 'logstash-core', :path => "#{logstash_path}/logstash-core" 10 | gem 'logstash-core-plugin-api', :path => "#{logstash_path}/logstash-core-plugin-api" 11 | end 12 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | 2 | Apache License 3 | Version 2.0, January 2004 4 | http://www.apache.org/licenses/ 5 | 6 | TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 7 | 8 | 1. Definitions. 9 | 10 | "License" shall mean the terms and conditions for use, reproduction, 11 | and distribution as defined by Sections 1 through 9 of this document. 12 | 13 | "Licensor" shall mean the copyright owner or entity authorized by 14 | the copyright owner that is granting the License. 15 | 16 | "Legal Entity" shall mean the union of the acting entity and all 17 | other entities that control, are controlled by, or are under common 18 | control with that entity. For the purposes of this definition, 19 | "control" means (i) the power, direct or indirect, to cause the 20 | direction or management of such entity, whether by contract or 21 | otherwise, or (ii) ownership of fifty percent (50%) or more of the 22 | outstanding shares, or (iii) beneficial ownership of such entity. 23 | 24 | "You" (or "Your") shall mean an individual or Legal Entity 25 | exercising permissions granted by this License. 26 | 27 | "Source" form shall mean the preferred form for making modifications, 28 | including but not limited to software source code, documentation 29 | source, and configuration files. 30 | 31 | "Object" form shall mean any form resulting from mechanical 32 | transformation or translation of a Source form, including but 33 | not limited to compiled object code, generated documentation, 34 | and conversions to other media types. 35 | 36 | "Work" shall mean the work of authorship, whether in Source or 37 | Object form, made available under the License, as indicated by a 38 | copyright notice that is included in or attached to the work 39 | (an example is provided in the Appendix below). 40 | 41 | "Derivative Works" shall mean any work, whether in Source or Object 42 | form, that is based on (or derived from) the Work and for which the 43 | editorial revisions, annotations, elaborations, or other modifications 44 | represent, as a whole, an original work of authorship. For the purposes 45 | of this License, Derivative Works shall not include works that remain 46 | separable from, or merely link (or bind by name) to the interfaces of, 47 | the Work and Derivative Works thereof. 48 | 49 | "Contribution" shall mean any work of authorship, including 50 | the original version of the Work and any modifications or additions 51 | to that Work or Derivative Works thereof, that is intentionally 52 | submitted to Licensor for inclusion in the Work by the copyright owner 53 | or by an individual or Legal Entity authorized to submit on behalf of 54 | the copyright owner. For the purposes of this definition, "submitted" 55 | means any form of electronic, verbal, or written communication sent 56 | to the Licensor or its representatives, including but not limited to 57 | communication on electronic mailing lists, source code control systems, 58 | and issue tracking systems that are managed by, or on behalf of, the 59 | Licensor for the purpose of discussing and improving the Work, but 60 | excluding communication that is conspicuously marked or otherwise 61 | designated in writing by the copyright owner as "Not a Contribution." 62 | 63 | "Contributor" shall mean Licensor and any individual or Legal Entity 64 | on behalf of whom a Contribution has been received by Licensor and 65 | subsequently incorporated within the Work. 66 | 67 | 2. Grant of Copyright License. Subject to the terms and conditions of 68 | this License, each Contributor hereby grants to You a perpetual, 69 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 70 | copyright license to reproduce, prepare Derivative Works of, 71 | publicly display, publicly perform, sublicense, and distribute the 72 | Work and such Derivative Works in Source or Object form. 73 | 74 | 3. Grant of Patent License. Subject to the terms and conditions of 75 | this License, each Contributor hereby grants to You a perpetual, 76 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 77 | (except as stated in this section) patent license to make, have made, 78 | use, offer to sell, sell, import, and otherwise transfer the Work, 79 | where such license applies only to those patent claims licensable 80 | by such Contributor that are necessarily infringed by their 81 | Contribution(s) alone or by combination of their Contribution(s) 82 | with the Work to which such Contribution(s) was submitted. If You 83 | institute patent litigation against any entity (including a 84 | cross-claim or counterclaim in a lawsuit) alleging that the Work 85 | or a Contribution incorporated within the Work constitutes direct 86 | or contributory patent infringement, then any patent licenses 87 | granted to You under this License for that Work shall terminate 88 | as of the date such litigation is filed. 89 | 90 | 4. Redistribution. You may reproduce and distribute copies of the 91 | Work or Derivative Works thereof in any medium, with or without 92 | modifications, and in Source or Object form, provided that You 93 | meet the following conditions: 94 | 95 | (a) You must give any other recipients of the Work or 96 | Derivative Works a copy of this License; and 97 | 98 | (b) You must cause any modified files to carry prominent notices 99 | stating that You changed the files; and 100 | 101 | (c) You must retain, in the Source form of any Derivative Works 102 | that You distribute, all copyright, patent, trademark, and 103 | attribution notices from the Source form of the Work, 104 | excluding those notices that do not pertain to any part of 105 | the Derivative Works; and 106 | 107 | (d) If the Work includes a "NOTICE" text file as part of its 108 | distribution, then any Derivative Works that You distribute must 109 | include a readable copy of the attribution notices contained 110 | within such NOTICE file, excluding those notices that do not 111 | pertain to any part of the Derivative Works, in at least one 112 | of the following places: within a NOTICE text file distributed 113 | as part of the Derivative Works; within the Source form or 114 | documentation, if provided along with the Derivative Works; or, 115 | within a display generated by the Derivative Works, if and 116 | wherever such third-party notices normally appear. The contents 117 | of the NOTICE file are for informational purposes only and 118 | do not modify the License. You may add Your own attribution 119 | notices within Derivative Works that You distribute, alongside 120 | or as an addendum to the NOTICE text from the Work, provided 121 | that such additional attribution notices cannot be construed 122 | as modifying the License. 123 | 124 | You may add Your own copyright statement to Your modifications and 125 | may provide additional or different license terms and conditions 126 | for use, reproduction, or distribution of Your modifications, or 127 | for any such Derivative Works as a whole, provided Your use, 128 | reproduction, and distribution of the Work otherwise complies with 129 | the conditions stated in this License. 130 | 131 | 5. Submission of Contributions. Unless You explicitly state otherwise, 132 | any Contribution intentionally submitted for inclusion in the Work 133 | by You to the Licensor shall be under the terms and conditions of 134 | this License, without any additional terms or conditions. 135 | Notwithstanding the above, nothing herein shall supersede or modify 136 | the terms of any separate license agreement you may have executed 137 | with Licensor regarding such Contributions. 138 | 139 | 6. Trademarks. This License does not grant permission to use the trade 140 | names, trademarks, service marks, or product names of the Licensor, 141 | except as required for reasonable and customary use in describing the 142 | origin of the Work and reproducing the content of the NOTICE file. 143 | 144 | 7. Disclaimer of Warranty. Unless required by applicable law or 145 | agreed to in writing, Licensor provides the Work (and each 146 | Contributor provides its Contributions) on an "AS IS" BASIS, 147 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 148 | implied, including, without limitation, any warranties or conditions 149 | of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A 150 | PARTICULAR PURPOSE. You are solely responsible for determining the 151 | appropriateness of using or redistributing the Work and assume any 152 | risks associated with Your exercise of permissions under this License. 153 | 154 | 8. Limitation of Liability. In no event and under no legal theory, 155 | whether in tort (including negligence), contract, or otherwise, 156 | unless required by applicable law (such as deliberate and grossly 157 | negligent acts) or agreed to in writing, shall any Contributor be 158 | liable to You for damages, including any direct, indirect, special, 159 | incidental, or consequential damages of any character arising as a 160 | result of this License or out of the use or inability to use the 161 | Work (including but not limited to damages for loss of goodwill, 162 | work stoppage, computer failure or malfunction, or any and all 163 | other commercial damages or losses), even if such Contributor 164 | has been advised of the possibility of such damages. 165 | 166 | 9. Accepting Warranty or Additional Liability. While redistributing 167 | the Work or Derivative Works thereof, You may choose to offer, 168 | and charge a fee for, acceptance of support, warranty, indemnity, 169 | or other liability obligations and/or rights consistent with this 170 | License. However, in accepting such obligations, You may act only 171 | on Your own behalf and on Your sole responsibility, not on behalf 172 | of any other Contributor, and only if You agree to indemnify, 173 | defend, and hold each Contributor harmless for any liability 174 | incurred by, or claims asserted against, such Contributor by reason 175 | of your accepting any such warranty or additional liability. 176 | 177 | END OF TERMS AND CONDITIONS 178 | 179 | APPENDIX: How to apply the Apache License to your work. 180 | 181 | To apply the Apache License to your work, attach the following 182 | boilerplate notice, with the fields enclosed by brackets "[]" 183 | replaced with your own identifying information. (Don't include 184 | the brackets!) The text should be enclosed in the appropriate 185 | comment syntax for the file format. We also recommend that a 186 | file or class name and description of purpose be included on the 187 | same "printed page" as the copyright notice for easier 188 | identification within third-party archives. 189 | 190 | Copyright 2020 Elastic and contributors 191 | 192 | Licensed under the Apache License, Version 2.0 (the "License"); 193 | you may not use this file except in compliance with the License. 194 | You may obtain a copy of the License at 195 | 196 | http://www.apache.org/licenses/LICENSE-2.0 197 | 198 | Unless required by applicable law or agreed to in writing, software 199 | distributed under the License is distributed on an "AS IS" BASIS, 200 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 201 | See the License for the specific language governing permissions and 202 | limitations under the License. 203 | -------------------------------------------------------------------------------- /NOTICE.TXT: -------------------------------------------------------------------------------- 1 | Elasticsearch 2 | Copyright 2012-2015 Elasticsearch 3 | 4 | This product includes software developed by The Apache Software 5 | Foundation (http://www.apache.org/). -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Logstash Plugin 2 | 3 | [![Travis Build Status](https://travis-ci.com/logstash-plugins/logstash-input-kafka.svg)](https://travis-ci.com/logstash-plugins/logstash-input-kafka) 4 | 5 | This is a plugin for [Logstash](https://github.com/elastic/logstash). 6 | 7 | It is fully free and fully open source. The license is Apache 2.0, meaning you are pretty much free to use it however you want in whatever way. 8 | 9 | ## Kafka Input Plugin Has Moved 10 | 11 | This Kafka Input Plugin is now a part of the [Kafka Integration Plugin][integration-source]. This project remains open for backports of fixes from that project to the 9.x series where possible, but issues should first be filed on the [integration plugin][integration-issues]. 12 | 13 | [integration-source]: https://github.com/logstash-plugins/logstash-integration-kafka 14 | [integration-issues]: https://github.com/logstash-plugins/logstash-integration-kafka/issues/ 15 | 16 | 17 | ## Logging 18 | 19 | Kafka logs do not respect the Log4J2 root logger level and defaults to INFO, for other levels, you must explicitly set the log level in your Logstash deployment's `log4j2.properties` file, e.g.: 20 | ``` 21 | logger.kafka.name=org.apache.kafka 22 | logger.kafka.appenderRef.console.ref=console 23 | logger.kafka.level=debug 24 | ``` 25 | 26 | ## Documentation 27 | 28 | https://www.elastic.co/guide/en/logstash/current/plugins-inputs-kafka.html 29 | 30 | Logstash provides infrastructure to automatically generate documentation for this plugin. We use the asciidoc format to write documentation so any comments in the source code will be first converted into asciidoc and then into html. All plugin documentation are placed under one [central location](http://www.elastic.co/guide/en/logstash/current/). 31 | 32 | - For formatting code or config example, you can use the asciidoc `[source,ruby]` directive 33 | - For more asciidoc formatting tips, see the excellent reference here https://github.com/elastic/docs#asciidoc-guide 34 | 35 | ## Need Help? 36 | 37 | Need help? Try #logstash on freenode IRC or the https://discuss.elastic.co/c/logstash discussion forum. 38 | 39 | ## Developing 40 | 41 | ### 1. Plugin Developement and Testing 42 | 43 | #### Code 44 | - To get started, you'll need JRuby with the Bundler gem installed. 45 | 46 | - Create a new plugin or clone and existing from the GitHub [logstash-plugins](https://github.com/logstash-plugins) organization. We also provide [example plugins](https://github.com/logstash-plugins?query=example). 47 | 48 | - Install dependencies 49 | ```sh 50 | bundle install 51 | rake install_jars 52 | ``` 53 | 54 | #### Test 55 | 56 | - Update your dependencies 57 | 58 | ```sh 59 | bundle install 60 | rake install_jars 61 | ``` 62 | 63 | - Run tests 64 | 65 | ```sh 66 | bundle exec rspec 67 | ``` 68 | 69 | ### 2. Running your unpublished Plugin in Logstash 70 | 71 | #### 2.1 Run in a local Logstash clone 72 | 73 | - Edit Logstash `Gemfile` and add the local plugin path, for example: 74 | ```ruby 75 | gem "logstash-filter-awesome", :path => "/your/local/logstash-filter-awesome" 76 | ``` 77 | - Install plugin 78 | ```sh 79 | # Logstash 2.3 and higher 80 | bin/logstash-plugin install --no-verify 81 | 82 | # Prior to Logstash 2.3 83 | bin/plugin install --no-verify 84 | 85 | ``` 86 | - Run Logstash with your plugin 87 | ```sh 88 | bin/logstash -e 'filter {awesome {}}' 89 | ``` 90 | At this point any modifications to the plugin code will be applied to this local Logstash setup. After modifying the plugin, simply rerun Logstash. 91 | 92 | #### 2.2 Run in an installed Logstash 93 | 94 | You can use the same **2.1** method to run your plugin in an installed Logstash by editing its `Gemfile` and pointing the `:path` to your local plugin development directory or you can build the gem and install it using: 95 | 96 | - Build your plugin gem 97 | ```sh 98 | gem build logstash-filter-awesome.gemspec 99 | ``` 100 | - Install the plugin from the Logstash home 101 | ```sh 102 | # Logstash 2.3 and higher 103 | bin/logstash-plugin install --no-verify 104 | 105 | # Prior to Logstash 2.3 106 | bin/plugin install --no-verify 107 | 108 | ``` 109 | - Start Logstash and proceed to test the plugin 110 | 111 | ## Contributing 112 | 113 | All contributions are welcome: ideas, patches, documentation, bug reports, complaints, and even something you drew up on a napkin. 114 | 115 | Programming is not a required skill. Whatever you've seen about open source and maintainers or community members saying "send patches or die" - you will not see that here. 116 | 117 | It is more important to the community that you are able to contribute. 118 | 119 | For more information about contributing, see the [CONTRIBUTING](https://github.com/elastic/logstash/blob/master/CONTRIBUTING.md) file. 120 | -------------------------------------------------------------------------------- /Rakefile: -------------------------------------------------------------------------------- 1 | 2 | # encoding: utf-8 3 | require "logstash/devutils/rake" 4 | require "jars/installer" 5 | require "fileutils" 6 | 7 | task :default do 8 | system('rake -vT') 9 | end 10 | 11 | task :vendor do 12 | exit(1) unless system './gradlew vendor' 13 | end 14 | 15 | task :clean do 16 | ["vendor/jar-dependencies", "Gemfile.lock"].each do |p| 17 | FileUtils.rm_rf(p) 18 | end 19 | end 20 | 21 | -------------------------------------------------------------------------------- /build.gradle: -------------------------------------------------------------------------------- 1 | import java.nio.file.Files 2 | import static java.nio.file.StandardCopyOption.REPLACE_EXISTING 3 | /* 4 | * Licensed to Elasticsearch under one or more contributor 5 | * license agreements. See the NOTICE file distributed with 6 | * this work for additional information regarding copyright 7 | * ownership. Elasticsearch licenses this file to you under 8 | * the Apache License, Version 2.0 (the "License"); you may 9 | * not use this file except in compliance with the License. 10 | * You may obtain a copy of the License at 11 | * 12 | * http://www.apache.org/licenses/LICENSE-2.0 13 | * 14 | * Unless required by applicable law or agreed to in writing, 15 | * software distributed under the License is distributed on an 16 | * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY 17 | * KIND, either express or implied. See the License for the 18 | * specific language governing permissions and limitations 19 | * under the License. 20 | */ 21 | apply plugin: "java" 22 | apply plugin: 'maven' 23 | apply plugin: "distribution" 24 | apply plugin: "idea" 25 | 26 | group "org.logstash.inputs" 27 | 28 | sourceCompatibility = JavaVersion.VERSION_1_8 29 | 30 | buildscript { 31 | repositories { 32 | mavenCentral() 33 | jcenter() 34 | } 35 | 36 | } 37 | 38 | repositories { 39 | mavenCentral() 40 | } 41 | 42 | task wrapper(type: Wrapper) { 43 | gradleVersion = '4.0' 44 | } 45 | 46 | dependencies { 47 | compile 'org.apache.kafka:kafka-clients:2.3.0' 48 | compile 'com.github.luben:zstd-jni:1.4.2-1' 49 | compile 'org.slf4j:slf4j-api:1.7.26' 50 | compile 'org.lz4:lz4-java:1.6.0' 51 | compile 'org.xerial.snappy:snappy-java:1.1.7.3' 52 | 53 | } 54 | task generateGemJarRequiresFile { 55 | doLast { 56 | File jars_file = file('lib/logstash-input-kafka_jars.rb') 57 | jars_file.newWriter().withWriter { w -> 58 | w << "# AUTOGENERATED BY THE GRADLE SCRIPT. DO NOT EDIT.\n\n" 59 | w << "require \'jar_dependencies\'\n" 60 | configurations.runtime.allDependencies.each { 61 | w << "require_jar(\'${it.group}\', \'${it.name}\', \'${it.version}\')\n" 62 | } 63 | } 64 | } 65 | } 66 | 67 | task vendor { 68 | doLast { 69 | String vendorPathPrefix = "vendor/jar-dependencies" 70 | configurations.runtime.allDependencies.each { dep -> 71 | File f = configurations.runtime.filter { it.absolutePath.contains("${dep.group}/${dep.name}/${dep.version}") }.singleFile 72 | String groupPath = dep.group.replaceAll('\\.', '/') 73 | File newJarFile = file("${vendorPathPrefix}/${groupPath}/${dep.name}/${dep.version}/${dep.name}-${dep.version}.jar") 74 | newJarFile.mkdirs() 75 | Files.copy(f.toPath(), newJarFile.toPath(), REPLACE_EXISTING) 76 | } 77 | } 78 | } 79 | 80 | vendor.dependsOn(generateGemJarRequiresFile) 81 | -------------------------------------------------------------------------------- /ci/build.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | # version: 1 3 | ######################################################## 4 | # 5 | # AUTOMATICALLY GENERATED! DO NOT EDIT 6 | # 7 | ######################################################## 8 | set -e 9 | 10 | ./ci/setup.sh 11 | 12 | export KAFKA_VERSION=2.1.1 13 | ./kafka_test_setup.sh 14 | bundle install 15 | bundle exec rake vendor 16 | -------------------------------------------------------------------------------- /ci/cleanup.sh: -------------------------------------------------------------------------------- 1 | ./kafka_test_teardown.sh 2 | -------------------------------------------------------------------------------- /ci/run.sh: -------------------------------------------------------------------------------- 1 | bundle exec rspec && bundle exec rspec --tag integration 2 | -------------------------------------------------------------------------------- /ci/setup.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | # version: 1 3 | ######################################################## 4 | # 5 | # AUTOMATICALLY GENERATED! DO NOT EDIT 6 | # 7 | ######################################################## 8 | set -e 9 | if [ "$LOGSTASH_BRANCH" ]; then 10 | echo "Building plugin using Logstash source" 11 | BASE_DIR=`pwd` 12 | echo "Checking out branch: $LOGSTASH_BRANCH" 13 | git clone -b $LOGSTASH_BRANCH https://github.com/elastic/logstash.git ../../logstash --depth 1 14 | printf "Checked out Logstash revision: %s\n" "$(git -C ../../logstash rev-parse HEAD)" 15 | cd ../../logstash 16 | echo "Building plugins with Logstash version:" 17 | cat versions.yml 18 | echo "---" 19 | # We need to build the jars for that specific version 20 | echo "Running gradle assemble in: `pwd`" 21 | ./gradlew assemble 22 | cd $BASE_DIR 23 | export LOGSTASH_SOURCE=1 24 | else 25 | echo "Building plugin using released gems on rubygems" 26 | fi 27 | -------------------------------------------------------------------------------- /docs/index.asciidoc: -------------------------------------------------------------------------------- 1 | :plugin: kafka 2 | :type: input 3 | :default_codec: plain 4 | 5 | 6 | ///////////////////////////////////////////// 7 | // Kafka Input Plugin Source Has Moved // 8 | // --------------------------------------- // 9 | // The Kafka Input Plugin is now a part // 10 | // of the Kafka Integration. // 11 | // // 12 | // This stand-alone plugin project remains // 13 | // open for backports to the 9.x series. // 14 | ///////////////////////////////////////////// 15 | 16 | 17 | /////////////////////////////////////////// 18 | START - GENERATED VARIABLES, DO NOT EDIT! 19 | /////////////////////////////////////////// 20 | :version: %VERSION% 21 | :release_date: %RELEASE_DATE% 22 | :changelog_url: %CHANGELOG_URL% 23 | :include_path: ../../../../logstash/docs/include 24 | /////////////////////////////////////////// 25 | END - GENERATED VARIABLES, DO NOT EDIT! 26 | /////////////////////////////////////////// 27 | 28 | 29 | [id="plugins-{type}s-{plugin}"] 30 | 31 | === Kafka input plugin 32 | 33 | include::{include_path}/plugin_header.asciidoc[] 34 | 35 | ==== Description 36 | 37 | This input will read events from a Kafka topic. 38 | 39 | This plugin uses Kafka Client 2.1.0. For broker compatibility, see the official https://cwiki.apache.org/confluence/display/KAFKA/Compatibility+Matrix[Kafka compatibility reference]. If the linked compatibility wiki is not up-to-date, please contact Kafka support/community to confirm compatibility. 40 | 41 | If you require features not yet available in this plugin (including client version upgrades), please file an issue with details about what you need. 42 | 43 | This input supports connecting to Kafka over: 44 | 45 | * SSL (requires plugin version 3.0.0 or later) 46 | * Kerberos SASL (requires plugin version 5.1.0 or later) 47 | 48 | By default security is disabled but can be turned on as needed. 49 | 50 | The Logstash Kafka consumer handles group management and uses the default offset management 51 | strategy using Kafka topics. 52 | 53 | Logstash instances by default form a single logical group to subscribe to Kafka topics 54 | Each Logstash Kafka consumer can run multiple threads to increase read throughput. Alternatively, 55 | you could run multiple Logstash instances with the same `group_id` to spread the load across 56 | physical machines. Messages in a topic will be distributed to all Logstash instances with 57 | the same `group_id`. 58 | 59 | Ideally you should have as many threads as the number of partitions for a perfect balance -- 60 | more threads than partitions means that some threads will be idle 61 | 62 | For more information see http://kafka.apache.org/documentation.html#theconsumer 63 | 64 | Kafka consumer configuration: http://kafka.apache.org/documentation.html#consumerconfigs 65 | 66 | ==== Metadata fields 67 | 68 | The following metadata from Kafka broker are added under the `[@metadata]` field: 69 | 70 | * `[@metadata][kafka][topic]`: Original Kafka topic from where the message was consumed. 71 | * `[@metadata][kafka][consumer_group]`: Consumer group 72 | * `[@metadata][kafka][partition]`: Partition info for this message. 73 | * `[@metadata][kafka][offset]`: Original record offset for this message. 74 | * `[@metadata][kafka][key]`: Record key, if any. 75 | * `[@metadata][kafka][timestamp]`: Timestamp in the Record. Depending on your broker configuration, this can be either when the record was created (default) or when it was received by the broker. See more about property log.message.timestamp.type at https://kafka.apache.org/10/documentation.html#brokerconfigs 76 | 77 | Metadata is only added to the event if the `decorate_events` option is set to true (it defaults to false). 78 | 79 | Please note that `@metadata` fields are not part of any of your events at output time. If you need these information to be 80 | inserted into your original event, you'll have to use the `mutate` filter to manually copy the required fields into your `event`. 81 | 82 | [id="plugins-{type}s-{plugin}-options"] 83 | ==== Kafka Input Configuration Options 84 | 85 | This plugin supports these configuration options plus the <> described later. 86 | 87 | NOTE: Some of these options map to a Kafka option. See the 88 | https://kafka.apache.org/documentation for more details. 89 | 90 | [cols="<,<,<",options="header",] 91 | |======================================================================= 92 | |Setting |Input type|Required 93 | | <> |<>|No 94 | | <> |<>|No 95 | | <> |<>|No 96 | | <> |<>|No 97 | | <> |<>|No 98 | | <> |<>|No 99 | | <> |<>|No 100 | | <> |<>|No 101 | | <> |<>|No 102 | | <> |<>|No 103 | | <> |<>|No 104 | | <> |<>|No 105 | | <> |<>|No 106 | | <> |<>|No 107 | | <> |<>|No 108 | | <> |a valid filesystem path|No 109 | | <> |a valid filesystem path|No 110 | | <> |<>|No 111 | | <> |<>|No 112 | | <> |<>|No 113 | | <> |<>|No 114 | | <> |<>|No 115 | | <> |<>|No 116 | | <> |<>|No 117 | | <> |<>|No 118 | | <> |<>|No 119 | | <> |<>|No 120 | | <> |<>|No 121 | | <> |<>|No 122 | | <> |<>|No 123 | | <> |<>|No 124 | | <> |<>, one of `["PLAINTEXT", "SSL", "SASL_PLAINTEXT", "SASL_SSL"]`|No 125 | | <> |<>|No 126 | | <> |<>|No 127 | | <> |<>|No 128 | | <> |<>|No 129 | | <> |a valid filesystem path|No 130 | | <> |<>|No 131 | | <> |<>|No 132 | | <> |a valid filesystem path|No 133 | | <> |<>|No 134 | | <> |<>|No 135 | | <> |<>|No 136 | | <> |<>|No 137 | | <> |<>|No 138 | |======================================================================= 139 | 140 | Also see <> for a list of options supported by all 141 | input plugins. 142 | 143 |   144 | 145 | [id="plugins-{type}s-{plugin}-auto_commit_interval_ms"] 146 | ===== `auto_commit_interval_ms` 147 | 148 | * Value type is <> 149 | * Default value is `"5000"` 150 | 151 | The frequency in milliseconds that the consumer offsets are committed to Kafka. 152 | 153 | [id="plugins-{type}s-{plugin}-auto_offset_reset"] 154 | ===== `auto_offset_reset` 155 | 156 | * Value type is <> 157 | * There is no default value for this setting. 158 | 159 | What to do when there is no initial offset in Kafka or if an offset is out of range: 160 | 161 | * earliest: automatically reset the offset to the earliest offset 162 | * latest: automatically reset the offset to the latest offset 163 | * none: throw exception to the consumer if no previous offset is found for the consumer's group 164 | * anything else: throw exception to the consumer. 165 | 166 | [id="plugins-{type}s-{plugin}-bootstrap_servers"] 167 | ===== `bootstrap_servers` 168 | 169 | * Value type is <> 170 | * Default value is `"localhost:9092"` 171 | 172 | A list of URLs of Kafka instances to use for establishing the initial connection to the cluster. 173 | This list should be in the form of `host1:port1,host2:port2` These urls are just used 174 | for the initial connection to discover the full cluster membership (which may change dynamically) 175 | so this list need not contain the full set of servers (you may want more than one, though, in 176 | case a server is down). 177 | 178 | [id="plugins-{type}s-{plugin}-check_crcs"] 179 | ===== `check_crcs` 180 | 181 | * Value type is <> 182 | * There is no default value for this setting. 183 | 184 | Automatically check the CRC32 of the records consumed. This ensures no on-the-wire or on-disk 185 | corruption to the messages occurred. This check adds some overhead, so it may be 186 | disabled in cases seeking extreme performance. 187 | 188 | [id="plugins-{type}s-{plugin}-client_id"] 189 | ===== `client_id` 190 | 191 | * Value type is <> 192 | * Default value is `"logstash"` 193 | 194 | The id string to pass to the server when making requests. The purpose of this 195 | is to be able to track the source of requests beyond just ip/port by allowing 196 | a logical application name to be included. 197 | 198 | [id="plugins-{type}s-{plugin}-connections_max_idle_ms"] 199 | ===== `connections_max_idle_ms` 200 | 201 | * Value type is <> 202 | * There is no default value for this setting. 203 | 204 | Close idle connections after the number of milliseconds specified by this config. 205 | 206 | [id="plugins-{type}s-{plugin}-consumer_threads"] 207 | ===== `consumer_threads` 208 | 209 | * Value type is <> 210 | * Default value is `1` 211 | 212 | Ideally you should have as many threads as the number of partitions for a perfect 213 | balance — more threads than partitions means that some threads will be idle 214 | 215 | [id="plugins-{type}s-{plugin}-decorate_events"] 216 | ===== `decorate_events` 217 | 218 | * Value type is <> 219 | * Default value is `false` 220 | 221 | Option to add Kafka metadata like topic, message size to the event. 222 | This will add a field named `kafka` to the logstash event containing the following attributes: 223 | 224 | * `topic`: The topic this message is associated with 225 | * `consumer_group`: The consumer group used to read in this event 226 | * `partition`: The partition this message is associated with 227 | * `offset`: The offset from the partition this message is associated with 228 | * `key`: A ByteBuffer containing the message key 229 | 230 | [id="plugins-{type}s-{plugin}-enable_auto_commit"] 231 | ===== `enable_auto_commit` 232 | 233 | * Value type is <> 234 | * Default value is `"true"` 235 | 236 | This committed offset will be used when the process fails as the position from 237 | which the consumption will begin. 238 | If true, periodically commit to Kafka the offsets of messages already returned by 239 | the consumer. If value is `false` however, the offset is committed every time the 240 | consumer fetches the data from the topic. 241 | 242 | [id="plugins-{type}s-{plugin}-exclude_internal_topics"] 243 | ===== `exclude_internal_topics` 244 | 245 | * Value type is <> 246 | * There is no default value for this setting. 247 | 248 | Whether records from internal topics (such as offsets) should be exposed to the consumer. 249 | If set to true the only way to receive records from an internal topic is subscribing to it. 250 | 251 | [id="plugins-{type}s-{plugin}-fetch_max_bytes"] 252 | ===== `fetch_max_bytes` 253 | 254 | * Value type is <> 255 | * There is no default value for this setting. 256 | 257 | The maximum amount of data the server should return for a fetch request. This is not an 258 | absolute maximum, if the first message in the first non-empty partition of the fetch is larger 259 | than this value, the message will still be returned to ensure that the consumer can make progress. 260 | 261 | [id="plugins-{type}s-{plugin}-fetch_max_wait_ms"] 262 | ===== `fetch_max_wait_ms` 263 | 264 | * Value type is <> 265 | * There is no default value for this setting. 266 | 267 | The maximum amount of time the server will block before answering the fetch request if 268 | there isn't sufficient data to immediately satisfy `fetch_min_bytes`. This 269 | should be less than or equal to the timeout used in `poll_timeout_ms` 270 | 271 | [id="plugins-{type}s-{plugin}-fetch_min_bytes"] 272 | ===== `fetch_min_bytes` 273 | 274 | * Value type is <> 275 | * There is no default value for this setting. 276 | 277 | The minimum amount of data the server should return for a fetch request. If insufficient 278 | data is available the request will wait for that much data to accumulate 279 | before answering the request. 280 | 281 | [id="plugins-{type}s-{plugin}-group_id"] 282 | ===== `group_id` 283 | 284 | * Value type is <> 285 | * Default value is `"logstash"` 286 | 287 | The identifier of the group this consumer belongs to. Consumer group is a single logical subscriber 288 | that happens to be made up of multiple processors. Messages in a topic will be distributed to all 289 | Logstash instances with the same `group_id` 290 | 291 | [id="plugins-{type}s-{plugin}-heartbeat_interval_ms"] 292 | ===== `heartbeat_interval_ms` 293 | 294 | * Value type is <> 295 | * There is no default value for this setting. 296 | 297 | The expected time between heartbeats to the consumer coordinator. Heartbeats are used to ensure 298 | that the consumer's session stays active and to facilitate rebalancing when new 299 | consumers join or leave the group. The value must be set lower than 300 | `session.timeout.ms`, but typically should be set no higher than 1/3 of that value. 301 | It can be adjusted even lower to control the expected time for normal rebalances. 302 | 303 | [id="plugins-{type}s-{plugin}-jaas_path"] 304 | ===== `jaas_path` 305 | 306 | * Value type is <> 307 | * There is no default value for this setting. 308 | 309 | The Java Authentication and Authorization Service (JAAS) API supplies user authentication and authorization 310 | services for Kafka. This setting provides the path to the JAAS file. Sample JAAS file for Kafka client: 311 | [source,java] 312 | ---------------------------------- 313 | KafkaClient { 314 | com.sun.security.auth.module.Krb5LoginModule required 315 | useTicketCache=true 316 | renewTicket=true 317 | serviceName="kafka"; 318 | }; 319 | ---------------------------------- 320 | 321 | Please note that specifying `jaas_path` and `kerberos_config` in the config file will add these 322 | to the global JVM system properties. This means if you have multiple Kafka inputs, all of them would be sharing the same 323 | `jaas_path` and `kerberos_config`. If this is not desirable, you would have to run separate instances of Logstash on 324 | different JVM instances. 325 | 326 | [id="plugins-{type}s-{plugin}-kerberos_config"] 327 | ===== `kerberos_config` 328 | 329 | * Value type is <> 330 | * There is no default value for this setting. 331 | 332 | Optional path to kerberos config file. This is krb5.conf style as detailed in https://web.mit.edu/kerberos/krb5-1.12/doc/admin/conf_files/krb5_conf.html 333 | 334 | [id="plugins-{type}s-{plugin}-key_deserializer_class"] 335 | ===== `key_deserializer_class` 336 | 337 | * Value type is <> 338 | * Default value is `"org.apache.kafka.common.serialization.StringDeserializer"` 339 | 340 | Java Class used to deserialize the record's key 341 | 342 | [id="plugins-{type}s-{plugin}-max_partition_fetch_bytes"] 343 | ===== `max_partition_fetch_bytes` 344 | 345 | * Value type is <> 346 | * There is no default value for this setting. 347 | 348 | The maximum amount of data per-partition the server will return. The maximum total memory used for a 349 | request will be `#partitions * max.partition.fetch.bytes`. This size must be at least 350 | as large as the maximum message size the server allows or else it is possible for the producer to 351 | send messages larger than the consumer can fetch. If that happens, the consumer can get stuck trying 352 | to fetch a large message on a certain partition. 353 | 354 | [id="plugins-{type}s-{plugin}-max_poll_interval_ms"] 355 | ===== `max_poll_interval_ms` 356 | 357 | * Value type is <> 358 | * There is no default value for this setting. 359 | 360 | The maximum delay between invocations of poll() when using consumer group management. This places 361 | an upper bound on the amount of time that the consumer can be idle before fetching more records. 362 | If poll() is not called before expiration of this timeout, then the consumer is considered failed and 363 | the group will rebalance in order to reassign the partitions to another member. 364 | The value of the configuration `request_timeout_ms` must always be larger than max_poll_interval_ms 365 | 366 | [id="plugins-{type}s-{plugin}-max_poll_records"] 367 | ===== `max_poll_records` 368 | 369 | * Value type is <> 370 | * There is no default value for this setting. 371 | 372 | The maximum number of records returned in a single call to poll(). 373 | 374 | [id="plugins-{type}s-{plugin}-metadata_max_age_ms"] 375 | ===== `metadata_max_age_ms` 376 | 377 | * Value type is <> 378 | * There is no default value for this setting. 379 | 380 | The period of time in milliseconds after which we force a refresh of metadata even if 381 | we haven't seen any partition leadership changes to proactively discover any new brokers or partitions 382 | 383 | [id="plugins-{type}s-{plugin}-partition_assignment_strategy"] 384 | ===== `partition_assignment_strategy` 385 | 386 | * Value type is <> 387 | * There is no default value for this setting. 388 | 389 | The class name of the partition assignment strategy that the client uses to 390 | distribute partition ownership amongst consumer instances. Maps to 391 | the Kafka `partition.assignment.strategy` setting, which defaults to 392 | `org.apache.kafka.clients.consumer.RangeAssignor`. 393 | 394 | [id="plugins-{type}s-{plugin}-poll_timeout_ms"] 395 | ===== `poll_timeout_ms` 396 | 397 | * Value type is <> 398 | * Default value is `100` 399 | 400 | Time kafka consumer will wait to receive new messages from topics 401 | 402 | [id="plugins-{type}s-{plugin}-receive_buffer_bytes"] 403 | ===== `receive_buffer_bytes` 404 | 405 | * Value type is <> 406 | * There is no default value for this setting. 407 | 408 | The size of the TCP receive buffer (SO_RCVBUF) to use when reading data. 409 | 410 | [id="plugins-{type}s-{plugin}-reconnect_backoff_ms"] 411 | ===== `reconnect_backoff_ms` 412 | 413 | * Value type is <> 414 | * There is no default value for this setting. 415 | 416 | The amount of time to wait before attempting to reconnect to a given host. 417 | This avoids repeatedly connecting to a host in a tight loop. 418 | This backoff applies to all requests sent by the consumer to the broker. 419 | 420 | [id="plugins-{type}s-{plugin}-request_timeout_ms"] 421 | ===== `request_timeout_ms` 422 | 423 | * Value type is <> 424 | * There is no default value for this setting. 425 | 426 | The configuration controls the maximum amount of time the client will wait 427 | for the response of a request. If the response is not received before the timeout 428 | elapses the client will resend the request if necessary or fail the request if 429 | retries are exhausted. 430 | 431 | [id="plugins-{type}s-{plugin}-retry_backoff_ms"] 432 | ===== `retry_backoff_ms` 433 | 434 | * Value type is <> 435 | * There is no default value for this setting. 436 | 437 | The amount of time to wait before attempting to retry a failed fetch request 438 | to a given topic partition. This avoids repeated fetching-and-failing in a tight loop. 439 | 440 | [id="plugins-{type}s-{plugin}-sasl_jaas_config"] 441 | ===== `sasl_jaas_config` 442 | 443 | * Value type is <> 444 | * There is no default value for this setting. 445 | 446 | JAAS configuration setting local to this plugin instance, as opposed to settings using config file configured using `jaas_path`, which are shared across the JVM. This allows each plugin instance to have its own configuration. 447 | 448 | If both `sasl_jaas_config` and `jaas_path` configurations are set, the setting here takes precedence. 449 | 450 | Example (setting for Azure Event Hub): 451 | [source,ruby] 452 | input { 453 | kafka { 454 | sasl_jaas_config => "org.apache.kafka.common.security.plain.PlainLoginModule required username='auser' password='apassword';" 455 | } 456 | } 457 | 458 | [id="plugins-{type}s-{plugin}-sasl_kerberos_service_name"] 459 | ===== `sasl_kerberos_service_name` 460 | 461 | * Value type is <> 462 | * There is no default value for this setting. 463 | 464 | The Kerberos principal name that Kafka broker runs as. 465 | This can be defined either in Kafka's JAAS config or in Kafka's config. 466 | 467 | [id="plugins-{type}s-{plugin}-sasl_mechanism"] 468 | ===== `sasl_mechanism` 469 | 470 | * Value type is <> 471 | * Default value is `"GSSAPI"` 472 | 473 | http://kafka.apache.org/documentation.html#security_sasl[SASL mechanism] used for client connections. 474 | This may be any mechanism for which a security provider is available. 475 | GSSAPI is the default mechanism. 476 | 477 | [id="plugins-{type}s-{plugin}-security_protocol"] 478 | ===== `security_protocol` 479 | 480 | * Value can be any of: `PLAINTEXT`, `SSL`, `SASL_PLAINTEXT`, `SASL_SSL` 481 | * Default value is `"PLAINTEXT"` 482 | 483 | Security protocol to use, which can be either of PLAINTEXT,SSL,SASL_PLAINTEXT,SASL_SSL 484 | 485 | [id="plugins-{type}s-{plugin}-send_buffer_bytes"] 486 | ===== `send_buffer_bytes` 487 | 488 | * Value type is <> 489 | * There is no default value for this setting. 490 | 491 | The size of the TCP send buffer (SO_SNDBUF) to use when sending data 492 | 493 | [id="plugins-{type}s-{plugin}-session_timeout_ms"] 494 | ===== `session_timeout_ms` 495 | 496 | * Value type is <> 497 | * There is no default value for this setting. 498 | 499 | The timeout after which, if the `poll_timeout_ms` is not invoked, the consumer is marked dead 500 | and a rebalance operation is triggered for the group identified by `group_id` 501 | 502 | [id="plugins-{type}s-{plugin}-ssl_endpoint_identification_algorithm"] 503 | ===== `ssl_endpoint_identification_algorithm` 504 | 505 | * Value type is <> 506 | * Default value is `"https"` 507 | 508 | The endpoint identification algorithm, defaults to `"https"`. Set to empty string `""` to disable endpoint verification 509 | 510 | 511 | [id="plugins-{type}s-{plugin}-ssl_key_password"] 512 | ===== `ssl_key_password` 513 | 514 | * Value type is <> 515 | * There is no default value for this setting. 516 | 517 | The password of the private key in the key store file. 518 | 519 | [id="plugins-{type}s-{plugin}-ssl_keystore_location"] 520 | ===== `ssl_keystore_location` 521 | 522 | * Value type is <> 523 | * There is no default value for this setting. 524 | 525 | If client authentication is required, this setting stores the keystore path. 526 | 527 | [id="plugins-{type}s-{plugin}-ssl_keystore_password"] 528 | ===== `ssl_keystore_password` 529 | 530 | * Value type is <> 531 | * There is no default value for this setting. 532 | 533 | If client authentication is required, this setting stores the keystore password 534 | 535 | [id="plugins-{type}s-{plugin}-ssl_keystore_type"] 536 | ===== `ssl_keystore_type` 537 | 538 | * Value type is <> 539 | * There is no default value for this setting. 540 | 541 | The keystore type. 542 | 543 | [id="plugins-{type}s-{plugin}-ssl_truststore_location"] 544 | ===== `ssl_truststore_location` 545 | 546 | * Value type is <> 547 | * There is no default value for this setting. 548 | 549 | The JKS truststore path to validate the Kafka broker's certificate. 550 | 551 | [id="plugins-{type}s-{plugin}-ssl_truststore_password"] 552 | ===== `ssl_truststore_password` 553 | 554 | * Value type is <> 555 | * There is no default value for this setting. 556 | 557 | The truststore password 558 | 559 | [id="plugins-{type}s-{plugin}-ssl_truststore_type"] 560 | ===== `ssl_truststore_type` 561 | 562 | * Value type is <> 563 | * There is no default value for this setting. 564 | 565 | The truststore type. 566 | 567 | [id="plugins-{type}s-{plugin}-topics"] 568 | ===== `topics` 569 | 570 | * Value type is <> 571 | * Default value is `["logstash"]` 572 | 573 | A list of topics to subscribe to, defaults to ["logstash"]. 574 | 575 | [id="plugins-{type}s-{plugin}-topics_pattern"] 576 | ===== `topics_pattern` 577 | 578 | * Value type is <> 579 | * There is no default value for this setting. 580 | 581 | A topic regex pattern to subscribe to. 582 | The topics configuration will be ignored when using this configuration. 583 | 584 | [id="plugins-{type}s-{plugin}-value_deserializer_class"] 585 | ===== `value_deserializer_class` 586 | 587 | * Value type is <> 588 | * Default value is `"org.apache.kafka.common.serialization.StringDeserializer"` 589 | 590 | Java Class used to deserialize the record's value 591 | 592 | 593 | 594 | [id="plugins-{type}s-{plugin}-common-options"] 595 | include::{include_path}/{type}.asciidoc[] 596 | 597 | :default_codec!: 598 | -------------------------------------------------------------------------------- /gradle.properties: -------------------------------------------------------------------------------- 1 | org.gradle.daemon=false 2 | -------------------------------------------------------------------------------- /gradle/wrapper/gradle-wrapper.jar: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/logstash-plugins/logstash-input-kafka/80ad0d8bb26e9b371f113e90dd56e3ee2a0e742e/gradle/wrapper/gradle-wrapper.jar -------------------------------------------------------------------------------- /gradle/wrapper/gradle-wrapper.properties: -------------------------------------------------------------------------------- 1 | #Wed Jun 21 11:39:16 CEST 2017 2 | distributionBase=GRADLE_USER_HOME 3 | distributionPath=wrapper/dists 4 | zipStoreBase=GRADLE_USER_HOME 5 | zipStorePath=wrapper/dists 6 | distributionUrl=https\://services.gradle.org/distributions/gradle-4.0-all.zip 7 | -------------------------------------------------------------------------------- /gradlew: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env sh 2 | 3 | ############################################################################## 4 | ## 5 | ## Gradle start up script for UN*X 6 | ## 7 | ############################################################################## 8 | 9 | # Attempt to set APP_HOME 10 | # Resolve links: $0 may be a link 11 | PRG="$0" 12 | # Need this for relative symlinks. 13 | while [ -h "$PRG" ] ; do 14 | ls=`ls -ld "$PRG"` 15 | link=`expr "$ls" : '.*-> \(.*\)$'` 16 | if expr "$link" : '/.*' > /dev/null; then 17 | PRG="$link" 18 | else 19 | PRG=`dirname "$PRG"`"/$link" 20 | fi 21 | done 22 | SAVED="`pwd`" 23 | cd "`dirname \"$PRG\"`/" >/dev/null 24 | APP_HOME="`pwd -P`" 25 | cd "$SAVED" >/dev/null 26 | 27 | APP_NAME="Gradle" 28 | APP_BASE_NAME=`basename "$0"` 29 | 30 | # Add default JVM options here. You can also use JAVA_OPTS and GRADLE_OPTS to pass JVM options to this script. 31 | DEFAULT_JVM_OPTS="" 32 | 33 | # Use the maximum available, or set MAX_FD != -1 to use that value. 34 | MAX_FD="maximum" 35 | 36 | warn () { 37 | echo "$*" 38 | } 39 | 40 | die () { 41 | echo 42 | echo "$*" 43 | echo 44 | exit 1 45 | } 46 | 47 | # OS specific support (must be 'true' or 'false'). 48 | cygwin=false 49 | msys=false 50 | darwin=false 51 | nonstop=false 52 | case "`uname`" in 53 | CYGWIN* ) 54 | cygwin=true 55 | ;; 56 | Darwin* ) 57 | darwin=true 58 | ;; 59 | MINGW* ) 60 | msys=true 61 | ;; 62 | NONSTOP* ) 63 | nonstop=true 64 | ;; 65 | esac 66 | 67 | CLASSPATH=$APP_HOME/gradle/wrapper/gradle-wrapper.jar 68 | 69 | # Determine the Java command to use to start the JVM. 70 | if [ -n "$JAVA_HOME" ] ; then 71 | if [ -x "$JAVA_HOME/jre/sh/java" ] ; then 72 | # IBM's JDK on AIX uses strange locations for the executables 73 | JAVACMD="$JAVA_HOME/jre/sh/java" 74 | else 75 | JAVACMD="$JAVA_HOME/bin/java" 76 | fi 77 | if [ ! -x "$JAVACMD" ] ; then 78 | die "ERROR: JAVA_HOME is set to an invalid directory: $JAVA_HOME 79 | 80 | Please set the JAVA_HOME variable in your environment to match the 81 | location of your Java installation." 82 | fi 83 | else 84 | JAVACMD="java" 85 | which java >/dev/null 2>&1 || die "ERROR: JAVA_HOME is not set and no 'java' command could be found in your PATH. 86 | 87 | Please set the JAVA_HOME variable in your environment to match the 88 | location of your Java installation." 89 | fi 90 | 91 | # Increase the maximum file descriptors if we can. 92 | if [ "$cygwin" = "false" -a "$darwin" = "false" -a "$nonstop" = "false" ] ; then 93 | MAX_FD_LIMIT=`ulimit -H -n` 94 | if [ $? -eq 0 ] ; then 95 | if [ "$MAX_FD" = "maximum" -o "$MAX_FD" = "max" ] ; then 96 | MAX_FD="$MAX_FD_LIMIT" 97 | fi 98 | ulimit -n $MAX_FD 99 | if [ $? -ne 0 ] ; then 100 | warn "Could not set maximum file descriptor limit: $MAX_FD" 101 | fi 102 | else 103 | warn "Could not query maximum file descriptor limit: $MAX_FD_LIMIT" 104 | fi 105 | fi 106 | 107 | # For Darwin, add options to specify how the application appears in the dock 108 | if $darwin; then 109 | GRADLE_OPTS="$GRADLE_OPTS \"-Xdock:name=$APP_NAME\" \"-Xdock:icon=$APP_HOME/media/gradle.icns\"" 110 | fi 111 | 112 | # For Cygwin, switch paths to Windows format before running java 113 | if $cygwin ; then 114 | APP_HOME=`cygpath --path --mixed "$APP_HOME"` 115 | CLASSPATH=`cygpath --path --mixed "$CLASSPATH"` 116 | JAVACMD=`cygpath --unix "$JAVACMD"` 117 | 118 | # We build the pattern for arguments to be converted via cygpath 119 | ROOTDIRSRAW=`find -L / -maxdepth 1 -mindepth 1 -type d 2>/dev/null` 120 | SEP="" 121 | for dir in $ROOTDIRSRAW ; do 122 | ROOTDIRS="$ROOTDIRS$SEP$dir" 123 | SEP="|" 124 | done 125 | OURCYGPATTERN="(^($ROOTDIRS))" 126 | # Add a user-defined pattern to the cygpath arguments 127 | if [ "$GRADLE_CYGPATTERN" != "" ] ; then 128 | OURCYGPATTERN="$OURCYGPATTERN|($GRADLE_CYGPATTERN)" 129 | fi 130 | # Now convert the arguments - kludge to limit ourselves to /bin/sh 131 | i=0 132 | for arg in "$@" ; do 133 | CHECK=`echo "$arg"|egrep -c "$OURCYGPATTERN" -` 134 | CHECK2=`echo "$arg"|egrep -c "^-"` ### Determine if an option 135 | 136 | if [ $CHECK -ne 0 ] && [ $CHECK2 -eq 0 ] ; then ### Added a condition 137 | eval `echo args$i`=`cygpath --path --ignore --mixed "$arg"` 138 | else 139 | eval `echo args$i`="\"$arg\"" 140 | fi 141 | i=$((i+1)) 142 | done 143 | case $i in 144 | (0) set -- ;; 145 | (1) set -- "$args0" ;; 146 | (2) set -- "$args0" "$args1" ;; 147 | (3) set -- "$args0" "$args1" "$args2" ;; 148 | (4) set -- "$args0" "$args1" "$args2" "$args3" ;; 149 | (5) set -- "$args0" "$args1" "$args2" "$args3" "$args4" ;; 150 | (6) set -- "$args0" "$args1" "$args2" "$args3" "$args4" "$args5" ;; 151 | (7) set -- "$args0" "$args1" "$args2" "$args3" "$args4" "$args5" "$args6" ;; 152 | (8) set -- "$args0" "$args1" "$args2" "$args3" "$args4" "$args5" "$args6" "$args7" ;; 153 | (9) set -- "$args0" "$args1" "$args2" "$args3" "$args4" "$args5" "$args6" "$args7" "$args8" ;; 154 | esac 155 | fi 156 | 157 | # Escape application args 158 | save () { 159 | for i do printf %s\\n "$i" | sed "s/'/'\\\\''/g;1s/^/'/;\$s/\$/' \\\\/" ; done 160 | echo " " 161 | } 162 | APP_ARGS=$(save "$@") 163 | 164 | # Collect all arguments for the java command, following the shell quoting and substitution rules 165 | eval set -- $DEFAULT_JVM_OPTS $JAVA_OPTS $GRADLE_OPTS "\"-Dorg.gradle.appname=$APP_BASE_NAME\"" -classpath "\"$CLASSPATH\"" org.gradle.wrapper.GradleWrapperMain "$APP_ARGS" 166 | 167 | # by default we should be in the correct project dir, but when run from Finder on Mac, the cwd is wrong 168 | if [ "$(uname)" = "Darwin" ] && [ "$HOME" = "$PWD" ]; then 169 | cd "$(dirname "$0")" 170 | fi 171 | 172 | exec "$JAVACMD" "$@" 173 | -------------------------------------------------------------------------------- /gradlew.bat: -------------------------------------------------------------------------------- 1 | @if "%DEBUG%" == "" @echo off 2 | @rem ########################################################################## 3 | @rem 4 | @rem Gradle startup script for Windows 5 | @rem 6 | @rem ########################################################################## 7 | 8 | @rem Set local scope for the variables with windows NT shell 9 | if "%OS%"=="Windows_NT" setlocal 10 | 11 | set DIRNAME=%~dp0 12 | if "%DIRNAME%" == "" set DIRNAME=. 13 | set APP_BASE_NAME=%~n0 14 | set APP_HOME=%DIRNAME% 15 | 16 | @rem Add default JVM options here. You can also use JAVA_OPTS and GRADLE_OPTS to pass JVM options to this script. 17 | set DEFAULT_JVM_OPTS= 18 | 19 | @rem Find java.exe 20 | if defined JAVA_HOME goto findJavaFromJavaHome 21 | 22 | set JAVA_EXE=java.exe 23 | %JAVA_EXE% -version >NUL 2>&1 24 | if "%ERRORLEVEL%" == "0" goto init 25 | 26 | echo. 27 | echo ERROR: JAVA_HOME is not set and no 'java' command could be found in your PATH. 28 | echo. 29 | echo Please set the JAVA_HOME variable in your environment to match the 30 | echo location of your Java installation. 31 | 32 | goto fail 33 | 34 | :findJavaFromJavaHome 35 | set JAVA_HOME=%JAVA_HOME:"=% 36 | set JAVA_EXE=%JAVA_HOME%/bin/java.exe 37 | 38 | if exist "%JAVA_EXE%" goto init 39 | 40 | echo. 41 | echo ERROR: JAVA_HOME is set to an invalid directory: %JAVA_HOME% 42 | echo. 43 | echo Please set the JAVA_HOME variable in your environment to match the 44 | echo location of your Java installation. 45 | 46 | goto fail 47 | 48 | :init 49 | @rem Get command-line arguments, handling Windows variants 50 | 51 | if not "%OS%" == "Windows_NT" goto win9xME_args 52 | 53 | :win9xME_args 54 | @rem Slurp the command line arguments. 55 | set CMD_LINE_ARGS= 56 | set _SKIP=2 57 | 58 | :win9xME_args_slurp 59 | if "x%~1" == "x" goto execute 60 | 61 | set CMD_LINE_ARGS=%* 62 | 63 | :execute 64 | @rem Setup the command line 65 | 66 | set CLASSPATH=%APP_HOME%\gradle\wrapper\gradle-wrapper.jar 67 | 68 | @rem Execute Gradle 69 | "%JAVA_EXE%" %DEFAULT_JVM_OPTS% %JAVA_OPTS% %GRADLE_OPTS% "-Dorg.gradle.appname=%APP_BASE_NAME%" -classpath "%CLASSPATH%" org.gradle.wrapper.GradleWrapperMain %CMD_LINE_ARGS% 70 | 71 | :end 72 | @rem End local scope for the variables with windows NT shell 73 | if "%ERRORLEVEL%"=="0" goto mainEnd 74 | 75 | :fail 76 | rem Set variable GRADLE_EXIT_CONSOLE if you need the _script_ return code instead of 77 | rem the _cmd.exe /c_ return code! 78 | if not "" == "%GRADLE_EXIT_CONSOLE%" exit 1 79 | exit /b 1 80 | 81 | :mainEnd 82 | if "%OS%"=="Windows_NT" endlocal 83 | 84 | :omega 85 | -------------------------------------------------------------------------------- /kafka_test_setup.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | # Setup Kafka and create test topics 3 | 4 | set -ex 5 | if [ -n "${KAFKA_VERSION+1}" ]; then 6 | echo "KAFKA_VERSION is $KAFKA_VERSION" 7 | else 8 | KAFKA_VERSION=2.1.1 9 | fi 10 | 11 | export _JAVA_OPTIONS="-Djava.net.preferIPv4Stack=true" 12 | 13 | rm -rf build 14 | mkdir build 15 | 16 | echo "Downloading Kafka version $KAFKA_VERSION" 17 | curl -s -o build/kafka.tgz "http://ftp.wayne.edu/apache/kafka/$KAFKA_VERSION/kafka_2.11-$KAFKA_VERSION.tgz" 18 | mkdir build/kafka && tar xzf build/kafka.tgz -C build/kafka --strip-components 1 19 | 20 | echo "Starting ZooKeeper" 21 | build/kafka/bin/zookeeper-server-start.sh -daemon build/kafka/config/zookeeper.properties 22 | sleep 10 23 | echo "Starting Kafka broker" 24 | build/kafka/bin/kafka-server-start.sh -daemon build/kafka/config/server.properties --override advertised.host.name=127.0.0.1 --override log.dirs="${PWD}/build/kafka-logs" 25 | sleep 10 26 | 27 | echo "Setting up test topics with test data" 28 | build/kafka/bin/kafka-topics.sh --create --partitions 3 --replication-factor 1 --topic logstash_topic_plain --zookeeper localhost:2181 29 | build/kafka/bin/kafka-topics.sh --create --partitions 3 --replication-factor 1 --topic logstash_topic_snappy --zookeeper localhost:2181 30 | build/kafka/bin/kafka-topics.sh --create --partitions 3 --replication-factor 1 --topic logstash_topic_lz4 --zookeeper localhost:2181 31 | curl -s -o build/apache_logs.txt https://s3.amazonaws.com/data.elasticsearch.org/apache_logs/apache_logs.txt 32 | cat build/apache_logs.txt | build/kafka/bin/kafka-console-producer.sh --topic logstash_topic_plain --broker-list localhost:9092 33 | cat build/apache_logs.txt | build/kafka/bin/kafka-console-producer.sh --topic logstash_topic_snappy --broker-list localhost:9092 --compression-codec snappy 34 | cat build/apache_logs.txt | build/kafka/bin/kafka-console-producer.sh --topic logstash_topic_lz4 --broker-list localhost:9092 --compression-codec lz4 35 | echo "Setup complete, running specs" 36 | -------------------------------------------------------------------------------- /kafka_test_teardown.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | set -ex 3 | 4 | echo "Stopping Kafka broker" 5 | build/kafka/bin/kafka-server-stop.sh 6 | echo "Stopping zookeeper" 7 | build/kafka/bin/zookeeper-server-stop.sh 8 | -------------------------------------------------------------------------------- /lib/logstash-input-kafka_jars.rb: -------------------------------------------------------------------------------- 1 | # AUTOGENERATED BY THE GRADLE SCRIPT. DO NOT EDIT. 2 | 3 | require 'jar_dependencies' 4 | require_jar('org.apache.kafka', 'kafka-clients', '2.1.0') 5 | require_jar('com.github.luben', 'zstd-jni', '1.3.7-3') 6 | require_jar('org.slf4j', 'slf4j-api', '1.7.25') 7 | require_jar('org.lz4', 'lz4-java', '1.5.0') 8 | require_jar('org.xerial.snappy', 'snappy-java', '1.1.7.2') 9 | -------------------------------------------------------------------------------- /lib/logstash/inputs/kafka.rb: -------------------------------------------------------------------------------- 1 | require 'logstash/namespace' 2 | require 'logstash/inputs/base' 3 | require 'stud/interval' 4 | require 'java' 5 | require 'logstash-input-kafka_jars.rb' 6 | 7 | # This input will read events from a Kafka topic. It uses the 0.10 version of 8 | # the consumer API provided by Kafka to read messages from the broker. 9 | # 10 | # Here's a compatibility matrix that shows the Kafka client versions that are compatible with each combination 11 | # of Logstash and the Kafka input plugin: 12 | # 13 | # [options="header"] 14 | # |========================================================== 15 | # |Kafka Client Version |Logstash Version |Plugin Version |Why? 16 | # |0.8 |2.0.0 - 2.x.x |<3.0.0 |Legacy, 0.8 is still popular 17 | # |0.9 |2.0.0 - 2.3.x | 3.x.x |Works with the old Ruby Event API (`event['product']['price'] = 10`) 18 | # |0.9 |2.4.x - 5.x.x | 4.x.x |Works with the new getter/setter APIs (`event.set('[product][price]', 10)`) 19 | # |0.10.0.x |2.4.x - 5.x.x | 5.x.x |Not compatible with the <= 0.9 broker 20 | # |0.10.1.x |2.4.x - 5.x.x | 6.x.x | 21 | # |========================================================== 22 | # 23 | # NOTE: We recommended that you use matching Kafka client and broker versions. During upgrades, you should 24 | # upgrade brokers before clients because brokers target backwards compatibility. For example, the 0.9 broker 25 | # is compatible with both the 0.8 consumer and 0.9 consumer APIs, but not the other way around. 26 | # 27 | # This input supports connecting to Kafka over: 28 | # 29 | # * SSL (requires plugin version 3.0.0 or later) 30 | # * Kerberos SASL (requires plugin version 5.1.0 or later) 31 | # 32 | # By default security is disabled but can be turned on as needed. 33 | # 34 | # The Logstash Kafka consumer handles group management and uses the default offset management 35 | # strategy using Kafka topics. 36 | # 37 | # Logstash instances by default form a single logical group to subscribe to Kafka topics 38 | # Each Logstash Kafka consumer can run multiple threads to increase read throughput. Alternatively, 39 | # you could run multiple Logstash instances with the same `group_id` to spread the load across 40 | # physical machines. Messages in a topic will be distributed to all Logstash instances with 41 | # the same `group_id`. 42 | # 43 | # Ideally you should have as many threads as the number of partitions for a perfect balance -- 44 | # more threads than partitions means that some threads will be idle 45 | # 46 | # For more information see http://kafka.apache.org/documentation.html#theconsumer 47 | # 48 | # Kafka consumer configuration: http://kafka.apache.org/documentation.html#consumerconfigs 49 | # 50 | class LogStash::Inputs::Kafka < LogStash::Inputs::Base 51 | config_name 'kafka' 52 | 53 | default :codec, 'plain' 54 | 55 | # The frequency in milliseconds that the consumer offsets are committed to Kafka. 56 | config :auto_commit_interval_ms, :validate => :string, :default => "5000" 57 | # What to do when there is no initial offset in Kafka or if an offset is out of range: 58 | # 59 | # * earliest: automatically reset the offset to the earliest offset 60 | # * latest: automatically reset the offset to the latest offset 61 | # * none: throw exception to the consumer if no previous offset is found for the consumer's group 62 | # * anything else: throw exception to the consumer. 63 | config :auto_offset_reset, :validate => :string 64 | # A list of URLs of Kafka instances to use for establishing the initial connection to the cluster. 65 | # This list should be in the form of `host1:port1,host2:port2` These urls are just used 66 | # for the initial connection to discover the full cluster membership (which may change dynamically) 67 | # so this list need not contain the full set of servers (you may want more than one, though, in 68 | # case a server is down). 69 | config :bootstrap_servers, :validate => :string, :default => "localhost:9092" 70 | # Automatically check the CRC32 of the records consumed. This ensures no on-the-wire or on-disk 71 | # corruption to the messages occurred. This check adds some overhead, so it may be 72 | # disabled in cases seeking extreme performance. 73 | config :check_crcs, :validate => :string 74 | # The id string to pass to the server when making requests. The purpose of this 75 | # is to be able to track the source of requests beyond just ip/port by allowing 76 | # a logical application name to be included. 77 | config :client_id, :validate => :string, :default => "logstash" 78 | # Close idle connections after the number of milliseconds specified by this config. 79 | config :connections_max_idle_ms, :validate => :string 80 | # Ideally you should have as many threads as the number of partitions for a perfect 81 | # balance — more threads than partitions means that some threads will be idle 82 | config :consumer_threads, :validate => :number, :default => 1 83 | # If true, periodically commit to Kafka the offsets of messages already returned by the consumer. 84 | # This committed offset will be used when the process fails as the position from 85 | # which the consumption will begin. 86 | config :enable_auto_commit, :validate => :string, :default => "true" 87 | # Whether records from internal topics (such as offsets) should be exposed to the consumer. 88 | # If set to true the only way to receive records from an internal topic is subscribing to it. 89 | config :exclude_internal_topics, :validate => :string 90 | # The maximum amount of data the server should return for a fetch request. This is not an 91 | # absolute maximum, if the first message in the first non-empty partition of the fetch is larger 92 | # than this value, the message will still be returned to ensure that the consumer can make progress. 93 | config :fetch_max_bytes, :validate => :string 94 | # The maximum amount of time the server will block before answering the fetch request if 95 | # there isn't sufficient data to immediately satisfy `fetch_min_bytes`. This 96 | # should be less than or equal to the timeout used in `poll_timeout_ms` 97 | config :fetch_max_wait_ms, :validate => :string 98 | # The minimum amount of data the server should return for a fetch request. If insufficient 99 | # data is available the request will wait for that much data to accumulate 100 | # before answering the request. 101 | config :fetch_min_bytes, :validate => :string 102 | # The identifier of the group this consumer belongs to. Consumer group is a single logical subscriber 103 | # that happens to be made up of multiple processors. Messages in a topic will be distributed to all 104 | # Logstash instances with the same `group_id` 105 | config :group_id, :validate => :string, :default => "logstash" 106 | # The expected time between heartbeats to the consumer coordinator. Heartbeats are used to ensure 107 | # that the consumer's session stays active and to facilitate rebalancing when new 108 | # consumers join or leave the group. The value must be set lower than 109 | # `session.timeout.ms`, but typically should be set no higher than 1/3 of that value. 110 | # It can be adjusted even lower to control the expected time for normal rebalances. 111 | config :heartbeat_interval_ms, :validate => :string 112 | # Java Class used to deserialize the record's key 113 | config :key_deserializer_class, :validate => :string, :default => "org.apache.kafka.common.serialization.StringDeserializer" 114 | # The maximum delay between invocations of poll() when using consumer group management. This places 115 | # an upper bound on the amount of time that the consumer can be idle before fetching more records. 116 | # If poll() is not called before expiration of this timeout, then the consumer is considered failed and 117 | # the group will rebalance in order to reassign the partitions to another member. 118 | # The value of the configuration `request_timeout_ms` must always be larger than max_poll_interval_ms 119 | config :max_poll_interval_ms, :validate => :string 120 | # The maximum amount of data per-partition the server will return. The maximum total memory used for a 121 | # request will be #partitions * max.partition.fetch.bytes. This size must be at least 122 | # as large as the maximum message size the server allows or else it is possible for the producer to 123 | # send messages larger than the consumer can fetch. If that happens, the consumer can get stuck trying 124 | # to fetch a large message on a certain partition. 125 | config :max_partition_fetch_bytes, :validate => :string 126 | # The maximum number of records returned in a single call to poll(). 127 | config :max_poll_records, :validate => :string 128 | # The period of time in milliseconds after which we force a refresh of metadata even if 129 | # we haven't seen any partition leadership changes to proactively discover any new brokers or partitions 130 | config :metadata_max_age_ms, :validate => :string 131 | # The class name of the partition assignment strategy that the client will use to distribute 132 | # partition ownership amongst consumer instances 133 | config :partition_assignment_strategy, :validate => :string 134 | # The size of the TCP receive buffer (SO_RCVBUF) to use when reading data. 135 | config :receive_buffer_bytes, :validate => :string 136 | # The amount of time to wait before attempting to reconnect to a given host. 137 | # This avoids repeatedly connecting to a host in a tight loop. 138 | # This backoff applies to all requests sent by the consumer to the broker. 139 | config :reconnect_backoff_ms, :validate => :string 140 | # The configuration controls the maximum amount of time the client will wait 141 | # for the response of a request. If the response is not received before the timeout 142 | # elapses the client will resend the request if necessary or fail the request if 143 | # retries are exhausted. 144 | config :request_timeout_ms, :validate => :string 145 | # The amount of time to wait before attempting to retry a failed fetch request 146 | # to a given topic partition. This avoids repeated fetching-and-failing in a tight loop. 147 | config :retry_backoff_ms, :validate => :string 148 | # The size of the TCP send buffer (SO_SNDBUF) to use when sending data 149 | config :send_buffer_bytes, :validate => :string 150 | # The timeout after which, if the `poll_timeout_ms` is not invoked, the consumer is marked dead 151 | # and a rebalance operation is triggered for the group identified by `group_id` 152 | config :session_timeout_ms, :validate => :string 153 | # Java Class used to deserialize the record's value 154 | config :value_deserializer_class, :validate => :string, :default => "org.apache.kafka.common.serialization.StringDeserializer" 155 | # A list of topics to subscribe to, defaults to ["logstash"]. 156 | config :topics, :validate => :array, :default => ["logstash"] 157 | # A topic regex pattern to subscribe to. 158 | # The topics configuration will be ignored when using this configuration. 159 | config :topics_pattern, :validate => :string 160 | # Time kafka consumer will wait to receive new messages from topics 161 | config :poll_timeout_ms, :validate => :number, :default => 100 162 | # The truststore type. 163 | config :ssl_truststore_type, :validate => :string 164 | # The JKS truststore path to validate the Kafka broker's certificate. 165 | config :ssl_truststore_location, :validate => :path 166 | # The truststore password 167 | config :ssl_truststore_password, :validate => :password 168 | # The keystore type. 169 | config :ssl_keystore_type, :validate => :string 170 | # If client authentication is required, this setting stores the keystore path. 171 | config :ssl_keystore_location, :validate => :path 172 | # If client authentication is required, this setting stores the keystore password 173 | config :ssl_keystore_password, :validate => :password 174 | # The password of the private key in the key store file. 175 | config :ssl_key_password, :validate => :password 176 | # Algorithm to use when verifying host. Set to "" to disable 177 | config :ssl_endpoint_identification_algorithm, :validate => :string, :default => 'https' 178 | # Security protocol to use, which can be either of PLAINTEXT,SSL,SASL_PLAINTEXT,SASL_SSL 179 | config :security_protocol, :validate => ["PLAINTEXT", "SSL", "SASL_PLAINTEXT", "SASL_SSL"], :default => "PLAINTEXT" 180 | # http://kafka.apache.org/documentation.html#security_sasl[SASL mechanism] used for client connections. 181 | # This may be any mechanism for which a security provider is available. 182 | # GSSAPI is the default mechanism. 183 | config :sasl_mechanism, :validate => :string, :default => "GSSAPI" 184 | # The Kerberos principal name that Kafka broker runs as. 185 | # This can be defined either in Kafka's JAAS config or in Kafka's config. 186 | config :sasl_kerberos_service_name, :validate => :string 187 | # The Java Authentication and Authorization Service (JAAS) API supplies user authentication and authorization 188 | # services for Kafka. This setting provides the path to the JAAS file. Sample JAAS file for Kafka client: 189 | # [source,java] 190 | # ---------------------------------- 191 | # KafkaClient { 192 | # com.sun.security.auth.module.Krb5LoginModule required 193 | # useTicketCache=true 194 | # renewTicket=true 195 | # serviceName="kafka"; 196 | # }; 197 | # ---------------------------------- 198 | # 199 | # Please note that specifying `jaas_path` and `kerberos_config` in the config file will add these 200 | # to the global JVM system properties. This means if you have multiple Kafka inputs, all of them would be sharing the same 201 | # `jaas_path` and `kerberos_config`. If this is not desirable, you would have to run separate instances of Logstash on 202 | # different JVM instances. 203 | config :jaas_path, :validate => :path 204 | # JAAS configuration settings. This allows JAAS config to be a part of the plugin configuration and allows for different JAAS configuration per each plugin config. 205 | config :sasl_jaas_config, :validate => :string 206 | # Optional path to kerberos config file. This is krb5.conf style as detailed in https://web.mit.edu/kerberos/krb5-1.12/doc/admin/conf_files/krb5_conf.html 207 | config :kerberos_config, :validate => :path 208 | # Option to add Kafka metadata like topic, message size to the event. 209 | # This will add a field named `kafka` to the logstash event containing the following attributes: 210 | # `topic`: The topic this message is associated with 211 | # `consumer_group`: The consumer group used to read in this event 212 | # `partition`: The partition this message is associated with 213 | # `offset`: The offset from the partition this message is associated with 214 | # `key`: A ByteBuffer containing the message key 215 | # `timestamp`: The timestamp of this message 216 | config :decorate_events, :validate => :boolean, :default => false 217 | 218 | 219 | public 220 | def register 221 | @runner_threads = [] 222 | end # def register 223 | 224 | public 225 | def run(logstash_queue) 226 | @runner_consumers = consumer_threads.times.map { |i| create_consumer("#{client_id}-#{i}") } 227 | @runner_threads = @runner_consumers.map { |consumer| thread_runner(logstash_queue, consumer) } 228 | @runner_threads.each { |t| t.join } 229 | end # def run 230 | 231 | public 232 | def stop 233 | # if we have consumers, wake them up to unblock our runner threads 234 | @runner_consumers && @runner_consumers.each(&:wakeup) 235 | end 236 | 237 | public 238 | def kafka_consumers 239 | @runner_consumers 240 | end 241 | 242 | private 243 | def thread_runner(logstash_queue, consumer) 244 | Thread.new do 245 | begin 246 | unless @topics_pattern.nil? 247 | nooplistener = org.apache.kafka.clients.consumer.internals.NoOpConsumerRebalanceListener.new 248 | pattern = java.util.regex.Pattern.compile(@topics_pattern) 249 | consumer.subscribe(pattern, nooplistener) 250 | else 251 | consumer.subscribe(topics); 252 | end 253 | codec_instance = @codec.clone 254 | while !stop? 255 | records = consumer.poll(poll_timeout_ms) 256 | next unless records.count > 0 257 | for record in records do 258 | codec_instance.decode(record.value.to_s) do |event| 259 | decorate(event) 260 | if @decorate_events 261 | event.set("[@metadata][kafka][topic]", record.topic) 262 | event.set("[@metadata][kafka][consumer_group]", @group_id) 263 | event.set("[@metadata][kafka][partition]", record.partition) 264 | event.set("[@metadata][kafka][offset]", record.offset) 265 | event.set("[@metadata][kafka][key]", record.key) 266 | event.set("[@metadata][kafka][timestamp]", record.timestamp) 267 | end 268 | logstash_queue << event 269 | end 270 | end 271 | # Manual offset commit 272 | if @enable_auto_commit == "false" 273 | consumer.commitSync 274 | end 275 | end 276 | rescue org.apache.kafka.common.errors.WakeupException => e 277 | raise e if !stop? 278 | ensure 279 | consumer.close 280 | end 281 | end 282 | end 283 | 284 | private 285 | def create_consumer(client_id) 286 | begin 287 | props = java.util.Properties.new 288 | kafka = org.apache.kafka.clients.consumer.ConsumerConfig 289 | 290 | props.put(kafka::AUTO_COMMIT_INTERVAL_MS_CONFIG, auto_commit_interval_ms) 291 | props.put(kafka::AUTO_OFFSET_RESET_CONFIG, auto_offset_reset) unless auto_offset_reset.nil? 292 | props.put(kafka::BOOTSTRAP_SERVERS_CONFIG, bootstrap_servers) 293 | props.put(kafka::CHECK_CRCS_CONFIG, check_crcs) unless check_crcs.nil? 294 | props.put(kafka::CLIENT_ID_CONFIG, client_id) 295 | props.put(kafka::CONNECTIONS_MAX_IDLE_MS_CONFIG, connections_max_idle_ms) unless connections_max_idle_ms.nil? 296 | props.put(kafka::ENABLE_AUTO_COMMIT_CONFIG, enable_auto_commit) 297 | props.put(kafka::EXCLUDE_INTERNAL_TOPICS_CONFIG, exclude_internal_topics) unless exclude_internal_topics.nil? 298 | props.put(kafka::FETCH_MAX_BYTES_CONFIG, fetch_max_bytes) unless fetch_max_bytes.nil? 299 | props.put(kafka::FETCH_MAX_WAIT_MS_CONFIG, fetch_max_wait_ms) unless fetch_max_wait_ms.nil? 300 | props.put(kafka::FETCH_MIN_BYTES_CONFIG, fetch_min_bytes) unless fetch_min_bytes.nil? 301 | props.put(kafka::GROUP_ID_CONFIG, group_id) 302 | props.put(kafka::HEARTBEAT_INTERVAL_MS_CONFIG, heartbeat_interval_ms) unless heartbeat_interval_ms.nil? 303 | props.put(kafka::KEY_DESERIALIZER_CLASS_CONFIG, key_deserializer_class) 304 | props.put(kafka::MAX_PARTITION_FETCH_BYTES_CONFIG, max_partition_fetch_bytes) unless max_partition_fetch_bytes.nil? 305 | props.put(kafka::MAX_POLL_RECORDS_CONFIG, max_poll_records) unless max_poll_records.nil? 306 | props.put(kafka::MAX_POLL_INTERVAL_MS_CONFIG, max_poll_interval_ms) unless max_poll_interval_ms.nil? 307 | props.put(kafka::METADATA_MAX_AGE_CONFIG, metadata_max_age_ms) unless metadata_max_age_ms.nil? 308 | props.put(kafka::PARTITION_ASSIGNMENT_STRATEGY_CONFIG, partition_assignment_strategy) unless partition_assignment_strategy.nil? 309 | props.put(kafka::RECEIVE_BUFFER_CONFIG, receive_buffer_bytes) unless receive_buffer_bytes.nil? 310 | props.put(kafka::RECONNECT_BACKOFF_MS_CONFIG, reconnect_backoff_ms) unless reconnect_backoff_ms.nil? 311 | props.put(kafka::REQUEST_TIMEOUT_MS_CONFIG, request_timeout_ms) unless request_timeout_ms.nil? 312 | props.put(kafka::RETRY_BACKOFF_MS_CONFIG, retry_backoff_ms) unless retry_backoff_ms.nil? 313 | props.put(kafka::SEND_BUFFER_CONFIG, send_buffer_bytes) unless send_buffer_bytes.nil? 314 | props.put(kafka::SESSION_TIMEOUT_MS_CONFIG, session_timeout_ms) unless session_timeout_ms.nil? 315 | props.put(kafka::VALUE_DESERIALIZER_CLASS_CONFIG, value_deserializer_class) 316 | 317 | props.put("security.protocol", security_protocol) unless security_protocol.nil? 318 | 319 | if security_protocol == "SSL" 320 | set_trustore_keystore_config(props) 321 | elsif security_protocol == "SASL_PLAINTEXT" 322 | set_sasl_config(props) 323 | elsif security_protocol == "SASL_SSL" 324 | set_trustore_keystore_config(props) 325 | set_sasl_config(props) 326 | end 327 | 328 | org.apache.kafka.clients.consumer.KafkaConsumer.new(props) 329 | rescue => e 330 | logger.error("Unable to create Kafka consumer from given configuration", 331 | :kafka_error_message => e, 332 | :cause => e.respond_to?(:getCause) ? e.getCause() : nil) 333 | raise e 334 | end 335 | end 336 | 337 | def set_trustore_keystore_config(props) 338 | props.put("ssl.truststore.type", ssl_truststore_type) unless ssl_truststore_type.nil? 339 | props.put("ssl.truststore.location", ssl_truststore_location) unless ssl_truststore_location.nil? 340 | props.put("ssl.truststore.password", ssl_truststore_password.value) unless ssl_truststore_password.nil? 341 | 342 | # Client auth stuff 343 | props.put("ssl.keystore.type", ssl_keystore_type) unless ssl_keystore_type.nil? 344 | props.put("ssl.key.password", ssl_key_password.value) unless ssl_key_password.nil? 345 | props.put("ssl.keystore.location", ssl_keystore_location) unless ssl_keystore_location.nil? 346 | props.put("ssl.keystore.password", ssl_keystore_password.value) unless ssl_keystore_password.nil? 347 | props.put("ssl.endpoint.identification.algorithm", ssl_endpoint_identification_algorithm) unless ssl_endpoint_identification_algorithm.nil? 348 | end 349 | 350 | def set_sasl_config(props) 351 | java.lang.System.setProperty("java.security.auth.login.config",jaas_path) unless jaas_path.nil? 352 | java.lang.System.setProperty("java.security.krb5.conf",kerberos_config) unless kerberos_config.nil? 353 | 354 | props.put("sasl.mechanism",sasl_mechanism) 355 | if sasl_mechanism == "GSSAPI" && sasl_kerberos_service_name.nil? 356 | raise LogStash::ConfigurationError, "sasl_kerberos_service_name must be specified when SASL mechanism is GSSAPI" 357 | end 358 | 359 | props.put("sasl.kerberos.service.name",sasl_kerberos_service_name) unless sasl_kerberos_service_name.nil? 360 | props.put("sasl.jaas.config", sasl_jaas_config) unless sasl_jaas_config.nil? 361 | end 362 | end #class LogStash::Inputs::Kafka 363 | -------------------------------------------------------------------------------- /logstash-input-kafka.gemspec: -------------------------------------------------------------------------------- 1 | Gem::Specification.new do |s| 2 | s.name = 'logstash-input-kafka' 3 | s.version = '9.1.0' 4 | s.licenses = ['Apache-2.0'] 5 | s.summary = "Reads events from a Kafka topic" 6 | s.description = "This gem is a Logstash plugin required to be installed on top of the Logstash core pipeline using $LS_HOME/bin/logstash-plugin install gemname. This gem is not a stand-alone program" 7 | s.authors = ['Elasticsearch'] 8 | s.email = 'info@elastic.co' 9 | s.homepage = "http://www.elastic.co/guide/en/logstash/current/index.html" 10 | s.require_paths = ['lib', 'vendor/jar-dependencies'] 11 | 12 | # Files 13 | s.files = Dir["lib/**/*","spec/**/*","*.gemspec","*.md","CONTRIBUTORS","Gemfile","LICENSE","NOTICE.TXT", "vendor/jar-dependencies/**/*.jar", "vendor/jar-dependencies/**/*.rb", "VERSION", "docs/**/*"] 14 | 15 | # Tests 16 | s.test_files = s.files.grep(%r{^(test|spec|features)/}) 17 | 18 | # Special flag to let us know this is actually a logstash plugin 19 | s.metadata = { 'logstash_plugin' => 'true', 'logstash_group' => 'input'} 20 | 21 | s.add_development_dependency 'jar-dependencies', '~> 0.3.2' 22 | 23 | # Gem dependencies 24 | s.add_runtime_dependency "logstash-core-plugin-api", ">= 1.60", "<= 2.99" 25 | s.add_runtime_dependency 'logstash-codec-json' 26 | s.add_runtime_dependency 'logstash-codec-plain' 27 | s.add_runtime_dependency 'stud', '>= 0.0.22', '< 0.1.0' 28 | 29 | s.add_development_dependency 'logstash-devutils' 30 | s.add_development_dependency 'rspec-wait' 31 | end 32 | 33 | -------------------------------------------------------------------------------- /spec/integration/inputs/kafka_spec.rb: -------------------------------------------------------------------------------- 1 | # encoding: utf-8 2 | require "logstash/devutils/rspec/spec_helper" 3 | require "logstash/inputs/kafka" 4 | require "digest" 5 | require "rspec/wait" 6 | 7 | # Please run kafka_test_setup.sh prior to executing this integration test. 8 | describe "inputs/kafka", :integration => true do 9 | # Group ids to make sure that the consumers get all the logs. 10 | let(:group_id_1) {rand(36**8).to_s(36)} 11 | let(:group_id_2) {rand(36**8).to_s(36)} 12 | let(:group_id_3) {rand(36**8).to_s(36)} 13 | let(:group_id_4) {rand(36**8).to_s(36)} 14 | let(:group_id_5) {rand(36**8).to_s(36)} 15 | let(:plain_config) { { 'topics' => ['logstash_topic_plain'], 'codec' => 'plain', 'group_id' => group_id_1, 'auto_offset_reset' => 'earliest'} } 16 | let(:multi_consumer_config) { plain_config.merge({"group_id" => group_id_4, "client_id" => "spec", "consumer_threads" => 3}) } 17 | let(:snappy_config) { { 'topics' => ['logstash_topic_snappy'], 'codec' => 'plain', 'group_id' => group_id_1, 'auto_offset_reset' => 'earliest'} } 18 | let(:lz4_config) { { 'topics' => ['logstash_topic_lz4'], 'codec' => 'plain', 'group_id' => group_id_1, 'auto_offset_reset' => 'earliest'} } 19 | let(:pattern_config) { { 'topics_pattern' => 'logstash_topic_.*', 'group_id' => group_id_2, 'codec' => 'plain', 'auto_offset_reset' => 'earliest'} } 20 | let(:decorate_config) { { 'topics' => ['logstash_topic_plain'], 'codec' => 'plain', 'group_id' => group_id_3, 'auto_offset_reset' => 'earliest', 'decorate_events' => true} } 21 | let(:manual_commit_config) { { 'topics' => ['logstash_topic_plain'], 'codec' => 'plain', 'group_id' => group_id_5, 'auto_offset_reset' => 'earliest', 'enable_auto_commit' => 'false'} } 22 | let(:timeout_seconds) { 30 } 23 | let(:num_events) { 103 } 24 | 25 | describe "#kafka-topics" do 26 | def thread_it(kafka_input, queue) 27 | Thread.new do 28 | begin 29 | kafka_input.run(queue) 30 | end 31 | end 32 | end 33 | 34 | it "should consume all messages from plain 3-partition topic" do 35 | kafka_input = LogStash::Inputs::Kafka.new(plain_config) 36 | queue = Queue.new 37 | t = thread_it(kafka_input, queue) 38 | begin 39 | t.run 40 | wait(timeout_seconds).for {queue.length}.to eq(num_events) 41 | expect(queue.length).to eq(num_events) 42 | ensure 43 | t.kill 44 | t.join(30_000) 45 | end 46 | end 47 | 48 | it "should consume all messages from snappy 3-partition topic" do 49 | kafka_input = LogStash::Inputs::Kafka.new(snappy_config) 50 | queue = Queue.new 51 | t = thread_it(kafka_input, queue) 52 | begin 53 | t.run 54 | wait(timeout_seconds).for {queue.length}.to eq(num_events) 55 | expect(queue.length).to eq(num_events) 56 | ensure 57 | t.kill 58 | t.join(30_000) 59 | end 60 | end 61 | 62 | it "should consume all messages from lz4 3-partition topic" do 63 | kafka_input = LogStash::Inputs::Kafka.new(lz4_config) 64 | queue = Queue.new 65 | t = thread_it(kafka_input, queue) 66 | begin 67 | t.run 68 | wait(timeout_seconds).for {queue.length}.to eq(num_events) 69 | expect(queue.length).to eq(num_events) 70 | ensure 71 | t.kill 72 | t.join(30_000) 73 | end 74 | end 75 | 76 | it "should consumer all messages with multiple consumers" do 77 | kafka_input = LogStash::Inputs::Kafka.new(multi_consumer_config) 78 | queue = Queue.new 79 | t = thread_it(kafka_input, queue) 80 | begin 81 | t.run 82 | wait(timeout_seconds).for {queue.length}.to eq(num_events) 83 | expect(queue.length).to eq(num_events) 84 | kafka_input.kafka_consumers.each_with_index do |consumer, i| 85 | expect(consumer.metrics.keys.first.tags["client-id"]).to eq("spec-#{i}") 86 | end 87 | ensure 88 | t.kill 89 | t.join(30_000) 90 | end 91 | end 92 | end 93 | 94 | describe "#kafka-topics-pattern" do 95 | def thread_it(kafka_input, queue) 96 | Thread.new do 97 | begin 98 | kafka_input.run(queue) 99 | end 100 | end 101 | end 102 | 103 | it "should consume all messages from all 3 topics" do 104 | kafka_input = LogStash::Inputs::Kafka.new(pattern_config) 105 | queue = Queue.new 106 | t = thread_it(kafka_input, queue) 107 | begin 108 | t.run 109 | wait(timeout_seconds).for {queue.length}.to eq(3*num_events) 110 | expect(queue.length).to eq(3*num_events) 111 | ensure 112 | t.kill 113 | t.join(30_000) 114 | end 115 | end 116 | end 117 | 118 | describe "#kafka-decorate" do 119 | def thread_it(kafka_input, queue) 120 | Thread.new do 121 | begin 122 | kafka_input.run(queue) 123 | end 124 | end 125 | end 126 | 127 | it "should show the right topic and group name in decorated kafka section" do 128 | start = LogStash::Timestamp.now.time.to_i 129 | kafka_input = LogStash::Inputs::Kafka.new(decorate_config) 130 | queue = Queue.new 131 | t = thread_it(kafka_input, queue) 132 | begin 133 | t.run 134 | wait(timeout_seconds).for {queue.length}.to eq(num_events) 135 | expect(queue.length).to eq(num_events) 136 | event = queue.shift 137 | expect(event.get("[@metadata][kafka][topic]")).to eq("logstash_topic_plain") 138 | expect(event.get("[@metadata][kafka][consumer_group]")).to eq(group_id_3) 139 | expect(event.get("[@metadata][kafka][timestamp]")).to be >= start 140 | ensure 141 | t.kill 142 | t.join(30_000) 143 | end 144 | end 145 | end 146 | 147 | describe "#kafka-offset-commit" do 148 | def thread_it(kafka_input, queue) 149 | Thread.new do 150 | begin 151 | kafka_input.run(queue) 152 | end 153 | end 154 | end 155 | 156 | it "should manually commit offsets" do 157 | kafka_input = LogStash::Inputs::Kafka.new(manual_commit_config) 158 | queue = Queue.new 159 | t = thread_it(kafka_input, queue) 160 | begin 161 | t.run 162 | wait(timeout_seconds).for {queue.length}.to eq(num_events) 163 | expect(queue.length).to eq(num_events) 164 | ensure 165 | t.kill 166 | t.join(30_000) 167 | end 168 | end 169 | end 170 | end 171 | -------------------------------------------------------------------------------- /spec/unit/inputs/kafka_spec.rb: -------------------------------------------------------------------------------- 1 | # encoding: utf-8 2 | require "logstash/devutils/rspec/spec_helper" 3 | require "logstash/inputs/kafka" 4 | require "concurrent" 5 | 6 | class MockConsumer 7 | def initialize 8 | @wake = Concurrent::AtomicBoolean.new(false) 9 | end 10 | 11 | def subscribe(topics) 12 | end 13 | 14 | def poll(ms) 15 | if @wake.value 16 | raise org.apache.kafka.common.errors.WakeupException.new 17 | else 18 | 10.times.map do 19 | org.apache.kafka.clients.consumer.ConsumerRecord.new("logstash", 0, 0, "key", "value") 20 | end 21 | end 22 | end 23 | 24 | def close 25 | end 26 | 27 | def wakeup 28 | @wake.make_true 29 | end 30 | end 31 | 32 | describe LogStash::Inputs::Kafka do 33 | let(:config) { { 'topics' => ['logstash'], 'consumer_threads' => 4 } } 34 | subject { LogStash::Inputs::Kafka.new(config) } 35 | 36 | it "should register" do 37 | expect {subject.register}.to_not raise_error 38 | end 39 | end 40 | --------------------------------------------------------------------------------