├── AUTHORS.md ├── CONTRIBUTING.md ├── LICENSE ├── README.md ├── README └── inputs.conf.spec ├── appserver └── static │ └── screenshot.png ├── bin └── mail.py ├── default ├── app.conf ├── authorize.conf ├── inputs.conf ├── props.conf └── transforms.conf ├── lib ├── file_parser │ ├── __init__.py │ ├── docx.py │ ├── email_mime.py │ ├── utils.py │ └── zip.py ├── mail_constants.py ├── mail_exceptions.py ├── mail_utils.py ├── six.py └── splunklib │ ├── __init__.py │ ├── binding.py │ ├── client.py │ ├── data.py │ ├── modularinput │ ├── __init__.py │ ├── argument.py │ ├── event.py │ ├── event_writer.py │ ├── input_definition.py │ ├── scheme.py │ ├── script.py │ ├── utils.py │ └── validation_definition.py │ ├── ordereddict.py │ ├── results.py │ ├── searchcommands │ ├── __init__.py │ ├── decorators.py │ ├── environment.py │ ├── eventing_command.py │ ├── external_search_command.py │ ├── generating_command.py │ ├── internals.py │ ├── reporting_command.py │ ├── search_command.py │ ├── streaming_command.py │ └── validators.py │ └── six.py ├── metadata └── default.meta └── static ├── appIcon.png └── appIcon_2x.png /AUTHORS.md: -------------------------------------------------------------------------------- 1 | ======= 2 | Credits 3 | ======= 4 | 5 | Development Lead 6 | ---------------- 7 | 8 | * [Oluwaseun Remi-Omosowon](mailto:seunomosowon@gmail.com) 9 | 10 | Contributors 11 | ------------ 12 | 13 | * [François Lacombe](mailto:flacombe@adista.fr) 14 | * [Nathan Worsham](mailto:nworsham@gmail.com) 15 | * [Lowell Alleman](mailto:lowell@kintyre.co) 16 | -------------------------------------------------------------------------------- /CONTRIBUTING.md: -------------------------------------------------------------------------------- 1 | --- 2 | 3 | Contributing 4 | 5 | --- 6 | 7 | Contributions are welcome, and they are greatly appreciated! Every little bit helps, and credit will always be given. 8 | 9 | You can contribute in many ways: 10 | 11 | Types of Contributions 12 | 13 | 1. Report Bugs 14 | 15 | Report bugs at [TA-mailclient repo via on Github](https://github.com/seunomosowon/TA-mailclient/issues). 16 | 17 | If you are reporting a bug, please include: 18 | 19 | * Your operating system name and version. 20 | * Any details about your local setup that might be helpful in troubleshooting. 21 | * Detailed steps to reproduce the bug. 22 | 23 | 2. Fix Bugs 24 | 25 | Look through the GitHub issues for bugs. Anything tagged with "bug" is open to whoever wants to implement it. 26 | 27 | 3. Implement Features 28 | 29 | Look through the GitHub issues for features. Anything tagged with "feature" is open to whoever wants to implement it. 30 | 31 | 4. Write Documentation 32 | 33 | TA-mailclient could always use more documentation. Feel free to add documentation for an undocumented feature. 34 | 35 | 5. Submit Feedback 36 | 37 | Please rate the app on [Splunkbase](https:://splunkbase.splunk.com/app/3200/) 38 | You can also send feedback or submit an issue on [Github](https://github.com/seunomosowon/TA-mailclient/issues). 39 | 40 | Feature requests can also be submitted in the same way. 41 | Remember that this is a volunteer-driven project, and that contributions are welcome :) 42 | 43 | This has been tested with Gmail, gmx.com, and a few other mail servers. You can also send a list of public mail servers that you use this without issues. 44 | 45 | Feature requests are yet to be added to Github include the following: 46 | * Oath support for imap 47 | * Additional mailbox folder support for IMAP 48 | * Parameterization of mailbox limits for each run (currenlty set to 25) 49 | 50 | I'm also working on integrating with Travis CI to allow automatic tests and continuous integration. 51 | 52 | #Guidelines: 53 | 54 | Please fork the repo on [Github](https://github.com/seunomosowon/TA-mailclient/) and create a branch for local changes. Create a pull request to the development branch. 55 | 56 | Thanks again for volunteering :smiley: 57 | 58 | Also remember to add your name to the list of contributors in AUTHORs.md 59 | 60 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | 2 | Apache License 3 | Version 2.0, January 2004 4 | http://www.apache.org/licenses/ 5 | 6 | TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 7 | 8 | 1. Definitions. 9 | 10 | "License" shall mean the terms and conditions for use, reproduction, 11 | and distribution as defined by Sections 1 through 9 of this document. 12 | 13 | "Licensor" shall mean the copyright owner or entity authorized by 14 | the copyright owner that is granting the License. 15 | 16 | "Legal Entity" shall mean the union of the acting entity and all 17 | other entities that control, are controlled by, or are under common 18 | control with that entity. For the purposes of this definition, 19 | "control" means (i) the power, direct or indirect, to cause the 20 | direction or management of such entity, whether by contract or 21 | otherwise, or (ii) ownership of fifty percent (50%) or more of the 22 | outstanding shares, or (iii) beneficial ownership of such entity. 23 | 24 | "You" (or "Your") shall mean an individual or Legal Entity 25 | exercising permissions granted by this License. 26 | 27 | "Source" form shall mean the preferred form for making modifications, 28 | including but not limited to software source code, documentation 29 | source, and configuration files. 30 | 31 | "Object" form shall mean any form resulting from mechanical 32 | transformation or translation of a Source form, including but 33 | not limited to compiled object code, generated documentation, 34 | and conversions to other media types. 35 | 36 | "Work" shall mean the work of authorship, whether in Source or 37 | Object form, made available under the License, as indicated by a 38 | copyright notice that is included in or attached to the work 39 | (an example is provided in the Appendix below). 40 | 41 | "Derivative Works" shall mean any work, whether in Source or Object 42 | form, that is based on (or derived from) the Work and for which the 43 | editorial revisions, annotations, elaborations, or other modifications 44 | represent, as a whole, an original work of authorship. For the purposes 45 | of this License, Derivative Works shall not include works that remain 46 | separable from, or merely link (or bind by name) to the interfaces of, 47 | the Work and Derivative Works thereof. 48 | 49 | "Contribution" shall mean any work of authorship, including 50 | the original version of the Work and any modifications or additions 51 | to that Work or Derivative Works thereof, that is intentionally 52 | submitted to Licensor for inclusion in the Work by the copyright owner 53 | or by an individual or Legal Entity authorized to submit on behalf of 54 | the copyright owner. For the purposes of this definition, "submitted" 55 | means any form of electronic, verbal, or written communication sent 56 | to the Licensor or its representatives, including but not limited to 57 | communication on electronic mailing lists, source code control systems, 58 | and issue tracking systems that are managed by, or on behalf of, the 59 | Licensor for the purpose of discussing and improving the Work, but 60 | excluding communication that is conspicuously marked or otherwise 61 | designated in writing by the copyright owner as "Not a Contribution." 62 | 63 | "Contributor" shall mean Licensor and any individual or Legal Entity 64 | on behalf of whom a Contribution has been received by Licensor and 65 | subsequently incorporated within the Work. 66 | 67 | 2. Grant of Copyright License. Subject to the terms and conditions of 68 | this License, each Contributor hereby grants to You a perpetual, 69 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 70 | copyright license to reproduce, prepare Derivative Works of, 71 | publicly display, publicly perform, sublicense, and distribute the 72 | Work and such Derivative Works in Source or Object form. 73 | 74 | 3. Grant of Patent License. Subject to the terms and conditions of 75 | this License, each Contributor hereby grants to You a perpetual, 76 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 77 | (except as stated in this section) patent license to make, have made, 78 | use, offer to sell, sell, import, and otherwise transfer the Work, 79 | where such license applies only to those patent claims licensable 80 | by such Contributor that are necessarily infringed by their 81 | Contribution(s) alone or by combination of their Contribution(s) 82 | with the Work to which such Contribution(s) was submitted. If You 83 | institute patent litigation against any entity (including a 84 | cross-claim or counterclaim in a lawsuit) alleging that the Work 85 | or a Contribution incorporated within the Work constitutes direct 86 | or contributory patent infringement, then any patent licenses 87 | granted to You under this License for that Work shall terminate 88 | as of the date such litigation is filed. 89 | 90 | 4. Redistribution. You may reproduce and distribute copies of the 91 | Work or Derivative Works thereof in any medium, with or without 92 | modifications, and in Source or Object form, provided that You 93 | meet the following conditions: 94 | 95 | (a) You must give any other recipients of the Work or 96 | Derivative Works a copy of this License; and 97 | 98 | (b) You must cause any modified files to carry prominent notices 99 | stating that You changed the files; and 100 | 101 | (c) You must retain, in the Source form of any Derivative Works 102 | that You distribute, all copyright, patent, trademark, and 103 | attribution notices from the Source form of the Work, 104 | excluding those notices that do not pertain to any part of 105 | the Derivative Works; and 106 | 107 | (d) If the Work includes a "NOTICE" text file as part of its 108 | distribution, then any Derivative Works that You distribute must 109 | include a readable copy of the attribution notices contained 110 | within such NOTICE file, excluding those notices that do not 111 | pertain to any part of the Derivative Works, in at least one 112 | of the following places: within a NOTICE text file distributed 113 | as part of the Derivative Works; within the Source form or 114 | documentation, if provided along with the Derivative Works; or, 115 | within a display generated by the Derivative Works, if and 116 | wherever such third-party notices normally appear. The contents 117 | of the NOTICE file are for informational purposes only and 118 | do not modify the License. You may add Your own attribution 119 | notices within Derivative Works that You distribute, alongside 120 | or as an addendum to the NOTICE text from the Work, provided 121 | that such additional attribution notices cannot be construed 122 | as modifying the License. 123 | 124 | You may add Your own copyright statement to Your modifications and 125 | may provide additional or different license terms and conditions 126 | for use, reproduction, or distribution of Your modifications, or 127 | for any such Derivative Works as a whole, provided Your use, 128 | reproduction, and distribution of the Work otherwise complies with 129 | the conditions stated in this License. 130 | 131 | 5. Submission of Contributions. Unless You explicitly state otherwise, 132 | any Contribution intentionally submitted for inclusion in the Work 133 | by You to the Licensor shall be under the terms and conditions of 134 | this License, without any additional terms or conditions. 135 | Notwithstanding the above, nothing herein shall supersede or modify 136 | the terms of any separate license agreement you may have executed 137 | with Licensor regarding such Contributions. 138 | 139 | 6. Trademarks. This License does not grant permission to use the trade 140 | names, trademarks, service marks, or product names of the Licensor, 141 | except as required for reasonable and customary use in describing the 142 | origin of the Work and reproducing the content of the NOTICE file. 143 | 144 | 7. Disclaimer of Warranty. Unless required by applicable law or 145 | agreed to in writing, Licensor provides the Work (and each 146 | Contributor provides its Contributions) on an "AS IS" BASIS, 147 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 148 | implied, including, without limitation, any warranties or conditions 149 | of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A 150 | PARTICULAR PURPOSE. You are solely responsible for determining the 151 | appropriateness of using or redistributing the Work and assume any 152 | risks associated with Your exercise of permissions under this License. 153 | 154 | 8. Limitation of Liability. In no event and under no legal theory, 155 | whether in tort (including negligence), contract, or otherwise, 156 | unless required by applicable law (such as deliberate and grossly 157 | negligent acts) or agreed to in writing, shall any Contributor be 158 | liable to You for damages, including any direct, indirect, special, 159 | incidental, or consequential damages of any character arising as a 160 | result of this License or out of the use or inability to use the 161 | Work (including but not limited to damages for loss of goodwill, 162 | work stoppage, computer failure or malfunction, or any and all 163 | other commercial damages or losses), even if such Contributor 164 | has been advised of the possibility of such damages. 165 | 166 | 9. Accepting Warranty or Additional Liability. While redistributing 167 | the Work or Derivative Works thereof, You may choose to offer, 168 | and charge a fee for, acceptance of support, warranty, indemnity, 169 | or other liability obligations and/or rights consistent with this 170 | License. However, in accepting such obligations, You may act only 171 | on Your own behalf and on Your sole responsibility, not on behalf 172 | of any other Contributor, and only if You agree to indemnify, 173 | defend, and hold each Contributor harmless for any liability 174 | incurred by, or claims asserted against, such Contributor by reason 175 | of your accepting any such warranty or additional liability. 176 | 177 | END OF TERMS AND CONDITIONS 178 | 179 | APPENDIX: How to apply the Apache License to your work. 180 | 181 | To apply the Apache License to your work, attach the following 182 | boilerplate notice, with the fields enclosed by brackets "[]" 183 | replaced with your own identifying information. (Don't include 184 | the brackets!) The text should be enclosed in the appropriate 185 | comment syntax for the file format. We also recommend that a 186 | file or class name and description of purpose be included on the 187 | same "printed page" as the copyright notice for easier 188 | identification within third-party archives. 189 | 190 | Copyright [yyyy] [name of copyright owner] 191 | 192 | Licensed under the Apache License, Version 2.0 (the "License"); 193 | you may not use this file except in compliance with the License. 194 | You may obtain a copy of the License at 195 | 196 | http://www.apache.org/licenses/LICENSE-2.0 197 | 198 | Unless required by applicable law or agreed to in writing, software 199 | distributed under the License is distributed on an "AS IS" BASIS, 200 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 201 | See the License for the specific language governing permissions and 202 | limitations under the License. 203 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | 2 | ## Table of Contents 3 | 4 | ### OVERVIEW 5 | 6 | - About the TA-mailclient 7 | - Release notes 8 | - About this release 9 | - New features 10 | - To Do 11 | - Known issues 12 | - Third-party software attributions 13 | - Older Releases 14 | - Support and resources 15 | 16 | ### INSTALLATION AND CONFIGURATION 17 | 18 | - Hardware and software requirements 19 | - Splunk Enterprise system requirements 20 | - Download 21 | - Installation steps 22 | - Deploy to single server instance 23 | - Deploy to distributed deployment 24 | - Deploy to Splunk Cloud 25 | - Configure TA-mailclient 26 | - Parameters 27 | - Upgrade 28 | - Copyright & License 29 | 30 | ### USER GUIDE 31 | 32 | - Data types 33 | - Troubleshooting 34 | - Diagnostic & Debug Logs 35 | 36 | 37 | --- 38 | ### OVERVIEW 39 | 40 | #### About the TA-mailclient 41 | 42 | | Author | Oluwaseun Remi-Omosowon | 43 | | --- | --- | 44 | | App Version | 1.6.0 | 45 | | Vendor Products | | 46 | 47 | The TA-mailclient add-on fetches emails for Splunk to index from mailboxes 48 | using either POP3 or IMAP, with or without SSL. 49 | 50 | The modular input also stores takes the password from inputs.conf in plain text, 51 | and replaces it with a place holder, while storing it encrypted within Splunk. 52 | This is built using the Splunk SDK for Python, should work on any Splunk 53 | installation with Python available including SHC. 54 | Passwords should also get replicated between search heard peer members. 55 | 56 | This only fetches emails from the 'inbox' folder when using POP3. Additional mailbox folders can be indexed when using IMAP. 57 | 58 | Be sure to set the interval to run this as frequently as required. 59 | 60 | It supports all 'text/\*' content types and several well known scripts (.bat, .js, .sh) detailed below: 61 | 62 | ``` 63 | 'application/xml' 64 | 'application/xhtml' 65 | 'application/x-sh' 66 | 'application/x-csh', 67 | 'application/javascript' 68 | 'application/bat' 69 | 'application/x-bat' 70 | 'application/x-msdos-program' 71 | 'application/textedit' 72 | ``` 73 | Images, videos and executables are not indexed. 74 | 75 | ##### Scripts and binaries 76 | 77 | Includes: 78 | - Splunk SDK for Python (1.6.14) 79 | - Six python 2/3 compatibility (1.15.0) 80 | - mail_lib - supports the calculation of vincenty distances which is used by default 81 | - constants.py - A number of constants / defaults used throughout the mail_lib module. 82 | - mail_common.py - Shared functions used to parse emails and attachments 83 | - exceptions raised by functions used in the mail_lib module. 84 | 85 | #### Release notes 86 | 87 | ##### About this release 88 | 89 | Version 1.6.0 of the TA-mailclient is compatible with: 90 | 91 | | Splunk Enterprise versions | 8.x, 7.x | 92 | | --- | --- | 93 | | CIM | Not Applicable | 94 | | Platforms | Platform independent | 95 | | Lookup file changes | No lookups included in this app | 96 | 97 | This version removes support for unencrypted connections to mailboxes to allow the app pass Splunk Certification. 98 | The _is_secure_ is no longer required and should be removed from the config. 99 | 100 | The administrator is responsible for setting the sourcetype to whatever is desired, 101 | as well as extracting CIM fields for the sourcetype. 102 | This app already includes several extractions for different parts of the message that can be reused. 103 | 104 | This app will not work on a universal forwarder, 105 | as it requires Python which comes with an HF or a full Splunk install. 106 | 107 | **Note:** Travis CI includes tests for both secure versions of POP3 / IMAP. 108 | 109 | ##### New features 110 | 111 | TA-mailclient includes the following new features: 112 | 113 | - Added support for Python 3 114 | - Added six 1.15.0 115 | - Upgraded Splunk SDK to 1.6.14 116 | - Fix CI/CD tests to work for POP3 on v7.3, fix testing 117 | - Added Fix for working with Zips and docx with python2/python3 118 | - Added support for indexing emails from additional folders when using IMAP 119 | 120 | ##### To Do 121 | 122 | - Add attachment file hash to Splunk 123 | - Add support for doc / ppt / pptx 124 | 125 | ##### Known issues 126 | 127 | This is currently tested against 7.3, 8.0 and the latest version of Splunk Enterprise (v8.1 as at the time of this writing). 128 | Issues can be reported and tracked on Github at this time. 129 | 130 | 131 | ##### Third-party software attributions 132 | 133 | This uses the inbuilt poplib and imaplib that comes with Python by default. 134 | 135 | Contributions on github are welcome and will be incorporated into the main release. 136 | Current contributors are listed in AUTHORS.md. 137 | 138 | 139 | ##### Older Releases 140 | * v1.6.0 141 | * Includes support for dropping attachments 142 | * Migrated CICD to CircleCI 143 | * Added appinspect testing to CI/CD pipeline 144 | * v1.5.5 145 | * Updated Improved support for Python3 146 | * Improved coding style to match new Splunk standards 147 | * Fixed bugs related to indexing zip and docx as a result of Python 2-3 compatibility 148 | * v1.4.0 149 | * Included support for Splunk v8.0 150 | * v1.3.5 151 | * Fixed bug introduced in v1.3.0 152 | * v1.3.0 153 | * Made it more modular to supporting more file types in zips and in emails 154 | * Added support for zips and files within zips 155 | * Fixed unicode conversion of emails following contributions from Francois Lacombe on GitHub 156 | - Also added static mail preamble for line break. Event breaking configuration may not be 157 | required since the modular input writes individual events separately, but it's always a good idea. 158 | * Additional logging from pop3 / imap 159 | * Removed interval from inputs.conf.spec 160 | * Upgraded Splunk SDK to 1.6.2 161 | * Added additional test cases on Travis CI to test that functionality work 162 | * modularized storage/password functions to make them reusable and simpler 163 | * Also fixed exception handling when dealing with storage/password 164 | * Fixed type casting for boolean parameters (is\_secure, include\_headers) and port validation 165 | * Rewrote sections of mail\_common 166 | * Merged functions from poputils / imaputils into main code and added additional logs from connection 167 | 168 | * v0.5.1 169 | * encoding corrections 170 | * deduplicate Date and MessageId from indexed headers 171 | * correction of MessageID extraction 172 | * changed the separator to a predefined one instead of Date and MessageID 173 | * activated and changed label for unsupported attachment 174 | 175 | * v0.5.0 176 | * Fixed UTF-8 encoding of mails before indexing. (Supporting Gmail and others) 177 | 178 | * v0.4.9 179 | * Changed encoding to support reading gmail. 180 | 181 | * v0.4.8 182 | * removed error introduced in v0.4.7 183 | 184 | * v0.4.7 185 | * Removed password field validation to allow users have complex or easy passwords however long 186 | * Handled all mail exceptions 187 | 188 | * v0.4.6 189 | * Fixed bug. 190 | * Fixed header inclusion 191 | 192 | * v0.4.5 193 | * Fixed bug. Removed line which caused v0.4.4 to fail 194 | * Fixed header inclusion 195 | 196 | * v0.4.4 197 | * Updated app to ignore case of file attachment extension 198 | 199 | * v0.4.3 200 | * Made extensions case insensitive 201 | * Added support for indexing _.docx_ extensions 202 | * Generalised ```Mail.save_password()``` to allow reuse of code when writing other modular inputs. 203 | * Optimized python import statements 204 | * Fixed deleting of mails in poplib which was broken in 0.4 205 | 206 | * v0.4.2 207 | * Added support for indexing mail headers 208 | 209 | * v0.4.1 210 | * Fixed bug with 0.4.0 211 | * Made updates to fix unneeded else statement which introduced bug in 0.4.0. 212 | 213 | * v0.4 214 | * Added support for decoding unicode characters in other languages or and removing the unicode identifier in the header. 215 | * Improved support for indexing some file types even if the content-type is not set correctly. (as with Microsoft sending some files as binaries instead of text) 216 | * Added fundamental code to support indexing of attachment as a configurable option in future release by the user. 217 | * Added multiple field extractions for the email header and file attachments. 218 | * Introduced a bug which was corrected in 0.4.1 **Faulty version** 219 | 220 | **Note:** _filename_ and _filecontent_ are multi-valve fields. 221 | 222 | * v0.3 223 | * Adds support for mailbox cleanup options 224 | 225 | * v0.2 226 | * Adds support for base64 encoded emails. 227 | 228 | 229 | #### Support and resources 230 | 231 | **Questions and answers** 232 | 233 | Access questions and answers specific to the TA-mailclient at (https://answers.splunk.com/). 234 | 235 | **Support** 236 | 237 | This Splunk support add-on is community / developer supported. 238 | 239 | Questions asked on Splunk answers will be answered either by the community of users or by the developer when available. 240 | All support questions should include the version of Splunk and OS. 241 | 242 | You can also contact the developer directly via [Splunkbase](https://splunkbase.splunk.com/app/3200/). 243 | Feedback and feature requests can also be sent via Splunkbase. 244 | 245 | Issues can also be submitted at the [TA-mailclient repo via on Github](https://github.com/seunomosowon/TA-mailclient/issues) 246 | 247 | Future release will support 248 | 1. Support for configuration of mail limits in inputs.conf 249 | 2. Recursive option to read all folders inside Inbox, and not just emails within inbox. 250 | 3. Support indexing mails from additional folders in a mailbox 251 | 252 | **Note** : This has not been tested against an exhaustive list of mail servers, so I'll welcome the feedback. 253 | 254 | Also, feel free to send me a list of well known servers that you 're using this with without problems. 255 | 256 | Rate the add-on on [Splunkbase](https://splunkbase.splunk.com/app/3200/) if you use it and are happy with it, 257 | and share your feedback. Thanks! 258 | 259 | 260 | ## INSTALLATION AND CONFIGURATION 261 | ### Hardware and software requirements 262 | 263 | #### Hardware requirements 264 | 265 | TA-mailclient supports the following server platforms in the versions supported by Splunk Enterprise: 266 | 267 | - Linux 268 | - Windows 269 | 270 | The app was developed to be platform agnostic, but tests are mostly run on Linix. 271 | 272 | Please contact the developer with issues running this on Windows. See the Splunk documentation for hardware 273 | requirements for running a heavy forwarder. 274 | 275 | #### Software requirements 276 | 277 | To function properly, TA-mailclient has no external requirements but needs to be installed on a full Splunk 278 | install which provides python and the required libraries (poplib and imaplib). 279 | 280 | #### Splunk Enterprise system requirements 281 | 282 | Because this add-on runs on Splunk Enterprise, all of the [Splunk Enterprise system requirements](http://docs.splunk.com/Documentation/Splunk/latest/Installation/Systemrequirements) apply. 283 | 284 | #### Download 285 | 286 | Download the TA-mailclient at one of the following locaitons: 287 | - [Splunkbase](https://splunkbase.splunk.com/app/3200/#/details) 288 | - [Github](https://github.com/seunomosowon/TA-mailclient) 289 | 290 | #### Installation steps 291 | 292 | ##### Deploy to single server instance 293 | 294 | To install and configure this app on your supported standalone platform, do one of the following: 295 | 296 | - Install on a standalone Splunk Enterprise install via the GUI. [See Link](https://docs.splunk.com/Documentation/AddOns/released/Overview/Singleserverinstall) 297 | - Extract the technology add-on to ```$SPLUNK_HOME/etc/apps/``` and restart Splunk 298 | 299 | ##### Deploy to distributed deployment 300 | 301 | **Install to search head** - (Standalone or Search head cluster) 302 | 303 | - Deploy the props.conf and transforms.conf from TA-mailclient to the search head. 304 | If using search head cluster, deploy the props.conf and transforms.conf via a search head deployer. 305 | 306 | 307 | **Install to indexers** 308 | 309 | - No App needs to be installed on indexers 310 | 311 | **Install to forwarders** 312 | 313 | - Follow the steps to install the TA-mailclient on a heavy forwarder. 314 | More instructions available at the following [URL](https://docs.splunk.com/Documentation/AddOns/released/Overview/Distributedinstall#Heavy_forwarders) 315 | 316 | - Configure an email input by going to the setup page or configuring inputs.conf. 317 | 318 | ##### Deploy to Splunk Cloud 319 | 320 | For Splunk cloud installations, install TA-mailclient on a heavy forwarder that has been configured to forward 321 | events to your Splunk Cloud instance. 322 | The sourcetype is set by the administrator of the heavy forwarder when configuring the inputs. 323 | 324 | You can work with Splunk Support on installing the Support add-on on Splunk Cloud for parsing the mails collected. 325 | 326 | 327 | #### Configure TA-mailclient 328 | 329 | This app adds a mail:// modular input and supports a variety of parameters in inputs.conf. 330 | 331 | ``` 332 | [mail://email_address@domain.com] 333 | interval = 600 334 | mailserver = imap.domain.com 335 | password = mypassword 336 | protocol = IMAP|POP3 337 | disabled = 0 338 | mailbox_cleanup = delete 339 | additional_folder = test,rfc,spam 340 | 341 | ``` 342 | 343 | Once the input is read, the password gets replaced and shows as 'encrypted'. 344 | As such, the password for the mailbox must not be set to 'encrypted'. 345 | 346 | The input can be edited if the password needs to be updated, and the password stored in a password 347 | storage endpoint would get updated automatically. Passwords are never stored in clear text. 348 | 349 | A different sourcetype can be specified for each input, thus making it possible to have different sourcetypes 350 | for every mailbox. Mailbox cleanup is also managed automatically, and emails are deleted once it has been 351 | indexed. 352 | 353 | ##### Parameters 354 | 355 | **mailserver** - This is a mandatory field and should be the hostname or 356 | IP address for the mail server or client access server with support for retrieving emails via POP3 or IMAP 357 | 358 | **protocol** - This must be set to either POP3 or IMAP 359 | 360 | **password** - Passwords must be set for every account, 361 | or the input will get disabled. 362 | 363 | **mailbox_cleanup** = This indicates if every email should be deleted as it is read, 364 | or delayed until the next interval. 365 | Setting this to ```readonly``` prevents mails from being deleted. 366 | 367 | The default is ```readonly```. Supported options are: 368 | ```delayed|delete|readonly``` 369 | 370 | **interval** - This should be configured to run as frequent as required 371 | to retreive emails. This modular input retrieves up to 20 emails at each run. 372 | A future release to this input might allow the limit to be configured as a parameter to the modular input. 373 | 374 | This modular input supports multiple instances, and each input runs at separate intervals. 375 | 376 | **include_headers** - This determines if email headers should be included. 377 | 378 | **additional_folders** - This is an optional parameter containing a comma-separated list of additional folders to be indexed if IMAP is configured for the mailbox. 379 | 380 | **drop_attachment** - This is an optional parameter to determine if email attachment should be discarded. 381 | 382 | ### Copyright & License 383 | 384 | A copy of the Creative Commons Legal code has been added to the add-on detailing its license. 385 | 386 | 387 | ## USER GUIDE 388 | 389 | ### Data types 390 | 391 | Data is indexed using a sourcetype specified by the administrator when configuring the inputs. 392 | If nothing is specified, events will get indexed with a sourcetype of `mail`. 393 | 394 | ### Troubleshooting 395 | 396 | Once an email is indexed, it will not be re-indexed except the checkpoint directory is emptied. 397 | This can be achieved by running the following command: 398 | ``` 399 | splunk clean inputdata mail 400 | ``` 401 | 402 | #### Diagnostic & Debug Logs 403 | 404 | Logs can be found by searching Splunk internal logs 405 | 406 | ```index=_internal sourcetype=splunkd (component=ModularInputs OR component=ExecProcessor) mail.py``` 407 | 408 | 409 | Additional logging can be enabled by turning on debug logging for ExecProcessor and ModInputs. 410 | set the logging level of the ExecProcessor to Debug 411 | 412 | /opt/splunk/bin/splunk set log-level ExecProcessor -level DEBUG 413 | /opt/splunk/bin/splunk set log-level ModInputs -level DEBUG 414 | 415 | You can find additional ways to enable debug logging on 416 | [here](http://docs.splunk.com/Documentation/Splunk/latest/Troubleshooting/Enabledebuglogging). 417 | -------------------------------------------------------------------------------- /README/inputs.conf.spec: -------------------------------------------------------------------------------- 1 | [mail://] 2 | * The name of the stanza should be an email address which would be used to connect to the server. 3 | 4 | protocol = [POP3|IMAP] 5 | * The protocol to be used to fetch emails from the server 6 | 7 | mailserver = 8 | * This is the mailserver to fetch mails from 9 | 10 | password = 11 | * The password for the account provided in the stanza name 12 | 13 | mailbox_cleanup = [delete,delayed,readonly] 14 | * This determines if the mails should be one of the following: 15 | * delete: deleted as they are indexed 16 | * delayed: deleted on next connection to the mailbox after verifying that the mail was indexed 17 | * readonly: mails will not be deleted. It will be read and left in the mailbox. 18 | * If this is not set, the default option used will be readonly 19 | 20 | include_headers = 21 | * This determines if email headers should be included. 22 | 23 | maintain_rfc = 24 | * This determines if email will still maintain RFC compatability for parsing tools 25 | 26 | attach_message_primary = 27 | * This determines if an attached message will instead be the indexed email (assuming the outer message was just the delivery mechanism) 28 | 29 | additional_folders = 30 | * This suggests additional folders to read messages via IMAP 31 | 32 | drop_attachment = 33 | * This determines if an email attachment will be indexed -------------------------------------------------------------------------------- /appserver/static/screenshot.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/seunomosowon/TA-mailclient/b4745263d53f03e06edf098a665a5597d40fe449/appserver/static/screenshot.png -------------------------------------------------------------------------------- /default/app.conf: -------------------------------------------------------------------------------- 1 | [install] 2 | is_configured = 0 3 | 4 | [ui] 5 | is_visible = 0 6 | label = Technology Add-on for Mail retrieval 7 | 8 | [launcher] 9 | author = seunomosowon 10 | description = Get mails from a mail server via POP3 or IMAP 11 | version = 1.6.0 12 | 13 | [package] 14 | id = TA-mailclient 15 | check_for_updates = true 16 | -------------------------------------------------------------------------------- /default/authorize.conf: -------------------------------------------------------------------------------- 1 | [capability::edit_modinput_mail] 2 | # Capability required to add mail inputs and edit settings. 3 | 4 | [role_admin] 5 | edit_modinput_mail = enabled 6 | -------------------------------------------------------------------------------- /default/inputs.conf: -------------------------------------------------------------------------------- 1 | [mail] 2 | python.version = python3 3 | -------------------------------------------------------------------------------- /default/props.conf: -------------------------------------------------------------------------------- 1 | [source::mail:\/\/...] 2 | KV_MODE = auto 3 | SHOULD_LINEMERGE=false 4 | MAX_EVENTS=5000 5 | LINE_BREAKER=(VGhpcyBpcyBhIG1haWwgc2VwYXJhdG9yIGluIGJhc2U2NCBmb3Igb3VyIFNwbHVuayBpbmRleGluZwo=[\r\n]+) 6 | TIME_PREFIX= \nDate: 7 | MAX_TIMESTAMP_LOOKAHEAD = 32 8 | TIME_FORMAT= %a, %d %b %Y %H:%M:%S %z 9 | TRUNCATE=200000 10 | REPORT-file_attachments = file_attachment 11 | REPORT-multi_part = multi_part 12 | REPORT-attachment_filename = attachment_filename:kvextraction 13 | REPORT-attachment_md5 = attachment_md5:kvextraction 14 | REPORT-attachment_sha256 = attachment_sha256:kvextraction 15 | EXTRACT-Message_ID = (?i)^Message-ID:\h+[^\r\n>]+?)>?$ 16 | EXTRACT-From = ^From:\h+(?(?:"?(?[^<\r\n]+)"?\h+)?[^\r\n]+?)>?)$ 17 | EXTRACT-Subject = ^Subject:\h+(?[^\r\n]+)$ 18 | EXTRACT-TO = ^To:\h+(?(?:"?(?[^<\r\n]+)"?\h+)?[^\r\n]+?)[^\r\n]+)[\r\n]+(?.*)#END_ATTACHMENT:\s*\g{file_name} 3 | MV_ADD=true 4 | 5 | [multi_part] 6 | REGEX=(?ms)[\r\n]#START_OF_MULTIPART_(\d+)[\r\n](?.*)[\r\n]#END_OF_MULTIPART_\1[\r\n]* 7 | MV_ADD=true 8 | 9 | [attachment_md5:kvextraction] 10 | FORMAT = md5::$1 11 | REGEX = md5\s=\s(\w+) 12 | MV_ADD = true 13 | 14 | [attachment_sha256:kvextraction] 15 | FORMAT = sha256::$1 16 | REGEX = sha256\s=\s(\w+) 17 | MV_ADD = true 18 | 19 | [attachment_filename:kvextraction] 20 | FORMAT = file_name::$1 21 | REGEX = file_name\s=\s((?!None\s)[^\.]+(?:\.\w+)?)\s 22 | MV_ADD = true 23 | -------------------------------------------------------------------------------- /lib/file_parser/__init__.py: -------------------------------------------------------------------------------- 1 | from .utils import * 2 | 3 | __version_info__ = (1, 3, 0) 4 | __version__ = ".".join(map(str, __version_info__)) 5 | __all__ = ['ZIP_EXTENSIONS', 'TEXT_FILE_EXTENSIONS', 'SUPPORTED_CONTENT_TYPES', 6 | 'email_mime', 'docx', 'zip'] 7 | -------------------------------------------------------------------------------- /lib/file_parser/docx.py: -------------------------------------------------------------------------------- 1 | """ Parse .docx files """ 2 | from __future__ import unicode_literals 3 | 4 | from .utils import * 5 | from xml.dom.minidom import parse as parsexml 6 | from six import text_type, binary_type, BytesIO 7 | from six import ensure_binary, ensure_str 8 | import zipfile 9 | 10 | 11 | def parse_docx(part, part_name): 12 | """ 13 | This reads a docx file form a string and outputs just the text from the document 14 | along with the document's internal structure 15 | :param part: This is a MIME part from an email that contains a docx file 16 | :type part: Union[email.message.Message, basestring] 17 | :param part_name: This can be either a file name or string $EMAIL$ 18 | :type part_name basestring 19 | :return: This returns the texts from the word document. 20 | :rtype: list 21 | """ 22 | if part_name == EMAIL_PART: 23 | decoded_payload = part.get_payload(decode=True) 24 | zip_name = part.get_filename() or '' 25 | else: 26 | decoded_payload = part 27 | zip_name = part_name 28 | fp = BytesIO(decoded_payload) 29 | try: 30 | zfp = zipfile.ZipFile(fp) 31 | except zipfile.BadZipfile: 32 | return ['#UNSUPPORTED_ATTACHMENT: %s' % zip_name] 33 | return_doc = [] 34 | if zfp: 35 | return_doc.append(parsexml(zfp.open('[Content_Types].xml', 'r')).documentElement.toprettyxml()) 36 | """ 37 | I can check for Macros here 38 | if zfp.getinfo('word/vbaData.xml'): 39 | openXML standard supports any name for xml file. Need to check all files. 40 | Add the contents pages to the top of word file for visual inspection of macros 41 | """ 42 | if zfp.getinfo('word/document.xml'): 43 | doc_xml = parsexml(zfp.open('word/document.xml', 'r')) 44 | return_doc.append(''.join([ensure_str(node.firstChild.nodeValue) for node in doc_xml.getElementsByTagName('w:t')])) 45 | else: 46 | return_doc.append('#UNSUPPORTED_DOCX_FILE: file_name = %s' % zip_name) 47 | else: 48 | return_doc.append('#INVALID_DOCX_FILE: file_name = %s' % zip_name) 49 | return return_doc 50 | 51 | 52 | def parse_docx_from_mail(message): 53 | """ 54 | 55 | :param message: string representation of docx file 56 | :type message: email.message.Message 57 | :return: 58 | """ 59 | parse_docx(message, EMAIL_PART) 60 | 61 | 62 | def parse_docx_from_string(docx_as_string, file_name): 63 | """ 64 | 65 | :param docx_as_string: string representation of docx file 66 | :type docx_as_string: basestring 67 | :param file_name: docx file name 68 | :type file_name: basestring 69 | :return: 70 | """ 71 | parse_docx(docx_as_string, file_name) 72 | -------------------------------------------------------------------------------- /lib/file_parser/email_mime.py: -------------------------------------------------------------------------------- 1 | """ Parse emails files """ 2 | from __future__ import unicode_literals 3 | from six import text_type, binary_type 4 | 5 | import email 6 | import re 7 | import os 8 | from . import zip 9 | import hashlib 10 | import quopri 11 | # noinspection PyUnresolvedReferences 12 | from base64 import b64decode 13 | try: 14 | from email.parser import Parser 15 | except ImportError: 16 | # Python 2 17 | from email.Parser import Parser 18 | 19 | from email.utils import mktime_tz, parsedate_tz 20 | from .utils import * 21 | 22 | 23 | def parse_email(email_as_string, include_headers, maintain_rfc, attach_message_primary): 24 | """ 25 | This function parses an email and returns an array with different parts of the message. 26 | :param email_as_string: This represents the email in a bytearray to be processed 27 | :type email_as_string: basestring 28 | :param include_headers: This parameter specifies if all headers should be included. 29 | :type include_headers: bool 30 | :param maintain_rfc: This parameter specifies if RFC format for email stays intact 31 | :type maintain_rfc: bool 32 | :param attach_message_primary: This parameter specifies if first attached email should 33 | be used as the message for indexing instead of the carrier email 34 | :type attach_message_primary: bool 35 | :return: Returns a list with the [date, Message-id, mail_message] 36 | :rtype: list 37 | """ 38 | message = email.message_from_string(email_as_string.strip()) or None 39 | if message is None: 40 | return [None, None, None] 41 | if attach_message_primary: 42 | message = change_primary_message(message) 43 | if maintain_rfc: 44 | index_mail = maintain_rfc_parse(message) 45 | else: 46 | mailheaders = Parser().parsestr(message.as_string(), True) 47 | headers = ["%s: %s" % (k, getheader(v)) for k, v in mailheaders.items() if k in MAIN_HEADERS] 48 | if include_headers: 49 | other_headers = ["%s: %s" % (k, getheader(v)) for k, v in mailheaders.items() if k not in MAIN_HEADERS] 50 | headers.extend(other_headers) 51 | body = [] 52 | if message.is_multipart(): 53 | part_number = 1 54 | for part in message.walk(): 55 | content_type = part.get_content_type() 56 | content_disposition = part.get('Content-Disposition') 57 | if content_type in ['multipart/alternative', 'multipart/mixed']: 58 | # The multipart/alternative part is usually empty. 59 | body.append("Multipart envelope header: %s" % str(part.get_payload(decode=True))) 60 | continue 61 | body.append("#START_OF_MULTIPART_%d" % part_number) 62 | extension = str(os.path.splitext(part.get_filename() or '')[1]).lower() 63 | if extension in TEXT_FILE_EXTENSIONS or content_type in SUPPORTED_CONTENT_TYPES or \ 64 | part.get_content_maintype() == 'text' or extension in ZIP_EXTENSIONS: 65 | if part.get_filename(): 66 | body.append("#BEGIN_ATTACHMENT: %s" % str(part.get_filename())) 67 | if extension in ZIP_EXTENSIONS: 68 | body.append("\n".join(zip.parse_zip(part, EMAIL_PART))) 69 | else: 70 | body.append(recode_mail(part)) 71 | body.append("#END_ATTACHMENT: %s" % str(part.get_filename())) 72 | else: 73 | body.append(recode_mail(part)) 74 | else: 75 | body.append("#UNSUPPORTED_ATTACHMENT: file_name = %s - type = %s ; disposition=%s" % ( 76 | part.get_filename(), content_type, content_disposition)) 77 | body.append("#END_OF_MULTIPART_%d" % part_number) 78 | part_number += 1 79 | else: 80 | body.append(recode_mail(message)) 81 | """mail_for_index = [MESSAGE_PREAMBLE]""" 82 | mail_for_index = [] 83 | mail_for_index.extend(headers + body) 84 | index_mail = '\n'.join(s.decode('utf-8', 'ignore') if isinstance(s, binary_type) else s for s in mail_for_index) 85 | message_time = float(mktime_tz(parsedate_tz(message['Date']))) 86 | return [message_time, message['Message-ID'], index_mail] 87 | 88 | def change_primary_message(message): 89 | """ 90 | This function will look for an attached email and return it. This is inteded to use 91 | the attached email as the email to be indexed instead of the carrier email. 92 | It checks if the message is already in message format or in a binary format and also 93 | only the first attached email will become the primary if there are more than one. 94 | :param message: This represents the email to be checked for attached email. 95 | :type message: email message object 96 | :return: Returns a email message object 97 | :rtype: email message object 98 | """ 99 | for i in message.walk(): 100 | if i.get_content_maintype()=='message': 101 | return i.get_payload()[0] 102 | elif i.get_content_subtype()=='octet-stream' and i.get_filename().lower().endswith('.eml'): 103 | if i['Content-Transfer-Encoding'].lower()=='base64': 104 | return email.message_from_string(b64decode(i.get_payload())) 105 | else: 106 | return email.message_from_string(i.get_payload()) 107 | 108 | def maintain_rfc_parse(message): 109 | """ 110 | This function parses an email and returns an array with different parts of the message 111 | but leaves the email still RFC compliant so that it works with Mail-Parser Plus app. 112 | Attachment headers are left in tact. 113 | :param message: This represents the email to be checked for attached email. 114 | :type message: email message object 115 | :return: Returns a email message formatted as a string 116 | :rtype: str 117 | """ 118 | if not message.is_multipart(): 119 | reformatted_message = quopri.decodestring( 120 | message.as_string().encode('ascii', 'ignore') 121 | ).decode("utf-8", 'ignore') 122 | return reformatted_message 123 | boundary = message.get_boundary() 124 | new_payload = '--' + boundary 125 | for i in message.get_payload(): 126 | content_type = i.get_content_type() 127 | extension = str(os.path.splitext(i.get_filename() or '')[1]).lower() 128 | if extension in TEXT_FILE_EXTENSIONS or content_type in SUPPORTED_CONTENT_TYPES or \ 129 | i.get_content_maintype() == 'text': 130 | text_content = i.as_string().encode('ascii', 'ignore') 131 | text_content = quopri.decodestring(text_content).decode("utf-8", 'ignore') 132 | new_payload += '\n' + text_content 133 | else: 134 | replace = re.sub(r'(?:\n\n)[\s\S]+',r'\n\n#UNSUPPORTED_ATTACHMENT:',i.as_string()) 135 | filename = i.get_filename() 136 | charset = i.get_content_charset() 137 | try: 138 | md5 = hashlib.md5(i.get_payload(None,True)).hexdigest() 139 | sha256 = hashlib.sha256(i.get_payload(None,True)).hexdigest() 140 | except: 141 | md5 = '' 142 | sha256 = '' 143 | replace_string = """ 144 | file_name = %(filename)s 145 | type = %(content_type)s 146 | charset = %(charset)s 147 | md5 = %(md5)s 148 | sha256 = %(sha256)s 149 | """ 150 | metadata = replace_string % dict( 151 | content_type=content_type, 152 | filename=filename, 153 | charset=charset, 154 | md5=md5, 155 | sha256=sha256, 156 | ) 157 | new_payload += '\n' \ 158 | + replace \ 159 | + metadata 160 | new_payload += '\n--' + boundary 161 | new_payload += '--' 162 | message.set_payload(new_payload) 163 | return message.as_string() 164 | -------------------------------------------------------------------------------- /lib/file_parser/utils.py: -------------------------------------------------------------------------------- 1 | """ 2 | This includes common functions that are required when dealing with mails 3 | """ 4 | from __future__ import unicode_literals 5 | 6 | from email.header import decode_header 7 | from six import text_type, binary_type 8 | 9 | MAIN_HEADERS = ('Date', 'Message-Id', 'Message-ID', 'From', 'To', 'Subject') 10 | ZIP_EXTENSIONS = {'.zip', '.docx'} 11 | EMAIL_PART = '$EMAIL$' 12 | SUPPORTED_CONTENT_TYPES = {'application/xml', 'application/xhtml', 'application/x-sh', 'application/x-csh', 13 | 'application/javascript', 'application/bat', 'application/x-bat', 14 | 'application/x-msdos-program', 'application/textedit', 15 | 'application/vnd.openxmlformats-officedocument.wordprocessingml.document'} 16 | TEXT_FILE_EXTENSIONS = {'.csv', '.txt', '.md', '.py', '.bat', '.sh', '.rb', '.js', '.asm', '.log'} 17 | """ 18 | It already indexes all text/* including: 19 | 'text/plain', 'text/html', 'text/x-asm', 'text/x-c','text/x-python-script','text/x-python' 20 | No need to add this to the supported types list 21 | """ 22 | 23 | 24 | def getheader(header_text, default="ascii"): 25 | """ This decodes sections of the email header which could be represented in utf8 or other iso languages""" 26 | headers = decode_header(header_text) 27 | header_sections = [text if isinstance(text, text_type) else text_type(text, charset or default, "ignore") for text, charset in headers] 28 | return "".join(header_sections) 29 | 30 | 31 | def recode_mail(part): 32 | cset = part.get_content_charset() 33 | if cset == "None": 34 | cset = "ascii" 35 | try: 36 | if not part.get_payload(decode=True): 37 | result = "" 38 | else: 39 | result = text_type(part.get_payload(decode=True), cset, "ignore").encode('utf8', 'xmlcharrefreplace').strip() 40 | except TypeError: 41 | result = part.get_payload(decode=True) 42 | if isinstance(result, text_type): 43 | result = result.encode('utf8', 'xmlcharrefreplace').strip() 44 | return result 45 | -------------------------------------------------------------------------------- /lib/file_parser/zip.py: -------------------------------------------------------------------------------- 1 | """Parse zip files""" 2 | from __future__ import unicode_literals 3 | from six import text_type, binary_type, BytesIO 4 | from six import ensure_binary, ensure_str 5 | from .utils import * 6 | from . import docx 7 | import os 8 | import zipfile 9 | 10 | 11 | def parse_zip(part, part_name): 12 | """ 13 | This reads a docx file form a string and outputs just the text from the document 14 | along with the document's internal structure 15 | :param part: This is a MIME message part from an email that contains a docx file 16 | :type part: Union[email.message.Message, basestring] 17 | :param part_name: This can be either file or email 18 | :type part_name basestring 19 | :return: This returns the texts from the word document. 20 | :rtype: list 21 | """ 22 | if EMAIL_PART == part_name: 23 | decoded_payload = part.get_payload(decode=True) 24 | zip_name = part.get_filename() or '' 25 | else: 26 | decoded_payload = part 27 | zip_name = part_name 28 | fp = BytesIO(decoded_payload) 29 | try: 30 | zfp = zipfile.ZipFile(fp) 31 | except zipfile.BadZipfile: 32 | return ['#UNSUPPORTED_ATTACHMENT: %s' % zip_name] 33 | extension = os.path.splitext(zip_name)[1].lower() 34 | unzip_content = [] 35 | if zfp: 36 | ziplist = ['#BEGIN_ZIP_FILELIST: %s' % zip_name] 37 | ziplist.extend(zfp.namelist()) 38 | ziplist.append('#END_ZIP_FILELIST: %s' % zip_name) 39 | unzip_content.append("\n".join(ziplist)) 40 | if '.docx' == extension: 41 | unzip_content.extend(docx.parse_docx(part, part_name)) 42 | else: 43 | for each_compressedfile in zfp.namelist(): 44 | zipped_file = [] 45 | if not each_compressedfile.endswith('/'): 46 | zipped_fextension = text_type(os.path.splitext(each_compressedfile)[1]).lower() 47 | zipped_file = ["#BEGIN_ATTACHMENT: %s/%s" % (zip_name, each_compressedfile)] 48 | if zipped_fextension in TEXT_FILE_EXTENSIONS: 49 | f = zfp.open(each_compressedfile) 50 | for line in f: 51 | zipped_file.append(ensure_str(line).rstrip('\n')) 52 | elif zipped_fextension in ZIP_EXTENSIONS: 53 | file_buff = zfp.open(each_compressedfile).read() 54 | zipped_file.extend(parse_zip(file_buff, each_compressedfile)) 55 | else: 56 | zipped_file.append("#UNSUPPORTED_CONTENT: file_name = %s" % each_compressedfile) 57 | zipped_file.append("#END_ATTACHMENT: %s/%s" % (zip_name, each_compressedfile)) 58 | unzip_content.append("\n".join(zipped_file)) 59 | return unzip_content 60 | 61 | 62 | def parse_zip_from_mail(message): 63 | """ 64 | 65 | :param message: string representation of docx file 66 | :type message: email.message.Message 67 | :return: 68 | """ 69 | parse_zip(message, EMAIL_PART) 70 | 71 | 72 | def parse_zip_from_string(file_as_string, file_name): 73 | """ 74 | 75 | :param file_as_string: string representation of docx file 76 | :type file_as_string: basestring 77 | :param file_name: docx file name 78 | :type file_name: basestring 79 | :return: 80 | """ 81 | parse_zip(file_as_string, file_name) 82 | -------------------------------------------------------------------------------- /lib/mail_constants.py: -------------------------------------------------------------------------------- 1 | # DEFAULTS 2 | from __future__ import unicode_literals 3 | 4 | IMAP_READONLY_FLAG = True 5 | INDEX_ATTACHMENT_DEFAULT = True 6 | DEFAULT_INCLUDE_HEADERS = True 7 | DEFAULT_INCLUDE_INBOX = True 8 | DEFAULT_MAINTAIN_RFC = False 9 | DEFAULT_ATTACH_MESSAGE_PRIMARY = False 10 | DEFAULT_MAILBOX_CLEANUP = 'readonly' 11 | DEFAULT_DROP_ATTACHMENT = False 12 | MAX_FETCH_COUNT = 25 13 | REALM = 'mail' 14 | PASSWORD_PLACEHOLDER = 'encrypted' 15 | REGEX_EMAIL = r'^[_a-z0-9-]+(\.[_a-z0-9-]+)*@[a-z0-9-]+(\.[a-z0-9-]+)*(\.[a-z]{2,4})$' 16 | REGEX_PASSWORD = r'^([\w!@#$%-]+)$' 17 | REGEX_HOSTNAME = r'^((25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)(\.(25[0-5]|2[0-4][0-9]|' \ 18 | r'[01]?[0-9][0-9]?)){3})$|^((([a-zA-Z0-9]|[a-zA-Z0-9][a-zA-Z0-9\-]*[a-zA-Z0-9])' \ 19 | r'\.)*([A-Za-z0-9]|[A-Za-z0-9][A-Za-z0-9\-]*[A-Za-z0-9]))$' 20 | MESSAGE_PREAMBLE = "VGhpcyBpcyBhIG1haWwgc2VwYXJhdG9yIGluIGJhc2U2NCBmb3Igb3VyIFNwbHVuayBpbmRleGluZwo=\n" 21 | -------------------------------------------------------------------------------- /lib/mail_exceptions.py: -------------------------------------------------------------------------------- 1 | from __future__ import unicode_literals 2 | 3 | """This contains exceptions defined for the Mail scheme""" 4 | 5 | 6 | class MailException(Exception): 7 | """ 8 | Exception raised for errors in the mail modular input. 9 | """ 10 | 11 | 12 | class MailExceptionInvalidProtocol(MailException): 13 | """ 14 | Raised if an invalid mail protocol is defined. 15 | This requires POP3 or IMAP 16 | """ 17 | 18 | def __init__(self): 19 | MailException.__init__(self, 'protocol must be set to either POP3 or IMAP') 20 | 21 | 22 | class MailExceptionStanzaNotEmail(MailException): 23 | """ 24 | Raised if the stanza is not an email address 25 | """ 26 | 27 | def __init__(self, message): 28 | self.input = message 29 | MailException.__init__(self, 'Input stanza must be an email address. Error parsing %s' % message) 30 | 31 | 32 | class MailProtocolError(MailException): 33 | """ 34 | Raised when a Poplib exception is thrown and caught 35 | """ 36 | 37 | def __init__(self, message): 38 | self.message = message 39 | MailException.__init__(self, 'Exception thrown by Poplib or Imaplib, %s' % message) 40 | 41 | 42 | class MailConnectionError(MailException): 43 | """ 44 | Raised when there's a connection error 45 | """ 46 | 47 | def __init__(self, message): 48 | self.message = message 49 | MailException.__init__(self, 'Mail connection error: %s' % message) 50 | 51 | 52 | class MailLoginFailed(MailException): 53 | """ 54 | Raised when there's a login failure 55 | """ 56 | 57 | def __init__(self, server, username): 58 | self.user = username 59 | MailException.__init__(self, 'Login failed on %s for username: %s' % (server, username)) 60 | 61 | 62 | -------------------------------------------------------------------------------- /lib/mail_utils.py: -------------------------------------------------------------------------------- 1 | from __future__ import unicode_literals 2 | 3 | from six import text_type, binary_type 4 | 5 | import hashlib 6 | import os 7 | import socket 8 | import re 9 | 10 | 11 | def mail_connectivity_test(server, protocol): 12 | """ 13 | This validates connectivity to given hostname and port 14 | :param server: This is the remote hostname or IP to be used for the test. 15 | :type server: basestring 16 | :param protocol: The protocol to be used to fetch emails - IMAPS or POP3S 17 | :type protocol: basestring 18 | :return: Raises an exception back to the modinput validation if connectivity test fails 19 | """ 20 | try: 21 | captive_dns_addr = socket.gethostbyname(server) 22 | s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) 23 | s.settimeout(1) 24 | s.connect((captive_dns_addr, get_mail_port(protocol=protocol))) 25 | s.close() 26 | except socket.error as e: 27 | raise socket.error("Socket error : %s" % e) 28 | 29 | 30 | def save_checkpoint(checkpoint_dir, msg): 31 | """ 32 | This creates a checkpoint file in the checkpoint directory for the message. 33 | :param checkpoint_dir: This contains the path where checkpoint files will be saved 34 | :type checkpoint_dir: basestring 35 | :param msg: Contains a message that needs to indexed and 36 | :type msg: basestring 37 | """ 38 | filename = os.path.join(checkpoint_dir, hashlib.sha256(msg.encode("utf8", "backslashreplace")).hexdigest()) 39 | f = open(filename, 'w') 40 | f.close() 41 | 42 | 43 | def locate_checkpoint(checkpoint_dir, msg): 44 | """ 45 | This checks if a message has already been indexed by using a digest of the first 300 characters, 46 | which includes a date, message id, source and destination email addresses. 47 | :param checkpoint_dir: This contains the path where checkpoint files will be saved 48 | :type checkpoint_dir: basestring 49 | :param msg: Contains a message that needs to indexed and 50 | :type msg: basestring 51 | :return: Returns true if the message has been indexed previously, and false if not. 52 | :rtype: bool 53 | """ 54 | filename = os.path.join(checkpoint_dir, hashlib.sha256(msg.encode("utf8", "backslashreplace")).hexdigest()) 55 | try: 56 | open(filename, 'r').close() 57 | except (OSError, IOError): 58 | return False 59 | return True 60 | 61 | 62 | def bool_variable(x): 63 | """ 64 | 65 | :param x: variable to be converted to boolean. This defaults to true if unsupported values are passed to this 66 | :return: 67 | """ 68 | if x == "enabled": 69 | x = True 70 | elif x == "disabled": 71 | x = False 72 | elif x == "True": 73 | x = True 74 | elif x == "False": 75 | x = False 76 | elif x == "1" or x == "0": 77 | x = bool(int(x)) 78 | else: 79 | x = True 80 | return x 81 | 82 | 83 | def get_mail_port(protocol): 84 | """ 85 | This returns the server port to use for POP retrieval of mails 86 | :param protocol: The protocol to be used to fetch emails - IMAP or POP3 87 | :type protocol: basestring 88 | :return: Returns the correct port for either POP3 or POP3 over SSL 89 | :rtype: int 90 | """ 91 | if protocol == 'POP3': 92 | port = 995 93 | elif 'IMAP' == protocol: 94 | port = 993 95 | else: 96 | raise Exception("Invalid options passed to get_mail_port") 97 | return port 98 | 99 | def drop_attachment_from_event(message): 100 | """ 101 | This prevent the attachment content to be ingested in clear text by 102 | dropping its content. If attachment is unsupported, nothing done. 103 | :param message: Email message to be ingested as event in Splunk 104 | :type message: basestring 105 | :return: Return the email message with no attachment content 106 | :rtype: basestring 107 | """ 108 | pattern = r'^#BEGIN_ATTACHMENT:\s(.*)#END_ATTACHMENT:\s' 109 | return re.sub(pattern, "", message, flags=re.DOTALL|re.MULTILINE) -------------------------------------------------------------------------------- /lib/splunklib/__init__.py: -------------------------------------------------------------------------------- 1 | # Copyright 2011-2015 Splunk, Inc. 2 | # 3 | # Licensed under the Apache License, Version 2.0 (the "License"): you may 4 | # not use this file except in compliance with the License. You may obtain 5 | # a copy of the License at 6 | # 7 | # http://www.apache.org/licenses/LICENSE-2.0 8 | # 9 | # Unless required by applicable law or agreed to in writing, software 10 | # distributed under the License is distributed on an "AS IS" BASIS, WITHOUT 11 | # WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the 12 | # License for the specific language governing permissions and limitations 13 | # under the License. 14 | 15 | """Python library for Splunk.""" 16 | 17 | from __future__ import absolute_import 18 | from splunklib.six.moves import map 19 | __version_info__ = (1, 6, 14) 20 | __version__ = ".".join(map(str, __version_info__)) 21 | -------------------------------------------------------------------------------- /lib/splunklib/data.py: -------------------------------------------------------------------------------- 1 | # Copyright 2011-2015 Splunk, Inc. 2 | # 3 | # Licensed under the Apache License, Version 2.0 (the "License"): you may 4 | # not use this file except in compliance with the License. You may obtain 5 | # a copy of the License at 6 | # 7 | # http://www.apache.org/licenses/LICENSE-2.0 8 | # 9 | # Unless required by applicable law or agreed to in writing, software 10 | # distributed under the License is distributed on an "AS IS" BASIS, WITHOUT 11 | # WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the 12 | # License for the specific language governing permissions and limitations 13 | # under the License. 14 | 15 | """The **splunklib.data** module reads the responses from splunkd in Atom Feed 16 | format, which is the format used by most of the REST API. 17 | """ 18 | 19 | from __future__ import absolute_import 20 | import sys 21 | from xml.etree.ElementTree import XML 22 | from splunklib import six 23 | 24 | __all__ = ["load"] 25 | 26 | # LNAME refers to element names without namespaces; XNAME is the same 27 | # name, but with an XML namespace. 28 | LNAME_DICT = "dict" 29 | LNAME_ITEM = "item" 30 | LNAME_KEY = "key" 31 | LNAME_LIST = "list" 32 | 33 | XNAMEF_REST = "{http://dev.splunk.com/ns/rest}%s" 34 | XNAME_DICT = XNAMEF_REST % LNAME_DICT 35 | XNAME_ITEM = XNAMEF_REST % LNAME_ITEM 36 | XNAME_KEY = XNAMEF_REST % LNAME_KEY 37 | XNAME_LIST = XNAMEF_REST % LNAME_LIST 38 | 39 | # Some responses don't use namespaces (eg: search/parse) so we look for 40 | # both the extended and local versions of the following names. 41 | 42 | def isdict(name): 43 | return name == XNAME_DICT or name == LNAME_DICT 44 | 45 | def isitem(name): 46 | return name == XNAME_ITEM or name == LNAME_ITEM 47 | 48 | def iskey(name): 49 | return name == XNAME_KEY or name == LNAME_KEY 50 | 51 | def islist(name): 52 | return name == XNAME_LIST or name == LNAME_LIST 53 | 54 | def hasattrs(element): 55 | return len(element.attrib) > 0 56 | 57 | def localname(xname): 58 | rcurly = xname.find('}') 59 | return xname if rcurly == -1 else xname[rcurly+1:] 60 | 61 | def load(text, match=None): 62 | """This function reads a string that contains the XML of an Atom Feed, then 63 | returns the 64 | data in a native Python structure (a ``dict`` or ``list``). If you also 65 | provide a tag name or path to match, only the matching sub-elements are 66 | loaded. 67 | 68 | :param text: The XML text to load. 69 | :type text: ``string`` 70 | :param match: A tag name or path to match (optional). 71 | :type match: ``string`` 72 | """ 73 | if text is None: return None 74 | text = text.strip() 75 | if len(text) == 0: return None 76 | nametable = { 77 | 'namespaces': [], 78 | 'names': {} 79 | } 80 | 81 | # Convert to unicode encoding in only python 2 for xml parser 82 | if(sys.version_info < (3, 0, 0) and isinstance(text, unicode)): 83 | text = text.encode('utf-8') 84 | 85 | root = XML(text) 86 | items = [root] if match is None else root.findall(match) 87 | count = len(items) 88 | if count == 0: 89 | return None 90 | elif count == 1: 91 | return load_root(items[0], nametable) 92 | else: 93 | return [load_root(item, nametable) for item in items] 94 | 95 | # Load the attributes of the given element. 96 | def load_attrs(element): 97 | if not hasattrs(element): return None 98 | attrs = record() 99 | for key, value in six.iteritems(element.attrib): 100 | attrs[key] = value 101 | return attrs 102 | 103 | # Parse a element and return a Python dict 104 | def load_dict(element, nametable = None): 105 | value = record() 106 | children = list(element) 107 | for child in children: 108 | assert iskey(child.tag) 109 | name = child.attrib["name"] 110 | value[name] = load_value(child, nametable) 111 | return value 112 | 113 | # Loads the given elements attrs & value into single merged dict. 114 | def load_elem(element, nametable=None): 115 | name = localname(element.tag) 116 | attrs = load_attrs(element) 117 | value = load_value(element, nametable) 118 | if attrs is None: return name, value 119 | if value is None: return name, attrs 120 | # If value is simple, merge into attrs dict using special key 121 | if isinstance(value, six.string_types): 122 | attrs["$text"] = value 123 | return name, attrs 124 | # Both attrs & value are complex, so merge the two dicts, resolving collisions. 125 | collision_keys = [] 126 | for key, val in six.iteritems(attrs): 127 | if key in value and key in collision_keys: 128 | value[key].append(val) 129 | elif key in value and key not in collision_keys: 130 | value[key] = [value[key], val] 131 | collision_keys.append(key) 132 | else: 133 | value[key] = val 134 | return name, value 135 | 136 | # Parse a element and return a Python list 137 | def load_list(element, nametable=None): 138 | assert islist(element.tag) 139 | value = [] 140 | children = list(element) 141 | for child in children: 142 | assert isitem(child.tag) 143 | value.append(load_value(child, nametable)) 144 | return value 145 | 146 | # Load the given root element. 147 | def load_root(element, nametable=None): 148 | tag = element.tag 149 | if isdict(tag): return load_dict(element, nametable) 150 | if islist(tag): return load_list(element, nametable) 151 | k, v = load_elem(element, nametable) 152 | return Record.fromkv(k, v) 153 | 154 | # Load the children of the given element. 155 | def load_value(element, nametable=None): 156 | children = list(element) 157 | count = len(children) 158 | 159 | # No children, assume a simple text value 160 | if count == 0: 161 | text = element.text 162 | if text is None: 163 | return None 164 | text = text.strip() 165 | if len(text) == 0: 166 | return None 167 | return text 168 | 169 | # Look for the special case of a single well-known structure 170 | if count == 1: 171 | child = children[0] 172 | tag = child.tag 173 | if isdict(tag): return load_dict(child, nametable) 174 | if islist(tag): return load_list(child, nametable) 175 | 176 | value = record() 177 | for child in children: 178 | name, item = load_elem(child, nametable) 179 | # If we have seen this name before, promote the value to a list 180 | if name in value: 181 | current = value[name] 182 | if not isinstance(current, list): 183 | value[name] = [current] 184 | value[name].append(item) 185 | else: 186 | value[name] = item 187 | 188 | return value 189 | 190 | # A generic utility that enables "dot" access to dicts 191 | class Record(dict): 192 | """This generic utility class enables dot access to members of a Python 193 | dictionary. 194 | 195 | Any key that is also a valid Python identifier can be retrieved as a field. 196 | So, for an instance of ``Record`` called ``r``, ``r.key`` is equivalent to 197 | ``r['key']``. A key such as ``invalid-key`` or ``invalid.key`` cannot be 198 | retrieved as a field, because ``-`` and ``.`` are not allowed in 199 | identifiers. 200 | 201 | Keys of the form ``a.b.c`` are very natural to write in Python as fields. If 202 | a group of keys shares a prefix ending in ``.``, you can retrieve keys as a 203 | nested dictionary by calling only the prefix. For example, if ``r`` contains 204 | keys ``'foo'``, ``'bar.baz'``, and ``'bar.qux'``, ``r.bar`` returns a record 205 | with the keys ``baz`` and ``qux``. If a key contains multiple ``.``, each 206 | one is placed into a nested dictionary, so you can write ``r.bar.qux`` or 207 | ``r['bar.qux']`` interchangeably. 208 | """ 209 | sep = '.' 210 | 211 | def __call__(self, *args): 212 | if len(args) == 0: return self 213 | return Record((key, self[key]) for key in args) 214 | 215 | def __getattr__(self, name): 216 | try: 217 | return self[name] 218 | except KeyError: 219 | raise AttributeError(name) 220 | 221 | def __delattr__(self, name): 222 | del self[name] 223 | 224 | def __setattr__(self, name, value): 225 | self[name] = value 226 | 227 | @staticmethod 228 | def fromkv(k, v): 229 | result = record() 230 | result[k] = v 231 | return result 232 | 233 | def __getitem__(self, key): 234 | if key in self: 235 | return dict.__getitem__(self, key) 236 | key += self.sep 237 | result = record() 238 | for k,v in six.iteritems(self): 239 | if not k.startswith(key): 240 | continue 241 | suffix = k[len(key):] 242 | if '.' in suffix: 243 | ks = suffix.split(self.sep) 244 | z = result 245 | for x in ks[:-1]: 246 | if x not in z: 247 | z[x] = record() 248 | z = z[x] 249 | z[ks[-1]] = v 250 | else: 251 | result[suffix] = v 252 | if len(result) == 0: 253 | raise KeyError("No key or prefix: %s" % key) 254 | return result 255 | 256 | 257 | def record(value=None): 258 | """This function returns a :class:`Record` instance constructed with an 259 | initial value that you provide. 260 | 261 | :param `value`: An initial record value. 262 | :type `value`: ``dict`` 263 | """ 264 | if value is None: value = {} 265 | return Record(value) 266 | 267 | -------------------------------------------------------------------------------- /lib/splunklib/modularinput/__init__.py: -------------------------------------------------------------------------------- 1 | """The following imports allow these classes to be imported via 2 | the splunklib.modularinput package like so: 3 | 4 | from splunklib.modularinput import * 5 | """ 6 | from .argument import Argument 7 | from .event import Event 8 | from .event_writer import EventWriter 9 | from .input_definition import InputDefinition 10 | from .scheme import Scheme 11 | from .script import Script 12 | from .validation_definition import ValidationDefinition 13 | -------------------------------------------------------------------------------- /lib/splunklib/modularinput/argument.py: -------------------------------------------------------------------------------- 1 | # Copyright 2011-2015 Splunk, Inc. 2 | # 3 | # Licensed under the Apache License, Version 2.0 (the "License"): you may 4 | # not use this file except in compliance with the License. You may obtain 5 | # a copy of the License at 6 | # 7 | # http://www.apache.org/licenses/LICENSE-2.0 8 | # 9 | # Unless required by applicable law or agreed to in writing, software 10 | # distributed under the License is distributed on an "AS IS" BASIS, WITHOUT 11 | # WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the 12 | # License for the specific language governing permissions and limitations 13 | # under the License. 14 | 15 | from __future__ import absolute_import 16 | try: 17 | import xml.etree.ElementTree as ET 18 | except ImportError: 19 | import xml.etree.cElementTree as ET 20 | 21 | class Argument(object): 22 | """Class representing an argument to a modular input kind. 23 | 24 | ``Argument`` is meant to be used with ``Scheme`` to generate an XML 25 | definition of the modular input kind that Splunk understands. 26 | 27 | ``name`` is the only required parameter for the constructor. 28 | 29 | **Example with least parameters**:: 30 | 31 | arg1 = Argument(name="arg1") 32 | 33 | **Example with all parameters**:: 34 | 35 | arg2 = Argument( 36 | name="arg2", 37 | description="This is an argument with lots of parameters", 38 | validation="is_pos_int('some_name')", 39 | data_type=Argument.data_type_number, 40 | required_on_edit=True, 41 | required_on_create=True 42 | ) 43 | """ 44 | 45 | # Constant values, do not change. 46 | # These should be used for setting the value of an Argument object's data_type field. 47 | data_type_boolean = "BOOLEAN" 48 | data_type_number = "NUMBER" 49 | data_type_string = "STRING" 50 | 51 | def __init__(self, name, description=None, validation=None, 52 | data_type=data_type_string, required_on_edit=False, required_on_create=False, title=None): 53 | """ 54 | :param name: ``string``, identifier for this argument in Splunk. 55 | :param description: ``string``, human-readable description of the argument. 56 | :param validation: ``string`` specifying how the argument should be validated, if using internal validation. 57 | If using external validation, this will be ignored. 58 | :param data_type: ``string``, data type of this field; use the class constants. 59 | "data_type_boolean", "data_type_number", or "data_type_string". 60 | :param required_on_edit: ``Boolean``, whether this arg is required when editing an existing modular input of this kind. 61 | :param required_on_create: ``Boolean``, whether this arg is required when creating a modular input of this kind. 62 | :param title: ``String``, a human-readable title for the argument. 63 | """ 64 | self.name = name 65 | self.description = description 66 | self.validation = validation 67 | self.data_type = data_type 68 | self.required_on_edit = required_on_edit 69 | self.required_on_create = required_on_create 70 | self.title = title 71 | 72 | def add_to_document(self, parent): 73 | """Adds an ``Argument`` object to this ElementTree document. 74 | 75 | Adds an subelement to the parent element, typically 76 | and sets up its subelements with their respective text. 77 | 78 | :param parent: An ``ET.Element`` to be the parent of a new subelement 79 | :returns: An ``ET.Element`` object representing this argument. 80 | """ 81 | arg = ET.SubElement(parent, "arg") 82 | arg.set("name", self.name) 83 | 84 | if self.title is not None: 85 | ET.SubElement(arg, "title").text = self.title 86 | 87 | if self.description is not None: 88 | ET.SubElement(arg, "description").text = self.description 89 | 90 | if self.validation is not None: 91 | ET.SubElement(arg, "validation").text = self.validation 92 | 93 | # add all other subelements to this Argument, represented by (tag, text) 94 | subelements = [ 95 | ("data_type", self.data_type), 96 | ("required_on_edit", self.required_on_edit), 97 | ("required_on_create", self.required_on_create) 98 | ] 99 | 100 | for name, value in subelements: 101 | ET.SubElement(arg, name).text = str(value).lower() 102 | 103 | return arg -------------------------------------------------------------------------------- /lib/splunklib/modularinput/event.py: -------------------------------------------------------------------------------- 1 | # Copyright 2011-2015 Splunk, Inc. 2 | # 3 | # Licensed under the Apache License, Version 2.0 (the "License"): you may 4 | # not use this file except in compliance with the License. You may obtain 5 | # a copy of the License at 6 | # 7 | # http://www.apache.org/licenses/LICENSE-2.0 8 | # 9 | # Unless required by applicable law or agreed to in writing, software 10 | # distributed under the License is distributed on an "AS IS" BASIS, WITHOUT 11 | # WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the 12 | # License for the specific language governing permissions and limitations 13 | # under the License. 14 | 15 | from __future__ import absolute_import 16 | from io import TextIOBase 17 | from splunklib.six import ensure_text 18 | 19 | try: 20 | import xml.etree.cElementTree as ET 21 | except ImportError as ie: 22 | import xml.etree.ElementTree as ET 23 | 24 | class Event(object): 25 | """Represents an event or fragment of an event to be written by this modular input to Splunk. 26 | 27 | To write an input to a stream, call the ``write_to`` function, passing in a stream. 28 | """ 29 | def __init__(self, data=None, stanza=None, time=None, host=None, index=None, source=None, 30 | sourcetype=None, done=True, unbroken=True): 31 | """There are no required parameters for constructing an Event 32 | 33 | **Example with minimal configuration**:: 34 | 35 | my_event = Event( 36 | data="This is a test of my new event.", 37 | stanza="myStanzaName", 38 | time="%.3f" % 1372187084.000 39 | ) 40 | 41 | **Example with full configuration**:: 42 | 43 | excellent_event = Event( 44 | data="This is a test of my excellent event.", 45 | stanza="excellenceOnly", 46 | time="%.3f" % 1372274622.493, 47 | host="localhost", 48 | index="main", 49 | source="Splunk", 50 | sourcetype="misc", 51 | done=True, 52 | unbroken=True 53 | ) 54 | 55 | :param data: ``string``, the event's text. 56 | :param stanza: ``string``, name of the input this event should be sent to. 57 | :param time: ``float``, time in seconds, including up to 3 decimal places to represent milliseconds. 58 | :param host: ``string``, the event's host, ex: localhost. 59 | :param index: ``string``, the index this event is specified to write to, or None if default index. 60 | :param source: ``string``, the source of this event, or None to have Splunk guess. 61 | :param sourcetype: ``string``, source type currently set on this event, or None to have Splunk guess. 62 | :param done: ``boolean``, is this a complete ``Event``? False if an ``Event`` fragment. 63 | :param unbroken: ``boolean``, Is this event completely encapsulated in this ``Event`` object? 64 | """ 65 | self.data = data 66 | self.done = done 67 | self.host = host 68 | self.index = index 69 | self.source = source 70 | self.sourceType = sourcetype 71 | self.stanza = stanza 72 | self.time = time 73 | self.unbroken = unbroken 74 | 75 | def write_to(self, stream): 76 | """Write an XML representation of self, an ``Event`` object, to the given stream. 77 | 78 | The ``Event`` object will only be written if its data field is defined, 79 | otherwise a ``ValueError`` is raised. 80 | 81 | :param stream: stream to write XML to. 82 | """ 83 | if self.data is None: 84 | raise ValueError("Events must have at least the data field set to be written to XML.") 85 | 86 | event = ET.Element("event") 87 | if self.stanza is not None: 88 | event.set("stanza", self.stanza) 89 | event.set("unbroken", str(int(self.unbroken))) 90 | 91 | # if a time isn't set, let Splunk guess by not creating a