├── AUTHORS.md
├── CONTRIBUTING.md
├── LICENSE
├── README.md
├── README
└── inputs.conf.spec
├── appserver
└── static
│ └── screenshot.png
├── bin
└── mail.py
├── default
├── app.conf
├── authorize.conf
├── inputs.conf
├── props.conf
└── transforms.conf
├── lib
├── file_parser
│ ├── __init__.py
│ ├── docx.py
│ ├── email_mime.py
│ ├── utils.py
│ └── zip.py
├── mail_constants.py
├── mail_exceptions.py
├── mail_utils.py
├── six.py
└── splunklib
│ ├── __init__.py
│ ├── binding.py
│ ├── client.py
│ ├── data.py
│ ├── modularinput
│ ├── __init__.py
│ ├── argument.py
│ ├── event.py
│ ├── event_writer.py
│ ├── input_definition.py
│ ├── scheme.py
│ ├── script.py
│ ├── utils.py
│ └── validation_definition.py
│ ├── ordereddict.py
│ ├── results.py
│ ├── searchcommands
│ ├── __init__.py
│ ├── decorators.py
│ ├── environment.py
│ ├── eventing_command.py
│ ├── external_search_command.py
│ ├── generating_command.py
│ ├── internals.py
│ ├── reporting_command.py
│ ├── search_command.py
│ ├── streaming_command.py
│ └── validators.py
│ └── six.py
├── metadata
└── default.meta
└── static
├── appIcon.png
└── appIcon_2x.png
/AUTHORS.md:
--------------------------------------------------------------------------------
1 | =======
2 | Credits
3 | =======
4 |
5 | Development Lead
6 | ----------------
7 |
8 | * [Oluwaseun Remi-Omosowon](mailto:seunomosowon@gmail.com)
9 |
10 | Contributors
11 | ------------
12 |
13 | * [François Lacombe](mailto:flacombe@adista.fr)
14 | * [Nathan Worsham](mailto:nworsham@gmail.com)
15 | * [Lowell Alleman](mailto:lowell@kintyre.co)
16 |
--------------------------------------------------------------------------------
/CONTRIBUTING.md:
--------------------------------------------------------------------------------
1 | ---
2 |
3 | Contributing
4 |
5 | ---
6 |
7 | Contributions are welcome, and they are greatly appreciated! Every little bit helps, and credit will always be given.
8 |
9 | You can contribute in many ways:
10 |
11 | Types of Contributions
12 |
13 | 1. Report Bugs
14 |
15 | Report bugs at [TA-mailclient repo via on Github](https://github.com/seunomosowon/TA-mailclient/issues).
16 |
17 | If you are reporting a bug, please include:
18 |
19 | * Your operating system name and version.
20 | * Any details about your local setup that might be helpful in troubleshooting.
21 | * Detailed steps to reproduce the bug.
22 |
23 | 2. Fix Bugs
24 |
25 | Look through the GitHub issues for bugs. Anything tagged with "bug" is open to whoever wants to implement it.
26 |
27 | 3. Implement Features
28 |
29 | Look through the GitHub issues for features. Anything tagged with "feature" is open to whoever wants to implement it.
30 |
31 | 4. Write Documentation
32 |
33 | TA-mailclient could always use more documentation. Feel free to add documentation for an undocumented feature.
34 |
35 | 5. Submit Feedback
36 |
37 | Please rate the app on [Splunkbase](https:://splunkbase.splunk.com/app/3200/)
38 | You can also send feedback or submit an issue on [Github](https://github.com/seunomosowon/TA-mailclient/issues).
39 |
40 | Feature requests can also be submitted in the same way.
41 | Remember that this is a volunteer-driven project, and that contributions are welcome :)
42 |
43 | This has been tested with Gmail, gmx.com, and a few other mail servers. You can also send a list of public mail servers that you use this without issues.
44 |
45 | Feature requests are yet to be added to Github include the following:
46 | * Oath support for imap
47 | * Additional mailbox folder support for IMAP
48 | * Parameterization of mailbox limits for each run (currenlty set to 25)
49 |
50 | I'm also working on integrating with Travis CI to allow automatic tests and continuous integration.
51 |
52 | #Guidelines:
53 |
54 | Please fork the repo on [Github](https://github.com/seunomosowon/TA-mailclient/) and create a branch for local changes. Create a pull request to the development branch.
55 |
56 | Thanks again for volunteering :smiley:
57 |
58 | Also remember to add your name to the list of contributors in AUTHORs.md
59 |
60 |
--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
1 |
2 | Apache License
3 | Version 2.0, January 2004
4 | http://www.apache.org/licenses/
5 |
6 | TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
7 |
8 | 1. Definitions.
9 |
10 | "License" shall mean the terms and conditions for use, reproduction,
11 | and distribution as defined by Sections 1 through 9 of this document.
12 |
13 | "Licensor" shall mean the copyright owner or entity authorized by
14 | the copyright owner that is granting the License.
15 |
16 | "Legal Entity" shall mean the union of the acting entity and all
17 | other entities that control, are controlled by, or are under common
18 | control with that entity. For the purposes of this definition,
19 | "control" means (i) the power, direct or indirect, to cause the
20 | direction or management of such entity, whether by contract or
21 | otherwise, or (ii) ownership of fifty percent (50%) or more of the
22 | outstanding shares, or (iii) beneficial ownership of such entity.
23 |
24 | "You" (or "Your") shall mean an individual or Legal Entity
25 | exercising permissions granted by this License.
26 |
27 | "Source" form shall mean the preferred form for making modifications,
28 | including but not limited to software source code, documentation
29 | source, and configuration files.
30 |
31 | "Object" form shall mean any form resulting from mechanical
32 | transformation or translation of a Source form, including but
33 | not limited to compiled object code, generated documentation,
34 | and conversions to other media types.
35 |
36 | "Work" shall mean the work of authorship, whether in Source or
37 | Object form, made available under the License, as indicated by a
38 | copyright notice that is included in or attached to the work
39 | (an example is provided in the Appendix below).
40 |
41 | "Derivative Works" shall mean any work, whether in Source or Object
42 | form, that is based on (or derived from) the Work and for which the
43 | editorial revisions, annotations, elaborations, or other modifications
44 | represent, as a whole, an original work of authorship. For the purposes
45 | of this License, Derivative Works shall not include works that remain
46 | separable from, or merely link (or bind by name) to the interfaces of,
47 | the Work and Derivative Works thereof.
48 |
49 | "Contribution" shall mean any work of authorship, including
50 | the original version of the Work and any modifications or additions
51 | to that Work or Derivative Works thereof, that is intentionally
52 | submitted to Licensor for inclusion in the Work by the copyright owner
53 | or by an individual or Legal Entity authorized to submit on behalf of
54 | the copyright owner. For the purposes of this definition, "submitted"
55 | means any form of electronic, verbal, or written communication sent
56 | to the Licensor or its representatives, including but not limited to
57 | communication on electronic mailing lists, source code control systems,
58 | and issue tracking systems that are managed by, or on behalf of, the
59 | Licensor for the purpose of discussing and improving the Work, but
60 | excluding communication that is conspicuously marked or otherwise
61 | designated in writing by the copyright owner as "Not a Contribution."
62 |
63 | "Contributor" shall mean Licensor and any individual or Legal Entity
64 | on behalf of whom a Contribution has been received by Licensor and
65 | subsequently incorporated within the Work.
66 |
67 | 2. Grant of Copyright License. Subject to the terms and conditions of
68 | this License, each Contributor hereby grants to You a perpetual,
69 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable
70 | copyright license to reproduce, prepare Derivative Works of,
71 | publicly display, publicly perform, sublicense, and distribute the
72 | Work and such Derivative Works in Source or Object form.
73 |
74 | 3. Grant of Patent License. Subject to the terms and conditions of
75 | this License, each Contributor hereby grants to You a perpetual,
76 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable
77 | (except as stated in this section) patent license to make, have made,
78 | use, offer to sell, sell, import, and otherwise transfer the Work,
79 | where such license applies only to those patent claims licensable
80 | by such Contributor that are necessarily infringed by their
81 | Contribution(s) alone or by combination of their Contribution(s)
82 | with the Work to which such Contribution(s) was submitted. If You
83 | institute patent litigation against any entity (including a
84 | cross-claim or counterclaim in a lawsuit) alleging that the Work
85 | or a Contribution incorporated within the Work constitutes direct
86 | or contributory patent infringement, then any patent licenses
87 | granted to You under this License for that Work shall terminate
88 | as of the date such litigation is filed.
89 |
90 | 4. Redistribution. You may reproduce and distribute copies of the
91 | Work or Derivative Works thereof in any medium, with or without
92 | modifications, and in Source or Object form, provided that You
93 | meet the following conditions:
94 |
95 | (a) You must give any other recipients of the Work or
96 | Derivative Works a copy of this License; and
97 |
98 | (b) You must cause any modified files to carry prominent notices
99 | stating that You changed the files; and
100 |
101 | (c) You must retain, in the Source form of any Derivative Works
102 | that You distribute, all copyright, patent, trademark, and
103 | attribution notices from the Source form of the Work,
104 | excluding those notices that do not pertain to any part of
105 | the Derivative Works; and
106 |
107 | (d) If the Work includes a "NOTICE" text file as part of its
108 | distribution, then any Derivative Works that You distribute must
109 | include a readable copy of the attribution notices contained
110 | within such NOTICE file, excluding those notices that do not
111 | pertain to any part of the Derivative Works, in at least one
112 | of the following places: within a NOTICE text file distributed
113 | as part of the Derivative Works; within the Source form or
114 | documentation, if provided along with the Derivative Works; or,
115 | within a display generated by the Derivative Works, if and
116 | wherever such third-party notices normally appear. The contents
117 | of the NOTICE file are for informational purposes only and
118 | do not modify the License. You may add Your own attribution
119 | notices within Derivative Works that You distribute, alongside
120 | or as an addendum to the NOTICE text from the Work, provided
121 | that such additional attribution notices cannot be construed
122 | as modifying the License.
123 |
124 | You may add Your own copyright statement to Your modifications and
125 | may provide additional or different license terms and conditions
126 | for use, reproduction, or distribution of Your modifications, or
127 | for any such Derivative Works as a whole, provided Your use,
128 | reproduction, and distribution of the Work otherwise complies with
129 | the conditions stated in this License.
130 |
131 | 5. Submission of Contributions. Unless You explicitly state otherwise,
132 | any Contribution intentionally submitted for inclusion in the Work
133 | by You to the Licensor shall be under the terms and conditions of
134 | this License, without any additional terms or conditions.
135 | Notwithstanding the above, nothing herein shall supersede or modify
136 | the terms of any separate license agreement you may have executed
137 | with Licensor regarding such Contributions.
138 |
139 | 6. Trademarks. This License does not grant permission to use the trade
140 | names, trademarks, service marks, or product names of the Licensor,
141 | except as required for reasonable and customary use in describing the
142 | origin of the Work and reproducing the content of the NOTICE file.
143 |
144 | 7. Disclaimer of Warranty. Unless required by applicable law or
145 | agreed to in writing, Licensor provides the Work (and each
146 | Contributor provides its Contributions) on an "AS IS" BASIS,
147 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
148 | implied, including, without limitation, any warranties or conditions
149 | of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
150 | PARTICULAR PURPOSE. You are solely responsible for determining the
151 | appropriateness of using or redistributing the Work and assume any
152 | risks associated with Your exercise of permissions under this License.
153 |
154 | 8. Limitation of Liability. In no event and under no legal theory,
155 | whether in tort (including negligence), contract, or otherwise,
156 | unless required by applicable law (such as deliberate and grossly
157 | negligent acts) or agreed to in writing, shall any Contributor be
158 | liable to You for damages, including any direct, indirect, special,
159 | incidental, or consequential damages of any character arising as a
160 | result of this License or out of the use or inability to use the
161 | Work (including but not limited to damages for loss of goodwill,
162 | work stoppage, computer failure or malfunction, or any and all
163 | other commercial damages or losses), even if such Contributor
164 | has been advised of the possibility of such damages.
165 |
166 | 9. Accepting Warranty or Additional Liability. While redistributing
167 | the Work or Derivative Works thereof, You may choose to offer,
168 | and charge a fee for, acceptance of support, warranty, indemnity,
169 | or other liability obligations and/or rights consistent with this
170 | License. However, in accepting such obligations, You may act only
171 | on Your own behalf and on Your sole responsibility, not on behalf
172 | of any other Contributor, and only if You agree to indemnify,
173 | defend, and hold each Contributor harmless for any liability
174 | incurred by, or claims asserted against, such Contributor by reason
175 | of your accepting any such warranty or additional liability.
176 |
177 | END OF TERMS AND CONDITIONS
178 |
179 | APPENDIX: How to apply the Apache License to your work.
180 |
181 | To apply the Apache License to your work, attach the following
182 | boilerplate notice, with the fields enclosed by brackets "[]"
183 | replaced with your own identifying information. (Don't include
184 | the brackets!) The text should be enclosed in the appropriate
185 | comment syntax for the file format. We also recommend that a
186 | file or class name and description of purpose be included on the
187 | same "printed page" as the copyright notice for easier
188 | identification within third-party archives.
189 |
190 | Copyright [yyyy] [name of copyright owner]
191 |
192 | Licensed under the Apache License, Version 2.0 (the "License");
193 | you may not use this file except in compliance with the License.
194 | You may obtain a copy of the License at
195 |
196 | http://www.apache.org/licenses/LICENSE-2.0
197 |
198 | Unless required by applicable law or agreed to in writing, software
199 | distributed under the License is distributed on an "AS IS" BASIS,
200 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
201 | See the License for the specific language governing permissions and
202 | limitations under the License.
203 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 |
2 | ## Table of Contents
3 |
4 | ### OVERVIEW
5 |
6 | - About the TA-mailclient
7 | - Release notes
8 | - About this release
9 | - New features
10 | - To Do
11 | - Known issues
12 | - Third-party software attributions
13 | - Older Releases
14 | - Support and resources
15 |
16 | ### INSTALLATION AND CONFIGURATION
17 |
18 | - Hardware and software requirements
19 | - Splunk Enterprise system requirements
20 | - Download
21 | - Installation steps
22 | - Deploy to single server instance
23 | - Deploy to distributed deployment
24 | - Deploy to Splunk Cloud
25 | - Configure TA-mailclient
26 | - Parameters
27 | - Upgrade
28 | - Copyright & License
29 |
30 | ### USER GUIDE
31 |
32 | - Data types
33 | - Troubleshooting
34 | - Diagnostic & Debug Logs
35 |
36 |
37 | ---
38 | ### OVERVIEW
39 |
40 | #### About the TA-mailclient
41 |
42 | | Author | Oluwaseun Remi-Omosowon |
43 | | --- | --- |
44 | | App Version | 1.6.0 |
45 | | Vendor Products |
poplib
imaplib
SDK for Python 1.6.14
|
46 |
47 | The TA-mailclient add-on fetches emails for Splunk to index from mailboxes
48 | using either POP3 or IMAP, with or without SSL.
49 |
50 | The modular input also stores takes the password from inputs.conf in plain text,
51 | and replaces it with a place holder, while storing it encrypted within Splunk.
52 | This is built using the Splunk SDK for Python, should work on any Splunk
53 | installation with Python available including SHC.
54 | Passwords should also get replicated between search heard peer members.
55 |
56 | This only fetches emails from the 'inbox' folder when using POP3. Additional mailbox folders can be indexed when using IMAP.
57 |
58 | Be sure to set the interval to run this as frequently as required.
59 |
60 | It supports all 'text/\*' content types and several well known scripts (.bat, .js, .sh) detailed below:
61 |
62 | ```
63 | 'application/xml'
64 | 'application/xhtml'
65 | 'application/x-sh'
66 | 'application/x-csh',
67 | 'application/javascript'
68 | 'application/bat'
69 | 'application/x-bat'
70 | 'application/x-msdos-program'
71 | 'application/textedit'
72 | ```
73 | Images, videos and executables are not indexed.
74 |
75 | ##### Scripts and binaries
76 |
77 | Includes:
78 | - Splunk SDK for Python (1.6.14)
79 | - Six python 2/3 compatibility (1.15.0)
80 | - mail_lib - supports the calculation of vincenty distances which is used by default
81 | - constants.py - A number of constants / defaults used throughout the mail_lib module.
82 | - mail_common.py - Shared functions used to parse emails and attachments
83 | - exceptions raised by functions used in the mail_lib module.
84 |
85 | #### Release notes
86 |
87 | ##### About this release
88 |
89 | Version 1.6.0 of the TA-mailclient is compatible with:
90 |
91 | | Splunk Enterprise versions | 8.x, 7.x |
92 | | --- | --- |
93 | | CIM | Not Applicable |
94 | | Platforms | Platform independent |
95 | | Lookup file changes | No lookups included in this app |
96 |
97 | This version removes support for unencrypted connections to mailboxes to allow the app pass Splunk Certification.
98 | The _is_secure_ is no longer required and should be removed from the config.
99 |
100 | The administrator is responsible for setting the sourcetype to whatever is desired,
101 | as well as extracting CIM fields for the sourcetype.
102 | This app already includes several extractions for different parts of the message that can be reused.
103 |
104 | This app will not work on a universal forwarder,
105 | as it requires Python which comes with an HF or a full Splunk install.
106 |
107 | **Note:** Travis CI includes tests for both secure versions of POP3 / IMAP.
108 |
109 | ##### New features
110 |
111 | TA-mailclient includes the following new features:
112 |
113 | - Added support for Python 3
114 | - Added six 1.15.0
115 | - Upgraded Splunk SDK to 1.6.14
116 | - Fix CI/CD tests to work for POP3 on v7.3, fix testing
117 | - Added Fix for working with Zips and docx with python2/python3
118 | - Added support for indexing emails from additional folders when using IMAP
119 |
120 | ##### To Do
121 |
122 | - Add attachment file hash to Splunk
123 | - Add support for doc / ppt / pptx
124 |
125 | ##### Known issues
126 |
127 | This is currently tested against 7.3, 8.0 and the latest version of Splunk Enterprise (v8.1 as at the time of this writing).
128 | Issues can be reported and tracked on Github at this time.
129 |
130 |
131 | ##### Third-party software attributions
132 |
133 | This uses the inbuilt poplib and imaplib that comes with Python by default.
134 |
135 | Contributions on github are welcome and will be incorporated into the main release.
136 | Current contributors are listed in AUTHORS.md.
137 |
138 |
139 | ##### Older Releases
140 | * v1.6.0
141 | * Includes support for dropping attachments
142 | * Migrated CICD to CircleCI
143 | * Added appinspect testing to CI/CD pipeline
144 | * v1.5.5
145 | * Updated Improved support for Python3
146 | * Improved coding style to match new Splunk standards
147 | * Fixed bugs related to indexing zip and docx as a result of Python 2-3 compatibility
148 | * v1.4.0
149 | * Included support for Splunk v8.0
150 | * v1.3.5
151 | * Fixed bug introduced in v1.3.0
152 | * v1.3.0
153 | * Made it more modular to supporting more file types in zips and in emails
154 | * Added support for zips and files within zips
155 | * Fixed unicode conversion of emails following contributions from Francois Lacombe on GitHub
156 | - Also added static mail preamble for line break. Event breaking configuration may not be
157 | required since the modular input writes individual events separately, but it's always a good idea.
158 | * Additional logging from pop3 / imap
159 | * Removed interval from inputs.conf.spec
160 | * Upgraded Splunk SDK to 1.6.2
161 | * Added additional test cases on Travis CI to test that functionality work
162 | * modularized storage/password functions to make them reusable and simpler
163 | * Also fixed exception handling when dealing with storage/password
164 | * Fixed type casting for boolean parameters (is\_secure, include\_headers) and port validation
165 | * Rewrote sections of mail\_common
166 | * Merged functions from poputils / imaputils into main code and added additional logs from connection
167 |
168 | * v0.5.1
169 | * encoding corrections
170 | * deduplicate Date and MessageId from indexed headers
171 | * correction of MessageID extraction
172 | * changed the separator to a predefined one instead of Date and MessageID
173 | * activated and changed label for unsupported attachment
174 |
175 | * v0.5.0
176 | * Fixed UTF-8 encoding of mails before indexing. (Supporting Gmail and others)
177 |
178 | * v0.4.9
179 | * Changed encoding to support reading gmail.
180 |
181 | * v0.4.8
182 | * removed error introduced in v0.4.7
183 |
184 | * v0.4.7
185 | * Removed password field validation to allow users have complex or easy passwords however long
186 | * Handled all mail exceptions
187 |
188 | * v0.4.6
189 | * Fixed bug.
190 | * Fixed header inclusion
191 |
192 | * v0.4.5
193 | * Fixed bug. Removed line which caused v0.4.4 to fail
194 | * Fixed header inclusion
195 |
196 | * v0.4.4
197 | * Updated app to ignore case of file attachment extension
198 |
199 | * v0.4.3
200 | * Made extensions case insensitive
201 | * Added support for indexing _.docx_ extensions
202 | * Generalised ```Mail.save_password()``` to allow reuse of code when writing other modular inputs.
203 | * Optimized python import statements
204 | * Fixed deleting of mails in poplib which was broken in 0.4
205 |
206 | * v0.4.2
207 | * Added support for indexing mail headers
208 |
209 | * v0.4.1
210 | * Fixed bug with 0.4.0
211 | * Made updates to fix unneeded else statement which introduced bug in 0.4.0.
212 |
213 | * v0.4
214 | * Added support for decoding unicode characters in other languages or and removing the unicode identifier in the header.
215 | * Improved support for indexing some file types even if the content-type is not set correctly. (as with Microsoft sending some files as binaries instead of text)
216 | * Added fundamental code to support indexing of attachment as a configurable option in future release by the user.
217 | * Added multiple field extractions for the email header and file attachments.
218 | * Introduced a bug which was corrected in 0.4.1 **Faulty version**
219 |
220 | **Note:** _filename_ and _filecontent_ are multi-valve fields.
221 |
222 | * v0.3
223 | * Adds support for mailbox cleanup options
224 |
225 | * v0.2
226 | * Adds support for base64 encoded emails.
227 |
228 |
229 | #### Support and resources
230 |
231 | **Questions and answers**
232 |
233 | Access questions and answers specific to the TA-mailclient at (https://answers.splunk.com/).
234 |
235 | **Support**
236 |
237 | This Splunk support add-on is community / developer supported.
238 |
239 | Questions asked on Splunk answers will be answered either by the community of users or by the developer when available.
240 | All support questions should include the version of Splunk and OS.
241 |
242 | You can also contact the developer directly via [Splunkbase](https://splunkbase.splunk.com/app/3200/).
243 | Feedback and feature requests can also be sent via Splunkbase.
244 |
245 | Issues can also be submitted at the [TA-mailclient repo via on Github](https://github.com/seunomosowon/TA-mailclient/issues)
246 |
247 | Future release will support
248 | 1. Support for configuration of mail limits in inputs.conf
249 | 2. Recursive option to read all folders inside Inbox, and not just emails within inbox.
250 | 3. Support indexing mails from additional folders in a mailbox
251 |
252 | **Note** : This has not been tested against an exhaustive list of mail servers, so I'll welcome the feedback.
253 |
254 | Also, feel free to send me a list of well known servers that you 're using this with without problems.
255 |
256 | Rate the add-on on [Splunkbase](https://splunkbase.splunk.com/app/3200/) if you use it and are happy with it,
257 | and share your feedback. Thanks!
258 |
259 |
260 | ## INSTALLATION AND CONFIGURATION
261 | ### Hardware and software requirements
262 |
263 | #### Hardware requirements
264 |
265 | TA-mailclient supports the following server platforms in the versions supported by Splunk Enterprise:
266 |
267 | - Linux
268 | - Windows
269 |
270 | The app was developed to be platform agnostic, but tests are mostly run on Linix.
271 |
272 | Please contact the developer with issues running this on Windows. See the Splunk documentation for hardware
273 | requirements for running a heavy forwarder.
274 |
275 | #### Software requirements
276 |
277 | To function properly, TA-mailclient has no external requirements but needs to be installed on a full Splunk
278 | install which provides python and the required libraries (poplib and imaplib).
279 |
280 | #### Splunk Enterprise system requirements
281 |
282 | Because this add-on runs on Splunk Enterprise, all of the [Splunk Enterprise system requirements](http://docs.splunk.com/Documentation/Splunk/latest/Installation/Systemrequirements) apply.
283 |
284 | #### Download
285 |
286 | Download the TA-mailclient at one of the following locaitons:
287 | - [Splunkbase](https://splunkbase.splunk.com/app/3200/#/details)
288 | - [Github](https://github.com/seunomosowon/TA-mailclient)
289 |
290 | #### Installation steps
291 |
292 | ##### Deploy to single server instance
293 |
294 | To install and configure this app on your supported standalone platform, do one of the following:
295 |
296 | - Install on a standalone Splunk Enterprise install via the GUI. [See Link](https://docs.splunk.com/Documentation/AddOns/released/Overview/Singleserverinstall)
297 | - Extract the technology add-on to ```$SPLUNK_HOME/etc/apps/``` and restart Splunk
298 |
299 | ##### Deploy to distributed deployment
300 |
301 | **Install to search head** - (Standalone or Search head cluster)
302 |
303 | - Deploy the props.conf and transforms.conf from TA-mailclient to the search head.
304 | If using search head cluster, deploy the props.conf and transforms.conf via a search head deployer.
305 |
306 |
307 | **Install to indexers**
308 |
309 | - No App needs to be installed on indexers
310 |
311 | **Install to forwarders**
312 |
313 | - Follow the steps to install the TA-mailclient on a heavy forwarder.
314 | More instructions available at the following [URL](https://docs.splunk.com/Documentation/AddOns/released/Overview/Distributedinstall#Heavy_forwarders)
315 |
316 | - Configure an email input by going to the setup page or configuring inputs.conf.
317 |
318 | ##### Deploy to Splunk Cloud
319 |
320 | For Splunk cloud installations, install TA-mailclient on a heavy forwarder that has been configured to forward
321 | events to your Splunk Cloud instance.
322 | The sourcetype is set by the administrator of the heavy forwarder when configuring the inputs.
323 |
324 | You can work with Splunk Support on installing the Support add-on on Splunk Cloud for parsing the mails collected.
325 |
326 |
327 | #### Configure TA-mailclient
328 |
329 | This app adds a mail:// modular input and supports a variety of parameters in inputs.conf.
330 |
331 | ```
332 | [mail://email_address@domain.com]
333 | interval = 600
334 | mailserver = imap.domain.com
335 | password = mypassword
336 | protocol = IMAP|POP3
337 | disabled = 0
338 | mailbox_cleanup = delete
339 | additional_folder = test,rfc,spam
340 |
341 | ```
342 |
343 | Once the input is read, the password gets replaced and shows as 'encrypted'.
344 | As such, the password for the mailbox must not be set to 'encrypted'.
345 |
346 | The input can be edited if the password needs to be updated, and the password stored in a password
347 | storage endpoint would get updated automatically. Passwords are never stored in clear text.
348 |
349 | A different sourcetype can be specified for each input, thus making it possible to have different sourcetypes
350 | for every mailbox. Mailbox cleanup is also managed automatically, and emails are deleted once it has been
351 | indexed.
352 |
353 | ##### Parameters
354 |
355 | **mailserver** - This is a mandatory field and should be the hostname or
356 | IP address for the mail server or client access server with support for retrieving emails via POP3 or IMAP
357 |
358 | **protocol** - This must be set to either POP3 or IMAP
359 |
360 | **password** - Passwords must be set for every account,
361 | or the input will get disabled.
362 |
363 | **mailbox_cleanup** = This indicates if every email should be deleted as it is read,
364 | or delayed until the next interval.
365 | Setting this to ```readonly``` prevents mails from being deleted.
366 |
367 | The default is ```readonly```. Supported options are:
368 | ```delayed|delete|readonly```
369 |
370 | **interval** - This should be configured to run as frequent as required
371 | to retreive emails. This modular input retrieves up to 20 emails at each run.
372 | A future release to this input might allow the limit to be configured as a parameter to the modular input.
373 |
374 | This modular input supports multiple instances, and each input runs at separate intervals.
375 |
376 | **include_headers** - This determines if email headers should be included.
377 |
378 | **additional_folders** - This is an optional parameter containing a comma-separated list of additional folders to be indexed if IMAP is configured for the mailbox.
379 |
380 | **drop_attachment** - This is an optional parameter to determine if email attachment should be discarded.
381 |
382 | ### Copyright & License
383 |
384 | A copy of the Creative Commons Legal code has been added to the add-on detailing its license.
385 |
386 |
387 | ## USER GUIDE
388 |
389 | ### Data types
390 |
391 | Data is indexed using a sourcetype specified by the administrator when configuring the inputs.
392 | If nothing is specified, events will get indexed with a sourcetype of `mail`.
393 |
394 | ### Troubleshooting
395 |
396 | Once an email is indexed, it will not be re-indexed except the checkpoint directory is emptied.
397 | This can be achieved by running the following command:
398 | ```
399 | splunk clean inputdata mail
400 | ```
401 |
402 | #### Diagnostic & Debug Logs
403 |
404 | Logs can be found by searching Splunk internal logs
405 |
406 | ```index=_internal sourcetype=splunkd (component=ModularInputs OR component=ExecProcessor) mail.py```
407 |
408 |
409 | Additional logging can be enabled by turning on debug logging for ExecProcessor and ModInputs.
410 | set the logging level of the ExecProcessor to Debug
411 |
412 | /opt/splunk/bin/splunk set log-level ExecProcessor -level DEBUG
413 | /opt/splunk/bin/splunk set log-level ModInputs -level DEBUG
414 |
415 | You can find additional ways to enable debug logging on
416 | [here](http://docs.splunk.com/Documentation/Splunk/latest/Troubleshooting/Enabledebuglogging).
417 |
--------------------------------------------------------------------------------
/README/inputs.conf.spec:
--------------------------------------------------------------------------------
1 | [mail://]
2 | * The name of the stanza should be an email address which would be used to connect to the server.
3 |
4 | protocol = [POP3|IMAP]
5 | * The protocol to be used to fetch emails from the server
6 |
7 | mailserver =
8 | * This is the mailserver to fetch mails from
9 |
10 | password =
11 | * The password for the account provided in the stanza name
12 |
13 | mailbox_cleanup = [delete,delayed,readonly]
14 | * This determines if the mails should be one of the following:
15 | * delete: deleted as they are indexed
16 | * delayed: deleted on next connection to the mailbox after verifying that the mail was indexed
17 | * readonly: mails will not be deleted. It will be read and left in the mailbox.
18 | * If this is not set, the default option used will be readonly
19 |
20 | include_headers =
21 | * This determines if email headers should be included.
22 |
23 | maintain_rfc =
24 | * This determines if email will still maintain RFC compatability for parsing tools
25 |
26 | attach_message_primary =
27 | * This determines if an attached message will instead be the indexed email (assuming the outer message was just the delivery mechanism)
28 |
29 | additional_folders =
30 | * This suggests additional folders to read messages via IMAP
31 |
32 | drop_attachment =
33 | * This determines if an email attachment will be indexed
--------------------------------------------------------------------------------
/appserver/static/screenshot.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/seunomosowon/TA-mailclient/b4745263d53f03e06edf098a665a5597d40fe449/appserver/static/screenshot.png
--------------------------------------------------------------------------------
/default/app.conf:
--------------------------------------------------------------------------------
1 | [install]
2 | is_configured = 0
3 |
4 | [ui]
5 | is_visible = 0
6 | label = Technology Add-on for Mail retrieval
7 |
8 | [launcher]
9 | author = seunomosowon
10 | description = Get mails from a mail server via POP3 or IMAP
11 | version = 1.6.0
12 |
13 | [package]
14 | id = TA-mailclient
15 | check_for_updates = true
16 |
--------------------------------------------------------------------------------
/default/authorize.conf:
--------------------------------------------------------------------------------
1 | [capability::edit_modinput_mail]
2 | # Capability required to add mail inputs and edit settings.
3 |
4 | [role_admin]
5 | edit_modinput_mail = enabled
6 |
--------------------------------------------------------------------------------
/default/inputs.conf:
--------------------------------------------------------------------------------
1 | [mail]
2 | python.version = python3
3 |
--------------------------------------------------------------------------------
/default/props.conf:
--------------------------------------------------------------------------------
1 | [source::mail:\/\/...]
2 | KV_MODE = auto
3 | SHOULD_LINEMERGE=false
4 | MAX_EVENTS=5000
5 | LINE_BREAKER=(VGhpcyBpcyBhIG1haWwgc2VwYXJhdG9yIGluIGJhc2U2NCBmb3Igb3VyIFNwbHVuayBpbmRleGluZwo=[\r\n]+)
6 | TIME_PREFIX= \nDate:
7 | MAX_TIMESTAMP_LOOKAHEAD = 32
8 | TIME_FORMAT= %a, %d %b %Y %H:%M:%S %z
9 | TRUNCATE=200000
10 | REPORT-file_attachments = file_attachment
11 | REPORT-multi_part = multi_part
12 | REPORT-attachment_filename = attachment_filename:kvextraction
13 | REPORT-attachment_md5 = attachment_md5:kvextraction
14 | REPORT-attachment_sha256 = attachment_sha256:kvextraction
15 | EXTRACT-Message_ID = (?i)^Message-ID:\h+(?[^\r\n>]+?)>?$
16 | EXTRACT-From = ^From:\h+(?(?:"?(?[^<\r\n]+)"?\h+)?(?[^\r\n]+?)>?)$
17 | EXTRACT-Subject = ^Subject:\h+(?[^\r\n]+)$
18 | EXTRACT-TO = ^To:\h+(?(?:"?(?[^<\r\n]+)"?\h+)?(?[^\r\n]+?))$
19 | FIELDALIAS-dest = host AS dest
20 | FIELDALIAS-mid = MessageID AS message_id
21 | FIELDALIAS-src_user = from AS src_user
22 | FIELDALIAS-sender = from_email AS sender
23 | FIELDALIAS-recipient = to AS recipient
24 | FIELDALIAS-file_hash = sha256 AS file_hash
25 | ANNOTATE_PUNCT = false
26 |
27 |
--------------------------------------------------------------------------------
/default/transforms.conf:
--------------------------------------------------------------------------------
1 | [file_attachment]
2 | REGEX=(?ms)#BEGIN_ATTACHMENT:\s(?[^\r\n]+)[\r\n]+(?.*)#END_ATTACHMENT:\s*\g{file_name}
3 | MV_ADD=true
4 |
5 | [multi_part]
6 | REGEX=(?ms)[\r\n]#START_OF_MULTIPART_(\d+)[\r\n](?.*)[\r\n]#END_OF_MULTIPART_\1[\r\n]*
7 | MV_ADD=true
8 |
9 | [attachment_md5:kvextraction]
10 | FORMAT = md5::$1
11 | REGEX = md5\s=\s(\w+)
12 | MV_ADD = true
13 |
14 | [attachment_sha256:kvextraction]
15 | FORMAT = sha256::$1
16 | REGEX = sha256\s=\s(\w+)
17 | MV_ADD = true
18 |
19 | [attachment_filename:kvextraction]
20 | FORMAT = file_name::$1
21 | REGEX = file_name\s=\s((?!None\s)[^\.]+(?:\.\w+)?)\s
22 | MV_ADD = true
23 |
--------------------------------------------------------------------------------
/lib/file_parser/__init__.py:
--------------------------------------------------------------------------------
1 | from .utils import *
2 |
3 | __version_info__ = (1, 3, 0)
4 | __version__ = ".".join(map(str, __version_info__))
5 | __all__ = ['ZIP_EXTENSIONS', 'TEXT_FILE_EXTENSIONS', 'SUPPORTED_CONTENT_TYPES',
6 | 'email_mime', 'docx', 'zip']
7 |
--------------------------------------------------------------------------------
/lib/file_parser/docx.py:
--------------------------------------------------------------------------------
1 | """ Parse .docx files """
2 | from __future__ import unicode_literals
3 |
4 | from .utils import *
5 | from xml.dom.minidom import parse as parsexml
6 | from six import text_type, binary_type, BytesIO
7 | from six import ensure_binary, ensure_str
8 | import zipfile
9 |
10 |
11 | def parse_docx(part, part_name):
12 | """
13 | This reads a docx file form a string and outputs just the text from the document
14 | along with the document's internal structure
15 | :param part: This is a MIME part from an email that contains a docx file
16 | :type part: Union[email.message.Message, basestring]
17 | :param part_name: This can be either a file name or string $EMAIL$
18 | :type part_name basestring
19 | :return: This returns the texts from the word document.
20 | :rtype: list
21 | """
22 | if part_name == EMAIL_PART:
23 | decoded_payload = part.get_payload(decode=True)
24 | zip_name = part.get_filename() or ''
25 | else:
26 | decoded_payload = part
27 | zip_name = part_name
28 | fp = BytesIO(decoded_payload)
29 | try:
30 | zfp = zipfile.ZipFile(fp)
31 | except zipfile.BadZipfile:
32 | return ['#UNSUPPORTED_ATTACHMENT: %s' % zip_name]
33 | return_doc = []
34 | if zfp:
35 | return_doc.append(parsexml(zfp.open('[Content_Types].xml', 'r')).documentElement.toprettyxml())
36 | """
37 | I can check for Macros here
38 | if zfp.getinfo('word/vbaData.xml'):
39 | openXML standard supports any name for xml file. Need to check all files.
40 | Add the contents pages to the top of word file for visual inspection of macros
41 | """
42 | if zfp.getinfo('word/document.xml'):
43 | doc_xml = parsexml(zfp.open('word/document.xml', 'r'))
44 | return_doc.append(''.join([ensure_str(node.firstChild.nodeValue) for node in doc_xml.getElementsByTagName('w:t')]))
45 | else:
46 | return_doc.append('#UNSUPPORTED_DOCX_FILE: file_name = %s' % zip_name)
47 | else:
48 | return_doc.append('#INVALID_DOCX_FILE: file_name = %s' % zip_name)
49 | return return_doc
50 |
51 |
52 | def parse_docx_from_mail(message):
53 | """
54 |
55 | :param message: string representation of docx file
56 | :type message: email.message.Message
57 | :return:
58 | """
59 | parse_docx(message, EMAIL_PART)
60 |
61 |
62 | def parse_docx_from_string(docx_as_string, file_name):
63 | """
64 |
65 | :param docx_as_string: string representation of docx file
66 | :type docx_as_string: basestring
67 | :param file_name: docx file name
68 | :type file_name: basestring
69 | :return:
70 | """
71 | parse_docx(docx_as_string, file_name)
72 |
--------------------------------------------------------------------------------
/lib/file_parser/email_mime.py:
--------------------------------------------------------------------------------
1 | """ Parse emails files """
2 | from __future__ import unicode_literals
3 | from six import text_type, binary_type
4 |
5 | import email
6 | import re
7 | import os
8 | from . import zip
9 | import hashlib
10 | import quopri
11 | # noinspection PyUnresolvedReferences
12 | from base64 import b64decode
13 | try:
14 | from email.parser import Parser
15 | except ImportError:
16 | # Python 2
17 | from email.Parser import Parser
18 |
19 | from email.utils import mktime_tz, parsedate_tz
20 | from .utils import *
21 |
22 |
23 | def parse_email(email_as_string, include_headers, maintain_rfc, attach_message_primary):
24 | """
25 | This function parses an email and returns an array with different parts of the message.
26 | :param email_as_string: This represents the email in a bytearray to be processed
27 | :type email_as_string: basestring
28 | :param include_headers: This parameter specifies if all headers should be included.
29 | :type include_headers: bool
30 | :param maintain_rfc: This parameter specifies if RFC format for email stays intact
31 | :type maintain_rfc: bool
32 | :param attach_message_primary: This parameter specifies if first attached email should
33 | be used as the message for indexing instead of the carrier email
34 | :type attach_message_primary: bool
35 | :return: Returns a list with the [date, Message-id, mail_message]
36 | :rtype: list
37 | """
38 | message = email.message_from_string(email_as_string.strip()) or None
39 | if message is None:
40 | return [None, None, None]
41 | if attach_message_primary:
42 | message = change_primary_message(message)
43 | if maintain_rfc:
44 | index_mail = maintain_rfc_parse(message)
45 | else:
46 | mailheaders = Parser().parsestr(message.as_string(), True)
47 | headers = ["%s: %s" % (k, getheader(v)) for k, v in mailheaders.items() if k in MAIN_HEADERS]
48 | if include_headers:
49 | other_headers = ["%s: %s" % (k, getheader(v)) for k, v in mailheaders.items() if k not in MAIN_HEADERS]
50 | headers.extend(other_headers)
51 | body = []
52 | if message.is_multipart():
53 | part_number = 1
54 | for part in message.walk():
55 | content_type = part.get_content_type()
56 | content_disposition = part.get('Content-Disposition')
57 | if content_type in ['multipart/alternative', 'multipart/mixed']:
58 | # The multipart/alternative part is usually empty.
59 | body.append("Multipart envelope header: %s" % str(part.get_payload(decode=True)))
60 | continue
61 | body.append("#START_OF_MULTIPART_%d" % part_number)
62 | extension = str(os.path.splitext(part.get_filename() or '')[1]).lower()
63 | if extension in TEXT_FILE_EXTENSIONS or content_type in SUPPORTED_CONTENT_TYPES or \
64 | part.get_content_maintype() == 'text' or extension in ZIP_EXTENSIONS:
65 | if part.get_filename():
66 | body.append("#BEGIN_ATTACHMENT: %s" % str(part.get_filename()))
67 | if extension in ZIP_EXTENSIONS:
68 | body.append("\n".join(zip.parse_zip(part, EMAIL_PART)))
69 | else:
70 | body.append(recode_mail(part))
71 | body.append("#END_ATTACHMENT: %s" % str(part.get_filename()))
72 | else:
73 | body.append(recode_mail(part))
74 | else:
75 | body.append("#UNSUPPORTED_ATTACHMENT: file_name = %s - type = %s ; disposition=%s" % (
76 | part.get_filename(), content_type, content_disposition))
77 | body.append("#END_OF_MULTIPART_%d" % part_number)
78 | part_number += 1
79 | else:
80 | body.append(recode_mail(message))
81 | """mail_for_index = [MESSAGE_PREAMBLE]"""
82 | mail_for_index = []
83 | mail_for_index.extend(headers + body)
84 | index_mail = '\n'.join(s.decode('utf-8', 'ignore') if isinstance(s, binary_type) else s for s in mail_for_index)
85 | message_time = float(mktime_tz(parsedate_tz(message['Date'])))
86 | return [message_time, message['Message-ID'], index_mail]
87 |
88 | def change_primary_message(message):
89 | """
90 | This function will look for an attached email and return it. This is inteded to use
91 | the attached email as the email to be indexed instead of the carrier email.
92 | It checks if the message is already in message format or in a binary format and also
93 | only the first attached email will become the primary if there are more than one.
94 | :param message: This represents the email to be checked for attached email.
95 | :type message: email message object
96 | :return: Returns a email message object
97 | :rtype: email message object
98 | """
99 | for i in message.walk():
100 | if i.get_content_maintype()=='message':
101 | return i.get_payload()[0]
102 | elif i.get_content_subtype()=='octet-stream' and i.get_filename().lower().endswith('.eml'):
103 | if i['Content-Transfer-Encoding'].lower()=='base64':
104 | return email.message_from_string(b64decode(i.get_payload()))
105 | else:
106 | return email.message_from_string(i.get_payload())
107 |
108 | def maintain_rfc_parse(message):
109 | """
110 | This function parses an email and returns an array with different parts of the message
111 | but leaves the email still RFC compliant so that it works with Mail-Parser Plus app.
112 | Attachment headers are left in tact.
113 | :param message: This represents the email to be checked for attached email.
114 | :type message: email message object
115 | :return: Returns a email message formatted as a string
116 | :rtype: str
117 | """
118 | if not message.is_multipart():
119 | reformatted_message = quopri.decodestring(
120 | message.as_string().encode('ascii', 'ignore')
121 | ).decode("utf-8", 'ignore')
122 | return reformatted_message
123 | boundary = message.get_boundary()
124 | new_payload = '--' + boundary
125 | for i in message.get_payload():
126 | content_type = i.get_content_type()
127 | extension = str(os.path.splitext(i.get_filename() or '')[1]).lower()
128 | if extension in TEXT_FILE_EXTENSIONS or content_type in SUPPORTED_CONTENT_TYPES or \
129 | i.get_content_maintype() == 'text':
130 | text_content = i.as_string().encode('ascii', 'ignore')
131 | text_content = quopri.decodestring(text_content).decode("utf-8", 'ignore')
132 | new_payload += '\n' + text_content
133 | else:
134 | replace = re.sub(r'(?:\n\n)[\s\S]+',r'\n\n#UNSUPPORTED_ATTACHMENT:',i.as_string())
135 | filename = i.get_filename()
136 | charset = i.get_content_charset()
137 | try:
138 | md5 = hashlib.md5(i.get_payload(None,True)).hexdigest()
139 | sha256 = hashlib.sha256(i.get_payload(None,True)).hexdigest()
140 | except:
141 | md5 = ''
142 | sha256 = ''
143 | replace_string = """
144 | file_name = %(filename)s
145 | type = %(content_type)s
146 | charset = %(charset)s
147 | md5 = %(md5)s
148 | sha256 = %(sha256)s
149 | """
150 | metadata = replace_string % dict(
151 | content_type=content_type,
152 | filename=filename,
153 | charset=charset,
154 | md5=md5,
155 | sha256=sha256,
156 | )
157 | new_payload += '\n' \
158 | + replace \
159 | + metadata
160 | new_payload += '\n--' + boundary
161 | new_payload += '--'
162 | message.set_payload(new_payload)
163 | return message.as_string()
164 |
--------------------------------------------------------------------------------
/lib/file_parser/utils.py:
--------------------------------------------------------------------------------
1 | """
2 | This includes common functions that are required when dealing with mails
3 | """
4 | from __future__ import unicode_literals
5 |
6 | from email.header import decode_header
7 | from six import text_type, binary_type
8 |
9 | MAIN_HEADERS = ('Date', 'Message-Id', 'Message-ID', 'From', 'To', 'Subject')
10 | ZIP_EXTENSIONS = {'.zip', '.docx'}
11 | EMAIL_PART = '$EMAIL$'
12 | SUPPORTED_CONTENT_TYPES = {'application/xml', 'application/xhtml', 'application/x-sh', 'application/x-csh',
13 | 'application/javascript', 'application/bat', 'application/x-bat',
14 | 'application/x-msdos-program', 'application/textedit',
15 | 'application/vnd.openxmlformats-officedocument.wordprocessingml.document'}
16 | TEXT_FILE_EXTENSIONS = {'.csv', '.txt', '.md', '.py', '.bat', '.sh', '.rb', '.js', '.asm', '.log'}
17 | """
18 | It already indexes all text/* including:
19 | 'text/plain', 'text/html', 'text/x-asm', 'text/x-c','text/x-python-script','text/x-python'
20 | No need to add this to the supported types list
21 | """
22 |
23 |
24 | def getheader(header_text, default="ascii"):
25 | """ This decodes sections of the email header which could be represented in utf8 or other iso languages"""
26 | headers = decode_header(header_text)
27 | header_sections = [text if isinstance(text, text_type) else text_type(text, charset or default, "ignore") for text, charset in headers]
28 | return "".join(header_sections)
29 |
30 |
31 | def recode_mail(part):
32 | cset = part.get_content_charset()
33 | if cset == "None":
34 | cset = "ascii"
35 | try:
36 | if not part.get_payload(decode=True):
37 | result = ""
38 | else:
39 | result = text_type(part.get_payload(decode=True), cset, "ignore").encode('utf8', 'xmlcharrefreplace').strip()
40 | except TypeError:
41 | result = part.get_payload(decode=True)
42 | if isinstance(result, text_type):
43 | result = result.encode('utf8', 'xmlcharrefreplace').strip()
44 | return result
45 |
--------------------------------------------------------------------------------
/lib/file_parser/zip.py:
--------------------------------------------------------------------------------
1 | """Parse zip files"""
2 | from __future__ import unicode_literals
3 | from six import text_type, binary_type, BytesIO
4 | from six import ensure_binary, ensure_str
5 | from .utils import *
6 | from . import docx
7 | import os
8 | import zipfile
9 |
10 |
11 | def parse_zip(part, part_name):
12 | """
13 | This reads a docx file form a string and outputs just the text from the document
14 | along with the document's internal structure
15 | :param part: This is a MIME message part from an email that contains a docx file
16 | :type part: Union[email.message.Message, basestring]
17 | :param part_name: This can be either file or email
18 | :type part_name basestring
19 | :return: This returns the texts from the word document.
20 | :rtype: list
21 | """
22 | if EMAIL_PART == part_name:
23 | decoded_payload = part.get_payload(decode=True)
24 | zip_name = part.get_filename() or ''
25 | else:
26 | decoded_payload = part
27 | zip_name = part_name
28 | fp = BytesIO(decoded_payload)
29 | try:
30 | zfp = zipfile.ZipFile(fp)
31 | except zipfile.BadZipfile:
32 | return ['#UNSUPPORTED_ATTACHMENT: %s' % zip_name]
33 | extension = os.path.splitext(zip_name)[1].lower()
34 | unzip_content = []
35 | if zfp:
36 | ziplist = ['#BEGIN_ZIP_FILELIST: %s' % zip_name]
37 | ziplist.extend(zfp.namelist())
38 | ziplist.append('#END_ZIP_FILELIST: %s' % zip_name)
39 | unzip_content.append("\n".join(ziplist))
40 | if '.docx' == extension:
41 | unzip_content.extend(docx.parse_docx(part, part_name))
42 | else:
43 | for each_compressedfile in zfp.namelist():
44 | zipped_file = []
45 | if not each_compressedfile.endswith('/'):
46 | zipped_fextension = text_type(os.path.splitext(each_compressedfile)[1]).lower()
47 | zipped_file = ["#BEGIN_ATTACHMENT: %s/%s" % (zip_name, each_compressedfile)]
48 | if zipped_fextension in TEXT_FILE_EXTENSIONS:
49 | f = zfp.open(each_compressedfile)
50 | for line in f:
51 | zipped_file.append(ensure_str(line).rstrip('\n'))
52 | elif zipped_fextension in ZIP_EXTENSIONS:
53 | file_buff = zfp.open(each_compressedfile).read()
54 | zipped_file.extend(parse_zip(file_buff, each_compressedfile))
55 | else:
56 | zipped_file.append("#UNSUPPORTED_CONTENT: file_name = %s" % each_compressedfile)
57 | zipped_file.append("#END_ATTACHMENT: %s/%s" % (zip_name, each_compressedfile))
58 | unzip_content.append("\n".join(zipped_file))
59 | return unzip_content
60 |
61 |
62 | def parse_zip_from_mail(message):
63 | """
64 |
65 | :param message: string representation of docx file
66 | :type message: email.message.Message
67 | :return:
68 | """
69 | parse_zip(message, EMAIL_PART)
70 |
71 |
72 | def parse_zip_from_string(file_as_string, file_name):
73 | """
74 |
75 | :param file_as_string: string representation of docx file
76 | :type file_as_string: basestring
77 | :param file_name: docx file name
78 | :type file_name: basestring
79 | :return:
80 | """
81 | parse_zip(file_as_string, file_name)
82 |
--------------------------------------------------------------------------------
/lib/mail_constants.py:
--------------------------------------------------------------------------------
1 | # DEFAULTS
2 | from __future__ import unicode_literals
3 |
4 | IMAP_READONLY_FLAG = True
5 | INDEX_ATTACHMENT_DEFAULT = True
6 | DEFAULT_INCLUDE_HEADERS = True
7 | DEFAULT_INCLUDE_INBOX = True
8 | DEFAULT_MAINTAIN_RFC = False
9 | DEFAULT_ATTACH_MESSAGE_PRIMARY = False
10 | DEFAULT_MAILBOX_CLEANUP = 'readonly'
11 | DEFAULT_DROP_ATTACHMENT = False
12 | MAX_FETCH_COUNT = 25
13 | REALM = 'mail'
14 | PASSWORD_PLACEHOLDER = 'encrypted'
15 | REGEX_EMAIL = r'^[_a-z0-9-]+(\.[_a-z0-9-]+)*@[a-z0-9-]+(\.[a-z0-9-]+)*(\.[a-z]{2,4})$'
16 | REGEX_PASSWORD = r'^([\w!@#$%-]+)$'
17 | REGEX_HOSTNAME = r'^((25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)(\.(25[0-5]|2[0-4][0-9]|' \
18 | r'[01]?[0-9][0-9]?)){3})$|^((([a-zA-Z0-9]|[a-zA-Z0-9][a-zA-Z0-9\-]*[a-zA-Z0-9])' \
19 | r'\.)*([A-Za-z0-9]|[A-Za-z0-9][A-Za-z0-9\-]*[A-Za-z0-9]))$'
20 | MESSAGE_PREAMBLE = "VGhpcyBpcyBhIG1haWwgc2VwYXJhdG9yIGluIGJhc2U2NCBmb3Igb3VyIFNwbHVuayBpbmRleGluZwo=\n"
21 |
--------------------------------------------------------------------------------
/lib/mail_exceptions.py:
--------------------------------------------------------------------------------
1 | from __future__ import unicode_literals
2 |
3 | """This contains exceptions defined for the Mail scheme"""
4 |
5 |
6 | class MailException(Exception):
7 | """
8 | Exception raised for errors in the mail modular input.
9 | """
10 |
11 |
12 | class MailExceptionInvalidProtocol(MailException):
13 | """
14 | Raised if an invalid mail protocol is defined.
15 | This requires POP3 or IMAP
16 | """
17 |
18 | def __init__(self):
19 | MailException.__init__(self, 'protocol must be set to either POP3 or IMAP')
20 |
21 |
22 | class MailExceptionStanzaNotEmail(MailException):
23 | """
24 | Raised if the stanza is not an email address
25 | """
26 |
27 | def __init__(self, message):
28 | self.input = message
29 | MailException.__init__(self, 'Input stanza must be an email address. Error parsing %s' % message)
30 |
31 |
32 | class MailProtocolError(MailException):
33 | """
34 | Raised when a Poplib exception is thrown and caught
35 | """
36 |
37 | def __init__(self, message):
38 | self.message = message
39 | MailException.__init__(self, 'Exception thrown by Poplib or Imaplib, %s' % message)
40 |
41 |
42 | class MailConnectionError(MailException):
43 | """
44 | Raised when there's a connection error
45 | """
46 |
47 | def __init__(self, message):
48 | self.message = message
49 | MailException.__init__(self, 'Mail connection error: %s' % message)
50 |
51 |
52 | class MailLoginFailed(MailException):
53 | """
54 | Raised when there's a login failure
55 | """
56 |
57 | def __init__(self, server, username):
58 | self.user = username
59 | MailException.__init__(self, 'Login failed on %s for username: %s' % (server, username))
60 |
61 |
62 |
--------------------------------------------------------------------------------
/lib/mail_utils.py:
--------------------------------------------------------------------------------
1 | from __future__ import unicode_literals
2 |
3 | from six import text_type, binary_type
4 |
5 | import hashlib
6 | import os
7 | import socket
8 | import re
9 |
10 |
11 | def mail_connectivity_test(server, protocol):
12 | """
13 | This validates connectivity to given hostname and port
14 | :param server: This is the remote hostname or IP to be used for the test.
15 | :type server: basestring
16 | :param protocol: The protocol to be used to fetch emails - IMAPS or POP3S
17 | :type protocol: basestring
18 | :return: Raises an exception back to the modinput validation if connectivity test fails
19 | """
20 | try:
21 | captive_dns_addr = socket.gethostbyname(server)
22 | s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
23 | s.settimeout(1)
24 | s.connect((captive_dns_addr, get_mail_port(protocol=protocol)))
25 | s.close()
26 | except socket.error as e:
27 | raise socket.error("Socket error : %s" % e)
28 |
29 |
30 | def save_checkpoint(checkpoint_dir, msg):
31 | """
32 | This creates a checkpoint file in the checkpoint directory for the message.
33 | :param checkpoint_dir: This contains the path where checkpoint files will be saved
34 | :type checkpoint_dir: basestring
35 | :param msg: Contains a message that needs to indexed and
36 | :type msg: basestring
37 | """
38 | filename = os.path.join(checkpoint_dir, hashlib.sha256(msg.encode("utf8", "backslashreplace")).hexdigest())
39 | f = open(filename, 'w')
40 | f.close()
41 |
42 |
43 | def locate_checkpoint(checkpoint_dir, msg):
44 | """
45 | This checks if a message has already been indexed by using a digest of the first 300 characters,
46 | which includes a date, message id, source and destination email addresses.
47 | :param checkpoint_dir: This contains the path where checkpoint files will be saved
48 | :type checkpoint_dir: basestring
49 | :param msg: Contains a message that needs to indexed and
50 | :type msg: basestring
51 | :return: Returns true if the message has been indexed previously, and false if not.
52 | :rtype: bool
53 | """
54 | filename = os.path.join(checkpoint_dir, hashlib.sha256(msg.encode("utf8", "backslashreplace")).hexdigest())
55 | try:
56 | open(filename, 'r').close()
57 | except (OSError, IOError):
58 | return False
59 | return True
60 |
61 |
62 | def bool_variable(x):
63 | """
64 |
65 | :param x: variable to be converted to boolean. This defaults to true if unsupported values are passed to this
66 | :return:
67 | """
68 | if x == "enabled":
69 | x = True
70 | elif x == "disabled":
71 | x = False
72 | elif x == "True":
73 | x = True
74 | elif x == "False":
75 | x = False
76 | elif x == "1" or x == "0":
77 | x = bool(int(x))
78 | else:
79 | x = True
80 | return x
81 |
82 |
83 | def get_mail_port(protocol):
84 | """
85 | This returns the server port to use for POP retrieval of mails
86 | :param protocol: The protocol to be used to fetch emails - IMAP or POP3
87 | :type protocol: basestring
88 | :return: Returns the correct port for either POP3 or POP3 over SSL
89 | :rtype: int
90 | """
91 | if protocol == 'POP3':
92 | port = 995
93 | elif 'IMAP' == protocol:
94 | port = 993
95 | else:
96 | raise Exception("Invalid options passed to get_mail_port")
97 | return port
98 |
99 | def drop_attachment_from_event(message):
100 | """
101 | This prevent the attachment content to be ingested in clear text by
102 | dropping its content. If attachment is unsupported, nothing done.
103 | :param message: Email message to be ingested as event in Splunk
104 | :type message: basestring
105 | :return: Return the email message with no attachment content
106 | :rtype: basestring
107 | """
108 | pattern = r'^#BEGIN_ATTACHMENT:\s(.*)#END_ATTACHMENT:\s'
109 | return re.sub(pattern, "", message, flags=re.DOTALL|re.MULTILINE)
--------------------------------------------------------------------------------
/lib/splunklib/__init__.py:
--------------------------------------------------------------------------------
1 | # Copyright 2011-2015 Splunk, Inc.
2 | #
3 | # Licensed under the Apache License, Version 2.0 (the "License"): you may
4 | # not use this file except in compliance with the License. You may obtain
5 | # a copy of the License at
6 | #
7 | # http://www.apache.org/licenses/LICENSE-2.0
8 | #
9 | # Unless required by applicable law or agreed to in writing, software
10 | # distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
11 | # WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
12 | # License for the specific language governing permissions and limitations
13 | # under the License.
14 |
15 | """Python library for Splunk."""
16 |
17 | from __future__ import absolute_import
18 | from splunklib.six.moves import map
19 | __version_info__ = (1, 6, 14)
20 | __version__ = ".".join(map(str, __version_info__))
21 |
--------------------------------------------------------------------------------
/lib/splunklib/data.py:
--------------------------------------------------------------------------------
1 | # Copyright 2011-2015 Splunk, Inc.
2 | #
3 | # Licensed under the Apache License, Version 2.0 (the "License"): you may
4 | # not use this file except in compliance with the License. You may obtain
5 | # a copy of the License at
6 | #
7 | # http://www.apache.org/licenses/LICENSE-2.0
8 | #
9 | # Unless required by applicable law or agreed to in writing, software
10 | # distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
11 | # WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
12 | # License for the specific language governing permissions and limitations
13 | # under the License.
14 |
15 | """The **splunklib.data** module reads the responses from splunkd in Atom Feed
16 | format, which is the format used by most of the REST API.
17 | """
18 |
19 | from __future__ import absolute_import
20 | import sys
21 | from xml.etree.ElementTree import XML
22 | from splunklib import six
23 |
24 | __all__ = ["load"]
25 |
26 | # LNAME refers to element names without namespaces; XNAME is the same
27 | # name, but with an XML namespace.
28 | LNAME_DICT = "dict"
29 | LNAME_ITEM = "item"
30 | LNAME_KEY = "key"
31 | LNAME_LIST = "list"
32 |
33 | XNAMEF_REST = "{http://dev.splunk.com/ns/rest}%s"
34 | XNAME_DICT = XNAMEF_REST % LNAME_DICT
35 | XNAME_ITEM = XNAMEF_REST % LNAME_ITEM
36 | XNAME_KEY = XNAMEF_REST % LNAME_KEY
37 | XNAME_LIST = XNAMEF_REST % LNAME_LIST
38 |
39 | # Some responses don't use namespaces (eg: search/parse) so we look for
40 | # both the extended and local versions of the following names.
41 |
42 | def isdict(name):
43 | return name == XNAME_DICT or name == LNAME_DICT
44 |
45 | def isitem(name):
46 | return name == XNAME_ITEM or name == LNAME_ITEM
47 |
48 | def iskey(name):
49 | return name == XNAME_KEY or name == LNAME_KEY
50 |
51 | def islist(name):
52 | return name == XNAME_LIST or name == LNAME_LIST
53 |
54 | def hasattrs(element):
55 | return len(element.attrib) > 0
56 |
57 | def localname(xname):
58 | rcurly = xname.find('}')
59 | return xname if rcurly == -1 else xname[rcurly+1:]
60 |
61 | def load(text, match=None):
62 | """This function reads a string that contains the XML of an Atom Feed, then
63 | returns the
64 | data in a native Python structure (a ``dict`` or ``list``). If you also
65 | provide a tag name or path to match, only the matching sub-elements are
66 | loaded.
67 |
68 | :param text: The XML text to load.
69 | :type text: ``string``
70 | :param match: A tag name or path to match (optional).
71 | :type match: ``string``
72 | """
73 | if text is None: return None
74 | text = text.strip()
75 | if len(text) == 0: return None
76 | nametable = {
77 | 'namespaces': [],
78 | 'names': {}
79 | }
80 |
81 | # Convert to unicode encoding in only python 2 for xml parser
82 | if(sys.version_info < (3, 0, 0) and isinstance(text, unicode)):
83 | text = text.encode('utf-8')
84 |
85 | root = XML(text)
86 | items = [root] if match is None else root.findall(match)
87 | count = len(items)
88 | if count == 0:
89 | return None
90 | elif count == 1:
91 | return load_root(items[0], nametable)
92 | else:
93 | return [load_root(item, nametable) for item in items]
94 |
95 | # Load the attributes of the given element.
96 | def load_attrs(element):
97 | if not hasattrs(element): return None
98 | attrs = record()
99 | for key, value in six.iteritems(element.attrib):
100 | attrs[key] = value
101 | return attrs
102 |
103 | # Parse a element and return a Python dict
104 | def load_dict(element, nametable = None):
105 | value = record()
106 | children = list(element)
107 | for child in children:
108 | assert iskey(child.tag)
109 | name = child.attrib["name"]
110 | value[name] = load_value(child, nametable)
111 | return value
112 |
113 | # Loads the given elements attrs & value into single merged dict.
114 | def load_elem(element, nametable=None):
115 | name = localname(element.tag)
116 | attrs = load_attrs(element)
117 | value = load_value(element, nametable)
118 | if attrs is None: return name, value
119 | if value is None: return name, attrs
120 | # If value is simple, merge into attrs dict using special key
121 | if isinstance(value, six.string_types):
122 | attrs["$text"] = value
123 | return name, attrs
124 | # Both attrs & value are complex, so merge the two dicts, resolving collisions.
125 | collision_keys = []
126 | for key, val in six.iteritems(attrs):
127 | if key in value and key in collision_keys:
128 | value[key].append(val)
129 | elif key in value and key not in collision_keys:
130 | value[key] = [value[key], val]
131 | collision_keys.append(key)
132 | else:
133 | value[key] = val
134 | return name, value
135 |
136 | # Parse a element and return a Python list
137 | def load_list(element, nametable=None):
138 | assert islist(element.tag)
139 | value = []
140 | children = list(element)
141 | for child in children:
142 | assert isitem(child.tag)
143 | value.append(load_value(child, nametable))
144 | return value
145 |
146 | # Load the given root element.
147 | def load_root(element, nametable=None):
148 | tag = element.tag
149 | if isdict(tag): return load_dict(element, nametable)
150 | if islist(tag): return load_list(element, nametable)
151 | k, v = load_elem(element, nametable)
152 | return Record.fromkv(k, v)
153 |
154 | # Load the children of the given element.
155 | def load_value(element, nametable=None):
156 | children = list(element)
157 | count = len(children)
158 |
159 | # No children, assume a simple text value
160 | if count == 0:
161 | text = element.text
162 | if text is None:
163 | return None
164 | text = text.strip()
165 | if len(text) == 0:
166 | return None
167 | return text
168 |
169 | # Look for the special case of a single well-known structure
170 | if count == 1:
171 | child = children[0]
172 | tag = child.tag
173 | if isdict(tag): return load_dict(child, nametable)
174 | if islist(tag): return load_list(child, nametable)
175 |
176 | value = record()
177 | for child in children:
178 | name, item = load_elem(child, nametable)
179 | # If we have seen this name before, promote the value to a list
180 | if name in value:
181 | current = value[name]
182 | if not isinstance(current, list):
183 | value[name] = [current]
184 | value[name].append(item)
185 | else:
186 | value[name] = item
187 |
188 | return value
189 |
190 | # A generic utility that enables "dot" access to dicts
191 | class Record(dict):
192 | """This generic utility class enables dot access to members of a Python
193 | dictionary.
194 |
195 | Any key that is also a valid Python identifier can be retrieved as a field.
196 | So, for an instance of ``Record`` called ``r``, ``r.key`` is equivalent to
197 | ``r['key']``. A key such as ``invalid-key`` or ``invalid.key`` cannot be
198 | retrieved as a field, because ``-`` and ``.`` are not allowed in
199 | identifiers.
200 |
201 | Keys of the form ``a.b.c`` are very natural to write in Python as fields. If
202 | a group of keys shares a prefix ending in ``.``, you can retrieve keys as a
203 | nested dictionary by calling only the prefix. For example, if ``r`` contains
204 | keys ``'foo'``, ``'bar.baz'``, and ``'bar.qux'``, ``r.bar`` returns a record
205 | with the keys ``baz`` and ``qux``. If a key contains multiple ``.``, each
206 | one is placed into a nested dictionary, so you can write ``r.bar.qux`` or
207 | ``r['bar.qux']`` interchangeably.
208 | """
209 | sep = '.'
210 |
211 | def __call__(self, *args):
212 | if len(args) == 0: return self
213 | return Record((key, self[key]) for key in args)
214 |
215 | def __getattr__(self, name):
216 | try:
217 | return self[name]
218 | except KeyError:
219 | raise AttributeError(name)
220 |
221 | def __delattr__(self, name):
222 | del self[name]
223 |
224 | def __setattr__(self, name, value):
225 | self[name] = value
226 |
227 | @staticmethod
228 | def fromkv(k, v):
229 | result = record()
230 | result[k] = v
231 | return result
232 |
233 | def __getitem__(self, key):
234 | if key in self:
235 | return dict.__getitem__(self, key)
236 | key += self.sep
237 | result = record()
238 | for k,v in six.iteritems(self):
239 | if not k.startswith(key):
240 | continue
241 | suffix = k[len(key):]
242 | if '.' in suffix:
243 | ks = suffix.split(self.sep)
244 | z = result
245 | for x in ks[:-1]:
246 | if x not in z:
247 | z[x] = record()
248 | z = z[x]
249 | z[ks[-1]] = v
250 | else:
251 | result[suffix] = v
252 | if len(result) == 0:
253 | raise KeyError("No key or prefix: %s" % key)
254 | return result
255 |
256 |
257 | def record(value=None):
258 | """This function returns a :class:`Record` instance constructed with an
259 | initial value that you provide.
260 |
261 | :param `value`: An initial record value.
262 | :type `value`: ``dict``
263 | """
264 | if value is None: value = {}
265 | return Record(value)
266 |
267 |
--------------------------------------------------------------------------------
/lib/splunklib/modularinput/__init__.py:
--------------------------------------------------------------------------------
1 | """The following imports allow these classes to be imported via
2 | the splunklib.modularinput package like so:
3 |
4 | from splunklib.modularinput import *
5 | """
6 | from .argument import Argument
7 | from .event import Event
8 | from .event_writer import EventWriter
9 | from .input_definition import InputDefinition
10 | from .scheme import Scheme
11 | from .script import Script
12 | from .validation_definition import ValidationDefinition
13 |
--------------------------------------------------------------------------------
/lib/splunklib/modularinput/argument.py:
--------------------------------------------------------------------------------
1 | # Copyright 2011-2015 Splunk, Inc.
2 | #
3 | # Licensed under the Apache License, Version 2.0 (the "License"): you may
4 | # not use this file except in compliance with the License. You may obtain
5 | # a copy of the License at
6 | #
7 | # http://www.apache.org/licenses/LICENSE-2.0
8 | #
9 | # Unless required by applicable law or agreed to in writing, software
10 | # distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
11 | # WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
12 | # License for the specific language governing permissions and limitations
13 | # under the License.
14 |
15 | from __future__ import absolute_import
16 | try:
17 | import xml.etree.ElementTree as ET
18 | except ImportError:
19 | import xml.etree.cElementTree as ET
20 |
21 | class Argument(object):
22 | """Class representing an argument to a modular input kind.
23 |
24 | ``Argument`` is meant to be used with ``Scheme`` to generate an XML
25 | definition of the modular input kind that Splunk understands.
26 |
27 | ``name`` is the only required parameter for the constructor.
28 |
29 | **Example with least parameters**::
30 |
31 | arg1 = Argument(name="arg1")
32 |
33 | **Example with all parameters**::
34 |
35 | arg2 = Argument(
36 | name="arg2",
37 | description="This is an argument with lots of parameters",
38 | validation="is_pos_int('some_name')",
39 | data_type=Argument.data_type_number,
40 | required_on_edit=True,
41 | required_on_create=True
42 | )
43 | """
44 |
45 | # Constant values, do not change.
46 | # These should be used for setting the value of an Argument object's data_type field.
47 | data_type_boolean = "BOOLEAN"
48 | data_type_number = "NUMBER"
49 | data_type_string = "STRING"
50 |
51 | def __init__(self, name, description=None, validation=None,
52 | data_type=data_type_string, required_on_edit=False, required_on_create=False, title=None):
53 | """
54 | :param name: ``string``, identifier for this argument in Splunk.
55 | :param description: ``string``, human-readable description of the argument.
56 | :param validation: ``string`` specifying how the argument should be validated, if using internal validation.
57 | If using external validation, this will be ignored.
58 | :param data_type: ``string``, data type of this field; use the class constants.
59 | "data_type_boolean", "data_type_number", or "data_type_string".
60 | :param required_on_edit: ``Boolean``, whether this arg is required when editing an existing modular input of this kind.
61 | :param required_on_create: ``Boolean``, whether this arg is required when creating a modular input of this kind.
62 | :param title: ``String``, a human-readable title for the argument.
63 | """
64 | self.name = name
65 | self.description = description
66 | self.validation = validation
67 | self.data_type = data_type
68 | self.required_on_edit = required_on_edit
69 | self.required_on_create = required_on_create
70 | self.title = title
71 |
72 | def add_to_document(self, parent):
73 | """Adds an ``Argument`` object to this ElementTree document.
74 |
75 | Adds an subelement to the parent element, typically
76 | and sets up its subelements with their respective text.
77 |
78 | :param parent: An ``ET.Element`` to be the parent of a new subelement
79 | :returns: An ``ET.Element`` object representing this argument.
80 | """
81 | arg = ET.SubElement(parent, "arg")
82 | arg.set("name", self.name)
83 |
84 | if self.title is not None:
85 | ET.SubElement(arg, "title").text = self.title
86 |
87 | if self.description is not None:
88 | ET.SubElement(arg, "description").text = self.description
89 |
90 | if self.validation is not None:
91 | ET.SubElement(arg, "validation").text = self.validation
92 |
93 | # add all other subelements to this Argument, represented by (tag, text)
94 | subelements = [
95 | ("data_type", self.data_type),
96 | ("required_on_edit", self.required_on_edit),
97 | ("required_on_create", self.required_on_create)
98 | ]
99 |
100 | for name, value in subelements:
101 | ET.SubElement(arg, name).text = str(value).lower()
102 |
103 | return arg
--------------------------------------------------------------------------------
/lib/splunklib/modularinput/event.py:
--------------------------------------------------------------------------------
1 | # Copyright 2011-2015 Splunk, Inc.
2 | #
3 | # Licensed under the Apache License, Version 2.0 (the "License"): you may
4 | # not use this file except in compliance with the License. You may obtain
5 | # a copy of the License at
6 | #
7 | # http://www.apache.org/licenses/LICENSE-2.0
8 | #
9 | # Unless required by applicable law or agreed to in writing, software
10 | # distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
11 | # WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
12 | # License for the specific language governing permissions and limitations
13 | # under the License.
14 |
15 | from __future__ import absolute_import
16 | from io import TextIOBase
17 | from splunklib.six import ensure_text
18 |
19 | try:
20 | import xml.etree.cElementTree as ET
21 | except ImportError as ie:
22 | import xml.etree.ElementTree as ET
23 |
24 | class Event(object):
25 | """Represents an event or fragment of an event to be written by this modular input to Splunk.
26 |
27 | To write an input to a stream, call the ``write_to`` function, passing in a stream.
28 | """
29 | def __init__(self, data=None, stanza=None, time=None, host=None, index=None, source=None,
30 | sourcetype=None, done=True, unbroken=True):
31 | """There are no required parameters for constructing an Event
32 |
33 | **Example with minimal configuration**::
34 |
35 | my_event = Event(
36 | data="This is a test of my new event.",
37 | stanza="myStanzaName",
38 | time="%.3f" % 1372187084.000
39 | )
40 |
41 | **Example with full configuration**::
42 |
43 | excellent_event = Event(
44 | data="This is a test of my excellent event.",
45 | stanza="excellenceOnly",
46 | time="%.3f" % 1372274622.493,
47 | host="localhost",
48 | index="main",
49 | source="Splunk",
50 | sourcetype="misc",
51 | done=True,
52 | unbroken=True
53 | )
54 |
55 | :param data: ``string``, the event's text.
56 | :param stanza: ``string``, name of the input this event should be sent to.
57 | :param time: ``float``, time in seconds, including up to 3 decimal places to represent milliseconds.
58 | :param host: ``string``, the event's host, ex: localhost.
59 | :param index: ``string``, the index this event is specified to write to, or None if default index.
60 | :param source: ``string``, the source of this event, or None to have Splunk guess.
61 | :param sourcetype: ``string``, source type currently set on this event, or None to have Splunk guess.
62 | :param done: ``boolean``, is this a complete ``Event``? False if an ``Event`` fragment.
63 | :param unbroken: ``boolean``, Is this event completely encapsulated in this ``Event`` object?
64 | """
65 | self.data = data
66 | self.done = done
67 | self.host = host
68 | self.index = index
69 | self.source = source
70 | self.sourceType = sourcetype
71 | self.stanza = stanza
72 | self.time = time
73 | self.unbroken = unbroken
74 |
75 | def write_to(self, stream):
76 | """Write an XML representation of self, an ``Event`` object, to the given stream.
77 |
78 | The ``Event`` object will only be written if its data field is defined,
79 | otherwise a ``ValueError`` is raised.
80 |
81 | :param stream: stream to write XML to.
82 | """
83 | if self.data is None:
84 | raise ValueError("Events must have at least the data field set to be written to XML.")
85 |
86 | event = ET.Element("event")
87 | if self.stanza is not None:
88 | event.set("stanza", self.stanza)
89 | event.set("unbroken", str(int(self.unbroken)))
90 |
91 | # if a time isn't set, let Splunk guess by not creating a