├── .github └── workflows │ └── ci.yml ├── .gitignore ├── CHANGES.md ├── LICENSE ├── MANIFEST.in ├── README.md ├── docs ├── .gitignore ├── Makefile └── source │ ├── CHANGES.md │ ├── conf.py │ └── index.rst ├── example_syslog_server.py ├── requirements-tests.txt ├── requirements.txt ├── setup.py ├── syslog_rfc5424_parser ├── __init__.py ├── constants.py ├── message.py └── parser.py ├── tests ├── .gitignore ├── __init__.py ├── test_message_parser.py └── test_parser.py └── tox.ini /.github/workflows/ci.yml: -------------------------------------------------------------------------------- 1 | name: 'CI' 2 | 3 | on: 4 | push: 5 | branches: [ master ] 6 | pull_request: 7 | 8 | jobs: 9 | flake8: 10 | runs-on: ubuntu-latest 11 | steps: 12 | - uses: actions/checkout@v2 13 | - name: set up python 14 | uses: actions/setup-python@v2 15 | with: 16 | python-version: 3.8 17 | - name: install flake8 18 | run: "python -m pip install flake8" 19 | - name: lint with flake8 20 | run: flake8 syslog_rfc5424_parser/ tests/ 21 | run-tests: 22 | runs-on: ubuntu-latest 23 | strategy: 24 | matrix: 25 | pythonversion: ['3.3', '3.4', '3.5', '3.6', '3.7', '3.8', '3.9', 'pypy-3.6'] 26 | steps: 27 | - uses: actions/checkout@v2 28 | - name: set up python 29 | uses: actions/setup-python@v2 30 | with: 31 | python-version: ${{ matrix.pythonversion }} 32 | - name: install dependencies 33 | run: "python -m pip install -r requirements-tests.txt -e ." 34 | - name: test with pytest 35 | run: pytest --cov=syslog_rfc5424_parser --cov-report=term-missing --cov-fail-under=90 tests/ 36 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | *.pyc 2 | __pycache__/ 3 | 4 | build/ 5 | dist/ 6 | sdist/ 7 | *.egg-info/ 8 | *.egg_info/ 9 | .coverage 10 | .tox 11 | .cache 12 | .python-version 13 | venv/ 14 | -------------------------------------------------------------------------------- /CHANGES.md: -------------------------------------------------------------------------------- 1 | NEXT 2 | ---- 3 | - Drop support for Python 2.x 4 | - Switch CI from Travis CI to Github Actions 5 | 6 | 0.3.2 7 | ---- 8 | - Fix `DeprecationWarning` (thanks again to @pvinci) 9 | - Drop Python 3.3 from officially-supported list of interpreters 10 | - Add Python 3.7 to interpreter list 11 | 12 | 0.3.1 13 | ----- 14 | - Only install `enum34` as a dependency on older version of Python (thanks Github user @pvinci) 15 | 16 | 0.3.0 17 | ----- 18 | - Switch from PyParsing to Lark for an almost 3x speedup 19 | - No changes to user-facing API 20 | - Add documentation via ReadTheDocs 21 | 22 | 0.2.0 23 | ----- 24 | - Allow message bodies to contain newlines. If you want to split on newlines, do it yourself up-front. Reported by 25 | GitHub user @tfogwill 26 | 27 | 0.1.6 28 | ----- 29 | - Require `pyparsing` 2.3 or above 30 | 31 | 0.1.5 32 | ----- 33 | - Pin `pyparsing` to less than version 2.3 until I make this work with the new API for grouping (reported by Github user @tfogwill) 34 | 35 | 0.1.4 36 | ----- 37 | - Properly handle messages with SD pairs that have an empty value (reported by Github user @eyalleshem) 38 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | Copyright (c) 2016, EasyPost 2 | 3 | Permission to use, copy, modify, and/or distribute this software for any 4 | purpose with or without fee is hereby granted, provided that the above 5 | copyright notice and this permission notice appear in all copies. 6 | 7 | THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES 8 | WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF 9 | MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY 10 | SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES 11 | WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION 12 | OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN 13 | CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. 14 | 15 | -------------------------------------------------------------------------------- /MANIFEST.in: -------------------------------------------------------------------------------- 1 | include *.md 2 | include LICENSE 3 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | This module implements an [RFC 5424](https://tools.ietf.org/html/rfc5424) IETF Syslog Protocol parser in Python, using the [lark](https://github.com/lark-parser/lark) parser-generator. It should work on Python 3.3+. 2 | 3 | ![CI](https://github.com/EasyPost/syslog-rfc5424-parser/workflows/CI/badge.svg) 4 | [![PyPI version](https://badge.fury.io/py/syslog-rfc5424-parser.svg)](https://badge.fury.io/py/syslog-rfc5424-parser) 5 | [![Documentation Status](https://readthedocs.org/projects/syslog-rfc5424-parser/badge/?version=latest)](https://syslog-rfc5424-parser.readthedocs.io/en/latest/?badge=latest) 6 | 7 | The file [example_syslog_server.py](example_syslog_server.py) contains a fully-functional Syslog server which will receive messages on a UNIX domain socket and print them to stdout as JSON blobs. 8 | 9 | ### A word on performance 10 | On a fairly modern system (Xeon E3-1270v3), it takes about 230µs to parse a single syslog message and construct a SyslogMessage object (which is to say, you should be able to parse about 4300 per second with a single-threaded process). Are you really in that much of a rush, anyway? 11 | 12 | If you're interested in a faster, non-Python alternative, you may also enjoy 13 | [rust-syslog-rfc5424](https://github.com/Roguelazer/rust-syslog-rfc5424). 14 | -------------------------------------------------------------------------------- /docs/.gitignore: -------------------------------------------------------------------------------- 1 | ./build/ 2 | -------------------------------------------------------------------------------- /docs/Makefile: -------------------------------------------------------------------------------- 1 | # Minimal makefile for Sphinx documentation 2 | # 3 | 4 | # You can set these variables from the command line. 5 | SPHINXOPTS = 6 | SPHINXBUILD = sphinx-build 7 | SOURCEDIR = source 8 | BUILDDIR = build 9 | 10 | # Put it first so that "make" without argument is like "make help". 11 | help: 12 | @$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O) 13 | 14 | .PHONY: help Makefile 15 | 16 | # Catch-all target: route all unknown targets to Sphinx using the new 17 | # "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS). 18 | %: Makefile 19 | @$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O) -------------------------------------------------------------------------------- /docs/source/CHANGES.md: -------------------------------------------------------------------------------- 1 | ../../CHANGES.md -------------------------------------------------------------------------------- /docs/source/conf.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # 3 | # Configuration file for the Sphinx documentation builder. 4 | # 5 | # This file does only contain a selection of the most common options. For a 6 | # full list see the documentation: 7 | # http://www.sphinx-doc.org/en/master/config 8 | 9 | # -- Path setup -------------------------------------------------------------- 10 | 11 | # If extensions (or modules to document with autodoc) are in another directory, 12 | # add these directories to sys.path here. If the directory is relative to the 13 | # documentation root, use os.path.abspath to make it absolute, like shown here. 14 | # 15 | # import os 16 | # import sys 17 | # sys.path.insert(0, os.path.abspath('.')) 18 | 19 | 20 | # -- Project information ----------------------------------------------------- 21 | 22 | project = u'syslog-rfc5424-parser' 23 | copyright = u'2016 - 2019, EasyPost' 24 | author = u'EasyPost' 25 | 26 | import syslog_rfc5424_parser 27 | version = syslog_rfc5424_parser.__version__ 28 | # The full version, including alpha/beta/rc tags 29 | release = syslog_rfc5424_parser.__version__ 30 | 31 | 32 | # -- General configuration --------------------------------------------------- 33 | 34 | # If your documentation needs a minimal Sphinx version, state it here. 35 | # 36 | # needs_sphinx = '1.0' 37 | 38 | # Add any Sphinx extension module names here, as strings. They can be 39 | # extensions coming with Sphinx (named 'sphinx.ext.*') or your custom 40 | # ones. 41 | extensions = [ 42 | 'sphinx.ext.autodoc', 43 | 'sphinx.ext.intersphinx', 44 | 'sphinx.ext.imgmath', 45 | 'sphinx.ext.viewcode', 46 | ] 47 | 48 | # Add any paths that contain templates here, relative to this directory. 49 | templates_path = ['_templates'] 50 | 51 | from recommonmark.parser import CommonMarkParser 52 | 53 | source_parsers = { 54 | '.md': CommonMarkParser, 55 | } 56 | 57 | source_suffix = ['.rst', '.md'] 58 | 59 | # The master toctree document. 60 | master_doc = 'index' 61 | 62 | # The language for content autogenerated by Sphinx. Refer to documentation 63 | # for a list of supported languages. 64 | # 65 | # This is also used if you do content translation via gettext catalogs. 66 | # Usually you set "language" from the command line for these cases. 67 | language = None 68 | 69 | # List of patterns, relative to source directory, that match files and 70 | # directories to ignore when looking for source files. 71 | # This pattern also affects html_static_path and html_extra_path. 72 | exclude_patterns = [] 73 | 74 | # The name of the Pygments (syntax highlighting) style to use. 75 | pygments_style = None 76 | 77 | 78 | # -- Options for HTML output ------------------------------------------------- 79 | 80 | # The theme to use for HTML and HTML Help pages. See the documentation for 81 | # a list of builtin themes. 82 | # 83 | html_theme = 'alabaster' 84 | 85 | # Theme options are theme-specific and customize the look and feel of a theme 86 | # further. For a list of options available for each theme, see the 87 | # documentation. 88 | # 89 | # html_theme_options = {} 90 | 91 | # Add any paths that contain custom static files (such as style sheets) here, 92 | # relative to this directory. They are copied after the builtin static files, 93 | # so a file named "default.css" will overwrite the builtin "default.css". 94 | html_static_path = ['_static'] 95 | 96 | # Custom sidebar templates, must be a dictionary that maps document names 97 | # to template names. 98 | # 99 | # The default sidebars (for documents that don't match any pattern) are 100 | # defined by theme itself. Builtin themes are using these templates by 101 | # default: ``['localtoc.html', 'relations.html', 'sourcelink.html', 102 | # 'searchbox.html']``. 103 | # 104 | # html_sidebars = {} 105 | 106 | 107 | # -- Options for HTMLHelp output --------------------------------------------- 108 | 109 | # Output file base name for HTML help builder. 110 | htmlhelp_basename = 'syslog-rfc5424-parserdoc' 111 | 112 | 113 | # -- Options for LaTeX output ------------------------------------------------ 114 | 115 | latex_elements = { 116 | # The paper size ('letterpaper' or 'a4paper'). 117 | # 118 | # 'papersize': 'letterpaper', 119 | 120 | # The font size ('10pt', '11pt' or '12pt'). 121 | # 122 | # 'pointsize': '10pt', 123 | 124 | # Additional stuff for the LaTeX preamble. 125 | # 126 | # 'preamble': '', 127 | 128 | # Latex figure (float) alignment 129 | # 130 | # 'figure_align': 'htbp', 131 | } 132 | 133 | # Grouping the document tree into LaTeX files. List of tuples 134 | # (source start file, target name, title, 135 | # author, documentclass [howto, manual, or own class]). 136 | latex_documents = [ 137 | (master_doc, 'syslog-rfc5424-parser.tex', u'syslog-rfc5424-parser Documentation', 138 | u'EasyPost', 'manual'), 139 | ] 140 | 141 | 142 | # -- Options for manual page output ------------------------------------------ 143 | 144 | # One entry per manual page. List of tuples 145 | # (source start file, name, description, authors, manual section). 146 | man_pages = [ 147 | (master_doc, 'syslog-rfc5424-parser', u'syslog-rfc5424-parser Documentation', 148 | [author], 1) 149 | ] 150 | 151 | 152 | # -- Options for Texinfo output ---------------------------------------------- 153 | 154 | # Grouping the document tree into Texinfo files. List of tuples 155 | # (source start file, target name, title, author, 156 | # dir menu entry, description, category) 157 | texinfo_documents = [ 158 | (master_doc, 'syslog-rfc5424-parser', u'syslog-rfc5424-parser Documentation', 159 | author, 'syslog-rfc5424-parser', 'One line description of project.', 160 | 'Miscellaneous'), 161 | ] 162 | 163 | 164 | # -- Options for Epub output ------------------------------------------------- 165 | 166 | # Bibliographic Dublin Core info. 167 | epub_title = project 168 | 169 | # The unique identifier of the text. This can be a ISBN number 170 | # or the project homepage. 171 | # 172 | # epub_identifier = '' 173 | 174 | # A unique identification for the text. 175 | # 176 | # epub_uid = '' 177 | 178 | # A list of files that should not be packed into the epub file. 179 | epub_exclude_files = ['search.html'] 180 | 181 | 182 | # -- Extension configuration ------------------------------------------------- 183 | 184 | # -- Options for intersphinx extension --------------------------------------- 185 | 186 | # Example configuration for intersphinx: refer to the Python standard library. 187 | intersphinx_mapping = {'https://docs.python.org/': None} 188 | -------------------------------------------------------------------------------- /docs/source/index.rst: -------------------------------------------------------------------------------- 1 | .. syslog-rfc5424-parser documentation master file, created by 2 | sphinx-quickstart on Tue Jan 22 14:37:38 2019. 3 | You can adapt this file completely to your liking, but it should at least 4 | contain the root `toctree` directive. 5 | 6 | This module implements an `RFC 5424 `_ IETF 7 | Syslog Protocol parser in Python, using the `lark `_ 8 | parser-generator. It should work on Python 2.7 or Python 3.3+. 9 | 10 | This work is available under the terms of the ISC License. 11 | 12 | Members 13 | ------- 14 | 15 | .. autoclass:: syslog_rfc5424_parser.SyslogMessage 16 | :members: 17 | :undoc-members: 18 | 19 | .. autoclass:: syslog_rfc5424_parser.ParseError 20 | :members: 21 | 22 | ChangeLog 23 | -------- 24 | 25 | .. toctree:: 26 | :maxdepth: 1 27 | 28 | CHANGES.md 29 | 30 | Indices and tables 31 | ================== 32 | 33 | * :ref:`genindex` 34 | * :ref:`modindex` 35 | * :ref:`search` 36 | 37 | -------------------------------------------------------------------------------- /example_syslog_server.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | 3 | from __future__ import print_function 4 | 5 | import argparse 6 | import socket 7 | import os 8 | import sys 9 | import json 10 | 11 | from syslog_rfc5424_parser import SyslogMessage, ParseError 12 | 13 | 14 | def main(): 15 | parser = argparse.ArgumentParser() 16 | parser.add_argument('-B', '--bind-path', required=True, 17 | help='Path at which to bind a Datagram-mode UNIX domain socket') 18 | args = parser.parse_args() 19 | 20 | s = socket.socket(socket.AF_UNIX, socket.SOCK_DGRAM) 21 | temp_name = args.bind_path + '.' + str(os.getpid()) 22 | s.bind(temp_name) 23 | os.rename(temp_name, args.bind_path) 24 | 25 | while True: 26 | message = s.recv(4096) 27 | # Technically, messages are only UTF-8 if they have a BOM; otherwise they're binary. However, I'm not 28 | # aware of any Syslog servers that handle that. *shrug* 29 | message = message.decode('utf-8') 30 | try: 31 | message = SyslogMessage.parse(message) 32 | print(json.dumps(message.as_dict())) 33 | except ParseError as e: 34 | print(e, file=sys.stderr) 35 | 36 | 37 | if __name__ == '__main__': 38 | sys.exit(main()) 39 | -------------------------------------------------------------------------------- /requirements-tests.txt: -------------------------------------------------------------------------------- 1 | pytest==3.* 2 | pytest-cov==2.5.* 3 | flake8 4 | -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | lark-parser==0.6.* 2 | enum34;python_version<="3.3" 3 | -------------------------------------------------------------------------------- /setup.py: -------------------------------------------------------------------------------- 1 | import sys 2 | 3 | from setuptools import setup, find_packages 4 | 5 | 6 | def install_requires(): 7 | if sys.version_info >= (3, 4): 8 | return ['lark-parser==0.6.*'] 9 | else: 10 | return ['lark-parser==0.6.*', 'enum34'] 11 | 12 | setup( 13 | name="syslog-rfc5424-parser", 14 | version="0.3.2", 15 | author="James Brown", 16 | author_email="jbrown@easypost.com", 17 | url="https://github.com/easypost/syslog-rfc5424-parser", 18 | description="Parser for RFC5424-compatible Syslog messages", 19 | long_description=open('README.md', 'r').read(), 20 | long_description_content_type='text/markdown', 21 | license="ISC", 22 | install_requires=install_requires(), 23 | project_urls={ 24 | 'Issue Tracker': 'https://github.com/easypost/syslog-rfc5424-parser/issues', 25 | 'Documentations': 'https://syslog-rfc5424-parser.readthedocs.io/en/latest/', 26 | }, 27 | packages=find_packages(exclude=['tests']), 28 | classifiers=[ 29 | "Development Status :: 4 - Beta", 30 | "Environment :: Console", 31 | "Programming Language :: Python", 32 | "Programming Language :: Python :: 3.6", 33 | "Programming Language :: Python :: 3.7", 34 | "Programming Language :: Python :: 3.8", 35 | "Programming Language :: Python :: 3.9", 36 | "Intended Audience :: Developers", 37 | "Operating System :: OS Independent", 38 | "License :: OSI Approved :: ISC License (ISCL)", 39 | ] 40 | ) 41 | -------------------------------------------------------------------------------- /syslog_rfc5424_parser/__init__.py: -------------------------------------------------------------------------------- 1 | from .message import SyslogMessage, ParseError 2 | 3 | version_info = (0, 3, 2) 4 | __version__ = '.'.join(map(str, version_info)) 5 | __author__ = 'James Brown ' 6 | 7 | __all__ = ['SyslogMessage', 'ParseError', 'version_info', '__version__', '__author__'] 8 | -------------------------------------------------------------------------------- /syslog_rfc5424_parser/constants.py: -------------------------------------------------------------------------------- 1 | from enum import IntEnum 2 | 3 | 4 | class SyslogFacility(IntEnum): 5 | kern = 0 6 | user = 1 7 | mail = 2 8 | daemon = 3 9 | auth = 4 10 | syslog = 5 11 | lpr = 6 12 | news = 7 13 | uucp = 8 14 | cron = 9 15 | authpriv = 10 16 | ftp = 11 17 | ntp = 12 18 | audit = 13 19 | alert = 14 20 | clockd = 15 21 | local0 = 16 22 | local1 = 17 23 | local2 = 18 24 | local3 = 19 25 | local4 = 20 26 | local5 = 21 27 | local6 = 22 28 | local7 = 23 29 | unknown = -1 30 | 31 | 32 | class SyslogSeverity(IntEnum): 33 | emerg = 0 34 | alert = 1 35 | crit = 2 36 | err = 3 37 | warning = 4 38 | notice = 5 39 | info = 6 40 | debug = 7 41 | -------------------------------------------------------------------------------- /syslog_rfc5424_parser/message.py: -------------------------------------------------------------------------------- 1 | import time 2 | 3 | import lark 4 | 5 | from . import parser 6 | from .constants import SyslogSeverity, SyslogFacility 7 | 8 | 9 | class ParseError(Exception): 10 | def __init__(self, description, message): 11 | self.description = description 12 | self.message = message 13 | 14 | def __repr__(self): 15 | return '{0}({1!r}, {2!r})'.format(self.__class__.__name__, self.description, self.message) # pragma: no cover 16 | 17 | def __str__(self): 18 | return '{0}: {1!r}'.format(self.description, self.message) # pragma: no cover 19 | 20 | 21 | class SyslogMessage(object): 22 | """Representation of a single RFC5424-format syslog message. """ 23 | 24 | __slots__ = ['severity', 'facility', 'version', 'timestamp', 'hostname', 'appname', 'procid', 'msgid', 'sd', 'msg'] 25 | 26 | def __init__(self, severity, facility, version=1, timestamp='-', hostname='-', appname='-', procid=None, 27 | msgid=None, sd='-', msg=None): 28 | """Initialize a syslog message (defaults correspond to the minimal default in the RFC""" 29 | # I wish Python had initializer lists 30 | self.severity = severity 31 | self.facility = facility 32 | self.version = version 33 | self.timestamp = timestamp 34 | self.hostname = hostname 35 | self.appname = appname 36 | self.procid = procid 37 | self.msgid = msgid 38 | if sd == '-': 39 | self.sd = {} 40 | else: 41 | self.sd = sd 42 | self.msg = msg 43 | 44 | def __str__(self): 45 | """Return this object represented as appropriate for the wire""" 46 | if self.facility == SyslogFacility.unknown: 47 | raise ValueError('Cannot dump a SyslogMessage with unknown facility') 48 | pri = int(self.facility) * 8 + int(self.severity) 49 | if isinstance(self.timestamp, (int, float)): 50 | timestamp = time.strftime("%Y-%m-%dT%H:%M:%SZ", time.gmtime(self.timestamp)) 51 | else: 52 | timestamp = self.timestamp 53 | sd = [] 54 | for sd_id, sd_params in self.sd.items(): 55 | sd_data = [] 56 | for k, v in sd_params.items(): 57 | sd_data.append(' {k}="{v}"'.format(k=k, v=v)) 58 | sd.append('[{sd_id}{sd_data}]'.format(sd_id=sd_id, sd_data=''.join(sd_data))) 59 | if sd: 60 | sd = ''.join(sd) 61 | else: 62 | sd = '-' 63 | if self.msg: 64 | rest = ' {msg}'.format(msg=self.msg) 65 | else: 66 | rest = '' 67 | return '<{pri}>{version} {timestamp} {hostname} {appname} {procid} {msgid} {sd}{rest}'.format( 68 | pri=pri, version=self.version, timestamp=timestamp, hostname=self.hostname, 69 | appname=self.appname, procid='-' if self.procid is None else self.procid, 70 | msgid='-' if self.msgid is None else self.msgid, 71 | sd=sd, rest=rest 72 | ) 73 | 74 | @classmethod 75 | def parse(cls, message_string): 76 | """Construct a syslog message from a string""" 77 | try: 78 | groups = parser.parse(message_string) 79 | except lark.UnexpectedInput: 80 | raise ParseError('Unable to parse message', message_string) 81 | header = groups.header 82 | pri = int(header.pri) 83 | fac = pri >> 3 84 | sev = pri & 7 85 | severity = SyslogSeverity(sev) 86 | try: 87 | facility = SyslogFacility(fac) 88 | except Exception: 89 | facility = SyslogFacility.unknown 90 | version = header.version 91 | hostname = header.hostname 92 | timestamp = header.timestamp 93 | appname = header.appname 94 | procid = header.procid 95 | if procid == '-': 96 | procid = None 97 | msgid = header.msgid 98 | if msgid == '-': 99 | msgid = None 100 | sd = {} 101 | for item in groups.structured_data: 102 | sd.setdefault(item.sd_id, {}) 103 | for param_name, param_value in item.sd_params: 104 | sd[item.sd_id][param_name] = param_value 105 | return cls(severity=severity, facility=facility, version=version, hostname=hostname, 106 | timestamp=timestamp, appname=appname, procid=procid, msgid=msgid, msg=groups.message, 107 | sd=sd) 108 | 109 | def __repr__(self): 110 | return '{0}({1})'.format( 111 | self.__class__.__name__, 112 | ','.join('{0}={1!r}'.format(k, getattr(self, k)) for k in self.__slots__) 113 | ) 114 | 115 | def as_dict(self): 116 | """Dump this class to a dictionary of primitive objects, suitable for serializing with JSON/MsgPack/etc.""" 117 | 118 | return dict( 119 | (k, getattr(self, k).name if k in ('severity', 'facility') else getattr(self, k)) 120 | for k in self.__slots__ 121 | ) 122 | -------------------------------------------------------------------------------- /syslog_rfc5424_parser/parser.py: -------------------------------------------------------------------------------- 1 | import collections 2 | 3 | from lark import Lark, Transformer 4 | 5 | 6 | GRAMMAR = r''' 7 | ?start : header _SP structured_data [ msg ] 8 | ?header : pri version _SP timestamp _SP hostname _SP appname _SP procid _SP msgid 9 | pri : "<" /[0-9]{1,3}/ ">" 10 | version : /[1-9][0-9]{0,2}/ 11 | timestamp : NILVALUE 12 | | /[0-9]{4}-[0-9]{2}-[0-9]{2}T[0-9]{2}:[0-9]{2}:[0-9]{2}/ time_secfrac? time_offset 13 | time_secfrac : /\.[0-9]{1,6}/ 14 | time_offset : ZULU 15 | | _ntime_offset 16 | _ntime_offset : /[+-][0-9]{2}:[0-9]{2}/ 17 | structured_data : NILVALUE 18 | | sd_element+ 19 | sd_element : "[" sd_id (" " sd_param)* "]" 20 | ?sd_id : sd_name 21 | sd_param : param_name "=" ESCAPED_STRING 22 | ?param_name : sd_name 23 | ?sd_name : /[^= \]\"]{1,32}/ 24 | appname : NILVALUE 25 | | /[!-~]{1,48}/ 26 | procid : NILVALUE 27 | | /[!-~]{1,128}/ 28 | msgid : NILVALUE 29 | | /[!-~]{1,32}/ 30 | hostname : NILVALUE 31 | | /[!-~]{1,255}/ 32 | msg : / .*/ms 33 | 34 | %import common.ESCAPED_STRING -> ESCAPED_STRING 35 | 36 | _SP: " " 37 | NILVALUE: "-" 38 | ZULU: "Z" 39 | ''' 40 | 41 | 42 | Header = collections.namedtuple('Header', ['pri', 'version', 'timestamp', 'hostname', 'appname', 'procid', 'msgid']) 43 | 44 | SDElement = collections.namedtuple('SDElement', ['sd_id', 'sd_params']) 45 | 46 | ParsedMessage = collections.namedtuple('ParsedMessage', ['header', 'structured_data', 'message']) 47 | 48 | 49 | class TreeTransformer(Transformer): 50 | def NILVALUE(self, inp): 51 | return '-' 52 | 53 | def pri(self, inp): 54 | return int(inp[0]) 55 | 56 | def version(self, inp): 57 | return int(inp[0]) 58 | 59 | def timestamp(self, inp): 60 | if len(inp) == 1: 61 | return inp[0] 62 | else: 63 | datetime = str(inp[0]) 64 | rest = [str(i.children[0]) for i in inp[1:]] 65 | return datetime + ''.join(rest) 66 | 67 | def hostname(self, inp): 68 | return str(inp[0]) 69 | 70 | def appname(self, inp): 71 | return str(inp[0]) 72 | 73 | def procid(self, inp): 74 | inp = str(inp[0]) 75 | if inp.isdigit(): 76 | return int(inp) 77 | return inp 78 | 79 | def msgid(self, inp): 80 | return str(inp[0]) 81 | 82 | def structured_data(self, inp): 83 | if len(inp) == 1 and inp[0] == "-": 84 | return [] 85 | output = [] 86 | for sd_element in inp: 87 | sd_id = str(sd_element.children[0]) 88 | sd_params = [] 89 | for sd_param in sd_element.children[1:]: 90 | param_name = str(sd_param.children[0]) 91 | param_value = str(sd_param.children[1])[1:-1] 92 | sd_params.append((param_name, param_value)) 93 | output.append(SDElement(sd_id=sd_id, sd_params=sd_params)) 94 | return output 95 | 96 | def msg(self, inp): 97 | return str(inp[0])[1:] 98 | 99 | def header(self, inp): 100 | return Header( 101 | pri=inp[0], 102 | version=inp[1], 103 | timestamp=inp[2], 104 | hostname=inp[3], 105 | appname=inp[4], 106 | procid=inp[5], 107 | msgid=inp[6] 108 | ) 109 | 110 | def start(self, inp): 111 | if len(inp) > 2: 112 | message = inp[2] 113 | else: 114 | message = None 115 | return ParsedMessage( 116 | header=inp[0], 117 | structured_data=inp[1], 118 | message=message 119 | ) 120 | 121 | 122 | _parser = Lark(GRAMMAR, parser='lalr', transformer=TreeTransformer()) 123 | 124 | 125 | def parse(s): 126 | tree = _parser.parse(s) 127 | return tree 128 | 129 | 130 | if __name__ == '__main__': 131 | import sys 132 | print(parse(sys.argv[1])) 133 | -------------------------------------------------------------------------------- /tests/.gitignore: -------------------------------------------------------------------------------- 1 | .cache/ 2 | -------------------------------------------------------------------------------- /tests/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/EasyPost/syslog-rfc5424-parser/cd3b13bfc7307d31ed3097086b69407cc735e08c/tests/__init__.py -------------------------------------------------------------------------------- /tests/test_message_parser.py: -------------------------------------------------------------------------------- 1 | import collections 2 | 3 | import pytest 4 | 5 | from syslog_rfc5424_parser import SyslogMessage, ParseError 6 | from syslog_rfc5424_parser.constants import SyslogFacility, SyslogSeverity 7 | 8 | 9 | Expected = collections.namedtuple('Expected', ['severity', 'facility', 'version', 'timestamp', 'hostname', 10 | 'appname', 'procid', 'msgid', 'msg', 'sd']) 11 | 12 | 13 | PARSE_VECTORS = ( 14 | ( 15 | '<1>1 - - - - - -', 16 | Expected(SyslogSeverity.alert, SyslogFacility.kern, 1, '-', '-', '-', None, None, None, {}) 17 | ), 18 | ( 19 | '<78>1 2016-01-15T00:04:01+00:00 host1 CROND 10391 - [meta sequenceId="29"] some_message', # noqa 20 | Expected(SyslogSeverity.info, SyslogFacility.cron, 1, '2016-01-15T00:04:01+00:00', 'host1', 'CROND', 10391, 21 | None, 'some_message', {'meta': {'sequenceId': '29'}}) 22 | ), 23 | ( 24 | '<29>1 2016-01-15T01:00:43Z some-host-name SEKRETPROGRAM prg - [origin x-service="svcname"][meta sequenceId="1"] 127.0.0.1 - - 1452819643 "GET /health HTTP/1.1" 200 175 "-" "hacheck 0.9.0" 20812 127.0.0.1:40150 1199', # noqa 25 | Expected(SyslogSeverity.notice, SyslogFacility.daemon, 1, '2016-01-15T01:00:43Z', 'some-host-name', 26 | 'SEKRETPROGRAM', 'prg', None, '127.0.0.1 - - 1452819643 "GET /health HTTP/1.1" 200 175 "-" "hacheck 0.9.0" 20812 127.0.0.1:40150 1199', # noqa 27 | {'meta': {'sequenceId': '1'}, 'origin': {'x-service': 'svcname'}}) 28 | ), 29 | ( 30 | '<190>1 2016-01-15T01:00:59+00:00 some-other-host 2016-01-15 - - [origin x-service="program"][meta sequenceId="4"] 01:00:59,989 PRG[14767:INFO] Starting up', # noqa 31 | Expected(SyslogSeverity.info, SyslogFacility.local7, 1, '2016-01-15T01:00:59+00:00', 'some-other-host', 32 | '2016-01-15', None, None, msg='01:00:59,989 PRG[14767:INFO] Starting up', 33 | sd={'meta': {'sequenceId': '4'}, 'origin': {'x-service': 'program'}}) 34 | ), 35 | # this one has a malformed PRI 36 | ( 37 | '<409>1 2016-01-15T00:00:00Z host2 prg - - - message', 38 | Expected(SyslogSeverity.alert, SyslogFacility.unknown, 1, '2016-01-15T00:00:00Z', 'host2', 39 | 'prg', None, None, 'message', {}) 40 | ), 41 | # this one has an SD-ID, but no SD-PARAMS 42 | ( 43 | '<78>1 2016-01-15T00:04:01+00:00 host1 CROND 10391 - [sdid] some_message', # noqa 44 | Expected(SyslogSeverity.info, SyslogFacility.cron, 1, '2016-01-15T00:04:01+00:00', 'host1', 'CROND', 10391, 45 | None, 'some_message', {'sdid': {}}) 46 | ), 47 | ( 48 | '<85>1 2017-03-02T13:21:15.733598-08:00 vrs-1 polkitd 20481 - - msg', 49 | Expected(SyslogSeverity.notice, SyslogFacility.authpriv, 1, '2017-03-02T13:21:15.733598-08:00', 'vrs-1', 50 | 'polkitd', 20481, None, ' msg', {}) 51 | ), 52 | # reported in pr 2; empty sd-param body 53 | ( 54 | '<29>1 2018-05-14T08:23:01.520Z leyal_test4 mgd 13894 UI_CHILD_EXITED [junos@2636.1.1.1.2.57 pid="14374" return-value="5" core-dump-status="" command="/usr/sbin/mustd"]', # noqa 55 | Expected(SyslogSeverity.notice, SyslogFacility.daemon, 1, '2018-05-14T08:23:01.520Z', 56 | 'leyal_test4', 'mgd', 13894, 'UI_CHILD_EXITED', None, { 57 | 'junos@2636.1.1.1.2.57': { 58 | 'command': '/usr/sbin/mustd', 59 | 'core-dump-status': '', 60 | 'pid': '14374', 61 | 'return-value': '5', 62 | } 63 | }) 64 | ), 65 | # reported in issue #7; multi-line body 66 | ( 67 | '<78>1 2019-01-17T17:39:00Z localhost CROND 9999 - - some message\nwith embedded newlines', 68 | Expected(SyslogSeverity.info, SyslogFacility.cron, 1, '2019-01-17T17:39:00Z', 69 | 'localhost', 'CROND', 9999, None, 'some message\nwith embedded newlines', {}) 70 | ), 71 | # requested in #10 72 | ( 73 | '''<134>1 2019-01-20T23:43:41.087236Z 172.16.3.1 NAT 15634 SADD [nsess SSUBIX="0" SVLAN="0" IATYP="IPv4" ISADDR="172.16.1.2" ISPORT="6303" XATYP="IPv4" XSADDR="10.0.0.3" XSPORT="16253" PROTO="6" XDADDR="172.16.2.2" XDPORT="80"] ''', # noqa 74 | Expected( 75 | SyslogSeverity.info, SyslogFacility.local0, 1, '2019-01-20T23:43:41.087236Z', 76 | '172.16.3.1', 'NAT', 15634, 77 | 'SADD', '''''', 78 | { 79 | 'nsess': { 80 | 'SSUBIX': '0', 81 | 'SVLAN': '0', 82 | 'IATYP': 'IPv4', 83 | 'ISADDR': '172.16.1.2', 84 | 'ISPORT': '6303', 85 | 'PROTO': '6', 86 | 'XATYP': 'IPv4', 87 | 'XSADDR': '10.0.0.3', 88 | 'XSPORT': '16253', 89 | 'XDADDR': '172.16.2.2', 90 | 'XDPORT': '80', 91 | } 92 | } 93 | ) 94 | ) 95 | 96 | ) 97 | 98 | 99 | # these only have one SD because ordering of SDs isn't consistent between runs 100 | ROUND_TRIP_VECTORS = ( 101 | '<1>1 - - - - - -', 102 | '<78>1 2016-01-15T00:04:01+00:00 host1 CROND 10391 - [meta sequenceId="29"] some_message', 103 | ) 104 | 105 | 106 | @pytest.mark.parametrize('input_line, expected', PARSE_VECTORS) 107 | def test_vector(input_line, expected): 108 | parsed = SyslogMessage.parse(input_line) 109 | assert parsed.severity == expected.severity 110 | assert parsed.facility == expected.facility 111 | assert parsed.version == expected.version 112 | assert parsed.timestamp == expected.timestamp 113 | assert parsed.hostname == expected.hostname 114 | assert parsed.appname == expected.appname 115 | assert parsed.procid == expected.procid 116 | assert parsed.msgid == expected.msgid 117 | assert parsed.msg == expected.msg 118 | assert parsed.sd == expected.sd 119 | 120 | 121 | def test_emitter(): 122 | m = SyslogMessage(facility=SyslogFacility.cron, severity=SyslogSeverity.info) 123 | assert '<78>1 - - - - - -' == str(m) 124 | 125 | 126 | def test_emitter_with_unix_timestamp(): 127 | m = SyslogMessage(facility=SyslogFacility.kern, severity=SyslogSeverity.emerg, timestamp=0) 128 | assert '<0>1 1970-01-01T00:00:00Z - - - - -' == str(m) 129 | 130 | 131 | @pytest.mark.parametrize('input_line', ROUND_TRIP_VECTORS) 132 | def test_emitter_round_trip(input_line): 133 | m = SyslogMessage.parse(input_line) 134 | assert str(m) == input_line 135 | 136 | 137 | @pytest.mark.parametrize('input_line, expected', PARSE_VECTORS) 138 | def test_as_dict(input_line, expected): 139 | m = SyslogMessage.parse(input_line) 140 | dictified = m.as_dict() 141 | expected_dict = expected._asdict() 142 | expected_dict['severity'] = expected_dict['severity'].name 143 | expected_dict['facility'] = expected_dict['facility'].name 144 | assert dictified == expected_dict 145 | 146 | 147 | def test_dumping_with_bad_pri_fails(): 148 | m = SyslogMessage(facility=SyslogFacility.unknown, severity=SyslogSeverity.emerg) 149 | with pytest.raises(ValueError): 150 | str(m) 151 | 152 | 153 | def test_unparseable(): 154 | with pytest.raises(ParseError): 155 | SyslogMessage.parse('garbage') 156 | 157 | 158 | def test_repr_does_not_raise(): 159 | m = SyslogMessage(facility=SyslogFacility.cron, severity=SyslogSeverity.info) 160 | repr(m) 161 | -------------------------------------------------------------------------------- /tests/test_parser.py: -------------------------------------------------------------------------------- 1 | from syslog_rfc5424_parser import parser 2 | 3 | 4 | def test_minimal(): 5 | message = '<1>1 - - - - - -' 6 | parsed = parser.parse(message) 7 | assert parsed.header.pri == 1 8 | assert parsed.header.version == 1 9 | assert parsed.header.timestamp == '-' 10 | assert parsed.header.hostname == '-' 11 | assert parsed.header.appname == '-' 12 | assert parsed.header.procid == '-' 13 | assert parsed.header.msgid == '-' 14 | assert parsed.structured_data == [] 15 | -------------------------------------------------------------------------------- /tox.ini: -------------------------------------------------------------------------------- 1 | [tox] 2 | basepython = python2.7 3 | envlist = py27, py34, py35, py36, py37 4 | 5 | [testenv] 6 | usedevelop = False 7 | basepython = python2.7 8 | install_command = pip install --upgrade {opts} {packages} 9 | deps = 10 | -rrequirements.txt 11 | -rrequirements-tests.txt 12 | flake8 13 | wheel 14 | commands = 15 | flake8 syslog_rfc5424_parser 16 | py.test --junit-prefix={envname}: --junit-xml={env:CIRCLE_TEST_REPORTS:.}/test-{envname}.xml -v tests/ 17 | 18 | [testenv:py27] 19 | basepython = python2.7 20 | 21 | [testenv:py34] 22 | basepython = python3.4 23 | 24 | [testenv:py35] 25 | basepython = python3.5 26 | 27 | [testenv:py36] 28 | basepython = python3.6 29 | 30 | [testenv:py37] 31 | basepython = python3.7 32 | 33 | [flake8] 34 | max-line-length=120 35 | --------------------------------------------------------------------------------