├── .gitignore
├── ANNOUNCE.md
├── README.rst
├── cronfed.py
├── example_mailbox
└── setup.py
/.gitignore:
--------------------------------------------------------------------------------
1 | *.py[cod]
2 | .DS_Store
3 | *.log
4 | venv/*
5 |
6 | # emacs
7 | *~
8 | ._*
9 | .\#*
10 | \#*\#
11 |
12 | # C extensions
13 | *.so
14 |
15 | # Packages
16 | *.egg
17 | *.egg-info
18 | dist
19 | build
20 | eggs
21 | parts
22 | bin
23 | var
24 | sdist
25 | develop-eggs
26 | .installed.cfg
27 | lib
28 | lib64
29 |
30 | # Installer logs
31 | pip-log.txt
32 |
33 | # Unit test / coverage reports
34 | .coverage
35 | .tox
36 | nosetests.xml
37 |
38 | # Translations
39 | *.mo
40 |
41 | # Mr Developer
42 | .mr.developer.cfg
43 | .project
44 | .pydevproject
45 |
--------------------------------------------------------------------------------
/ANNOUNCE.md:
--------------------------------------------------------------------------------
1 | # Cronfed: Minimum Viable Monitoring
2 |
3 | Every project should be monitored. It’s common sense. We all know
4 | it. We’re all _sensible_ developers. What is constructed must
5 | eventually fall, and without monitoring, we won’t know when or how to
6 | fix it.
7 |
8 | And yet thousands of projects go without logging or monitoring,
9 | working fine until they don’t, preventably depressing countless
10 | others.
11 |
12 | [**Cronfed**](https://github.com/hatnote/cronfed) is a Python package
13 | that embodies a **Minimum Viable Monitoring **mindset. Something is
14 | better than nothing, and there are far too many cron jobs with
15 | nothing. Whether lazy or busy, take 5 minutes and clear your
16 | conscience.
17 |
18 | `pip install cronfed`
19 |
20 | `echo ”*/15 * * * * /usr/bin/python -m cronfed --output /var/www/mysite/assets/cronfed.rss /var/mail/myuser 2>&1 | tee -a /home/myuser/project/logs/cronfed.txt”`
21 |
22 | Just replace “myuser”, the “mysite” path, and the “logs” with the
23 | appropriate paths and you’ll be ready to point your
24 | [feedreader](http://theoldreader.com) or [IFTTT](https://ifttt.com/)
25 | at your site and commence breathing easier. Learn more on [the Cronfed
26 | docs](hatnote.github.io/cronfed/).
27 |
28 | 
29 |
30 | _If only all solutions were this easy and consequence-free._
31 |
32 | We use [Cronfed](github.com/hatnote/cronfed) for Hatnote’s [Recent
33 | Changes Map](rcmap.hatnote.com) and [Weeklypedia](weekly.hatnote.com)
34 | (which just turned 1! [You should sign up](weekly.hatnote.com), learn
35 | about Wikipedia, and experience the reliability
36 | firsthand!). Predictably, everyone seems satisfied with a simple tool
37 | that does one job well.
38 |
--------------------------------------------------------------------------------
/README.rst:
--------------------------------------------------------------------------------
1 | Cronfed
2 | =======
3 |
4 | .. image:: https://farm9.staticflickr.com/8144/7544169948_8abb2bb2f3_m_d.jpg
5 |
6 | **Cronfed** monitors basic batch jobs, or any other cron-based
7 | scheduled commands by parsing a given mailbox and turning it into an
8 | RSS feed. The feed can in turn be monitored with your browser_,
9 | feedreader_ or other RSS-compatible service (such as IFTTT_).
10 |
11 | Simply add a cron job to generate the feed, pointing it at a
12 | web-accessible location (such as a ``public_html`` directory or your
13 | site's assets directory). Check out the example for some real-world
14 | Cronfed usage, with an explanation of how cron and Cronfed work
15 | together.
16 |
17 | Cronfed is **Minimum Viable Monitoring**, aimed at providing a basic
18 | threshold of monitoring without complex automation or dependencies.
19 | It's targeted at smaller projects which otherwise might go without any
20 | monitoring at all. It's so easy to set up and use on the standard
21 | Linux/BSD machine that there's no reason to not use it from
22 | Day 1. While Cronfed makes attempts at limiting the amount of
23 | information externalized, it is not recommended for jobs with
24 | extremely-sensitive information.
25 |
26 | *"Cronfed: It's the least you could do!"*
27 |
28 | Installation
29 | ------------
30 |
31 | Cronfed is pure Python, has no system library dependencies, and should
32 | work wonders on any POSIX machine with a functioning cron daemon and
33 | local mail system::
34 |
35 | pip install cronfed
36 |
37 | Run ``python -m cronfed --help`` to see options, or read on for a
38 | usage example.
39 |
40 | Example
41 | -------
42 |
43 | First, let's look at a basic cron job. This one will fetch our data
44 | once an hour, on the hour::
45 |
46 | 0 * * * * /usr/bin/python /home/myuser/project/fetch.py 2>&1 | tee -a /home/myuser/project/logs/fetch.txt
47 |
48 | Notice how the output (``stdout`` + ``stderr``) is piped to a log file,
49 | but using the ``tee`` command. This ensures that the output goes to the
50 | file as well as ``stdout``. ``cron`` captures that ``stdout`` and puts it
51 | into an email, which then gets sent to the user who owns the job. This
52 | usually means the email goes to ``myuser@localhost``, which on many
53 | distributions means that it is saved to ``/var/mail/myuser``. Do note
54 | that if the command generates no output, then ``cron`` **will not send
55 | an email**, so it's a good idea to emit an error message.
56 |
57 | Once we're sure that email is being delivered, we're halfway
58 | there. Now we just need the actual Cronfed cronjob::
59 |
60 | */15 * * * * /usr/bin/python -m cronfed --output /var/www/mysite/assets/cronfed.rss /var/mail/myuser 2>&1 | tee -a /home/myuser/project/logs/cronfed.txt
61 |
62 | In this example we have the installed ``cronfed`` module regenerating
63 | our feed every fifteen minutes. This is a pretty quick process in most
64 | cases, so feel free to make it more often. In this case, the output of
65 | cronfed itself is monitored in exactly the same way as normal cron
66 | jobs, with a logfile and email to ``user@localhost``.
67 |
68 | History
69 | -------
70 |
71 | Cronfed was created for `Hatnote`_ to monitor the periodic data
72 | refreshes necessary to generate `The Weeklypedia`_. See those cron
73 | jobs and more in the `Weeklypedia crontab`_.
74 |
75 | * Copyright: (c) 2015 by `Mark Williams`_ and `Mahmoud Hashemi`_
76 | * License: BSD, see LICENSE for more details.
77 |
78 |
79 | .. _browser: https://www.mozilla.org/en-US/firefox/new/
80 | .. _feedreader: https://theoldreader.com/
81 | .. _IFTTT: https://ifttt.com/
82 | .. _Hatnote: http://hatnote.com
83 | .. _The Weeklypedia: http://weekly.hatnote.com
84 | .. _Weeklypedia crontab: https://github.com/hatnote/weeklypedia/blob/master/weeklypedia/crontab
85 | .. _Mark Williams: https://github.com/markrwilliams/
86 | .. _Mahmoud Hashemi: https://github.com/mahmoud/
87 |
--------------------------------------------------------------------------------
/cronfed.py:
--------------------------------------------------------------------------------
1 |
2 | # Credit to Mark Williams for this
3 | import re
4 | import sys
5 | import socket
6 | import hashlib
7 | import argparse
8 | import datetime
9 | from contextlib import closing
10 | from collections import namedtuple
11 | import xml.etree.cElementTree as ET
12 |
13 | from boltons.tzutils import UTC
14 | from boltons.mboxutils import mbox_readonlydir
15 |
16 | GENERATOR_TEXT = 'cronfed v1.0'
17 |
18 | DEFAULT_LINK = 'http://github.com/hatnote/cronfed'
19 | DEFAULT_DESC = 'Fresh cron output from cronfed'
20 | DEFAULT_TITLE = 'Cronfed on %s' % socket.gethostname()
21 | DEFAULT_TTL = '30' # time to live (how often to recheck in mins)
22 | DEFAULT_GUID_URL_TMPL = None
23 | # NOTE: would've used isPermalink=false but IFTTT does not like that
24 |
25 | DEFAULT_EXCERPT = 16
26 | DEFAULT_EXCLUDE_EXC = False
27 | DEFAULT_SAVE = sys.maxint
28 |
29 |
30 | DEFAULT_SUBJECT_RE = re.compile(
31 | 'Cron <(?P[^@].+)@(?P[^>].+)> (?P.*)')
32 | DEFAULT_SUBJECT_TMPL = 'Cron: <%(user)s@%(host)s> %(command)s'
33 | MESSAGE_ID_RE = re.compile('<(?P[^@]+)@(?P[^>+]+)>')
34 |
35 |
36 | def find_python_error_type(text):
37 | from boltons.tbutils import ParsedTB
38 | try:
39 | tb_str = text[text.index('Traceback (most recent'):]
40 | except ValueError:
41 | raise ValueError('no traceback found')
42 | parsed_tb = ParsedTB.from_string(tb_str)
43 | return parsed_tb.exc_type
44 |
45 |
46 | class CronFeeder(object):
47 | def __init__(self, mailbox_path, **kwargs):
48 | self.mailbox_path = mailbox_path
49 | self.output_path = kwargs.pop('output_path', None)
50 | self.save_count = kwargs.pop('save_count', DEFAULT_SAVE)
51 | self.feed_title = kwargs.pop('feed_title', DEFAULT_TITLE)
52 | self.feed_desc = kwargs.pop('feed_desc', DEFAULT_DESC)
53 | self.feed_link = kwargs.pop('feed_link', DEFAULT_LINK)
54 | self.feed_ttl = kwargs.pop('feed_ttl', DEFAULT_TTL)
55 | self.guid_url_tmpl = kwargs.pop('guid_url_tmpl', DEFAULT_GUID_URL_TMPL)
56 | if not self.guid_url_tmpl:
57 | self.guid_url_tmpl = self.feed_link + '/{guid}'
58 |
59 | self.subject_re = kwargs.pop('subject_re', DEFAULT_SUBJECT_RE)
60 | self.subject_tmpl = kwargs.pop('subject_tmpl', DEFAULT_SUBJECT_TMPL)
61 |
62 | self.excerpt_len = kwargs.pop('excerpt_len', DEFAULT_EXCERPT)
63 | self.exclude_exc = kwargs.pop('exclude_exc', DEFAULT_EXCLUDE_EXC)
64 | if kwargs:
65 | raise TypeError('unexpected keyword arguments: %r' % kwargs.keys())
66 |
67 | def process(self):
68 | with closing(mbox_readonlydir(self.mailbox_path)) as mbox:
69 | emails = self._process_emails(mbox)
70 | rss_items = []
71 | for email in emails:
72 | rss_items.append(RSSItem.from_email(email,
73 | parser=self.subject_re,
74 | renderer=self.subject_tmpl,
75 | excerpt=self.excerpt_len))
76 | rendered = self._render_feed(rss_items)
77 |
78 | if self.output_path:
79 | with open(self.output_path, 'w') as f:
80 | f.write(rendered)
81 | else:
82 | print rendered
83 | return
84 |
85 | def _process_emails(self, mbox):
86 | ret = []
87 | for key, email in reversed(mbox.items()):
88 | if self.subject_re.match(email.get('subject')):
89 | if len(ret) < self.save_count:
90 | ret.append(email)
91 | else:
92 | del mbox[key]
93 | return ret
94 |
95 | def _render_feed(self, rss_items):
96 | rss = ET.Element('rss', {'version': '2.0'})
97 | channel = ET.SubElement(rss, 'channel')
98 | title_elem = ET.SubElement(channel, 'title')
99 | title_elem.text = self.feed_title
100 | desc_elem = ET.SubElement(channel, 'description')
101 | desc_elem.text = self.feed_desc
102 | link_elem = ET.SubElement(channel, 'link')
103 | link_elem.text = self.feed_link
104 | ttl_elem = ET.SubElement(channel, 'ttl')
105 | ttl_elem.text = self.feed_ttl
106 |
107 | gen_elem = ET.SubElement(channel, 'generator')
108 | gen_elem.text = GENERATOR_TEXT
109 |
110 | lbd_elem = ET.SubElement(channel, 'lastBuildDate')
111 | _now = datetime.datetime.now(tz=UTC)
112 | lbd_elem.text = _now.strftime('%a, %d %b %Y %H:%M:%S %z')
113 |
114 | for rss_item in rss_items:
115 | item = ET.SubElement(channel, 'item')
116 | for tag, text in rss_item._asdict().items():
117 | if tag == 'link' and text is None:
118 | text = self.feed_link # TODO: make this configurable?
119 | elif tag == 'guid':
120 | text = self.guid_url_tmpl.format(guid=text)
121 | elem = ET.SubElement(item, tag)
122 | elem.text = text
123 | return ET.tostring(rss, encoding='UTF-8')
124 |
125 | @staticmethod
126 | def get_argparser():
127 | prs = argparse.ArgumentParser()
128 | add_arg = prs.add_argument
129 | add_arg('mailbox_path',
130 | help='path to the mailbox file to process for cron output')
131 | add_arg('--output', '-o', default=None,
132 | help='where to write the output, defaults to stdout')
133 | add_arg('--save', default=DEFAULT_SAVE, type=int,
134 | help='the number of cron emails to save, defaults to'
135 | ' saving all of them')
136 | add_arg('--title', default=DEFAULT_TITLE,
137 | help='top-level title for the RSS feed')
138 | add_arg('--desc', default=DEFAULT_DESC,
139 | help='top-level description for the RSS feed')
140 | add_arg('--link', default=DEFAULT_LINK,
141 | help='top-level home page URL for the RSS feed')
142 | add_arg('--ttl', default=DEFAULT_TTL,
143 | help='recommended minutes between feed reader update checks')
144 |
145 | add_arg('--exclude-exc', default=DEFAULT_EXCLUDE_EXC,
146 | action='store_true', help='whether to search for and include'
147 | ' Python exception types in the feed')
148 | add_arg('--excerpt', default=DEFAULT_EXCERPT, type=int,
149 | help='how much cron job output to include, defaults to a small'
150 | ' amount, specify 0 to disable excerpting')
151 | add_arg('--guid-url-tmpl', default=DEFAULT_GUID_URL_TMPL,
152 | help='template used to generate individual item links')
153 | return prs
154 |
155 | @classmethod
156 | def from_args(cls):
157 | kwarg_map = {'save': 'save_count',
158 | 'output': 'output_path',
159 | 'excerpt': 'excerpt_len',
160 | 'title': 'feed_title',
161 | 'desc': 'feed_desc',
162 | 'link': 'feed_link',
163 | 'ttl': 'feed_ttl'}
164 | prs = cls.get_argparser()
165 | kwargs = dict(prs.parse_args()._get_kwargs())
166 | for src, dest in kwarg_map.items():
167 | kwargs[dest] = kwargs.pop(src)
168 | return cls(**kwargs)
169 |
170 |
171 | BaseRSSItem = namedtuple('RSSItem', ['title', 'description', 'link',
172 | 'pubDate', 'guid'])
173 |
174 |
175 | class RSSItem(BaseRSSItem):
176 | @classmethod
177 | def from_email(cls, email,
178 | parser=DEFAULT_SUBJECT_RE,
179 | renderer=DEFAULT_SUBJECT_TMPL,
180 | excerpt=DEFAULT_EXCERPT):
181 | body = email.get_payload()
182 | date = email['date']
183 | subject = email['subject']
184 |
185 | match = parser.match(subject)
186 | if not match:
187 | raise ValueError("Unparseable subject")
188 | subject_dict = match.groupdict()
189 | title = renderer % subject_dict
190 |
191 | match = MESSAGE_ID_RE.match(email.get('message-id', ''))
192 | if match:
193 | guid = match.group('id')
194 | else:
195 | guid = hashlib.sha224(date + subject + body).hexdigest()
196 |
197 | desc = 'At %s, Cron ran:\n\n\t%s' % (date, subject_dict['command'])
198 | try:
199 | python_error_type = find_python_error_type(body)
200 | except:
201 | python_error_type = None
202 | if python_error_type:
203 | desc += '\n\nPython exception: %s.' % python_error_type
204 |
205 | if body:
206 | desc += '\n\nCommand output:\n\n\t' + summarize(body, excerpt)
207 |
208 | return cls(title=title, description=desc, link=None,
209 | pubDate=date, guid=guid)
210 |
211 |
212 | def summarize(text, length):
213 | """
214 | Length is the amount of text to show. It doesn't include the
215 | length that the summarization adds back in."
216 | """
217 | len_diff = len(text) - length
218 | if len_diff <= 0:
219 | return text
220 | elif not length:
221 | return '(%s bytes)' % len(text)
222 | return ''.join([text[:length/2],
223 | '... (%s bytes) ...' % len_diff,
224 | text[-length/2:]])
225 |
226 |
227 | def main():
228 | cronfeeder = CronFeeder.from_args()
229 | cronfeeder.process()
230 |
231 |
232 | if __name__ == '__main__':
233 | main()
234 |
--------------------------------------------------------------------------------
/example_mailbox:
--------------------------------------------------------------------------------
1 | From user@example.com Fri Feb 27 06:21:03 2015
2 | Return-Path:
3 | Received: from example.com (localhost.localdomain [127.0.0.1])
4 | by example.com (8.14.3/8.14.3/Debian-9.4) with ESMTP id t1RBL3ih022545
5 | for ; Fri, 27 Feb 2015 06:21:03 -0500
6 | Received: (from user@localhost)
7 | by example.com (8.14.3/8.14.3/Submit) id t1RBL3RP022544
8 | for user; Fri, 27 Feb 2015 06:21:03 -0500
9 | Date: Fri, 27 Feb 2015 06:21:03 -0500
10 | Message-Id: <201502271121.t1RBL3RP022544@example.com>
11 | From: root@example.com (Cron Daemon)
12 | To: user@example.com
13 | Subject: Cron $PYTHON_BIN /home/user/script.py --lang da 2>&1 | tee -a /home/user/logs/da.txt
14 | Content-Type: text/plain; charset=ANSI_X3.4-1968
15 | X-Cron-Env:
16 | {u'complete': True}
17 | Some more fun da output for our precious example.
18 |
19 | From user@example.com Fri Feb 27 06:21:03 2015
20 | Return-Path:
21 | Received: from example.com (localhost.localdomain [127.0.0.1])
22 | by example.com (8.14.3/8.14.3/Debian-9.4) with ESMTP id t1RBL3ed022549
23 | for ; Fri, 27 Feb 2015 06:21:03 -0500
24 | Received: (from user@localhost)
25 | by example.com (8.14.3/8.14.3/Submit) id t1RBL3Uf022546
26 | for user; Fri, 27 Feb 2015 06:21:03 -0500
27 | Date: Fri, 27 Feb 2015 06:21:03 -0500
28 | Message-Id: <201502271121.t1RBL3Uf022546@example.com>
29 | From: root@example.com (Cron Daemon)
30 | To: user@example.com
31 | Subject: Cron $PYTHON_BIN /home/user/script.py --lang it 2>&1 | tee -a /home/user/logs/it.txt
32 | Content-Type: text/plain; charset=ANSI_X3.4-196
33 | X-Cron-Env:
34 | {u'complete': True}
35 | Some more fun it output for our precious example.
36 |
37 | From friend@example.com Fri Feb 27 06:22:02 2015
38 | Return-Path:
39 | Received: from example.com (localhost.localdomain [127.0.0.1])
40 | by example.com (8.14.3/8.14.3/Debian-9.4) with ESMTP id t1RBM23c022565
41 | for ; Fri, 27 Feb 2015 06:22:02 -0500
42 | Received: (from user@localhost)
43 | by example.com (8.14.3/8.14.3/Submit) id t1RBM2RK022564
44 | for user; Fri, 27 Feb 2015 06:22:02 -0500
45 | Date: Fri, 27 Feb 2015 06:22:02 -0500
46 | Message-Id: <201502271122.t1RBM2RK022564@example.com>
47 | From: friend@example.com (Your friend)
48 | To: user@example.com
49 | Subject: Cron? What cron?
50 | Content-Type: text/plain; charset=ANSI_X3.4-1968
51 | Some non-cron content, because cron isn't everything
52 |
53 | From user@example.com Fri Feb 27 06:22:02 2015
54 | Return-Path:
55 | Received: from example.com (localhost.localdomain [127.0.0.1])
56 | by example.com (8.14.3/8.14.3/Debian-9.4) with ESMTP id t1RBM23c022565
57 | for ; Fri, 27 Feb 2015 06:22:02 -0500
58 | Received: (from user@localhost)
59 | by example.com (8.14.3/8.14.3/Submit) id t1RBM2RK022564
60 | for user; Fri, 27 Feb 2015 06:22:02 -0500
61 | Date: Fri, 27 Feb 2015 06:22:02 -0500
62 | Message-Id: <201502271122.t1RBM2RK022564@example.com>
63 | From: root@example.com (Cron Daemon)
64 | To: user@example.com
65 | Subject: Cron $PYTHON_BIN /home/user/script.py --lang ca 2>&1 | tee -a /home/user/logs/ca.txt
66 | Content-Type: text/plain; charset=ANSI_X3.4-1968
67 | X-Cron-Env:
68 | {u'complete': True}
69 | Some more fun ca output for our precious example.
70 |
--------------------------------------------------------------------------------
/setup.py:
--------------------------------------------------------------------------------
1 | """
2 | Cronfed
3 | =======
4 |
5 | Cronfed is a tool for monitoring basic batch jobs, or any other
6 | cron-based scheduled commands. For more information see the README.
7 | """
8 | import os
9 | import sys
10 | from setuptools import setup
11 |
12 |
13 | __author__ = 'Mahmoud Hashemi and Mark Williams'
14 | __version__ = '0.2.1'
15 | __contact__ = 'mahmoudrhashemi@gmail.com'
16 | __url__ = 'https://github.com/hatnote/cronfed'
17 | __license__ = 'BSD'
18 |
19 |
20 | if sys.version_info >= (3,):
21 | raise NotImplementedError("cronfed Python 3 support still en route to your location")
22 |
23 | CUR_PATH = os.path.dirname(os.path.abspath(__file__))
24 |
25 | setup(name='cronfed',
26 | version=__version__,
27 | description="Bare minimum cron job monitoring for the masses.",
28 | long_description=open(CUR_PATH + '/README.rst').read(),
29 | author=__author__,
30 | author_email=__contact__,
31 | url=__url__,
32 | py_modules=['cronfed'],
33 | install_requires=['boltons==0.4.1'],
34 | zip_safe=True,
35 | license=__license__,
36 | platforms='any',
37 | classifiers=[
38 | 'Intended Audience :: Developers',
39 | 'Topic :: Software Development :: Libraries',
40 | 'Programming Language :: Python :: 2.6',
41 | 'Programming Language :: Python :: 2.7', ]
42 | )
43 |
--------------------------------------------------------------------------------