├── .gitignore ├── ANNOUNCE.md ├── README.rst ├── cronfed.py ├── example_mailbox └── setup.py /.gitignore: -------------------------------------------------------------------------------- 1 | *.py[cod] 2 | .DS_Store 3 | *.log 4 | venv/* 5 | 6 | # emacs 7 | *~ 8 | ._* 9 | .\#* 10 | \#*\# 11 | 12 | # C extensions 13 | *.so 14 | 15 | # Packages 16 | *.egg 17 | *.egg-info 18 | dist 19 | build 20 | eggs 21 | parts 22 | bin 23 | var 24 | sdist 25 | develop-eggs 26 | .installed.cfg 27 | lib 28 | lib64 29 | 30 | # Installer logs 31 | pip-log.txt 32 | 33 | # Unit test / coverage reports 34 | .coverage 35 | .tox 36 | nosetests.xml 37 | 38 | # Translations 39 | *.mo 40 | 41 | # Mr Developer 42 | .mr.developer.cfg 43 | .project 44 | .pydevproject 45 | -------------------------------------------------------------------------------- /ANNOUNCE.md: -------------------------------------------------------------------------------- 1 | # Cronfed: Minimum Viable Monitoring 2 | 3 | Every project should be monitored. It’s common sense. We all know 4 | it. We’re all _sensible_ developers. What is constructed must 5 | eventually fall, and without monitoring, we won’t know when or how to 6 | fix it. 7 | 8 | And yet thousands of projects go without logging or monitoring, 9 | working fine until they don’t, preventably depressing countless 10 | others. 11 | 12 | [**Cronfed**](https://github.com/hatnote/cronfed) is a Python package 13 | that embodies a **Minimum Viable Monitoring **mindset. Something is 14 | better than nothing, and there are far too many cron jobs with 15 | nothing. Whether lazy or busy, take 5 minutes and clear your 16 | conscience. 17 | 18 | `pip install cronfed` 19 | 20 | `echo ”*/15 * * * * /usr/bin/python -m cronfed --output /var/www/mysite/assets/cronfed.rss /var/mail/myuser 2>&1 | tee -a /home/myuser/project/logs/cronfed.txt”` 21 | 22 | Just replace “myuser”, the “mysite” path, and the “logs” with the 23 | appropriate paths and you’ll be ready to point your 24 | [feedreader](http://theoldreader.com) or [IFTTT](https://ifttt.com/) 25 | at your site and commence breathing easier. Learn more on [the Cronfed 26 | docs](hatnote.github.io/cronfed/). 27 | 28 | ![simpsbird](https://31.media.tumblr.com/1ce65aa6920ac4f62c36790dab032342/tumblr_inline_nlhyytPLYW1ql4e1e.gif) 29 | 30 | _If only all solutions were this easy and consequence-free._ 31 | 32 | We use [Cronfed](github.com/hatnote/cronfed) for Hatnote’s [Recent 33 | Changes Map](rcmap.hatnote.com) and [Weeklypedia](weekly.hatnote.com) 34 | (which just turned 1! [You should sign up](weekly.hatnote.com), learn 35 | about Wikipedia, and experience the reliability 36 | firsthand!). Predictably, everyone seems satisfied with a simple tool 37 | that does one job well. 38 | -------------------------------------------------------------------------------- /README.rst: -------------------------------------------------------------------------------- 1 | Cronfed 2 | ======= 3 | 4 | .. image:: https://farm9.staticflickr.com/8144/7544169948_8abb2bb2f3_m_d.jpg 5 | 6 | **Cronfed** monitors basic batch jobs, or any other cron-based 7 | scheduled commands by parsing a given mailbox and turning it into an 8 | RSS feed. The feed can in turn be monitored with your browser_, 9 | feedreader_ or other RSS-compatible service (such as IFTTT_). 10 | 11 | Simply add a cron job to generate the feed, pointing it at a 12 | web-accessible location (such as a ``public_html`` directory or your 13 | site's assets directory). Check out the example for some real-world 14 | Cronfed usage, with an explanation of how cron and Cronfed work 15 | together. 16 | 17 | Cronfed is **Minimum Viable Monitoring**, aimed at providing a basic 18 | threshold of monitoring without complex automation or dependencies. 19 | It's targeted at smaller projects which otherwise might go without any 20 | monitoring at all. It's so easy to set up and use on the standard 21 | Linux/BSD machine that there's no reason to not use it from 22 | Day 1. While Cronfed makes attempts at limiting the amount of 23 | information externalized, it is not recommended for jobs with 24 | extremely-sensitive information. 25 | 26 | *"Cronfed: It's the least you could do!"* 27 | 28 | Installation 29 | ------------ 30 | 31 | Cronfed is pure Python, has no system library dependencies, and should 32 | work wonders on any POSIX machine with a functioning cron daemon and 33 | local mail system:: 34 | 35 | pip install cronfed 36 | 37 | Run ``python -m cronfed --help`` to see options, or read on for a 38 | usage example. 39 | 40 | Example 41 | ------- 42 | 43 | First, let's look at a basic cron job. This one will fetch our data 44 | once an hour, on the hour:: 45 | 46 | 0 * * * * /usr/bin/python /home/myuser/project/fetch.py 2>&1 | tee -a /home/myuser/project/logs/fetch.txt 47 | 48 | Notice how the output (``stdout`` + ``stderr``) is piped to a log file, 49 | but using the ``tee`` command. This ensures that the output goes to the 50 | file as well as ``stdout``. ``cron`` captures that ``stdout`` and puts it 51 | into an email, which then gets sent to the user who owns the job. This 52 | usually means the email goes to ``myuser@localhost``, which on many 53 | distributions means that it is saved to ``/var/mail/myuser``. Do note 54 | that if the command generates no output, then ``cron`` **will not send 55 | an email**, so it's a good idea to emit an error message. 56 | 57 | Once we're sure that email is being delivered, we're halfway 58 | there. Now we just need the actual Cronfed cronjob:: 59 | 60 | */15 * * * * /usr/bin/python -m cronfed --output /var/www/mysite/assets/cronfed.rss /var/mail/myuser 2>&1 | tee -a /home/myuser/project/logs/cronfed.txt 61 | 62 | In this example we have the installed ``cronfed`` module regenerating 63 | our feed every fifteen minutes. This is a pretty quick process in most 64 | cases, so feel free to make it more often. In this case, the output of 65 | cronfed itself is monitored in exactly the same way as normal cron 66 | jobs, with a logfile and email to ``user@localhost``. 67 | 68 | History 69 | ------- 70 | 71 | Cronfed was created for `Hatnote`_ to monitor the periodic data 72 | refreshes necessary to generate `The Weeklypedia`_. See those cron 73 | jobs and more in the `Weeklypedia crontab`_. 74 | 75 | * Copyright: (c) 2015 by `Mark Williams`_ and `Mahmoud Hashemi`_ 76 | * License: BSD, see LICENSE for more details. 77 | 78 | 79 | .. _browser: https://www.mozilla.org/en-US/firefox/new/ 80 | .. _feedreader: https://theoldreader.com/ 81 | .. _IFTTT: https://ifttt.com/ 82 | .. _Hatnote: http://hatnote.com 83 | .. _The Weeklypedia: http://weekly.hatnote.com 84 | .. _Weeklypedia crontab: https://github.com/hatnote/weeklypedia/blob/master/weeklypedia/crontab 85 | .. _Mark Williams: https://github.com/markrwilliams/ 86 | .. _Mahmoud Hashemi: https://github.com/mahmoud/ 87 | -------------------------------------------------------------------------------- /cronfed.py: -------------------------------------------------------------------------------- 1 | 2 | # Credit to Mark Williams for this 3 | import re 4 | import sys 5 | import socket 6 | import hashlib 7 | import argparse 8 | import datetime 9 | from contextlib import closing 10 | from collections import namedtuple 11 | import xml.etree.cElementTree as ET 12 | 13 | from boltons.tzutils import UTC 14 | from boltons.mboxutils import mbox_readonlydir 15 | 16 | GENERATOR_TEXT = 'cronfed v1.0' 17 | 18 | DEFAULT_LINK = 'http://github.com/hatnote/cronfed' 19 | DEFAULT_DESC = 'Fresh cron output from cronfed' 20 | DEFAULT_TITLE = 'Cronfed on %s' % socket.gethostname() 21 | DEFAULT_TTL = '30' # time to live (how often to recheck in mins) 22 | DEFAULT_GUID_URL_TMPL = None 23 | # NOTE: would've used isPermalink=false but IFTTT does not like that 24 | 25 | DEFAULT_EXCERPT = 16 26 | DEFAULT_EXCLUDE_EXC = False 27 | DEFAULT_SAVE = sys.maxint 28 | 29 | 30 | DEFAULT_SUBJECT_RE = re.compile( 31 | 'Cron <(?P[^@].+)@(?P[^>].+)> (?P.*)') 32 | DEFAULT_SUBJECT_TMPL = 'Cron: <%(user)s@%(host)s> %(command)s' 33 | MESSAGE_ID_RE = re.compile('<(?P[^@]+)@(?P[^>+]+)>') 34 | 35 | 36 | def find_python_error_type(text): 37 | from boltons.tbutils import ParsedTB 38 | try: 39 | tb_str = text[text.index('Traceback (most recent'):] 40 | except ValueError: 41 | raise ValueError('no traceback found') 42 | parsed_tb = ParsedTB.from_string(tb_str) 43 | return parsed_tb.exc_type 44 | 45 | 46 | class CronFeeder(object): 47 | def __init__(self, mailbox_path, **kwargs): 48 | self.mailbox_path = mailbox_path 49 | self.output_path = kwargs.pop('output_path', None) 50 | self.save_count = kwargs.pop('save_count', DEFAULT_SAVE) 51 | self.feed_title = kwargs.pop('feed_title', DEFAULT_TITLE) 52 | self.feed_desc = kwargs.pop('feed_desc', DEFAULT_DESC) 53 | self.feed_link = kwargs.pop('feed_link', DEFAULT_LINK) 54 | self.feed_ttl = kwargs.pop('feed_ttl', DEFAULT_TTL) 55 | self.guid_url_tmpl = kwargs.pop('guid_url_tmpl', DEFAULT_GUID_URL_TMPL) 56 | if not self.guid_url_tmpl: 57 | self.guid_url_tmpl = self.feed_link + '/{guid}' 58 | 59 | self.subject_re = kwargs.pop('subject_re', DEFAULT_SUBJECT_RE) 60 | self.subject_tmpl = kwargs.pop('subject_tmpl', DEFAULT_SUBJECT_TMPL) 61 | 62 | self.excerpt_len = kwargs.pop('excerpt_len', DEFAULT_EXCERPT) 63 | self.exclude_exc = kwargs.pop('exclude_exc', DEFAULT_EXCLUDE_EXC) 64 | if kwargs: 65 | raise TypeError('unexpected keyword arguments: %r' % kwargs.keys()) 66 | 67 | def process(self): 68 | with closing(mbox_readonlydir(self.mailbox_path)) as mbox: 69 | emails = self._process_emails(mbox) 70 | rss_items = [] 71 | for email in emails: 72 | rss_items.append(RSSItem.from_email(email, 73 | parser=self.subject_re, 74 | renderer=self.subject_tmpl, 75 | excerpt=self.excerpt_len)) 76 | rendered = self._render_feed(rss_items) 77 | 78 | if self.output_path: 79 | with open(self.output_path, 'w') as f: 80 | f.write(rendered) 81 | else: 82 | print rendered 83 | return 84 | 85 | def _process_emails(self, mbox): 86 | ret = [] 87 | for key, email in reversed(mbox.items()): 88 | if self.subject_re.match(email.get('subject')): 89 | if len(ret) < self.save_count: 90 | ret.append(email) 91 | else: 92 | del mbox[key] 93 | return ret 94 | 95 | def _render_feed(self, rss_items): 96 | rss = ET.Element('rss', {'version': '2.0'}) 97 | channel = ET.SubElement(rss, 'channel') 98 | title_elem = ET.SubElement(channel, 'title') 99 | title_elem.text = self.feed_title 100 | desc_elem = ET.SubElement(channel, 'description') 101 | desc_elem.text = self.feed_desc 102 | link_elem = ET.SubElement(channel, 'link') 103 | link_elem.text = self.feed_link 104 | ttl_elem = ET.SubElement(channel, 'ttl') 105 | ttl_elem.text = self.feed_ttl 106 | 107 | gen_elem = ET.SubElement(channel, 'generator') 108 | gen_elem.text = GENERATOR_TEXT 109 | 110 | lbd_elem = ET.SubElement(channel, 'lastBuildDate') 111 | _now = datetime.datetime.now(tz=UTC) 112 | lbd_elem.text = _now.strftime('%a, %d %b %Y %H:%M:%S %z') 113 | 114 | for rss_item in rss_items: 115 | item = ET.SubElement(channel, 'item') 116 | for tag, text in rss_item._asdict().items(): 117 | if tag == 'link' and text is None: 118 | text = self.feed_link # TODO: make this configurable? 119 | elif tag == 'guid': 120 | text = self.guid_url_tmpl.format(guid=text) 121 | elem = ET.SubElement(item, tag) 122 | elem.text = text 123 | return ET.tostring(rss, encoding='UTF-8') 124 | 125 | @staticmethod 126 | def get_argparser(): 127 | prs = argparse.ArgumentParser() 128 | add_arg = prs.add_argument 129 | add_arg('mailbox_path', 130 | help='path to the mailbox file to process for cron output') 131 | add_arg('--output', '-o', default=None, 132 | help='where to write the output, defaults to stdout') 133 | add_arg('--save', default=DEFAULT_SAVE, type=int, 134 | help='the number of cron emails to save, defaults to' 135 | ' saving all of them') 136 | add_arg('--title', default=DEFAULT_TITLE, 137 | help='top-level title for the RSS feed') 138 | add_arg('--desc', default=DEFAULT_DESC, 139 | help='top-level description for the RSS feed') 140 | add_arg('--link', default=DEFAULT_LINK, 141 | help='top-level home page URL for the RSS feed') 142 | add_arg('--ttl', default=DEFAULT_TTL, 143 | help='recommended minutes between feed reader update checks') 144 | 145 | add_arg('--exclude-exc', default=DEFAULT_EXCLUDE_EXC, 146 | action='store_true', help='whether to search for and include' 147 | ' Python exception types in the feed') 148 | add_arg('--excerpt', default=DEFAULT_EXCERPT, type=int, 149 | help='how much cron job output to include, defaults to a small' 150 | ' amount, specify 0 to disable excerpting') 151 | add_arg('--guid-url-tmpl', default=DEFAULT_GUID_URL_TMPL, 152 | help='template used to generate individual item links') 153 | return prs 154 | 155 | @classmethod 156 | def from_args(cls): 157 | kwarg_map = {'save': 'save_count', 158 | 'output': 'output_path', 159 | 'excerpt': 'excerpt_len', 160 | 'title': 'feed_title', 161 | 'desc': 'feed_desc', 162 | 'link': 'feed_link', 163 | 'ttl': 'feed_ttl'} 164 | prs = cls.get_argparser() 165 | kwargs = dict(prs.parse_args()._get_kwargs()) 166 | for src, dest in kwarg_map.items(): 167 | kwargs[dest] = kwargs.pop(src) 168 | return cls(**kwargs) 169 | 170 | 171 | BaseRSSItem = namedtuple('RSSItem', ['title', 'description', 'link', 172 | 'pubDate', 'guid']) 173 | 174 | 175 | class RSSItem(BaseRSSItem): 176 | @classmethod 177 | def from_email(cls, email, 178 | parser=DEFAULT_SUBJECT_RE, 179 | renderer=DEFAULT_SUBJECT_TMPL, 180 | excerpt=DEFAULT_EXCERPT): 181 | body = email.get_payload() 182 | date = email['date'] 183 | subject = email['subject'] 184 | 185 | match = parser.match(subject) 186 | if not match: 187 | raise ValueError("Unparseable subject") 188 | subject_dict = match.groupdict() 189 | title = renderer % subject_dict 190 | 191 | match = MESSAGE_ID_RE.match(email.get('message-id', '')) 192 | if match: 193 | guid = match.group('id') 194 | else: 195 | guid = hashlib.sha224(date + subject + body).hexdigest() 196 | 197 | desc = 'At %s, Cron ran:\n\n\t%s' % (date, subject_dict['command']) 198 | try: 199 | python_error_type = find_python_error_type(body) 200 | except: 201 | python_error_type = None 202 | if python_error_type: 203 | desc += '\n\nPython exception: %s.' % python_error_type 204 | 205 | if body: 206 | desc += '\n\nCommand output:\n\n\t' + summarize(body, excerpt) 207 | 208 | return cls(title=title, description=desc, link=None, 209 | pubDate=date, guid=guid) 210 | 211 | 212 | def summarize(text, length): 213 | """ 214 | Length is the amount of text to show. It doesn't include the 215 | length that the summarization adds back in." 216 | """ 217 | len_diff = len(text) - length 218 | if len_diff <= 0: 219 | return text 220 | elif not length: 221 | return '(%s bytes)' % len(text) 222 | return ''.join([text[:length/2], 223 | '... (%s bytes) ...' % len_diff, 224 | text[-length/2:]]) 225 | 226 | 227 | def main(): 228 | cronfeeder = CronFeeder.from_args() 229 | cronfeeder.process() 230 | 231 | 232 | if __name__ == '__main__': 233 | main() 234 | -------------------------------------------------------------------------------- /example_mailbox: -------------------------------------------------------------------------------- 1 | From user@example.com Fri Feb 27 06:21:03 2015 2 | Return-Path: 3 | Received: from example.com (localhost.localdomain [127.0.0.1]) 4 | by example.com (8.14.3/8.14.3/Debian-9.4) with ESMTP id t1RBL3ih022545 5 | for ; Fri, 27 Feb 2015 06:21:03 -0500 6 | Received: (from user@localhost) 7 | by example.com (8.14.3/8.14.3/Submit) id t1RBL3RP022544 8 | for user; Fri, 27 Feb 2015 06:21:03 -0500 9 | Date: Fri, 27 Feb 2015 06:21:03 -0500 10 | Message-Id: <201502271121.t1RBL3RP022544@example.com> 11 | From: root@example.com (Cron Daemon) 12 | To: user@example.com 13 | Subject: Cron $PYTHON_BIN /home/user/script.py --lang da 2>&1 | tee -a /home/user/logs/da.txt 14 | Content-Type: text/plain; charset=ANSI_X3.4-1968 15 | X-Cron-Env: 16 | {u'complete': True} 17 | Some more fun da output for our precious example. 18 | 19 | From user@example.com Fri Feb 27 06:21:03 2015 20 | Return-Path: 21 | Received: from example.com (localhost.localdomain [127.0.0.1]) 22 | by example.com (8.14.3/8.14.3/Debian-9.4) with ESMTP id t1RBL3ed022549 23 | for ; Fri, 27 Feb 2015 06:21:03 -0500 24 | Received: (from user@localhost) 25 | by example.com (8.14.3/8.14.3/Submit) id t1RBL3Uf022546 26 | for user; Fri, 27 Feb 2015 06:21:03 -0500 27 | Date: Fri, 27 Feb 2015 06:21:03 -0500 28 | Message-Id: <201502271121.t1RBL3Uf022546@example.com> 29 | From: root@example.com (Cron Daemon) 30 | To: user@example.com 31 | Subject: Cron $PYTHON_BIN /home/user/script.py --lang it 2>&1 | tee -a /home/user/logs/it.txt 32 | Content-Type: text/plain; charset=ANSI_X3.4-196 33 | X-Cron-Env: 34 | {u'complete': True} 35 | Some more fun it output for our precious example. 36 | 37 | From friend@example.com Fri Feb 27 06:22:02 2015 38 | Return-Path: 39 | Received: from example.com (localhost.localdomain [127.0.0.1]) 40 | by example.com (8.14.3/8.14.3/Debian-9.4) with ESMTP id t1RBM23c022565 41 | for ; Fri, 27 Feb 2015 06:22:02 -0500 42 | Received: (from user@localhost) 43 | by example.com (8.14.3/8.14.3/Submit) id t1RBM2RK022564 44 | for user; Fri, 27 Feb 2015 06:22:02 -0500 45 | Date: Fri, 27 Feb 2015 06:22:02 -0500 46 | Message-Id: <201502271122.t1RBM2RK022564@example.com> 47 | From: friend@example.com (Your friend) 48 | To: user@example.com 49 | Subject: Cron? What cron? 50 | Content-Type: text/plain; charset=ANSI_X3.4-1968 51 | Some non-cron content, because cron isn't everything 52 | 53 | From user@example.com Fri Feb 27 06:22:02 2015 54 | Return-Path: 55 | Received: from example.com (localhost.localdomain [127.0.0.1]) 56 | by example.com (8.14.3/8.14.3/Debian-9.4) with ESMTP id t1RBM23c022565 57 | for ; Fri, 27 Feb 2015 06:22:02 -0500 58 | Received: (from user@localhost) 59 | by example.com (8.14.3/8.14.3/Submit) id t1RBM2RK022564 60 | for user; Fri, 27 Feb 2015 06:22:02 -0500 61 | Date: Fri, 27 Feb 2015 06:22:02 -0500 62 | Message-Id: <201502271122.t1RBM2RK022564@example.com> 63 | From: root@example.com (Cron Daemon) 64 | To: user@example.com 65 | Subject: Cron $PYTHON_BIN /home/user/script.py --lang ca 2>&1 | tee -a /home/user/logs/ca.txt 66 | Content-Type: text/plain; charset=ANSI_X3.4-1968 67 | X-Cron-Env: 68 | {u'complete': True} 69 | Some more fun ca output for our precious example. 70 | -------------------------------------------------------------------------------- /setup.py: -------------------------------------------------------------------------------- 1 | """ 2 | Cronfed 3 | ======= 4 | 5 | Cronfed is a tool for monitoring basic batch jobs, or any other 6 | cron-based scheduled commands. For more information see the README. 7 | """ 8 | import os 9 | import sys 10 | from setuptools import setup 11 | 12 | 13 | __author__ = 'Mahmoud Hashemi and Mark Williams' 14 | __version__ = '0.2.1' 15 | __contact__ = 'mahmoudrhashemi@gmail.com' 16 | __url__ = 'https://github.com/hatnote/cronfed' 17 | __license__ = 'BSD' 18 | 19 | 20 | if sys.version_info >= (3,): 21 | raise NotImplementedError("cronfed Python 3 support still en route to your location") 22 | 23 | CUR_PATH = os.path.dirname(os.path.abspath(__file__)) 24 | 25 | setup(name='cronfed', 26 | version=__version__, 27 | description="Bare minimum cron job monitoring for the masses.", 28 | long_description=open(CUR_PATH + '/README.rst').read(), 29 | author=__author__, 30 | author_email=__contact__, 31 | url=__url__, 32 | py_modules=['cronfed'], 33 | install_requires=['boltons==0.4.1'], 34 | zip_safe=True, 35 | license=__license__, 36 | platforms='any', 37 | classifiers=[ 38 | 'Intended Audience :: Developers', 39 | 'Topic :: Software Development :: Libraries', 40 | 'Programming Language :: Python :: 2.6', 41 | 'Programming Language :: Python :: 2.7', ] 42 | ) 43 | --------------------------------------------------------------------------------