├── docs ├── diagram.png ├── usecase_http-monitoring-sms.png ├── usecase_http-monitoring-email.png ├── usecase_prometheus-alerting-email.png ├── usecase_http-monitoring-page-screenshot.png ├── usecase_cloudwatch-alerting-healthy-queue.png ├── usecase_prometheus-alerting-graph-normal.png ├── usecase_cloudwatch-alerting-unhealthy-queue.png ├── usecase_prometheus-alerting-graph-unhealthy.png ├── usecase_http-monitoring-canary-checkdefinition.png ├── usecase_http-monitoring.md ├── setup_dynamodb.md ├── diagram.md ├── setup_alertmanager.md ├── setup_sns.md ├── setup_custom_integration.md ├── usecase_cloudwatch-alerting.md ├── setup_iam.md ├── setup_apigateway.md ├── usecase_prometheus-alerting.md └── setup_alertmanager-canary.md ├── bin └── build.sh ├── .github └── workflows │ └── build.yml ├── go.mod ├── turbobob.json ├── cmd └── alertmanager │ ├── lambdahandler.go │ ├── sns.go │ ├── main.go │ ├── httpmonitorscanner_test.go │ ├── alerts.go │ ├── ingest.go │ ├── deadmansswitches_test.go │ ├── deadmansswitches.go │ ├── scheduled_test.go │ ├── httpmonitorscanner.go │ ├── scheduled.go │ ├── httpmonitors.go │ └── restapi.go ├── pkg ├── amstate │ ├── types.go │ ├── utils.go │ ├── store.go │ └── store_test.go ├── alertmanagertypes │ └── types.go ├── alertmanagerclient │ └── client.go └── amdomain │ └── events.go ├── README.md ├── LICENSE └── go.sum /docs/diagram.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/function61/lambda-alertmanager/HEAD/docs/diagram.png -------------------------------------------------------------------------------- /docs/usecase_http-monitoring-sms.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/function61/lambda-alertmanager/HEAD/docs/usecase_http-monitoring-sms.png -------------------------------------------------------------------------------- /docs/usecase_http-monitoring-email.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/function61/lambda-alertmanager/HEAD/docs/usecase_http-monitoring-email.png -------------------------------------------------------------------------------- /docs/usecase_prometheus-alerting-email.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/function61/lambda-alertmanager/HEAD/docs/usecase_prometheus-alerting-email.png -------------------------------------------------------------------------------- /docs/usecase_http-monitoring-page-screenshot.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/function61/lambda-alertmanager/HEAD/docs/usecase_http-monitoring-page-screenshot.png -------------------------------------------------------------------------------- /docs/usecase_cloudwatch-alerting-healthy-queue.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/function61/lambda-alertmanager/HEAD/docs/usecase_cloudwatch-alerting-healthy-queue.png -------------------------------------------------------------------------------- /docs/usecase_prometheus-alerting-graph-normal.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/function61/lambda-alertmanager/HEAD/docs/usecase_prometheus-alerting-graph-normal.png -------------------------------------------------------------------------------- /docs/usecase_cloudwatch-alerting-unhealthy-queue.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/function61/lambda-alertmanager/HEAD/docs/usecase_cloudwatch-alerting-unhealthy-queue.png -------------------------------------------------------------------------------- /docs/usecase_prometheus-alerting-graph-unhealthy.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/function61/lambda-alertmanager/HEAD/docs/usecase_prometheus-alerting-graph-unhealthy.png -------------------------------------------------------------------------------- /docs/usecase_http-monitoring-canary-checkdefinition.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/function61/lambda-alertmanager/HEAD/docs/usecase_http-monitoring-canary-checkdefinition.png -------------------------------------------------------------------------------- /bin/build.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash -eu 2 | 3 | source /build-common.sh 4 | 5 | BINARY_NAME="alertmanager" 6 | COMPILE_IN_DIRECTORY="cmd/alertmanager" 7 | 8 | # aws has non-gofmt code.. 9 | GOFMT_TARGETS="cmd/ pkg/" 10 | 11 | # TODO: one deployerspec is done, we can stop overriding this from base image 12 | function packageLambdaFunction { 13 | cd rel/ 14 | cp "${BINARY_NAME}_linux-amd64" "${BINARY_NAME}" 15 | rm -f lambdafunc.zip 16 | zip lambdafunc.zip "${BINARY_NAME}" 17 | rm "${BINARY_NAME}" 18 | } 19 | 20 | standardBuildProcess 21 | 22 | packageLambdaFunction 23 | -------------------------------------------------------------------------------- /docs/usecase_http-monitoring.md: -------------------------------------------------------------------------------- 1 | Use case: HTTP monitoring 2 | ========================= 3 | 4 | You have an important web property that you want to monitor: 5 | 6 | ![](usecase_http-monitoring-page-screenshot.png) 7 | 8 | You have AlertManager-Canary installed and configured to monitor it: 9 | 10 | ![](usecase_http-monitoring-canary-checkdefinition.png) 11 | 12 | So, if Canary fails to find this text from the page: 13 | 14 | ``` 15 | Hostname: c70e24a08b3a 16 | ``` 17 | 18 | It'll send an alert (to configurable receivers), for example by email: 19 | 20 | ![](usecase_http-monitoring-email.png) 21 | 22 | And SMS: 23 | 24 | ![](usecase_http-monitoring-sms.png) 25 | -------------------------------------------------------------------------------- /.github/workflows/build.yml: -------------------------------------------------------------------------------- 1 | name: Build 2 | 3 | on: [push] 4 | 5 | jobs: 6 | build: 7 | runs-on: ubuntu-latest 8 | steps: 9 | - uses: actions/checkout@v2 10 | - name: Download Turbo Bob 11 | run: curl --fail --location --output bob https://dl.bintray.com/function61/dl/turbobob/20200220_1142_9c1ea959/bob_linux-amd64 && chmod +x bob 12 | - name: Build with Turbo Bob 13 | run: CI_REVISION_ID="$GITHUB_SHA" ./bob build --publish-artefacts 14 | # unfortunately there doesn't seem to be a way to "expose all secrets", so you must 15 | # list here each secret to pass on to the build 16 | env: 17 | GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} 18 | EVENTHORIZON: ${{ secrets.EVENTHORIZON }} 19 | -------------------------------------------------------------------------------- /go.mod: -------------------------------------------------------------------------------- 1 | module github.com/function61/lambda-alertmanager 2 | 3 | go 1.13 4 | 5 | require ( 6 | github.com/apcera/termtables v0.0.0-20170405184538-bcbc5dc54055 // indirect 7 | github.com/apex/gateway v1.1.1 8 | github.com/aws/aws-lambda-go v1.14.0 9 | github.com/aws/aws-sdk-go v1.29.0 10 | github.com/function61/eventhorizon v0.2.1-0.20200227140656-f89fe5d462ca 11 | github.com/function61/gokit v0.0.0-20200307135016-6dd948616ce0 12 | github.com/mattn/go-runewidth v0.0.8 // indirect 13 | github.com/scylladb/termtables v1.0.0 14 | github.com/spf13/cobra v0.0.6 15 | github.com/stretchr/testify v1.5.1 // indirect 16 | github.com/tj/assert v0.0.0-20190920132354-ee03d75cd160 // indirect 17 | golang.org/x/net v0.0.0-20200226121028-0de0cce0169b // indirect 18 | ) 19 | -------------------------------------------------------------------------------- /turbobob.json: -------------------------------------------------------------------------------- 1 | { 2 | "for_description_of_this_file_see": "https://github.com/function61/turbobob", 3 | "version_major": 1, 4 | "project_name": "lambda-alertmanager", 5 | "builders": [ 6 | { 7 | "name": "default", 8 | "uses": "docker://fn61/buildkit-golang:20200423_1235_75c6eae7", 9 | "mount_source": "", 10 | "mount_destination": "/workspace", 11 | "workdir": "/workspace", 12 | "commands": { 13 | "build": ["bin/build.sh"], 14 | "dev": ["bash"] 15 | } 16 | }, 17 | { 18 | "name": "publisher", 19 | "uses": "docker://fn61/buildkit-publisher:20200228_1755_83c203ff", 20 | "mount_destination": "/workspace", 21 | "commands": { 22 | "publish": ["publish-gh.sh", "function61/lambda-alertmanager", "rel/"], 23 | "dev": ["bash"] 24 | }, 25 | "pass_envs": [ 26 | "GITHUB_TOKEN", 27 | "EVENTHORIZON" 28 | ] 29 | } 30 | ], 31 | "os_arches": { 32 | "linux-amd64": true 33 | } 34 | } 35 | -------------------------------------------------------------------------------- /docs/setup_dynamodb.md: -------------------------------------------------------------------------------- 1 | Setting up DynamoDB 2 | =================== 3 | 4 | DynamoDB holds all the state that AlertManager keeps. The state is required to implement: 5 | 6 | - Acknowledgements: mark an alert as having been handled. 7 | - Alarm suppression: if alert is not yet acknowledged as handled, the same alert should not be sent twice. 8 | - Rate limiting: if too many alarms are triggering, do not overwhelm operations' inboxes & phone with alerts. 9 | 10 | 11 | Create table 12 | ------------ 13 | 14 | Create table: 15 | 16 | - Name = `alertmanager_alerts` 17 | - Primary key = `alert_key` (type=string) 18 | - Use default settings 19 | - `[ Create ]` 20 | 21 | 22 | Enable stream 23 | ------------- 24 | 25 | Now we need to enable stream for that table, so our Lambda function can listen to changes in this table. 26 | 27 | From `alertmanager_alerts > Overview > Stream details > Manage stream`: 28 | 29 | - View type = `New image` 30 | - `[ Enable ]` 31 | -------------------------------------------------------------------------------- /cmd/alertmanager/lambdahandler.go: -------------------------------------------------------------------------------- 1 | package main 2 | 3 | import ( 4 | "context" 5 | "errors" 6 | 7 | "github.com/aws/aws-lambda-go/events" 8 | "github.com/aws/aws-lambda-go/lambda" 9 | "github.com/function61/gokit/aws/lambdautils" 10 | ) 11 | 12 | func lambdaHandler() { 13 | restApi := newRestApi(context.Background()) 14 | 15 | handler := func(ctx context.Context, polymorphicEvent interface{}) ([]byte, error) { 16 | switch event := polymorphicEvent.(type) { 17 | case *events.CloudWatchEvent: 18 | return nil, handleCloudwatchScheduledEvent(ctx, event.Time) 19 | case *events.SNSEvent: 20 | return nil, handleSnsIngest(ctx, *event) 21 | case *events.APIGatewayProxyRequest: 22 | return lambdautils.ServeApiGatewayProxyRequestUsingHttpHandler( 23 | ctx, 24 | event, 25 | restApi) 26 | default: 27 | return nil, errors.New("cannot identify type of request") 28 | } 29 | } 30 | 31 | lambda.StartHandler(lambdautils.NewMultiEventTypeHandler(handler)) 32 | } 33 | -------------------------------------------------------------------------------- /pkg/amstate/types.go: -------------------------------------------------------------------------------- 1 | package amstate 2 | 3 | import ( 4 | "time" 5 | ) 6 | 7 | // for snapshots 8 | type stateFormat struct { 9 | LastUnnoticedAlertsNotified time.Time `json:"last_unnoticed_alerts_notified"` 10 | ActiveAlerts map[string]Alert `json:"active_alerts"` 11 | HttpMonitors map[string]HttpMonitor `json:"http_monitors"` 12 | DeadMansSwitches map[string]DeadMansSwitch `json:"dead_mans_switches"` 13 | } 14 | 15 | type Alert struct { 16 | Id string `json:"alert_key"` // name in JSON for backwards compat 17 | Subject string `json:"subject"` // same type of error should always have same subject 18 | Details string `json:"details"` 19 | Timestamp time.Time `json:"timestamp"` 20 | } 21 | 22 | type HttpMonitor struct { 23 | Id string `json:"id"` 24 | Created time.Time `json:"created"` 25 | Enabled bool `json:"enabled"` 26 | Url string `json:"url"` 27 | Find string `json:"find"` 28 | } 29 | 30 | type DeadMansSwitch struct { 31 | Subject string `json:"subject"` 32 | Ttl time.Time `json:"ttl"` 33 | } 34 | -------------------------------------------------------------------------------- /docs/diagram.md: -------------------------------------------------------------------------------- 1 | [Gravizo](https://gravizo.com/) source code: 2 | 3 | ``` 4 | digraph G { 5 | Prometheus; 6 | custom [label="Custom integration"]; 7 | cloudwatch_alarms [label="Cloudwatch Alarms"]; 8 | alertmanager_canary [label="HTTP(S) monitoring%5CnLambda: AlertManager Canary"]; 9 | sns_ingest [label="SNS topic:%5CnAlertManager-ingest"]; 10 | http [label="HTTPS (API Gateway)%5Cn- POST /alerts/ingest"]; 11 | receive_alarm [label="Receive alarm%5CnLambda: AlertManager"]; 12 | alarm_already_triggering [label="Alarm already triggering?"]; 13 | Discard; 14 | rate_limit_exceeded [label="Rate limit exceeded?"]; 15 | store_alarm_dynamodb [label="Store alarm%5Cn- DynamoDB"]; 16 | dynamodb_trigger [label="DynamoDB trigger%5Cn- Row inserted: send alert"]; 17 | sns_alert [label="SNS topic:%5CnAlertManager-alert"]; 18 | sns_email [label="Email%5Cnops@example.com"]; 19 | sns_sms [label="SMS%5Cn+358 40 123 456"]; 20 | Prometheus -> http; 21 | custom -> http; 22 | cloudwatch_alarms -> sns_ingest; 23 | alertmanager_canary -> sns_ingest; 24 | sns_ingest -> receive_alarm; 25 | http -> receive_alarm; 26 | receive_alarm -> alarm_already_triggering; 27 | alarm_already_triggering -> Discard [label=" yes"]; 28 | alarm_already_triggering -> rate_limit_exceeded [label=" no"]; 29 | rate_limit_exceeded -> Discard [label=" yes"]; 30 | rate_limit_exceeded -> store_alarm_dynamodb; 31 | store_alarm_dynamodb -> dynamodb_trigger; 32 | dynamodb_trigger -> sns_alert; 33 | sns_alert -> sns_email; 34 | sns_alert -> sns_sms; 35 | } 36 | ``` 37 | -------------------------------------------------------------------------------- /docs/setup_alertmanager.md: -------------------------------------------------------------------------------- 1 | Setting up AlertManager 2 | ======================= 3 | 4 | 5 | Create Lambda function 6 | ---------------------- 7 | 8 | - Go to `Lambda > Create a Lambda function > Blank function`. 9 | - Do not configure any triggers at this time (just hit next). 10 | - Name: `AlertManager` 11 | - Description: `AlertManager main: ingestor & alerter` 12 | - Runtime: `Node.js 4.3` 13 | - Code entry type: `Upload a .ZIP file` 14 | - Download latest `alertmanager.zip` from releases -page (in GitHub) 15 | to your desktop and then upload to Lambda 16 | 17 | Env variables: 18 | 19 | - `ALERT_TOPIC` = ARN of your alert topic (mine looked like `arn:aws:sns:us-west-2:426466625513:AlertManager-alert`) 20 | 21 | Role config: 22 | 23 | - Handler: leave as is 24 | - Role: leave as is (`Choose existing role`) 25 | - Existing role: `AlertManager` 26 | 27 | Advanced config: 28 | 29 | - Memory (MB): leave as is (`128`) 30 | - Timeout: `1 min` 31 | 32 | Okay now hit `[ Create function ]`. 33 | 34 | 35 | Add trigger for "alertmanager_alerts" DynamoDB table 36 | ---------------------------------------------------- 37 | 38 | Go to `Triggers > Add > DynamoDB`: 39 | 40 | - Table = `alertmanager_alerts` 41 | - Batch size = `1` 42 | - Starting position = `Trim horizon` 43 | 44 | 45 | Add trigger for "AlertManager-ingest" SNS topic 46 | ----------------------------------------------- 47 | 48 | Go to `Triggers > Add > SNS`: 49 | 50 | - Topic = `AlertManager-ingest` 51 | 52 | This topic allows the ingestor to receive alerts from Canary, CloudWatch & other SNS-compatible sources. 53 | -------------------------------------------------------------------------------- /docs/setup_sns.md: -------------------------------------------------------------------------------- 1 | Settings up SNS 2 | =============== 3 | 4 | SNS is basically a pub/sub solution from AWS. 5 | 6 | 7 | Create topic "AlertManager-ingest" 8 | ---------------------------------- 9 | 10 | In `SNS > Topics > Create new topic`: 11 | 12 | - Topic name = `AlertManager-ingest` 13 | - Display name = (leave blank) 14 | 15 | Write the `Topic ARN` down - you'll need this when setting up Lambda. 16 | 17 | 18 | Create topic "AlertManager-alert" 19 | --------------------------------- 20 | 21 | In `SNS > Topics > Create new topic`: 22 | 23 | - Topic name = `AlertManager-alert` 24 | - Display name = `ALERT` (this is shown in SMS message prefix etc.) 25 | 26 | Write the `Topic ARN` down (for this topic as well) - you'll need this when setting up Lambda. 27 | 28 | 29 | Add first subscriber to alert topic 30 | ----------------------------------- 31 | 32 | Now, `SNS > Topics > AlertManager-alert > Actions > Subscribe`: 33 | 34 | - Protocol: `Email` 35 | - Endpoint: `your.email@example.com` 36 | 37 | AWS just sent you an email. Open that email and confirm your subscription. 38 | This has to be done only one per subscription. 39 | 40 | You can later set up SMS delivery by adding a new subscription to the `AlertManager-alert` topic. 41 | 42 | 43 | What is the difference between "ingest" and "alert" topics? 44 | ----------------------------------------------------------- 45 | 46 | The diagram in [README](../README.md) explains this the best! Look for the SNS topics. 47 | 48 | TL;DR: `ingest` processes high-bandwith alarms and `alert` delivers filtered low-bandwith alerts. 49 | -------------------------------------------------------------------------------- /pkg/alertmanagertypes/types.go: -------------------------------------------------------------------------------- 1 | package alertmanagertypes 2 | 3 | import ( 4 | "fmt" 5 | "time" 6 | ) 7 | 8 | type Alert struct { 9 | Key string `json:"alert_key"` 10 | Subject string `json:"subject"` // same type of error should always have same subject 11 | Details string `json:"details"` 12 | Timestamp time.Time `json:"timestamp"` 13 | } 14 | 15 | func NewAlert(subject string, details string) Alert { 16 | return Alert{ 17 | Subject: subject, 18 | Details: details, 19 | Timestamp: time.Now(), 20 | } 21 | } 22 | 23 | func (a *Alert) Equal(other Alert) bool { 24 | return a.Subject == other.Subject 25 | } 26 | 27 | type DeadMansSwitch struct { 28 | Subject string `json:"subject"` 29 | TTL time.Time `json:"ttl"` 30 | } 31 | 32 | func (d *DeadMansSwitch) AsAlert(now time.Time) Alert { 33 | return Alert{ 34 | Subject: d.Subject, 35 | Timestamp: now, 36 | Details: fmt.Sprintf("Check-in late by %s (%s)", now.Sub(d.TTL), d.TTL.Format(time.RFC3339Nano)), 37 | } 38 | } 39 | 40 | // otherwise the same but TTL in un-expanded form 41 | type DeadMansSwitchCheckinRequest struct { 42 | Subject string `json:"subject"` 43 | TTL string `json:"ttl"` 44 | } 45 | 46 | func (d *DeadMansSwitchCheckinRequest) AsAlert(details string) Alert { 47 | return Alert{ 48 | Subject: d.Subject, 49 | Details: details, 50 | Timestamp: time.Now(), 51 | } 52 | } 53 | 54 | func NewDeadMansSwitchCheckinRequest(subject string, ttl string) DeadMansSwitchCheckinRequest { 55 | return DeadMansSwitchCheckinRequest{ 56 | Subject: subject, 57 | TTL: ttl, 58 | } 59 | } 60 | -------------------------------------------------------------------------------- /cmd/alertmanager/sns.go: -------------------------------------------------------------------------------- 1 | package main 2 | 3 | import ( 4 | "encoding/json" 5 | 6 | "github.com/aws/aws-sdk-go/aws" 7 | "github.com/aws/aws-sdk-go/aws/session" 8 | "github.com/aws/aws-sdk-go/service/sns" 9 | "github.com/function61/gokit/envvar" 10 | "github.com/function61/gokit/stringutils" 11 | "github.com/function61/lambda-alertmanager/pkg/amstate" 12 | ) 13 | 14 | func publishAlert(alert amstate.Alert) error { 15 | awsSession, err := session.NewSession() 16 | if err != nil { 17 | return err 18 | } 19 | 20 | // FIXME: harcoded region id 21 | snsSvc := sns.New(awsSession, aws.NewConfig().WithRegion("us-east-1")) 22 | 23 | messageText := alert.Subject + "\n\n" + alert.Details 24 | 25 | alertTopic, err := envvar.Required("ALERT_TOPIC") 26 | if err != nil { 27 | return err 28 | } 29 | 30 | ackLinkMaybe := "" 31 | if alert.Id != "" { 32 | ackLinkMaybe = "Ack: " + ackLink(alert) + "\n\n" 33 | } 34 | 35 | messagePerProtocol := struct { 36 | Default string `json:"default"` // email etc. 37 | Sms string `json:"sms"` 38 | }{ 39 | Default: ackLinkMaybe + stringutils.Truncate(messageText, 4*1024), 40 | Sms: stringutils.Truncate(messageText, 160-7), // -7 for "ALERT >" prefix in SMS messages 41 | } 42 | 43 | messagePerProtocolJson, err := json.Marshal(&messagePerProtocol) 44 | if err != nil { 45 | return err 46 | } 47 | 48 | _, err = snsSvc.Publish(&sns.PublishInput{ 49 | TopicArn: aws.String(alertTopic), 50 | Subject: aws.String(alert.Subject), 51 | Message: aws.String(string(messagePerProtocolJson)), 52 | MessageStructure: aws.String("json"), 53 | }) 54 | return err 55 | } 56 | -------------------------------------------------------------------------------- /pkg/alertmanagerclient/client.go: -------------------------------------------------------------------------------- 1 | package alertmanagerclient 2 | 3 | import ( 4 | "context" 5 | "os" 6 | "time" 7 | 8 | "github.com/function61/gokit/envvar" 9 | "github.com/function61/gokit/ezhttp" 10 | "github.com/function61/lambda-alertmanager/pkg/alertmanagertypes" 11 | ) 12 | 13 | const ( 14 | baseUrlEnvVarName = "ALERTMANAGER_BASEURL" 15 | ) 16 | 17 | type Client struct { 18 | baseUrl string 19 | } 20 | 21 | func New(baseUrl string) *Client { 22 | return &Client{baseUrl} 23 | } 24 | 25 | func (c *Client) Alert(ctx context.Context, alert alertmanagertypes.Alert) error { 26 | _, err := ezhttp.Post(ctx, c.baseUrl+"/alerts/ingest", ezhttp.SendJson(&alert)) 27 | return err 28 | } 29 | 30 | // simpler version of DeadMansSwitchCheckinCustom() 31 | func (c *Client) DeadMansSwitchCheckin( 32 | ctx context.Context, 33 | subject string, 34 | ttl time.Duration, 35 | ) error { 36 | return c.DeadMansSwitchCheckinCustom( 37 | ctx, 38 | alertmanagertypes.NewDeadMansSwitchCheckinRequest( 39 | subject, 40 | "+"+ttl.String())) 41 | } 42 | 43 | func (c *Client) DeadMansSwitchCheckinCustom( 44 | ctx context.Context, 45 | req alertmanagertypes.DeadMansSwitchCheckinRequest, 46 | ) error { 47 | _, err := ezhttp.Post(ctx, c.baseUrl+"/deadmansswitch/checkin", ezhttp.SendJson(&req)) 48 | return err 49 | } 50 | 51 | // if ALERTMANAGER_BASEURL is set, returns client 52 | func ClientFromEnvOptional() *Client { 53 | baseUrl := os.Getenv(baseUrlEnvVarName) 54 | if baseUrl == "" { 55 | return nil 56 | } 57 | 58 | return New(baseUrl) 59 | } 60 | 61 | // required version of ClientFromEnvOptional() 62 | func ClientFromEnvRequired() (*Client, error) { 63 | baseUrl, err := envvar.Required(baseUrlEnvVarName) 64 | if err != nil { 65 | return nil, err 66 | } 67 | 68 | return New(baseUrl), nil 69 | } 70 | -------------------------------------------------------------------------------- /cmd/alertmanager/main.go: -------------------------------------------------------------------------------- 1 | package main 2 | 3 | import ( 4 | "context" 5 | "fmt" 6 | "os" 7 | "time" 8 | 9 | "github.com/function61/eventhorizon/pkg/ehcli" 10 | "github.com/function61/eventhorizon/pkg/ehreader" 11 | "github.com/function61/gokit/aws/lambdautils" 12 | "github.com/function61/gokit/dynversion" 13 | "github.com/function61/gokit/logex" 14 | "github.com/function61/gokit/ossignal" 15 | "github.com/function61/lambda-alertmanager/pkg/amstate" 16 | "github.com/spf13/cobra" 17 | ) 18 | 19 | func main() { 20 | // AWS Lambda doesn't support giving argv, so we use an ugly hack to detect when 21 | // we're in Lambda 22 | if lambdautils.InLambda() { 23 | lambdaHandler() 24 | return 25 | } 26 | 27 | app := &cobra.Command{ 28 | Use: os.Args[0], 29 | Short: "Alert manager", 30 | Version: dynversion.Version, 31 | } 32 | 33 | app.AddCommand(alertEntry()) 34 | 35 | app.AddCommand(deadMansSwitchEntry()) 36 | 37 | app.AddCommand(httpMonitorEntry()) 38 | 39 | app.AddCommand(ehcli.Entrypoint()) 40 | 41 | app.AddCommand(restApiCliEntry()) 42 | 43 | app.AddCommand(&cobra.Command{ 44 | Use: "lambda-scheduler", 45 | Short: "Run what Lambda would invoke in response to scheduler event", 46 | Run: func(*cobra.Command, []string) { 47 | exitIfError(handleCloudwatchScheduledEvent( 48 | ossignal.InterruptOrTerminateBackgroundCtx(nil), 49 | time.Now())) 50 | }, 51 | }) 52 | 53 | exitIfError(app.Execute()) 54 | } 55 | 56 | func getApp(ctx context.Context) (*amstate.App, error) { 57 | tenantCtx, err := ehreader.TenantCtxWithSnapshotsFrom(ehreader.ConfigFromEnv, "am:v1") 58 | if err != nil { 59 | return nil, err 60 | } 61 | 62 | logger := logex.StandardLogger() 63 | 64 | return amstate.LoadUntilRealtime( 65 | ctx, 66 | tenantCtx, 67 | logger) 68 | } 69 | 70 | func exitIfError(err error) { 71 | if err != nil { 72 | fmt.Fprintln(os.Stderr, err) 73 | os.Exit(1) 74 | } 75 | } 76 | -------------------------------------------------------------------------------- /docs/setup_custom_integration.md: -------------------------------------------------------------------------------- 1 | Setting up custom integration over HTTPS 2 | ======================================== 3 | 4 | This API is pretty much explained in the [Setting up API gateway](./setup_apigateway.md) guide, 5 | but here's a short summary until I get time to write a better tutorial on this. 6 | 7 | NOTE: replace `REDACTED` with your `API ID` from API Gateway. 8 | 9 | 10 | List firing alerts 11 | ------------------ 12 | 13 | AKA un-acknowledged alerts. 14 | 15 | ``` 16 | $ curl https://REDACTED.execute-api.us-west-2.amazonaws.com/prod/alerts 17 | [ 18 | { 19 | "alert_key": "1", 20 | "subject": "www.example.com", 21 | "timestamp": "2017-01-15T12:12:04.018Z", 22 | "details": "I dont like the page" 23 | } 24 | ] 25 | ``` 26 | 27 | 28 | Submit an alarm 29 | --------------- 30 | 31 | ``` 32 | $ curl -H 'Content-Type: application/json' -X POST -d '{"subject": "www.example.com", "details": "I dont like the page"}' https://REDACTED.execute-api.us-west-2.amazonaws.com/prod/alerts/ingest 33 | "OK => alert saved to database and queued for delivery" 34 | ``` 35 | 36 | Or if the alert is already firing, you'll get back response `This alert is already firing. Discarding the submitted alert.`. 37 | 38 | Alternate way: if you app uses AWS-SDK, you can also submit the alarm for ingestion by posting to the `AlertManager-ingest` SNS topic. 39 | 40 | 41 | Acknowledge an alert 42 | -------------------- 43 | 44 | ``` 45 | $ curl -H 'Content-Type: application/json' -X POST -d '{"alert_key": "1"}' https://REDACTED.execute-api.us-west-2.amazonaws.com/prod/alerts/acknowledge 46 | "Alert 1 deleted" 47 | ``` 48 | 49 | 50 | Receiving fired alerts via webhook 51 | ---------------------------------- 52 | 53 | Go to `SNS > Topics > AlertManager-alert > Create subscription > HTTP or HTTPS`. 54 | 55 | Firing alerts are sent to any subscribers listed in this topic. If I added a webhook, I would have these subscriptions: 56 | 57 | - Email: ops@example.com 58 | - SMS: +358 40 123 456 59 | - Webhook(HTTPS): https://example.com/api/alert-firing-webhook 60 | -------------------------------------------------------------------------------- /pkg/amstate/utils.go: -------------------------------------------------------------------------------- 1 | package amstate 2 | 3 | import ( 4 | "time" 5 | 6 | "github.com/function61/gokit/cryptorandombytes" 7 | ) 8 | 9 | func FindAlertWithSubject(subject string, alerts []Alert) *Alert { 10 | for _, alert := range alerts { 11 | if alert.Subject == subject { 12 | return &alert 13 | } 14 | } 15 | 16 | return nil 17 | } 18 | 19 | func HasAlertWithId(id string, alerts []Alert) bool { 20 | for _, alert := range alerts { 21 | if alert.Id == id { 22 | return true 23 | } 24 | } 25 | 26 | return false 27 | } 28 | 29 | // unnoticed = not acked within 4 hours 30 | func GetUnnoticedAlerts(alerts []Alert, now time.Time) []Alert { 31 | unnoticed := []Alert{} 32 | for _, alert := range alerts { 33 | if now.Sub(alert.Timestamp) >= 4*time.Hour { 34 | unnoticed = append(unnoticed, alert) 35 | } 36 | } 37 | 38 | return unnoticed 39 | } 40 | 41 | func FindHttpMonitorWithId(id string, monitors []HttpMonitor) *HttpMonitor { 42 | for _, monitor := range monitors { 43 | if monitor.Id == id { 44 | return &monitor 45 | } 46 | } 47 | 48 | return nil 49 | } 50 | 51 | func EnabledHttpMonitors(monitors []HttpMonitor) []HttpMonitor { 52 | enabled := []HttpMonitor{} 53 | 54 | for _, monitor := range monitors { 55 | if monitor.Enabled { 56 | enabled = append(enabled, monitor) 57 | } 58 | } 59 | 60 | return enabled 61 | } 62 | 63 | func FindDeadMansSwitchWithSubject(subject string, dmss []DeadMansSwitch) *DeadMansSwitch { 64 | for _, dms := range dmss { 65 | if dms.Subject == subject { 66 | return &dms 67 | } 68 | } 69 | 70 | return nil 71 | } 72 | 73 | func GetExpiredDeadMansSwitches(switches []DeadMansSwitch, now time.Time) []DeadMansSwitch { 74 | expired := []DeadMansSwitch{} 75 | for _, sw := range switches { 76 | if !now.Before(sw.Ttl) { 77 | expired = append(expired, sw) 78 | } 79 | } 80 | 81 | return expired 82 | } 83 | 84 | func NewAlertId() string { 85 | return cryptorandombytes.Base64UrlWithoutLeadingDash(6) 86 | } 87 | 88 | func NewHttpMonitorId() string { 89 | return cryptorandombytes.Base64UrlWithoutLeadingDash(6) 90 | } 91 | -------------------------------------------------------------------------------- /docs/usecase_cloudwatch-alerting.md: -------------------------------------------------------------------------------- 1 | Use case: CloudWatch alerting 2 | ============================= 3 | 4 | NOTE: this guide applies to most AWS services - not just SQS. But we'll use SQS as an example. 5 | 6 | Let's say that you have queue workers (whether in AWS or outside of AWS) that use AWS's SQS 7 | (Simple Queue Service). Great way to detect problems is to detect if the queue is backing up. 8 | 9 | 10 | What does a healhy queue look like? 11 | ----------------------------------- 12 | 13 | A healthy queue would not have many queued work items for a prolonged amount of time. 14 | Healthy queue looks like this: 15 | 16 | ![](usecase_cloudwatch-alerting-healthy-queue.png) 17 | 18 | Observations: 19 | 20 | - Items are sent to the queue pretty constantly. 21 | - Visible messages (= messages that are not yet consumed by a worker) should be close to zero at all times. 22 | 23 | 24 | What does an unhealhy queue look like? 25 | -------------------------------------- 26 | 27 | Unhealthy queue gets messages sent to it faster than they are processed. Looks like this: 28 | 29 | ![](usecase_cloudwatch-alerting-unhealthy-queue.png) 30 | 31 | Observations: 32 | 33 | - Items are sent to the queue pretty constantly. 34 | - Visible messages ARE NOT close to zero. 35 | 36 | 37 | Creating a CloudWatch alarm to detect unhealthy queue 38 | ----------------------------------------------------- 39 | 40 | Go to `CloudWatch > Alarms > Create Alarm > SQS Queue Metrics`: 41 | 42 | - QueueName = your queue 43 | - Metrics = `ApproximateNumberOfMessagesVisible` 44 | - `[ Next ]` 45 | - Name = `Queue XYZ health` 46 | - Whenever ApproximateNumberOfMessagesVisible `is >= 5 for 1 consecutive periods` 47 | - Period = `5 minutes` 48 | - Action = `state = alarm => send notification to AlertManager-ingest` 49 | - `[ Create Alarm ]` 50 | 51 | Now when alarming condition is detected, CloudWatch uses AlertManager to dispatch the alert to you. :) 52 | 53 | NOTE: `ApproximateAgeOfOldestMessage` is probably best metric to detect unhealthy queue that 54 | works even in high-bandwidth queues. 55 | `ApproximateNumberOfMessagesVisible` was mainly used as the easiest explanation. -------------------------------------------------------------------------------- /cmd/alertmanager/httpmonitorscanner_test.go: -------------------------------------------------------------------------------- 1 | package main 2 | 3 | import ( 4 | "context" 5 | "fmt" 6 | "testing" 7 | 8 | "github.com/function61/gokit/assert" 9 | "github.com/function61/lambda-alertmanager/pkg/amstate" 10 | ) 11 | 12 | func TestOneFails(t *testing.T) { 13 | failures := scanMonitors(context.Background(), []amstate.HttpMonitor{ 14 | { 15 | Url: "http://example.com/frontpage", 16 | Find: "Welcome to", 17 | }, 18 | { 19 | Url: "http://example.com/contacts", 20 | Find: "bar@exmaple.com", 21 | }, 22 | }, &testScanner{}, nil) 23 | 24 | assert.Assert(t, len(failures) == 1) 25 | assert.EqualString( 26 | t, 27 | failures[0].err.Error(), 28 | "string-to-find `bar@exmaple.com` NOT in body: Contact us by email at foo@example.com") 29 | } 30 | 31 | func TestAllSucceed(t *testing.T) { 32 | failures := scanMonitors(context.Background(), []amstate.HttpMonitor{ 33 | { 34 | Url: "http://example.com/frontpage", 35 | Find: "Welcome to", 36 | }, 37 | { 38 | Url: "http://example.com/contacts", 39 | Find: "foo@example.com", 40 | }, 41 | }, &testScanner{}, nil) 42 | 43 | assert.Assert(t, len(failures) == 0) 44 | } 45 | 46 | func Test404(t *testing.T) { 47 | failures := scanMonitors(context.Background(), []amstate.HttpMonitor{ 48 | { 49 | Url: "http://notfound.net/", 50 | Find: "doesntmatter", 51 | }, 52 | }, &testScanner{}, nil) 53 | 54 | assert.Assert(t, len(failures) == 1) 55 | assert.EqualString(t, failures[0].err.Error(), "404: http://notfound.net/") 56 | assert.EqualJson(t, failures[0].monitor, `{ 57 | "id": "", 58 | "created": "0001-01-01T00:00:00Z", 59 | "enabled": false, 60 | "url": "http://notfound.net/", 61 | "find": "doesntmatter" 62 | }`) 63 | } 64 | 65 | type testScanner struct{} 66 | 67 | func (a *testScanner) Scan(ctx context.Context, monitor amstate.HttpMonitor) error { 68 | pages := map[string]string{ 69 | "http://example.com/frontpage": "Welcome to frontpage", 70 | "http://example.com/contacts": "Contact us by email at foo@example.com", 71 | } 72 | 73 | page, found := pages[monitor.Url] 74 | if !found { 75 | return fmt.Errorf("404: %s", monitor.Url) 76 | } 77 | 78 | return mustFindStringInBody(page, monitor.Find) 79 | } 80 | -------------------------------------------------------------------------------- /docs/setup_iam.md: -------------------------------------------------------------------------------- 1 | Setup IAM 2 | ========= 3 | 4 | "IAM" takes care of access management. For security we'll restrict AlertManager's access to 5 | the bare minimum it needs to operate under. 6 | 7 | 8 | Create role 9 | ----------- 10 | 11 | Go to `IAM > Roles > Create new role`: 12 | 13 | - Name = `AlertManager`. 14 | - Role type = `AWS Service Roles > AWS Lambda`. 15 | - Do no attach any policies (= just hit next) as we'll set up super restrictive custom policy. 16 | - `[ Create role ]` 17 | 18 | 19 | Attach policy to role 20 | --------------------- 21 | 22 | Now go to `IAM > Roles > AlertManager > Inline policies > Create > Custom policy`: 23 | 24 | - Policy name = `dynamodbAlertsPlusSnsAlertAndIngest`. 25 | 26 | Content will be below, but you should copy it to a text editor first, and replace `__ACCOUNT_ID__` with your AWS account ID. It looks like `426466625513`. 27 | 28 | ``` 29 | { 30 | "Version": "2012-10-17", 31 | "Statement": [ 32 | { 33 | "Sid": "", 34 | "Effect": "Allow", 35 | "Action": [ 36 | "dynamodb:PutItem", 37 | "dynamodb:DeleteItem", 38 | "dynamodb:Scan" 39 | ], 40 | "Resource": [ 41 | "arn:aws:dynamodb:*:*:table/alertmanager_*" 42 | ] 43 | }, 44 | { 45 | "Sid": "", 46 | "Effect": "Allow", 47 | "Action": [ 48 | "dynamodb:GetRecords", 49 | "dynamodb:GetShardIterator", 50 | "dynamodb:DescribeStream", 51 | "dynamodb:ListStreams" 52 | ], 53 | "Resource": [ 54 | "arn:aws:dynamodb:*:*:table/alertmanager_alerts/stream/*" 55 | ] 56 | }, 57 | { 58 | "Sid": "", 59 | "Effect": "Allow", 60 | "Action": [ 61 | "sns:Publish" 62 | ], 63 | "Resource": [ 64 | "arn:aws:sns:*:*:AlertManager-alert", 65 | "arn:aws:sns:*:*:AlertManager-ingest" 66 | ] 67 | }, 68 | { 69 | "Sid": "", 70 | "Resource": "*", 71 | "Action": [ 72 | "logs:CreateLogGroup", 73 | "logs:CreateLogStream", 74 | "logs:PutLogEvents" 75 | ], 76 | "Effect": "Allow" 77 | } 78 | ] 79 | } 80 | ``` 81 | 82 | Are the wildcards safe? 83 | ----------------------- 84 | 85 | Yes. I used wildcards so you can just copy-paste the policy from above without needing to do region and 86 | account id replacements (the `*:*` parts). It is acceptable to have wildcards for: 87 | 88 | - Region component: gives additional access only to table with same name (alertmanager_alerts) 89 | in other regions (you won't have same table name in other regions) or SNS topics with same 90 | names in other regions (you won't have same topic names in other regions). 91 | - Account id component: gives AlertManager additional access to resources in other accounts you have access to: **none**, 92 | as how could you give yourself access to other accounts' resources? 93 | 94 | If you're unsure of this in any capacity, feel free to plug in your region and account IDs in the resource constraints. 95 | -------------------------------------------------------------------------------- /cmd/alertmanager/alerts.go: -------------------------------------------------------------------------------- 1 | package main 2 | 3 | import ( 4 | "context" 5 | "fmt" 6 | "strings" 7 | "time" 8 | 9 | "github.com/function61/eventhorizon/pkg/ehevent" 10 | "github.com/function61/gokit/ossignal" 11 | "github.com/function61/gokit/stringutils" 12 | "github.com/function61/lambda-alertmanager/pkg/amdomain" 13 | "github.com/function61/lambda-alertmanager/pkg/amstate" 14 | "github.com/scylladb/termtables" 15 | "github.com/spf13/cobra" 16 | ) 17 | 18 | func alertEntry() *cobra.Command { 19 | cmd := &cobra.Command{ 20 | Use: "alert", 21 | Short: "Manage alerts", 22 | } 23 | 24 | cmd.AddCommand(&cobra.Command{ 25 | Use: "mk [subject] [details]", 26 | Short: "Raise an alert", 27 | Args: cobra.ExactArgs(2), 28 | Run: func(cmd *cobra.Command, args []string) { 29 | exitIfError(alertRaise( 30 | ossignal.InterruptOrTerminateBackgroundCtx(nil), 31 | args[0], 32 | args[1])) 33 | }, 34 | }) 35 | 36 | cmd.AddCommand(&cobra.Command{ 37 | Use: "ls", 38 | Short: "List active alerts", 39 | Args: cobra.NoArgs, 40 | Run: func(cmd *cobra.Command, args []string) { 41 | exitIfError(alertList( 42 | ossignal.InterruptOrTerminateBackgroundCtx(nil))) 43 | }, 44 | }) 45 | 46 | cmd.AddCommand(&cobra.Command{ 47 | Use: "ack [id]", 48 | Short: "Acknowledge an alert", 49 | Args: cobra.ExactArgs(1), 50 | Run: func(cmd *cobra.Command, args []string) { 51 | exitIfError(alertAck( 52 | ossignal.InterruptOrTerminateBackgroundCtx(nil), 53 | args[0])) 54 | }, 55 | }) 56 | 57 | return cmd 58 | } 59 | 60 | func alertRaise(ctx context.Context, subject string, details string) error { 61 | app, err := getApp(ctx) 62 | if err != nil { 63 | return err 64 | } 65 | 66 | raised := amdomain.NewAlertRaised( 67 | amstate.NewAlertId(), 68 | subject, 69 | details, 70 | ehevent.MetaSystemUser(time.Now())) 71 | 72 | return app.Reader.TransactWrite(ctx, func() error { 73 | if amstate.FindAlertWithSubject(subject, app.State.ActiveAlerts()) != nil { 74 | return fmt.Errorf("already active have alert: %s", subject) 75 | } 76 | 77 | return app.AppendAfter(ctx, app.State.Version(), raised) 78 | }) 79 | } 80 | 81 | func alertList(ctx context.Context) error { 82 | app, err := getApp(ctx) 83 | if err != nil { 84 | return err 85 | } 86 | 87 | view := termtables.CreateTable() 88 | view.AddHeaders("Id", "Raised", "Subject", "Details") 89 | 90 | for _, alert := range app.State.ActiveAlerts() { 91 | view.AddRow( 92 | alert.Id, 93 | alert.Timestamp.Format(time.RFC3339), 94 | alert.Subject, 95 | stringutils.Truncate(removeLinebreaks(alert.Details), 50)) 96 | } 97 | 98 | fmt.Println(view.Render()) 99 | 100 | return nil 101 | } 102 | 103 | func alertAck(ctx context.Context, alertId string) error { 104 | app, err := getApp(ctx) 105 | if err != nil { 106 | return err 107 | } 108 | 109 | acked := amdomain.NewAlertAcknowledged( 110 | alertId, 111 | ehevent.MetaSystemUser(time.Now())) 112 | 113 | return app.Reader.TransactWrite(ctx, func() error { 114 | if !amstate.HasAlertWithId(alertId, app.State.ActiveAlerts()) { 115 | return fmt.Errorf("no alert: %s", alertId) 116 | } 117 | 118 | return app.AppendAfter(ctx, app.State.Version(), acked) 119 | }) 120 | } 121 | 122 | func removeLinebreaks(input string) string { 123 | return strings.ReplaceAll( 124 | strings.ReplaceAll( 125 | input, 126 | "\r", 127 | `\r`), 128 | "\n", 129 | `\n`) 130 | } 131 | -------------------------------------------------------------------------------- /cmd/alertmanager/ingest.go: -------------------------------------------------------------------------------- 1 | package main 2 | 3 | // Ingesting is the act of taking alert from some system, doing deduplication and alert 4 | // limiting (only N amount of active alerts are allowed). We either accept or drop the alert. 5 | 6 | import ( 7 | "context" 8 | "os" 9 | "strconv" 10 | 11 | "github.com/aws/aws-lambda-go/events" 12 | "github.com/function61/eventhorizon/pkg/ehevent" 13 | "github.com/function61/gokit/logex" 14 | "github.com/function61/lambda-alertmanager/pkg/amdomain" 15 | "github.com/function61/lambda-alertmanager/pkg/amstate" 16 | ) 17 | 18 | // invoked for "AlertManager-ingest" SNS topic 19 | func handleSnsIngest(ctx context.Context, event events.SNSEvent) error { 20 | app, err := getApp(ctx) 21 | if err != nil { 22 | return err 23 | } 24 | 25 | candidateAlerts := []amstate.Alert{} 26 | 27 | for _, msg := range event.Records { 28 | candidateAlerts = append(candidateAlerts, amstate.Alert{ 29 | Id: amstate.NewAlertId(), 30 | Subject: msg.SNS.Subject, 31 | Details: msg.SNS.Message, 32 | Timestamp: msg.SNS.Timestamp, 33 | }) 34 | } 35 | 36 | return ingestAlerts(ctx, candidateAlerts, app) 37 | } 38 | 39 | // this is somewhat of a hack to pass candidate-phase alerts as the same struct as we get 40 | // from the actual persisted State 41 | func ingestAlerts(ctx context.Context, candidateAlerts []amstate.Alert, app *amstate.App) error { 42 | _, err := ingestAlertsAndReturnCreatedFlag(ctx, candidateAlerts, app) 43 | return err 44 | } 45 | 46 | func ingestAlertsAndReturnCreatedFlag(ctx context.Context, candidateAlerts []amstate.Alert, app *amstate.App) (bool, error) { 47 | ingestedAny := false 48 | 49 | maxActiveAlerts, err := getMaxFiringAlerts() 50 | if err != nil { 51 | return false, err 52 | } 53 | 54 | // this call is free (unless we actually call Append()), so no reason to optimize by 55 | // checking for alert length 56 | if err := app.Reader.TransactWrite(ctx, func() error { 57 | alertEvents := []ehevent.Event{} 58 | 59 | alerts := deduplicateAndRatelimit(candidateAlerts, app.State, maxActiveAlerts) 60 | 61 | // raise alerts for failures 62 | for _, alert := range alerts { 63 | alertEvents = append(alertEvents, amdomain.NewAlertRaised( 64 | alert.Id, 65 | alert.Subject, 66 | alert.Details, 67 | ehevent.MetaSystemUser(alert.Timestamp))) 68 | } 69 | 70 | if len(alertEvents) == 0 { 71 | return nil // nothing to do 72 | } 73 | 74 | if err := app.AppendAfter(ctx, app.State.Version(), alertEvents...); err != nil { 75 | return err 76 | } 77 | 78 | ingestedAny = true 79 | 80 | for _, alert := range alerts { 81 | if err := publishAlert(alert); err != nil { 82 | logex.Levels(app.Logger).Error.Printf("publishAlert: %v", err) 83 | } 84 | } 85 | 86 | return nil 87 | }); err != nil { 88 | return ingestedAny, err 89 | } 90 | 91 | return ingestedAny, nil 92 | } 93 | 94 | func deduplicateAndRatelimit( 95 | alerts []amstate.Alert, 96 | state *amstate.Store, 97 | maxActiveAlerts int, 98 | ) []amstate.Alert { 99 | filtered := []amstate.Alert{} 100 | 101 | activeAlerts := state.ActiveAlerts() 102 | 103 | addedJustNow := func() int { return len(filtered) } 104 | 105 | for _, alert := range alerts { 106 | // no more "room"? 107 | if (len(activeAlerts) + addedJustNow()) >= maxActiveAlerts { 108 | continue 109 | } 110 | 111 | // deduplication 112 | if amstate.FindAlertWithSubject(alert.Subject, activeAlerts) != nil { 113 | continue 114 | } 115 | 116 | filtered = append(filtered, alert) 117 | } 118 | 119 | return filtered 120 | } 121 | 122 | func getMaxFiringAlerts() (int, error) { 123 | fromEnvStr := os.Getenv("MAX_FIRING_ALERTS") 124 | if fromEnvStr == "" { 125 | return 5, nil // default 126 | } 127 | 128 | return strconv.Atoi(fromEnvStr) 129 | } 130 | -------------------------------------------------------------------------------- /cmd/alertmanager/deadmansswitches_test.go: -------------------------------------------------------------------------------- 1 | package main 2 | 3 | import ( 4 | "context" 5 | "strings" 6 | "testing" 7 | "time" 8 | 9 | "github.com/function61/eventhorizon/pkg/ehclient" 10 | "github.com/function61/eventhorizon/pkg/ehevent" 11 | "github.com/function61/eventhorizon/pkg/ehreader" 12 | "github.com/function61/eventhorizon/pkg/ehreader/ehreadertest" 13 | "github.com/function61/gokit/assert" 14 | "github.com/function61/lambda-alertmanager/pkg/amdomain" 15 | "github.com/function61/lambda-alertmanager/pkg/amstate" 16 | ) 17 | 18 | func TestDeadmansswitchCheckin(t *testing.T) { 19 | ctx := context.Background() 20 | 21 | testStreamName := "/t-42/alertmanager" 22 | 23 | eventLog := ehreadertest.NewEventLog() 24 | // not significant (we just need an event in the log for initial read to work) 25 | // TODO: add CreateStream() on the testing log? 26 | eventLog.AppendE( 27 | testStreamName, 28 | amdomain.NewUnnoticedAlertsNotified( 29 | []string{"dummyid"}, 30 | ehevent.MetaSystemUser(t0))) 31 | 32 | app, err := amstate.LoadUntilRealtime( 33 | ctx, 34 | ehreader.NewTenantCtxWithSnapshots( 35 | ehreader.TenantId("42"), 36 | eventLog, 37 | ehreader.NewInMemSnapshotStore()), 38 | nil) 39 | assert.Ok(t, err) 40 | 41 | alertAcked, err := deadmansswitchCheckin( 42 | ctx, 43 | "My test switch", 44 | t0.Add(1*time.Hour), 45 | app, 46 | t0) 47 | assert.Ok(t, err) 48 | assert.Assert(t, !alertAcked) 49 | 50 | dumper := newEventDumper(testStreamName, eventLog, amdomain.Types) 51 | 52 | assert.EqualString(t, dumper.Dump(), ` 53 | 2019-09-07T12:00:00.000Z UnnoticedAlertsNotified {"AlertIds":["dummyid"]} 54 | 2019-09-07T12:00:00.000Z DeadMansSwitchCreated {"Subject":"My test switch","Ttl":"2019-09-07T13:00:00Z"} 55 | 2019-09-07T12:00:00.000Z DeadMansSwitchCheckin {"Subject":"My test switch","Ttl":"2019-09-07T13:00:00Z"}`) 56 | 57 | // 30 minutes lapses, and we send another checkin 58 | 59 | alertAcked, err = deadmansswitchCheckin( 60 | ctx, 61 | "My test switch", 62 | t0.Add(90*time.Minute), 63 | app, 64 | t0.Add(30*time.Minute)) 65 | assert.Ok(t, err) 66 | assert.Assert(t, !alertAcked) 67 | 68 | assert.EqualString(t, dumper.Dump(), ` 69 | 2019-09-07T12:00:00.000Z UnnoticedAlertsNotified {"AlertIds":["dummyid"]} 70 | 2019-09-07T12:00:00.000Z DeadMansSwitchCreated {"Subject":"My test switch","Ttl":"2019-09-07T13:00:00Z"} 71 | 2019-09-07T12:00:00.000Z DeadMansSwitchCheckin {"Subject":"My test switch","Ttl":"2019-09-07T13:00:00Z"} 72 | 2019-09-07T12:30:00.000Z DeadMansSwitchCheckin {"Subject":"My test switch","Ttl":"2019-09-07T13:30:00Z"}`) 73 | } 74 | 75 | func newEventDumper(stream string, eventLog ehclient.Reader, types ehevent.Allocators) *eventDumper { 76 | d := &eventDumper{ 77 | cur: ehclient.Beginning(stream), 78 | dump: []string{""}, // to get newline at beginning 79 | types: types, 80 | } 81 | d.reader = ehreader.New(d, eventLog, nil) 82 | return d 83 | } 84 | 85 | // For testing, allows you to print your event log 86 | // TODO: move this to eventhorizon testing helper package? 87 | type eventDumper struct { 88 | cur ehclient.Cursor 89 | reader *ehreader.Reader 90 | dump []string 91 | types ehevent.Allocators 92 | } 93 | 94 | func (e *eventDumper) Dump() string { 95 | if err := e.reader.LoadUntilRealtime(context.Background()); err != nil { 96 | panic(err) 97 | } 98 | 99 | return strings.Join(e.dump, "\n") 100 | } 101 | 102 | func (e *eventDumper) GetEventTypes() ehevent.Allocators { 103 | return e.types 104 | } 105 | 106 | func (e *eventDumper) ProcessEvents(ctx context.Context, handle ehreader.EventProcessorHandler) error { 107 | return handle( 108 | e.cur, 109 | func(ev ehevent.Event) error { 110 | e.dump = append(e.dump, ehevent.Serialize(ev)) 111 | return nil 112 | }, 113 | func(commitCursor ehclient.Cursor) error { 114 | e.cur = commitCursor 115 | return nil 116 | }) 117 | } 118 | -------------------------------------------------------------------------------- /cmd/alertmanager/deadmansswitches.go: -------------------------------------------------------------------------------- 1 | package main 2 | 3 | import ( 4 | "context" 5 | "fmt" 6 | "time" 7 | 8 | "github.com/function61/eventhorizon/pkg/ehevent" 9 | "github.com/function61/gokit/ossignal" 10 | "github.com/function61/lambda-alertmanager/pkg/amdomain" 11 | "github.com/function61/lambda-alertmanager/pkg/amstate" 12 | "github.com/scylladb/termtables" 13 | "github.com/spf13/cobra" 14 | ) 15 | 16 | func deadMansSwitchEntry() *cobra.Command { 17 | cmd := &cobra.Command{ 18 | Use: "dms", 19 | Short: "Manage dead man´s switches", 20 | } 21 | 22 | expired := false 23 | 24 | ls := &cobra.Command{ 25 | Use: "ls", 26 | Short: "List dead man´s switches", 27 | Args: cobra.NoArgs, 28 | Run: func(cmd *cobra.Command, args []string) { 29 | exitIfError(deadmansswitchList( 30 | ossignal.InterruptOrTerminateBackgroundCtx(nil), 31 | expired)) 32 | }, 33 | } 34 | 35 | ls.Flags().BoolVarP(&expired, "expired", "e", expired, "List only expired") 36 | 37 | cmd.AddCommand(ls) 38 | 39 | cmd.AddCommand(&cobra.Command{ 40 | Use: "rm [id]", 41 | Short: "Remove a switch", 42 | Args: cobra.ExactArgs(1), 43 | Run: func(cmd *cobra.Command, args []string) { 44 | exitIfError(deadmansswitchRemove( 45 | ossignal.InterruptOrTerminateBackgroundCtx(nil), 46 | args[0])) 47 | }, 48 | }) 49 | 50 | cmd.AddCommand(&cobra.Command{ 51 | Use: "checkin [subject] [ttl]", 52 | Short: "Make a checkin", 53 | Args: cobra.ExactArgs(2), 54 | Run: func(cmd *cobra.Command, args []string) { 55 | ctx := ossignal.InterruptOrTerminateBackgroundCtx(nil) 56 | 57 | ttl, err := parseTtlSpec(args[1], time.Now()) 58 | exitIfError(err) 59 | 60 | app, err := getApp(ctx) 61 | exitIfError(err) 62 | 63 | _, err = deadmansswitchCheckin( 64 | ctx, 65 | args[0], 66 | ttl, 67 | app, 68 | time.Now()) 69 | exitIfError(err) 70 | }, 71 | }) 72 | 73 | return cmd 74 | } 75 | 76 | func deadmansswitchList(ctx context.Context, expired bool) error { 77 | app, err := getApp(ctx) 78 | if err != nil { 79 | return err 80 | } 81 | 82 | dmss := app.State.DeadMansSwitches() 83 | if expired { 84 | dmss = amstate.GetExpiredDeadMansSwitches(dmss, time.Now()) 85 | } 86 | 87 | view := termtables.CreateTable() 88 | view.AddHeaders("Subject", "TTL") 89 | 90 | for _, dms := range dmss { 91 | view.AddRow(dms.Subject, dms.Ttl.Format(time.RFC3339)) 92 | } 93 | 94 | fmt.Println(view.Render()) 95 | 96 | return nil 97 | } 98 | 99 | func deadmansswitchRemove(ctx context.Context, subject string) error { 100 | app, err := getApp(ctx) 101 | if err != nil { 102 | return err 103 | } 104 | 105 | return app.Reader.TransactWrite(ctx, func() error { 106 | if amstate.FindDeadMansSwitchWithSubject(subject, app.State.DeadMansSwitches()) == nil { 107 | return fmt.Errorf("switch to delete not found: %s", subject) 108 | } 109 | 110 | return app.AppendAfter(ctx, app.State.Version(), amdomain.NewDeadMansSwitchDeleted( 111 | subject, 112 | ehevent.MetaSystemUser(time.Now()))) 113 | }) 114 | } 115 | 116 | func deadmansswitchCheckin( 117 | ctx context.Context, 118 | subject string, 119 | ttl time.Time, 120 | app *amstate.App, 121 | now time.Time, 122 | ) (bool, error) { 123 | alertAcked := false 124 | 125 | checkin := amdomain.NewDeadMansSwitchCheckin( 126 | subject, 127 | ttl, 128 | ehevent.MetaSystemUser(now)) 129 | 130 | if err := app.Reader.TransactWrite(ctx, func() error { 131 | events := []ehevent.Event{} 132 | 133 | // first time seeing this checkin => create said switch 134 | if amstate.FindDeadMansSwitchWithSubject(subject, app.State.DeadMansSwitches()) == nil { 135 | events = append(events, amdomain.NewDeadMansSwitchCreated( 136 | subject, 137 | ttl, 138 | ehevent.MetaSystemUser(now))) 139 | } 140 | 141 | events = append(events, checkin) 142 | 143 | if alert := amstate.FindAlertWithSubject(subject, app.State.ActiveAlerts()); alert != nil { 144 | events = append(events, amdomain.NewAlertAcknowledged( 145 | alert.Id, 146 | ehevent.MetaSystemUser(now))) 147 | 148 | alertAcked = true 149 | } 150 | 151 | return app.AppendAfter(ctx, app.State.Version(), events...) 152 | }); err != nil { 153 | return false, err 154 | } 155 | 156 | return alertAcked, nil 157 | } 158 | -------------------------------------------------------------------------------- /docs/setup_apigateway.md: -------------------------------------------------------------------------------- 1 | Setting up API gateway 2 | ====================== 3 | 4 | Foreword: Amazon API gateway is not the most elegant of products that AWS has shipped. 5 | It still isn't as easy to use with Lambda as it should be, but it has improved greatly. 6 | 7 | 8 | Create API & configure it as Lambda proxy 9 | ----------------------------------------- 10 | 11 | Go To `API gateway > Create new API` 12 | 13 | - New API 14 | - API Name = `AlertManager` 15 | - Description = leave empty 16 | 17 | In `API Gateway > AlertManager > Resources > Actions > Create resource`: 18 | 19 | - Configure as proxy resource: `check` 20 | - Resource name: `proxy` 21 | - Resource path: `{proxy+}` 22 | - Enable API Gateway CORS: leave unchecked 23 | - `[ Create resource ]` 24 | 25 | In integration setup: 26 | 27 | - Type = `Lambda function proxy` 28 | - Lambda region = choose the region your Lambda function is in 29 | - Lambda function = `AlertManager` 30 | - `[ Save ]` 31 | - Add Permission to Lambda Function: OK 32 | 33 | Now go to `Actions > Deploy API`: 34 | 35 | - Deployment stage: `prod` (or `[New Stage]` if prod does not exist yet) 36 | - `[ Deploy ]` 37 | 38 | 39 | Testing if this works 40 | --------------------- 41 | 42 | After deploying, you should now see the `Invoke URL`. Mine was `https://REDACTED.execute-api.us-west-2.amazonaws.com/prod`. 43 | 44 | Sidenote: the `REDACTED` part is the API's ID. Currently we use that as our access control mechanism to submit alerts, 45 | so treat the API ID as secret. We will probably implement some kind of API token authentication in the future. 46 | 47 | Open that URL, you should see: `{"message":"Missing Authentication Token"}`. 48 | That is to be expected. Append `/alerts` to that URL, resulting in `https://REDACTED.execute-api.us-west-2.amazonaws.com/prod/alerts`. 49 | 50 | That URL should give you: 51 | 52 | ``` 53 | [ 54 | 55 | ] 56 | ``` 57 | 58 | Meaning that there are no triggered alerts. If you get anything else, carefully read the 59 | instructions again because something is wrong. 60 | 61 | 62 | Submit a test alert 63 | ------------------- 64 | 65 | Now you should try test raising an alert to the `/alerts/ingest` endpoint: 66 | 67 | ``` 68 | $ curl -H 'Content-Type: application/json' -X POST -d '{"subject": "www.example.com", "details": "I dont like the page"}' https://REDACTED.execute-api.us-west-2.amazonaws.com/prod/alerts/ingest 69 | "OK => alert saved to database and queued for delivery" 70 | ``` 71 | 72 | Now you should hear a **bling** from your inbox, meaning that AlertManager used SNS to deliver your first alert. 73 | 74 | Now, revisit that `/alerts` endpoint from the previous heading again - you should see: 75 | 76 | ``` 77 | [ 78 | { 79 | "alert_key": "1", 80 | "subject": "www.example.com", 81 | "timestamp": "2017-01-15T12:12:04.018Z", 82 | "details": "I dont like the page" 83 | } 84 | ] 85 | ``` 86 | 87 | Now, if you try to submit the same exact alert again (repeat the `$ curl ...` command), 88 | it should not be re-accepted (because it has the same `subject`): 89 | 90 | ``` 91 | $ curl -H 'Content-Type: application/json' -X POST -d '{"subject": "www.example.com", "details": "I dont like the page"}' https://REDACTED.execute-api.us-west-2.amazonaws.com/prod/alerts/ingest 92 | "This alert is already firing. Discarding the submitted alert." 93 | ``` 94 | 95 | Congrats, everything seems to be working! 96 | 97 | 98 | Acknowledging the alert 99 | ----------------------- 100 | 101 | Okay now we learned that AlertManager won't accept alarms with the same subject again, before they are acknowledged. 102 | 103 | How do I acknowledge the alert (after I've fixed the root cause that made the alarm go off)? 104 | 105 | The most basic way to acknowledge the alert is to remove the row from 106 | `DynamoDB > alertmanager_alerts > Items > (choose the alert) > Actions > Delete`. 107 | 108 | There's also an API for doing the same: 109 | 110 | ``` 111 | $ curl -H 'Content-Type: application/json' -X POST -d '{"alert_key": "1"}' https://REDACTED.execute-api.us-west-2.amazonaws.com/prod/alerts/acknowledge 112 | "Alert 1 deleted" 113 | ``` 114 | 115 | We use the aforementioned APIs to show/acknowledge the alerts in a central dashboard. We'll probably 116 | open source that project in the future, but in the meantime you can use either the DynamoDB UI as your 117 | acknowledge tool or make your own with the APIs. 118 | -------------------------------------------------------------------------------- /docs/usecase_prometheus-alerting.md: -------------------------------------------------------------------------------- 1 | Use case: Prometheus alerting 2 | ============================= 3 | 4 | - [Prometheus](https://prometheus.io/download/#prometheus) is the main component when talking about Prometheus. 5 | It monitors your services by scraping them for metrics. It allows to define alerting rules for these metrics: 6 | "if this metric looks like something is wrong -> raise an alert". 7 | 8 | - However, Prometheus only **raises** alerts. It does not filter or transport them. They wisely made 9 | this modular and separated those concerns into another Prometheus project: 10 | [prom-alertmanager](https://prometheus.io/docs/alerting/alertmanager/). 11 | 12 | - Our lambda-alertmanager is a simple replacement to prom-alertmanager that runs entirely on AWS. 13 | 14 | 15 | Configure Prometheus to send alarms to lambda-alertmanager 16 | ---------------------------------------------------------- 17 | 18 | Edit `prometheus.conf`: 19 | 20 | ``` 21 | global: 22 | ... snipped 23 | 24 | # most verbose way of specifying 'https://REDACTED.execute-api.us-east-1.amazonaws.com/prod/prometheus-alertmanager' 25 | # Prometheus will do a HTTP POST to /prod/prometheus-alertmanager/api/v1/alerts 26 | alerting: 27 | alertmanagers: 28 | - scheme: 'https' 29 | path_prefix: '/prod/prometheus-alertmanager' 30 | static_configs: 31 | - targets: 32 | - 'REDACTED.execute-api.us-east-1.amazonaws.com' 33 | 34 | scrape_configs: 35 | ... snipped 36 | ``` 37 | 38 | 39 | Have a Prometheus-enabled service you want to monitor/graph 40 | ----------------------------------------------------------- 41 | 42 | In our example we have a service `http://prometheus-dummy-service`. 43 | Its Prometheus-scrapable metrics live at `http://prometheus-dummy-service/metrics`. 44 | 45 | The response looks like this: 46 | 47 | ``` 48 | # this is fictional value 49 | fictional_healthmeter 100 50 | 51 | ``` 52 | 53 | Prometheus-metrics can have much [richer data structure](https://prometheus.io/docs/concepts/data_model/) 54 | than this, but this is the simplest example. 55 | 56 | Prometheus [autodiscovers](https://prometheus.io/docs/operating/configuration/) our services, 57 | and will scrape those metrics automatically. 58 | 59 | Now we can graph that metric inside Prometheus: 60 | 61 | ![](usecase_prometheus-alerting-graph-normal.png) 62 | 63 | The metric is reporting constant `100`. Which in our fictional case means everything is OK. 64 | 65 | 66 | Configure an alert to Prometheus 67 | -------------------------------- 68 | 69 | We'll decide that the metric `fictional_healthmeter` signals error if it dips below `50`. 70 | Add to Prometheus' alerting rules: 71 | 72 | ``` 73 | ALERT dummy_service_down 74 | IF fictional_healthmeter{job="prometheus-dummy-service"} < 50 75 | ``` 76 | 77 | Now, when that happens (`fictional_healthmeter` dips to `20`): 78 | 79 | ![](usecase_prometheus-alerting-graph-unhealthy.png) 80 | 81 | Prometheus will submit this alarm to lambda-AlertManager - you'll get a notification via your configured transports: 82 | 83 | ![](usecase_prometheus-alerting-email.png) 84 | 85 | 86 | Why did we replace Prometheus' AlertManager? 87 | -------------------------------------------- 88 | 89 | - Prometheus' AlertManager would have to run on your own infrastructure - more stuff for you to operate and worry about. 90 | 91 | - Reliability. If AlertManager goes down, you are not going to be alerted. AlertManager is in a sense 92 | your most critical part of your infrastructure, as you have to trust it to work when shit hits the fan. 93 | You don't want your customers to call you because you yourself don't know that your servers are down. 94 | I.e. if monitoring goes down, who monitors the monitoring? I have great confidence in letting all this 95 | run on AWS' well-managed environment. 96 | 97 | 98 | But what if Prometheus goes down? 99 | --------------------------------- 100 | 101 | Okay we learned that lambda-alertmanager is in charge of being reliable in delivering alerts. But since 102 | Prometheus is the one that **raises these alerts**, what if Prometheus itself goes down, so there's nobody 103 | to alert us that monitoring is down? 104 | 105 | For this case I advise you to make AlertManager-Canary monitor your Prometheus. Just configure a http check 106 | in Canary to alert you if Prometheus goes down. That way if AWS stays up, you'll always be notified even 107 | if your entire cluster dies at the exact same moment. 108 | -------------------------------------------------------------------------------- /cmd/alertmanager/scheduled_test.go: -------------------------------------------------------------------------------- 1 | package main 2 | 3 | import ( 4 | "context" 5 | "encoding/json" 6 | "fmt" 7 | "os" 8 | "testing" 9 | "time" 10 | 11 | "github.com/function61/eventhorizon/pkg/ehevent" 12 | "github.com/function61/eventhorizon/pkg/ehreader" 13 | "github.com/function61/eventhorizon/pkg/ehreader/ehreadertest" 14 | "github.com/function61/gokit/assert" 15 | "github.com/function61/lambda-alertmanager/pkg/amdomain" 16 | "github.com/function61/lambda-alertmanager/pkg/amstate" 17 | ) 18 | 19 | var t0 = time.Date(2019, 9, 7, 12, 0, 0, 0, time.UTC) 20 | 21 | func TestCheckAndAlertForUnnoticedAlerts(t *testing.T) { 22 | ctx := context.Background() 23 | 24 | testStreamName := "/t-42/alertmanager" 25 | 26 | // have an alert raised at T+0 27 | eventLog := ehreadertest.NewEventLog() 28 | eventLog.AppendE( 29 | testStreamName, 30 | amdomain.NewAlertRaised( 31 | "a14308bba82f", 32 | "The building is on fire", 33 | "Fire sensor in room 456 went off", 34 | ehevent.MetaSystemUser(t0))) 35 | 36 | app, err := amstate.LoadUntilRealtime( 37 | ctx, 38 | ehreader.NewTenantCtxWithSnapshots( 39 | ehreader.TenantId("42"), 40 | eventLog, 41 | ehreader.NewInMemSnapshotStore()), 42 | nil) 43 | assert.Ok(t, err) 44 | 45 | // lame hack to get ackLink() to produce URLs while in testing 46 | os.Setenv("API_ENDPOINT", "https://alertmanager.com/api") 47 | 48 | unnoticedAlertAtT0Plus := func(plus time.Duration) string { // helper 49 | var publishedAlert *amstate.Alert 50 | 51 | // capture maybe-published alert 52 | assert.Ok(t, checkAndAlertForUnnoticedAlerts(ctx, app, func(alert amstate.Alert) error { 53 | publishedAlert = &alert 54 | 55 | return nil 56 | }, t0.Add(plus))) 57 | 58 | publishedAlertAsJson, err := json.MarshalIndent(publishedAlert, "", " ") 59 | assert.Ok(t, err) 60 | 61 | return string(publishedAlertAsJson) 62 | } 63 | 64 | /* we want alert at 4 hour mark when alert is un-acked, then *every* hour again: 65 | 66 | 0:00 (an alert is raised) 67 | 1:00 68 | 2:00 69 | 3:00 70 | 4:00 => first "unnoticed" alert 71 | 4:30 72 | 5:00 => second "unnoticed" alert 73 | ... 74 | */ 75 | assert.EqualString(t, unnoticedAlertAtT0Plus(0*time.Hour), "null") 76 | assert.EqualString(t, unnoticedAlertAtT0Plus(1*time.Hour), "null") 77 | assert.EqualString(t, unnoticedAlertAtT0Plus(2*time.Hour), "null") 78 | assert.EqualString(t, unnoticedAlertAtT0Plus(3*time.Hour), "null") 79 | assert.EqualString(t, unnoticedAlertAtT0Plus(4*time.Hour), `{ 80 | "alert_key": "", 81 | "subject": "Un-acked alerts", 82 | "details": "There are 1 un-acked alert(s):\n\nThe building is on fire https://alertmanager.com/api/alerts/acknowledge?id=a14308bba82f\n\nGo take care of them!", 83 | "timestamp": "2019-09-07T16:00:00Z" 84 | }`) 85 | assert.EqualString(t, unnoticedAlertAtT0Plus(4*time.Hour+30*time.Minute), "null") 86 | assert.EqualString(t, unnoticedAlertAtT0Plus(5*time.Hour), `{ 87 | "alert_key": "", 88 | "subject": "Un-acked alerts", 89 | "details": "There are 1 un-acked alert(s):\n\nThe building is on fire https://alertmanager.com/api/alerts/acknowledge?id=a14308bba82f\n\nGo take care of them!", 90 | "timestamp": "2019-09-07T17:00:00Z" 91 | }`) 92 | assert.EqualString(t, unnoticedAlertAtT0Plus(5*time.Hour+30*time.Minute), "null") 93 | assert.EqualString(t, unnoticedAlertAtT0Plus(6*time.Hour), `{ 94 | "alert_key": "", 95 | "subject": "Un-acked alerts", 96 | "details": "There are 1 un-acked alert(s):\n\nThe building is on fire https://alertmanager.com/api/alerts/acknowledge?id=a14308bba82f\n\nGo take care of them!", 97 | "timestamp": "2019-09-07T18:00:00Z" 98 | }`) 99 | } 100 | 101 | func TestParseTtlSpec(t *testing.T) { 102 | tcs := []struct { 103 | input string 104 | output string 105 | }{ 106 | { 107 | "+24h", 108 | "2019-09-08T12:00:00Z", 109 | }, 110 | { 111 | "+1h", 112 | "2019-09-07T13:00:00Z", 113 | }, 114 | { 115 | "+1d@18:00", 116 | "2019-09-08T18:00:00Z", 117 | }, 118 | { 119 | "+14d@10:00", 120 | "2019-09-21T10:00:00Z", 121 | }, 122 | { 123 | "2019-09-10T01:13:00Z", 124 | "2019-09-10T01:13:00Z", 125 | }, 126 | { 127 | "foobar", 128 | "error: not in RFC3339: foobar", 129 | }, 130 | { 131 | "+12x", 132 | "error: duration in bad format", 133 | }, 134 | { 135 | "+1d@18:0x", 136 | "error: duration in bad format", 137 | }, 138 | } 139 | 140 | for _, tc := range tcs { 141 | tc := tc // pin 142 | t.Run(tc.input, func(t *testing.T) { 143 | ttl, err := parseTtlSpec(tc.input, t0) 144 | var actual string 145 | if err != nil { 146 | actual = fmt.Sprintf("error: %v", err) 147 | } else { 148 | actual = ttl.Format(time.RFC3339Nano) 149 | } 150 | 151 | assert.EqualString(t, actual, tc.output) 152 | }) 153 | } 154 | } 155 | -------------------------------------------------------------------------------- /cmd/alertmanager/httpmonitorscanner.go: -------------------------------------------------------------------------------- 1 | package main 2 | 3 | import ( 4 | "context" 5 | "fmt" 6 | "io/ioutil" 7 | "log" 8 | "net/http" 9 | "strings" 10 | "sync" 11 | "time" 12 | 13 | "github.com/function61/gokit/ezhttp" 14 | "github.com/function61/gokit/logex" 15 | "github.com/function61/lambda-alertmanager/pkg/amstate" 16 | ) 17 | 18 | type monitorFailure struct { 19 | err error 20 | monitor amstate.HttpMonitor 21 | } 22 | 23 | func httpMonitorScanAndAlertFailures(ctx context.Context, app *amstate.App) error { 24 | startOfScan := time.Now() 25 | 26 | failures := scanMonitors( 27 | ctx, 28 | amstate.EnabledHttpMonitors(app.State.HttpMonitors()), 29 | newRetryScanner(newScanner()), 30 | logex.Prefix("httpscanner", app.Logger)) 31 | 32 | // convert monitor failures into alerts 33 | alerts := []amstate.Alert{} 34 | for _, failure := range failures { 35 | alerts = append(alerts, amstate.Alert{ 36 | Id: amstate.NewAlertId(), 37 | Subject: failure.monitor.Url, 38 | Details: failure.err.Error(), 39 | Timestamp: startOfScan, 40 | }) 41 | } 42 | 43 | // ok with len(alerts) == 0 44 | return ingestAlerts(ctx, alerts, app) 45 | } 46 | 47 | // scans HTTP monitors and returns the ones that failed 48 | func scanMonitors( 49 | ctx context.Context, 50 | monitors []amstate.HttpMonitor, 51 | scanner HttpMonitorScanner, 52 | logger *log.Logger, 53 | ) []monitorFailure { 54 | logl := logex.Levels(logger) 55 | 56 | failed := []monitorFailure{} 57 | failedMu := sync.Mutex{} 58 | 59 | checkOne := func(monitor amstate.HttpMonitor) { 60 | ctx, cancel := context.WithTimeout(ctx, 30*time.Second) 61 | defer cancel() 62 | 63 | started := time.Now() 64 | 65 | err := scanner.Scan(ctx, monitor) 66 | 67 | durationMs := time.Since(started).Milliseconds() 68 | 69 | if err != nil { 70 | failedMu.Lock() 71 | defer failedMu.Unlock() 72 | 73 | failed = append(failed, monitorFailure{ 74 | err, 75 | monitor, 76 | }) 77 | 78 | logl.Error.Printf("❌ %s @ %d ms => %v", monitor.Url, durationMs, err.Error()) 79 | } else { 80 | logl.Debug.Printf("✔️ %s @ %d ms", monitor.Url, durationMs) 81 | } 82 | } 83 | 84 | work := make(chan amstate.HttpMonitor) 85 | 86 | concurrently(3, func() { 87 | for monitor := range work { 88 | checkOne(monitor) 89 | } 90 | }, func() { 91 | for _, monitor := range monitors { 92 | work <- monitor 93 | } 94 | 95 | close(work) 96 | }) 97 | 98 | return failed 99 | } 100 | 101 | type HttpMonitorScanner interface { 102 | Scan(context.Context, amstate.HttpMonitor) error 103 | } 104 | 105 | type retryScanner struct { 106 | actualScanner HttpMonitorScanner 107 | } 108 | 109 | // retries once 110 | func newRetryScanner(actual HttpMonitorScanner) HttpMonitorScanner { 111 | return &retryScanner{actual} 112 | } 113 | 114 | func (r *retryScanner) Scan(ctx context.Context, monitor amstate.HttpMonitor) error { 115 | firstTryCtx, cancel := context.WithTimeout(ctx, 15*time.Second) 116 | defer cancel() 117 | 118 | if err := r.actualScanner.Scan(firstTryCtx, monitor); err != nil { 119 | time.Sleep(2 * time.Second) 120 | 121 | // it'd be hard to detect if we shouldn't retry this at all, since timeouts, 122 | // HTTP gateway errors, internal server errors etc. all can be transient 123 | 124 | // now use the longer context 125 | if err2 := r.actualScanner.Scan(ctx, monitor); err2 != nil { 126 | return fmt.Errorf("first error: %v; retry error: %v", err, err2) 127 | } 128 | 129 | return nil 130 | } 131 | 132 | return nil 133 | } 134 | 135 | type scanner struct { 136 | noRedirects *http.Client 137 | } 138 | 139 | func newScanner() HttpMonitorScanner { 140 | return &scanner{ 141 | &http.Client{ 142 | CheckRedirect: func(req *http.Request, via []*http.Request) error { 143 | return http.ErrUseLastResponse // do not follow redirects 144 | }, 145 | }, 146 | } 147 | } 148 | 149 | func (s *scanner) Scan(ctx context.Context, monitor amstate.HttpMonitor) error { 150 | resp, err := ezhttp.Get( 151 | ctx, 152 | monitor.Url, 153 | ezhttp.TolerateNon2xxResponse, 154 | ezhttp.Client(s.noRedirects)) // rationale: no much else than how previous one worked 155 | if err != nil { 156 | return err 157 | } 158 | defer resp.Body.Close() 159 | 160 | buf, err := ioutil.ReadAll(resp.Body) 161 | if err != nil { 162 | return err 163 | } 164 | 165 | return mustFindStringInBody(string(buf), monitor.Find) 166 | } 167 | 168 | func mustFindStringInBody(body string, find string) error { 169 | if !strings.Contains(body, find) { 170 | return fmt.Errorf("string-to-find `%s` NOT in body: %s", find, body) 171 | } 172 | 173 | return nil 174 | } 175 | 176 | func concurrently(numWorkers int, worker func(), produceWork func()) { 177 | workersDone := sync.WaitGroup{} 178 | 179 | for i := 0; i < numWorkers; i++ { 180 | workersDone.Add(1) 181 | go func() { 182 | defer workersDone.Done() 183 | 184 | worker() 185 | }() 186 | } 187 | 188 | produceWork() 189 | 190 | workersDone.Wait() 191 | } 192 | -------------------------------------------------------------------------------- /cmd/alertmanager/scheduled.go: -------------------------------------------------------------------------------- 1 | package main 2 | 3 | import ( 4 | "context" 5 | "errors" 6 | "fmt" 7 | "regexp" 8 | "strconv" 9 | "strings" 10 | "time" 11 | 12 | "github.com/function61/eventhorizon/pkg/ehevent" 13 | "github.com/function61/lambda-alertmanager/pkg/amdomain" 14 | "github.com/function61/lambda-alertmanager/pkg/amstate" 15 | ) 16 | 17 | // runs every minute 18 | func handleCloudwatchScheduledEvent(ctx context.Context, now time.Time) error { 19 | app, err := getApp(ctx) 20 | if err != nil { 21 | return err 22 | } 23 | 24 | if err := checkAndAlertForUnnoticedAlerts(ctx, app, publishAlert, now); err != nil { 25 | return err 26 | } 27 | 28 | if err := alertForExpiredDeadMansSwitches(ctx, app, now); err != nil { 29 | return err 30 | } 31 | 32 | if err := httpMonitorScanAndAlertFailures(ctx, app); err != nil { 33 | return err 34 | } 35 | 36 | return nil 37 | } 38 | 39 | type alertDirectPublisherFn func(amstate.Alert) error 40 | 41 | // check for old unnoticed alerts (not acked within 4 hours) and send an alarm to notify the operator, 42 | // keep sending every hour 43 | func checkAndAlertForUnnoticedAlerts( 44 | ctx context.Context, 45 | app *amstate.App, 46 | alertDirectPublisher alertDirectPublisherFn, 47 | now time.Time, 48 | ) error { 49 | unnoticedAlerts := amstate.GetUnnoticedAlerts(app.State.ActiveAlerts(), now) 50 | 51 | unnoticedAlertSubjects := []string{} 52 | unnoticedAlertIds := []string{} 53 | 54 | for _, alert := range unnoticedAlerts { 55 | unnoticedAlertSubjects = append(unnoticedAlertSubjects, alert.Subject+" "+ackLink(alert)) 56 | unnoticedAlertIds = append(unnoticedAlertIds, alert.Id) 57 | } 58 | 59 | if len(unnoticedAlertIds) == 0 { 60 | return nil 61 | } 62 | 63 | errAlreadyNotified := errors.New("already notified") 64 | 65 | if err := app.Reader.TransactWrite(ctx, func() error { 66 | // only notify about unnoticed alerts once an hour 67 | if now.Sub(app.State.LastUnnoticedAlertsNotified()) < 1*time.Hour { 68 | return errAlreadyNotified 69 | } 70 | 71 | return app.AppendAfter(ctx, app.State.Version(), amdomain.NewUnnoticedAlertsNotified( 72 | unnoticedAlertIds, 73 | ehevent.MetaSystemUser(now))) 74 | }); err != nil { 75 | if err == errAlreadyNotified { 76 | return nil // not actually an error 77 | } else { 78 | return err 79 | } 80 | } 81 | 82 | details := fmt.Sprintf( 83 | "There are %d un-acked alert(s):\n\n%s\n\nGo take care of them!", 84 | len(unnoticedAlertSubjects), 85 | strings.Join(unnoticedAlertSubjects, "\n")) 86 | 87 | // skip ingestion to bypass rate limiting (this scheduled function is not invoked 88 | // too often) and deduplication. besides, we want to keep reminding the operator 89 | // to take care of this situation 90 | return alertDirectPublisher(amstate.Alert{ 91 | Subject: "Un-acked alerts", 92 | Details: details, 93 | Timestamp: now, 94 | }) 95 | } 96 | 97 | func alertForExpiredDeadMansSwitches(ctx context.Context, app *amstate.App, now time.Time) error { 98 | candidateAlerts := []amstate.Alert{} 99 | 100 | for _, dms := range amstate.GetExpiredDeadMansSwitches(app.State.DeadMansSwitches(), now) { 101 | candidateAlerts = append(candidateAlerts, deadMansSwitchToAlert(dms, now)) 102 | } 103 | 104 | // ok with len(alerts) == 0 105 | return ingestAlerts(ctx, candidateAlerts, app) 106 | } 107 | 108 | var plusDayAtStaticTimeRe = regexp.MustCompile(`^\+([0-9]+)d@([0-9]{2}):([0-9]{2})$`) 109 | 110 | func parseTtlSpec(spec string, now time.Time) (time.Time, error) { 111 | // +1d@12:00 112 | if match := plusDayAtStaticTimeRe.FindStringSubmatch(spec); match != nil { 113 | // below errors should never trigger because regexp guarantees they're in good format 114 | day, err := strconv.Atoi(match[1]) 115 | if err != nil { 116 | return time.Time{}, fmt.Errorf("bad day component: %v", err) 117 | } 118 | hour, err := strconv.Atoi(match[2]) 119 | if err != nil { 120 | return time.Time{}, fmt.Errorf("bad hour component: %v", err) 121 | } 122 | minute, err := strconv.Atoi(match[3]) 123 | if err != nil { 124 | return time.Time{}, fmt.Errorf("bad minute component: %v", err) 125 | } 126 | 127 | return time.Date(now.Year(), now.Month(), now.Day()+day, hour, minute, 0, 0, time.UTC), nil 128 | } else if strings.HasPrefix(spec, "+") { // +24h 129 | duration, err := time.ParseDuration(spec[1:]) 130 | if err != nil { 131 | return time.Time{}, errors.New("duration in bad format") 132 | } 133 | 134 | return now.Add(duration), nil 135 | } else { 136 | ttl, err := time.Parse(time.RFC3339Nano, spec) 137 | if err != nil { 138 | err = fmt.Errorf("not in RFC3339: %s", spec) 139 | } 140 | return ttl, err 141 | } 142 | } 143 | 144 | func deadMansSwitchToAlert(dms amstate.DeadMansSwitch, now time.Time) amstate.Alert { 145 | return amstate.Alert{ 146 | Id: amstate.NewAlertId(), 147 | Subject: dms.Subject, 148 | Details: fmt.Sprintf("Check-in late by %s (%s)", now.Sub(dms.Ttl), dms.Ttl.Format(time.RFC3339Nano)), 149 | Timestamp: now, 150 | } 151 | } 152 | -------------------------------------------------------------------------------- /docs/setup_alertmanager-canary.md: -------------------------------------------------------------------------------- 1 | Setting up AlertManager-Canary 2 | ============================== 3 | 4 | Canary pokes at your web properties over http/https to see if they are online. If they are not, an alarm will be raised. 5 | 6 | 7 | Create the Lambda function 8 | -------------------------- 9 | 10 | - Go to `Lambda > Create a Lambda function > Blank function`. 11 | - Do not configure any triggers at this time (just hit next). 12 | - Name: `AlertManager-Canary` 13 | - Description: `Checks that important web properties are working.` 14 | - Runtime: `Node.js 4.3` (or higher) 15 | - Code entry type: `Upload a .ZIP file` 16 | - Download latest `alertmanager-canary.zip` from releases -page (in GitHub) 17 | to your desktop and then upload to Lambda 18 | 19 | Now, for each property that you want to monitor, add those checks as separate ENV variables. Example: 20 | 21 | - `CHECK1` = `{"url":"https://example.com/"§"find":"This domain is established to be used for illustrative examples in documents."}` 22 | - `INGEST_TOPIC` = ARN of your ingest topic (mine looked like `arn:aws:sns:us-west-2:426466625513:AlertManager-ingest`) 23 | 24 | (NOTE: `,` chars in `CHECK...` JSON are replaced with `§` because the geniuses that implemented ENV variables 25 | in Lambda probably serialize the ENV list as a `,`-separated string because currently 26 | [ENV var values cannot contain `,`](https://forums.aws.amazon.com/thread.jspa?messageID=753580)) 27 | 28 | (NOTE: there can be gaps in the check numbers, the numbers only have to be unique - luckily Lambda checks this) 29 | 30 | Role config: 31 | 32 | - Handler: leave as is 33 | - Role: leave as is (`Choose existing role`) 34 | - Existing role: `AlertManager` 35 | 36 | Advanced config: 37 | 38 | - Memory (MB): leave as is (`128`) 39 | - Timeout: 1 min 40 | 41 | Okay now hit `[ Create function ]`. 42 | 43 | 44 | Test that the function works 45 | ---------------------------- 46 | 47 | Now hit `[ Test ]` so we can see that it is working. It'll ask you for a test event, but the content does not matter 48 | (since our events will be schedule-based) so just accept the dummy event offered by Lambda. 49 | 50 | You should get this log output from the test run: 51 | 52 | ``` 53 | START RequestId: ff8ffe53-db1d-11e6-8fda-35d4c5ac1dd6 Version: $LATEST 54 | 2017-01-15T12:27:37.417Z ff8ffe53-db1d-11e6-8fda-35d4c5ac1dd6 Starting Canary. Check count: 1 55 | 2017-01-15T12:27:37.838Z ff8ffe53-db1d-11e6-8fda-35d4c5ac1dd6 ✓ https://example.com/ duration=419 56 | 2017-01-15T12:27:37.838Z ff8ffe53-db1d-11e6-8fda-35d4c5ac1dd6 => All passed. Awesome! 57 | END RequestId: ff8ffe53-db1d-11e6-8fda-35d4c5ac1dd6 58 | ``` 59 | 60 | Now edit the check definition (`CHECK1`) to look like this: 61 | 62 | ``` 63 | {"url":"https://example.com/"§"find":"THIS TEXT WILL NOT BE FOUND"} 64 | ``` 65 | 66 | - `[ Save ]` 67 | - `[ Test ]` 68 | 69 | Your log output should now be: 70 | 71 | ``` 72 | START RequestId: 586059c3-db1e-11e6-ab0a-a37bca6277f6 Version: $LATEST 73 | 2017-01-15T12:30:06.403Z 586059c3-db1e-11e6-ab0a-a37bca6277f6 Starting Canary. Check count: 1 74 | 2017-01-15T12:30:06.837Z 586059c3-db1e-11e6-ab0a-a37bca6277f6 https://example.com/ failed once - re-trying (only once) 75 | 2017-01-15T12:30:06.948Z 586059c3-db1e-11e6-ab0a-a37bca6277f6 ✗ https://example.com/ => find="THIS TEXT WILL NOT BE FOUND" NOT in body: FAIL (0/1) succeeded 77 | END RequestId: 586059c3-db1e-11e6-ab0a-a37bca6277f6 78 | ``` 79 | 80 | AlertManager-Canary just posted this alarm to AlertManager for ingestion via SNS topic `AlertManager-ingest`. 81 | 82 | You should've received the alert via email. Now if you hit `[ Test ]` again, Canary will submit the alarm again for ingestion, 83 | but this time it will be discarded because the previous alarm for the same URL is not acknowledged yet. 84 | 85 | You can now acknowledge the alert you just triggered (read the API gateway docs again if you are not sure how to do this), 86 | and add actual websites to monitor to your Canary. Please don't leave the example.com check there, as it's not your website to hammer. 87 | 88 | 89 | Add scheduled trigger 90 | --------------------- 91 | 92 | We want this Canary to be ran automatically every minute (or any rate you want). 93 | 94 | Go to `CloudWatch > Events > Rules > Create`: 95 | 96 | - Event source = `Schedule` 97 | - Fixed rate = `1 minutes` 98 | 99 | In `Targets > Add target`: 100 | 101 | - Lambda function = `AlertManager-Canary` 102 | 103 | Hit `[ Configure details ]` ("next"): 104 | 105 | - Name = `AlertManager-Canary` 106 | - Description = leave empty 107 | - State = `enabled` 108 | - `[ Create rule ]` 109 | 110 | Canary will now be run automatically - every minute. You can verify it works either by: 111 | 112 | - Looking at the logs in `Lambda > AlertManager-Canary > Monitoring > Logs` or 113 | - Tweaking the check definitions in a way that they'll trigger an alarm and wait a minute 114 | to receive the alarm so you know it's working. Just remember to tweak the check back to 115 | how it should be, and acknowledge the alert! 116 | -------------------------------------------------------------------------------- /cmd/alertmanager/httpmonitors.go: -------------------------------------------------------------------------------- 1 | package main 2 | 3 | import ( 4 | "context" 5 | "fmt" 6 | "time" 7 | 8 | "github.com/function61/eventhorizon/pkg/ehevent" 9 | "github.com/function61/gokit/ossignal" 10 | "github.com/function61/gokit/stringutils" 11 | "github.com/function61/lambda-alertmanager/pkg/amdomain" 12 | "github.com/function61/lambda-alertmanager/pkg/amstate" 13 | "github.com/scylladb/termtables" 14 | "github.com/spf13/cobra" 15 | ) 16 | 17 | func httpMonitorEntry() *cobra.Command { 18 | cmd := &cobra.Command{ 19 | Use: "hm", 20 | Short: "Manage HTTP monitors", 21 | } 22 | 23 | cmd.AddCommand(&cobra.Command{ 24 | Use: "ls", 25 | Short: "List monitors", 26 | Args: cobra.NoArgs, 27 | Run: func(cmd *cobra.Command, args []string) { 28 | exitIfError(httpMonitorList( 29 | ossignal.InterruptOrTerminateBackgroundCtx(nil))) 30 | }, 31 | }) 32 | 33 | cmd.AddCommand(&cobra.Command{ 34 | Use: "rm [id]", 35 | Short: "Remove HTTP monitor", 36 | Args: cobra.ExactArgs(1), 37 | Run: func(cmd *cobra.Command, args []string) { 38 | exitIfError(httpMonitorDelete( 39 | ossignal.InterruptOrTerminateBackgroundCtx(nil), 40 | args[0])) 41 | }, 42 | }) 43 | 44 | cmd.AddCommand(&cobra.Command{ 45 | Use: "enable [id]", 46 | Short: "Enable disabled HTTP monitor", 47 | Args: cobra.ExactArgs(1), 48 | Run: func(cmd *cobra.Command, args []string) { 49 | exitIfError(httpMonitorEnableOrDisable( 50 | ossignal.InterruptOrTerminateBackgroundCtx(nil), 51 | args[0], 52 | true)) 53 | }, 54 | }) 55 | 56 | cmd.AddCommand(&cobra.Command{ 57 | Use: "disable [id]", 58 | Short: "Temporarily disable a HTTP monitor", 59 | Args: cobra.ExactArgs(1), 60 | Run: func(cmd *cobra.Command, args []string) { 61 | exitIfError(httpMonitorEnableOrDisable( 62 | ossignal.InterruptOrTerminateBackgroundCtx(nil), 63 | args[0], 64 | false)) 65 | }, 66 | }) 67 | 68 | cmd.AddCommand(&cobra.Command{ 69 | Use: "mk [url] [find]", 70 | Short: "Create HTTP monitor", 71 | Args: cobra.ExactArgs(2), 72 | Run: func(cmd *cobra.Command, args []string) { 73 | exitIfError(httpMonitorCreate( 74 | ossignal.InterruptOrTerminateBackgroundCtx(nil), 75 | args[0], 76 | args[1])) 77 | }, 78 | }) 79 | 80 | cmd.AddCommand(&cobra.Command{ 81 | Use: "scan", 82 | Short: "Runs all enabled monitors and raises alerts if appropriate", 83 | Args: cobra.NoArgs, 84 | Run: func(cmd *cobra.Command, args []string) { 85 | ctx := ossignal.InterruptOrTerminateBackgroundCtx(nil) 86 | 87 | app, err := getApp(ctx) 88 | exitIfError(err) 89 | 90 | exitIfError(httpMonitorScanAndAlertFailures(ctx, app)) 91 | }, 92 | }) 93 | 94 | return cmd 95 | } 96 | 97 | func httpMonitorList(ctx context.Context) error { 98 | app, err := getApp(ctx) 99 | if err != nil { 100 | return err 101 | } 102 | 103 | view := termtables.CreateTable() 104 | view.AddHeaders("Id", "Enabled", "Url", "Find") 105 | 106 | for _, alert := range app.State.HttpMonitors() { 107 | view.AddRow( 108 | alert.Id, 109 | boolToCheckmark(alert.Enabled), 110 | stringutils.Truncate(alert.Url, 44), 111 | alert.Find) 112 | } 113 | 114 | fmt.Println(view.Render()) 115 | 116 | return nil 117 | } 118 | 119 | func httpMonitorCreate(ctx context.Context, url string, find string) error { 120 | app, err := getApp(ctx) 121 | if err != nil { 122 | return err 123 | } 124 | 125 | monitorCreated := amdomain.NewHttpMonitorCreated( 126 | amstate.NewHttpMonitorId(), 127 | true, 128 | url, 129 | find, 130 | ehevent.MetaSystemUser(time.Now())) 131 | 132 | ver := app.State.Version() 133 | 134 | _, err = app.Writer.Append(ctx, ver.Stream(), []string{ 135 | ehevent.Serialize(monitorCreated), 136 | }) 137 | return err 138 | } 139 | 140 | func httpMonitorDelete(ctx context.Context, id string) error { 141 | app, err := getApp(ctx) 142 | if err != nil { 143 | return err 144 | } 145 | 146 | return app.Reader.TransactWrite(ctx, func() error { 147 | if amstate.FindHttpMonitorWithId(id, app.State.HttpMonitors()) == nil { 148 | return fmt.Errorf("monitor to delete not found: %s", id) 149 | } 150 | 151 | return app.AppendAfter(ctx, app.State.Version(), amdomain.NewHttpMonitorDeleted( 152 | id, 153 | ehevent.MetaSystemUser(time.Now()))) 154 | }) 155 | } 156 | 157 | func httpMonitorEnableOrDisable(ctx context.Context, id string, newState bool) error { 158 | app, err := getApp(ctx) 159 | if err != nil { 160 | return err 161 | } 162 | 163 | return app.Reader.TransactWrite(ctx, func() error { 164 | monitorToEdit := amstate.FindHttpMonitorWithId(id, app.State.HttpMonitors()) 165 | if monitorToEdit == nil { 166 | return fmt.Errorf("monitor not found: %s", id) 167 | } 168 | 169 | if monitorToEdit.Enabled == newState { 170 | return fmt.Errorf("monitor left unchanged: %s", id) 171 | } 172 | 173 | return app.AppendAfter(ctx, app.State.Version(), amdomain.NewHttpMonitorEnabledUpdated( 174 | id, 175 | newState, 176 | ehevent.MetaSystemUser(time.Now()))) 177 | }) 178 | } 179 | 180 | func boolToCheckmark(input bool) string { 181 | if input { 182 | return "✓" 183 | } else { 184 | return "✗" 185 | } 186 | } 187 | -------------------------------------------------------------------------------- /cmd/alertmanager/restapi.go: -------------------------------------------------------------------------------- 1 | package main 2 | 3 | import ( 4 | "context" 5 | "encoding/json" 6 | "fmt" 7 | "log" 8 | "net/http" 9 | "os" 10 | "time" 11 | 12 | "github.com/function61/gokit/httputils" 13 | "github.com/function61/gokit/jsonfile" 14 | "github.com/function61/gokit/logex" 15 | "github.com/function61/gokit/ossignal" 16 | "github.com/function61/gokit/taskrunner" 17 | "github.com/function61/lambda-alertmanager/pkg/alertmanagertypes" 18 | "github.com/function61/lambda-alertmanager/pkg/amstate" 19 | "github.com/spf13/cobra" 20 | ) 21 | 22 | func newRestApi(ctx context.Context) http.Handler { 23 | app, err := getApp(ctx) 24 | if err != nil { 25 | return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { 26 | http.Error(w, err.Error(), http.StatusInternalServerError) 27 | }) 28 | } 29 | 30 | mux := httputils.NewMethodMux() 31 | 32 | mux.GET.HandleFunc("/alerts", func(w http.ResponseWriter, r *http.Request) { 33 | noCacheHeaders(w) 34 | 35 | handleJsonOutput(w, app.State.ActiveAlerts()) 36 | }) 37 | 38 | mux.POST.HandleFunc("/alerts/ingest", func(w http.ResponseWriter, r *http.Request) { 39 | alert := amstate.Alert{} 40 | if err := jsonfile.Unmarshal(r.Body, &alert, true); err != nil { 41 | http.Error(w, err.Error(), http.StatusBadRequest) 42 | return 43 | } 44 | alert.Id = amstate.NewAlertId() // FIXME: bad design 45 | 46 | created, err := ingestAlertsAndReturnCreatedFlag(r.Context(), []amstate.Alert{alert}, app) 47 | if err != nil { 48 | http.Error(w, err.Error(), http.StatusInternalServerError) 49 | return 50 | } 51 | 52 | if created { 53 | w.WriteHeader(http.StatusCreated) 54 | } else { 55 | w.WriteHeader(http.StatusNoContent) 56 | } 57 | }) 58 | 59 | mux.GET.HandleFunc("/alerts/acknowledge", func(w http.ResponseWriter, r *http.Request) { 60 | id := r.URL.Query().Get("id") 61 | 62 | noCacheHeaders(w) 63 | 64 | if err := alertAck(r.Context(), id); err != nil { 65 | http.Error(w, err.Error(), http.StatusInternalServerError) 66 | return 67 | } 68 | 69 | fmt.Fprintf(w, "Ack ok for %s", id) 70 | }) 71 | 72 | mux.GET.HandleFunc("/deadmansswitches", func(w http.ResponseWriter, r *http.Request) { 73 | noCacheHeaders(w) 74 | 75 | handleJsonOutput(w, app.State.DeadMansSwitches()) 76 | }) 77 | 78 | // /deadmansswitch/checkin?subject=ubackup_done&ttl=24h30m 79 | mux.GET.HandleFunc("/deadmansswitch/checkin", func(w http.ResponseWriter, r *http.Request) { 80 | // same semantic hack here as acknowledge endpoint 81 | 82 | noCacheHeaders(w) 83 | 84 | // handles validation 85 | handleDeadMansSwitchCheckin(w, r, alertmanagertypes.DeadMansSwitchCheckinRequest{ 86 | Subject: r.URL.Query().Get("subject"), 87 | TTL: r.URL.Query().Get("ttl"), 88 | }, app) 89 | }) 90 | 91 | mux.POST.HandleFunc("/deadmansswitch/checkin", func(w http.ResponseWriter, r *http.Request) { 92 | checkin := alertmanagertypes.DeadMansSwitchCheckinRequest{} 93 | if err := jsonfile.Unmarshal(r.Body, &checkin, true); err != nil { 94 | http.Error(w, err.Error(), http.StatusBadRequest) 95 | return 96 | } 97 | 98 | // handles validation 99 | handleDeadMansSwitchCheckin(w, r, checkin, app) 100 | }) 101 | 102 | mux.POST.HandleFunc("/prometheus-alertmanager/api/v1/alerts", func(w http.ResponseWriter, r *http.Request) { 103 | http.Error(w, "not implemented yet", http.StatusInternalServerError) 104 | }) 105 | 106 | return mux 107 | } 108 | 109 | func handleDeadMansSwitchCheckin( 110 | w http.ResponseWriter, 111 | r *http.Request, 112 | raw alertmanagertypes.DeadMansSwitchCheckinRequest, 113 | app *amstate.App, 114 | ) { 115 | if raw.Subject == "" || raw.TTL == "" { 116 | http.Error(w, "subject or ttl empty", http.StatusBadRequest) 117 | return 118 | } 119 | 120 | now := time.Now() 121 | 122 | ttl, err := parseTtlSpec(raw.TTL, now) 123 | if err != nil { 124 | http.Error(w, err.Error(), http.StatusBadRequest) 125 | return 126 | } 127 | 128 | alertAcked, err := deadmansswitchCheckin(r.Context(), raw.Subject, ttl, app, time.Now()) 129 | if err != nil { 130 | http.Error(w, err.Error(), http.StatusInternalServerError) 131 | return 132 | } 133 | 134 | if alertAcked { 135 | fmt.Fprintln(w, "Check-in noted; alert that was firing for this dead mans's switch was acked") 136 | } else { 137 | fmt.Fprintln(w, "Check-in noted") 138 | } 139 | } 140 | 141 | func handleJsonOutput(w http.ResponseWriter, output interface{}) { 142 | w.Header().Set("Content-Type", "application/json") 143 | 144 | if err := json.NewEncoder(w).Encode(output); err != nil { 145 | panic(err) 146 | } 147 | } 148 | 149 | func restApiCliEntry() *cobra.Command { 150 | return &cobra.Command{ 151 | Use: "restapi", 152 | Short: "Start REST API (used mainly for dev/testing)", 153 | Args: cobra.NoArgs, 154 | Run: func(cmd *cobra.Command, args []string) { 155 | logger := logex.StandardLogger() 156 | 157 | exitIfError(runStandaloneRestApi( 158 | ossignal.InterruptOrTerminateBackgroundCtx(logger), 159 | logger)) 160 | }, 161 | } 162 | } 163 | 164 | func runStandaloneRestApi(ctx context.Context, logger *log.Logger) error { 165 | srv := &http.Server{ 166 | Addr: ":80", 167 | Handler: newRestApi(ctx), 168 | } 169 | 170 | tasks := taskrunner.New(ctx, logger) 171 | 172 | tasks.Start("listener "+srv.Addr, func(_ context.Context, _ string) error { 173 | return httputils.RemoveGracefulServerClosedError(srv.ListenAndServe()) 174 | }) 175 | 176 | tasks.Start("listenershutdowner", httputils.ServerShutdownTask(srv)) 177 | 178 | return tasks.Wait() 179 | } 180 | 181 | func ackLink(alert amstate.Alert) string { 182 | return os.Getenv("API_ENDPOINT") + "/alerts/acknowledge?id=" + alert.Id 183 | } 184 | 185 | func noCacheHeaders(w http.ResponseWriter) { 186 | w.Header().Set("Cache-Control", "no-store, must-revalidate") 187 | } 188 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | ![Build](https://github.com/function61/lambda-alertmanager/workflows/Build/badge.svg) 2 | 3 | 4 | **NOTE: some documentation outdated, since there was a major refactoring** 5 | 6 | Requires these ENV vars: 7 | 8 | - `ALERT_TOPIC`=arn:aws:sns:us-east-1:123456789012:AlertManager-alert 9 | - `API_ENDPOINT`=https://foobar.execute-api.us-east-1.amazonaws.com/prod 10 | - `AWS_ACCESS_KEY_ID`=AKIA... 11 | - `AWS_SECRET_ACCESS_KEY`=brKsU... 12 | - `EVENTHORIZON`=prod:1:::eu-central-1 13 | 14 | 15 | lambda-alertmanager? 16 | -------------------- 17 | 18 | - Provides simple & reliable alerting for your infrastructure. 19 | - Uses so little resources that it is practically free to run. 20 | - [Monitors your web properties for being up](docs/usecase_http-monitoring.md), 21 | [receive alerts from Prometheus](docs/usecase_prometheus-alerting.md), 22 | [Amazon CloudWatch alarms](docs/usecase_cloudwatch-alerting.md), alarms via SNS topic or 23 | [any custom HTTP integration (as JSON)](docs/setup_custom_integration.md). 24 | - Runs **entirely** on AWS' reliable infrastructure (after setup nothing for you to manage or fix). The compute part is Lambda, 25 | but we also use DynamoDB + streams (for state), IAM (for sandboxing AlertManager), API Gateway (for inbound https integrations), 26 | CloudWatch Events (for scheduling) and SNS (inbound alarm receiving, outbound alert delivery). 27 | - Acknowledge -model: each separate alarm is alerted only once until it is acknowledged from UI, 28 | even if the same alarm is submitted again. F.ex. Prometheus sends the same alert continuously 29 | until the issue is resolved, but of course you want to receive the alert only once. 30 | - Rate limiting: if shit hits the fan and your hundreds of alarms trigger all at once, you only get alerts 31 | for the first, say, 10 alarms. The rate limit is configurable. 32 | - Supports dead man's switches: a service has to periodically make a check-in. If the 33 | check-ins stop coming, we raise an alert. 34 | 35 | 36 | Can send alerts to you (or many people) via: 37 | -------------------------------------------- 38 | 39 | - SMS ([free: <= 100 alerts/month](https://aws.amazon.com/sns/sms-pricing/)) 40 | - Email 41 | - Webhook 42 | - Push to mobile device (though SMS is better in cases when you are travelling or otherwise not reachable via mobile data) 43 | - Any combination of these (I use SMS + Email) 44 | - Or [anything that SNS supports](https://aws.amazon.com/sns/details/) (the above are just SNS transports) 45 | 46 | 47 | Can directly monitor: 48 | --------------------- 49 | 50 | - http/https checks via AlertManager-Canary component (included but optional): 51 | checks that your web properties are up - triggers an alert if not. Can even check all your properties 52 | at 1 minute intervals, and runs efficiently because all the checks are executed in parallel. Tries to minimize 53 | false positives by retrying each failed check once before generating an alarm. 54 | 55 | 56 | Integrates with: 57 | ---------------- 58 | 59 | - Supports receiving alerts from [Prometheus](https://prometheus.io/). 60 | - Supports receiving alerts via SNS (= directly plugs into Amazon CloudWatch Alerts) 61 | or any other SNS-publishing source. For example we receive alerts from CloudWatch -> AlertManager if our 62 | queue processors stop processing work. 63 | - Supports receiving alerts over https as JSON. 64 | 65 | 66 | How to install & other docs 67 | --------------------------- 68 | 69 | Take note of your AWS region. These docs assume you are in the `us-west-2` region. 70 | If not, substitute your region code everywhere in these docs! 71 | 72 | Follow these steps precisely, and you've got yourself a working installation: 73 | 74 | 1. [Set up SNS topics](docs/setup_sns.md) 75 | 2. [Set up DynamoDB](docs/setup_dynamodb.md) 76 | 3. [Set up IAM](docs/setup_iam.md) 77 | 4. [Set up AlertManager](docs/setup_alertmanager.md) 78 | 5. [Set up API Gateway](docs/setup_apigateway.md) (also includes: testing that this works) 79 | 6. (recommended) [Set up AlertManager-canary](docs/setup_alertmanager-canary.md) 80 | 7. (optional) [Set up Prometheus integration](docs/usecase_prometheus-alerting.md) 81 | 8. (optional) [Set up custom integration](docs/setup_custom_integration.md) 82 | 9. (optional) [Set up CloudWatch integration](docs/usecase_cloudwatch-alerting.md) 83 | 84 | 85 | Diagram 86 | ------- 87 | 88 | [![Graph](docs/diagram.png)](docs/diagram.md) 89 | 90 | 91 | FAQ 92 | --- 93 | 94 | Q: Why use this, [uptimerobot.com](https://uptimerobot.com/) is free? 95 | 96 | A: uptimerobot.com is good, but: 97 | 98 | - The free option only supports 5 minute rates while lambda-alertmanager supports 1 minute rates. 99 | - I don't trust the quality of it for my production usage: I had an issue where a failed check 100 | after fixing stayed failed for more than 12 hours even though I manually checked that the 101 | endpoint works. I had to pause-and-then-resume the check right after it UptimeRobot 102 | reported the check as OK. 103 | - It does mainly HTTP/HTTPS checks, while lambda-alertmanager integrates with Prometheus, Amazon CloudWatch & others as well. 104 | - It supports free SMS messages (no delivery guarantees), but they have non-free "pro SMS" (better delivery). 105 | lambda-alertmanager SMSes are all "pro SMS" and free to a certain limit. 106 | - lambda-alertmanager is simple, free, open source, runs "on premises" (in your AWS account) and should run forever 107 | (AWS is not going anywhere). 108 | - That being said, lambda-alertmanager is not "dead simple" to set up and you need an AWS account. If your use 109 | case does not require lambda-alertmanager, you should probably choose uptimerobot. :) 110 | 111 | 112 | Support / contact 113 | ----------------- 114 | 115 | Basic support (no guarantees) for issues / feature requests via GitHub issues. 116 | 117 | Paid support is available via [function61.com/consulting](https://function61.com/consulting/) 118 | 119 | Contact options (email, Twitter etc.) at [function61.com](https://function61.com/) 120 | -------------------------------------------------------------------------------- /pkg/amdomain/events.go: -------------------------------------------------------------------------------- 1 | // Structure of data for all state changes 2 | package amdomain 3 | 4 | import ( 5 | "time" 6 | 7 | "github.com/function61/eventhorizon/pkg/ehevent" 8 | ) 9 | 10 | var Types = ehevent.Allocators{ 11 | "AlertRaised": func() ehevent.Event { return &AlertRaised{} }, 12 | "AlertAcknowledged": func() ehevent.Event { return &AlertAcknowledged{} }, 13 | "UnnoticedAlertsNotified": func() ehevent.Event { return &UnnoticedAlertsNotified{} }, 14 | "HttpMonitorCreated": func() ehevent.Event { return &HttpMonitorCreated{} }, 15 | "HttpMonitorEnabledUpdated": func() ehevent.Event { return &HttpMonitorEnabledUpdated{} }, 16 | "HttpMonitorDeleted": func() ehevent.Event { return &HttpMonitorDeleted{} }, 17 | "DeadMansSwitchCreated": func() ehevent.Event { return &DeadMansSwitchCreated{} }, 18 | "DeadMansSwitchCheckin": func() ehevent.Event { return &DeadMansSwitchCheckin{} }, 19 | "DeadMansSwitchDeleted": func() ehevent.Event { return &DeadMansSwitchDeleted{} }, 20 | } 21 | 22 | // ------ 23 | 24 | type AlertRaised struct { 25 | meta ehevent.EventMeta 26 | Id string 27 | Subject string 28 | Details string 29 | } 30 | 31 | func (e *AlertRaised) MetaType() string { return "AlertRaised" } 32 | func (e *AlertRaised) Meta() *ehevent.EventMeta { return &e.meta } 33 | 34 | func NewAlertRaised( 35 | id string, 36 | subject string, 37 | details string, 38 | meta ehevent.EventMeta, 39 | ) *AlertRaised { 40 | return &AlertRaised{ 41 | meta: meta, 42 | Id: id, 43 | Subject: subject, 44 | Details: details, 45 | } 46 | } 47 | 48 | // ------ 49 | 50 | type AlertAcknowledged struct { 51 | meta ehevent.EventMeta 52 | Id string 53 | } 54 | 55 | func (e *AlertAcknowledged) MetaType() string { return "AlertAcknowledged" } 56 | func (e *AlertAcknowledged) Meta() *ehevent.EventMeta { return &e.meta } 57 | 58 | func NewAlertAcknowledged( 59 | id string, 60 | meta ehevent.EventMeta, 61 | ) *AlertAcknowledged { 62 | return &AlertAcknowledged{ 63 | meta: meta, 64 | Id: id, 65 | } 66 | } 67 | 68 | // ------ 69 | 70 | type UnnoticedAlertsNotified struct { 71 | meta ehevent.EventMeta 72 | AlertIds []string 73 | } 74 | 75 | func (e *UnnoticedAlertsNotified) MetaType() string { return "UnnoticedAlertsNotified" } 76 | func (e *UnnoticedAlertsNotified) Meta() *ehevent.EventMeta { return &e.meta } 77 | 78 | func NewUnnoticedAlertsNotified( 79 | alertIds []string, 80 | meta ehevent.EventMeta, 81 | ) *UnnoticedAlertsNotified { 82 | return &UnnoticedAlertsNotified{ 83 | meta: meta, 84 | AlertIds: alertIds, 85 | } 86 | } 87 | 88 | // ------ 89 | 90 | type HttpMonitorCreated struct { 91 | meta ehevent.EventMeta 92 | Id string 93 | Enabled bool 94 | Url string 95 | Find string 96 | } 97 | 98 | func (e *HttpMonitorCreated) MetaType() string { return "HttpMonitorCreated" } 99 | func (e *HttpMonitorCreated) Meta() *ehevent.EventMeta { return &e.meta } 100 | 101 | func NewHttpMonitorCreated( 102 | id string, 103 | enabled bool, 104 | url string, 105 | find string, 106 | meta ehevent.EventMeta, 107 | ) *HttpMonitorCreated { 108 | return &HttpMonitorCreated{ 109 | meta: meta, 110 | Id: id, 111 | Enabled: enabled, 112 | Url: url, 113 | Find: find, 114 | } 115 | } 116 | 117 | // ------ 118 | 119 | type HttpMonitorEnabledUpdated struct { 120 | meta ehevent.EventMeta 121 | Id string 122 | Enabled bool 123 | } 124 | 125 | func (e *HttpMonitorEnabledUpdated) MetaType() string { return "HttpMonitorEnabledUpdated" } 126 | func (e *HttpMonitorEnabledUpdated) Meta() *ehevent.EventMeta { return &e.meta } 127 | 128 | func NewHttpMonitorEnabledUpdated( 129 | id string, 130 | enabled bool, 131 | meta ehevent.EventMeta, 132 | ) *HttpMonitorEnabledUpdated { 133 | return &HttpMonitorEnabledUpdated{ 134 | meta: meta, 135 | Id: id, 136 | Enabled: enabled, 137 | } 138 | } 139 | 140 | // ------ 141 | 142 | type HttpMonitorDeleted struct { 143 | meta ehevent.EventMeta 144 | Id string 145 | } 146 | 147 | func (e *HttpMonitorDeleted) MetaType() string { return "HttpMonitorDeleted" } 148 | func (e *HttpMonitorDeleted) Meta() *ehevent.EventMeta { return &e.meta } 149 | 150 | func NewHttpMonitorDeleted( 151 | id string, 152 | meta ehevent.EventMeta, 153 | ) *HttpMonitorDeleted { 154 | return &HttpMonitorDeleted{ 155 | meta: meta, 156 | Id: id, 157 | } 158 | } 159 | 160 | // ------ 161 | 162 | type DeadMansSwitchCreated struct { 163 | meta ehevent.EventMeta 164 | Subject string 165 | Ttl time.Time 166 | } 167 | 168 | func (e *DeadMansSwitchCreated) MetaType() string { return "DeadMansSwitchCreated" } 169 | func (e *DeadMansSwitchCreated) Meta() *ehevent.EventMeta { return &e.meta } 170 | 171 | func NewDeadMansSwitchCreated( 172 | subject string, 173 | ttl time.Time, 174 | meta ehevent.EventMeta, 175 | ) *DeadMansSwitchCreated { 176 | return &DeadMansSwitchCreated{ 177 | meta: meta, 178 | Subject: subject, 179 | Ttl: ttl, 180 | } 181 | } 182 | 183 | // ------ 184 | 185 | type DeadMansSwitchCheckin struct { 186 | meta ehevent.EventMeta 187 | Subject string 188 | Ttl time.Time 189 | } 190 | 191 | func (e *DeadMansSwitchCheckin) MetaType() string { return "DeadMansSwitchCheckin" } 192 | func (e *DeadMansSwitchCheckin) Meta() *ehevent.EventMeta { return &e.meta } 193 | 194 | func NewDeadMansSwitchCheckin( 195 | subject string, 196 | ttl time.Time, 197 | meta ehevent.EventMeta, 198 | ) *DeadMansSwitchCheckin { 199 | return &DeadMansSwitchCheckin{ 200 | meta: meta, 201 | Subject: subject, 202 | Ttl: ttl, 203 | } 204 | } 205 | 206 | // ------ 207 | 208 | type DeadMansSwitchDeleted struct { 209 | meta ehevent.EventMeta 210 | Subject string 211 | } 212 | 213 | func (e *DeadMansSwitchDeleted) MetaType() string { return "DeadMansSwitchDeleted" } 214 | func (e *DeadMansSwitchDeleted) Meta() *ehevent.EventMeta { return &e.meta } 215 | 216 | func NewDeadMansSwitchDeleted( 217 | subject string, 218 | meta ehevent.EventMeta, 219 | ) *DeadMansSwitchDeleted { 220 | return &DeadMansSwitchDeleted{ 221 | meta: meta, 222 | Subject: subject, 223 | } 224 | } 225 | -------------------------------------------------------------------------------- /pkg/amstate/store.go: -------------------------------------------------------------------------------- 1 | package amstate 2 | 3 | import ( 4 | "context" 5 | "encoding/json" 6 | "log" 7 | "sort" 8 | "sync" 9 | "time" 10 | 11 | "github.com/function61/eventhorizon/pkg/ehclient" 12 | "github.com/function61/eventhorizon/pkg/ehevent" 13 | "github.com/function61/eventhorizon/pkg/ehreader" 14 | "github.com/function61/gokit/logex" 15 | "github.com/function61/lambda-alertmanager/pkg/amdomain" 16 | ) 17 | 18 | const ( 19 | Stream = "/alertmanager" 20 | ) 21 | 22 | func newStateFormat() stateFormat { 23 | return stateFormat{ 24 | ActiveAlerts: map[string]Alert{}, 25 | HttpMonitors: map[string]HttpMonitor{}, 26 | DeadMansSwitches: map[string]DeadMansSwitch{}, 27 | } 28 | } 29 | 30 | type Store struct { 31 | version ehclient.Cursor 32 | mu sync.Mutex 33 | state stateFormat // for easy snapshotting 34 | logl *logex.Leveled 35 | } 36 | 37 | func New(tenant ehreader.Tenant, logger *log.Logger) *Store { 38 | return &Store{ 39 | version: ehclient.Beginning(tenant.Stream(Stream)), 40 | state: newStateFormat(), 41 | logl: logex.Levels(logger), 42 | } 43 | } 44 | 45 | func (s *Store) Version() ehclient.Cursor { 46 | s.mu.Lock() 47 | defer s.mu.Unlock() 48 | 49 | return s.version 50 | } 51 | 52 | func (s *Store) InstallSnapshot(snap *ehreader.Snapshot) error { 53 | s.mu.Lock() 54 | defer s.mu.Unlock() 55 | 56 | s.version = snap.Cursor 57 | s.state = stateFormat{} 58 | 59 | return json.Unmarshal(snap.Data, &s.state) 60 | } 61 | 62 | func (s *Store) Snapshot() (*ehreader.Snapshot, error) { 63 | s.mu.Lock() 64 | defer s.mu.Unlock() 65 | 66 | data, err := json.MarshalIndent(s.state, "", "\t") 67 | if err != nil { 68 | return nil, err 69 | } 70 | 71 | return ehreader.NewSnapshot(s.version, data), nil 72 | } 73 | 74 | func (s *Store) ActiveAlerts() []Alert { 75 | s.mu.Lock() 76 | defer s.mu.Unlock() 77 | 78 | alerts := []Alert{} 79 | for _, alert := range s.state.ActiveAlerts { 80 | alerts = append(alerts, alert) 81 | } 82 | 83 | sort.Slice(alerts, func(i, j int) bool { return alerts[i].Timestamp.Before(alerts[j].Timestamp) }) 84 | 85 | return alerts 86 | } 87 | 88 | func (s *Store) HttpMonitors() []HttpMonitor { 89 | s.mu.Lock() 90 | defer s.mu.Unlock() 91 | 92 | monitors := []HttpMonitor{} 93 | for _, alert := range s.state.HttpMonitors { 94 | monitors = append(monitors, alert) 95 | } 96 | 97 | sort.Slice(monitors, func(i, j int) bool { return monitors[i].Created.Before(monitors[j].Created) }) 98 | 99 | return monitors 100 | } 101 | 102 | func (s *Store) DeadMansSwitches() []DeadMansSwitch { 103 | s.mu.Lock() 104 | defer s.mu.Unlock() 105 | 106 | deadMansSwitches := []DeadMansSwitch{} 107 | for _, dms := range s.state.DeadMansSwitches { 108 | deadMansSwitches = append(deadMansSwitches, dms) 109 | } 110 | 111 | sort.Slice(deadMansSwitches, func(i, j int) bool { 112 | return deadMansSwitches[i].Subject < deadMansSwitches[j].Subject 113 | }) 114 | 115 | return deadMansSwitches 116 | } 117 | 118 | func (s *Store) LastUnnoticedAlertsNotified() time.Time { 119 | s.mu.Lock() 120 | defer s.mu.Unlock() 121 | 122 | return s.state.LastUnnoticedAlertsNotified 123 | } 124 | 125 | func (s *Store) GetEventTypes() ehevent.Allocators { 126 | return amdomain.Types 127 | } 128 | 129 | func (s *Store) ProcessEvents(_ context.Context, processAndCommit ehreader.EventProcessorHandler) error { 130 | s.mu.Lock() 131 | defer s.mu.Unlock() 132 | 133 | return processAndCommit( 134 | s.version, 135 | func(ev ehevent.Event) error { return s.processEvent(ev) }, 136 | func(version ehclient.Cursor) error { 137 | s.version = version 138 | return nil 139 | }) 140 | } 141 | 142 | func (s *Store) processEvent(ev ehevent.Event) error { 143 | s.logl.Debug.Println(ev.MetaType()) 144 | 145 | switch e := ev.(type) { 146 | case *amdomain.AlertRaised: 147 | s.state.ActiveAlerts[e.Id] = Alert{ 148 | Id: e.Id, 149 | Subject: e.Subject, 150 | Details: e.Details, 151 | Timestamp: e.Meta().Timestamp, 152 | } 153 | case *amdomain.AlertAcknowledged: 154 | delete(s.state.ActiveAlerts, e.Id) 155 | case *amdomain.HttpMonitorCreated: 156 | s.state.HttpMonitors[e.Id] = HttpMonitor{ 157 | Id: e.Id, 158 | Created: e.Meta().Timestamp, 159 | Enabled: e.Enabled, 160 | Url: e.Url, 161 | Find: e.Find, 162 | } 163 | case *amdomain.HttpMonitorEnabledUpdated: 164 | mon := s.state.HttpMonitors[e.Id] 165 | mon.Enabled = e.Enabled 166 | s.state.HttpMonitors[e.Id] = mon 167 | case *amdomain.HttpMonitorDeleted: 168 | delete(s.state.HttpMonitors, e.Id) 169 | case *amdomain.DeadMansSwitchCreated: 170 | s.state.DeadMansSwitches[e.Subject] = DeadMansSwitch{ 171 | Subject: e.Subject, 172 | Ttl: e.Ttl, 173 | } 174 | case *amdomain.DeadMansSwitchCheckin: 175 | dms := s.state.DeadMansSwitches[e.Subject] 176 | dms.Ttl = e.Ttl 177 | s.state.DeadMansSwitches[e.Subject] = dms 178 | case *amdomain.DeadMansSwitchDeleted: 179 | delete(s.state.DeadMansSwitches, e.Subject) 180 | case *amdomain.UnnoticedAlertsNotified: 181 | s.state.LastUnnoticedAlertsNotified = e.Meta().Timestamp 182 | default: 183 | return ehreader.UnsupportedEventTypeErr(ev) 184 | } 185 | 186 | return nil 187 | } 188 | 189 | type App struct { 190 | State *Store 191 | Reader *ehreader.Reader 192 | Writer ehclient.Writer 193 | Logger *log.Logger 194 | } 195 | 196 | // helper 197 | func (a *App) AppendAfter(ctx context.Context, cur ehclient.Cursor, events ...ehevent.Event) error { 198 | serialized := []string{} 199 | for _, event := range events { 200 | serialized = append(serialized, ehevent.Serialize(event)) 201 | } 202 | 203 | // helper mainly written b/c we don't care for returned cursor 204 | _, err := a.Writer.AppendAfter(ctx, cur, serialized) 205 | return err 206 | } 207 | 208 | func LoadUntilRealtime( 209 | ctx context.Context, 210 | tenantCtx *ehreader.TenantCtxWithSnapshots, 211 | logger *log.Logger, 212 | ) (*App, error) { 213 | store := New(tenantCtx.Tenant, logger) 214 | 215 | a := &App{ 216 | store, 217 | ehreader.NewWithSnapshots( 218 | store, 219 | tenantCtx.Client, 220 | tenantCtx.SnapshotStore, 221 | logger), 222 | tenantCtx.Client, 223 | logger} 224 | 225 | if err := a.Reader.LoadUntilRealtime(ctx); err != nil { 226 | return nil, err 227 | } 228 | 229 | return a, nil 230 | } 231 | -------------------------------------------------------------------------------- /pkg/amstate/store_test.go: -------------------------------------------------------------------------------- 1 | package amstate 2 | 3 | import ( 4 | "context" 5 | "testing" 6 | "time" 7 | 8 | "github.com/function61/eventhorizon/pkg/ehevent" 9 | "github.com/function61/eventhorizon/pkg/ehreader" 10 | "github.com/function61/eventhorizon/pkg/ehreader/ehreadertest" 11 | "github.com/function61/gokit/assert" 12 | "github.com/function61/lambda-alertmanager/pkg/amdomain" 13 | ) 14 | 15 | const ( 16 | testStreamName = "/t-42/alertmanager" 17 | ) 18 | 19 | var ( 20 | t0 = time.Date(2020, 2, 20, 14, 2, 0, 0, time.UTC) 21 | ) 22 | 23 | func TestAlerts(t *testing.T) { 24 | ctx := context.Background() 25 | 26 | eventLog := ehreadertest.NewEventLog() 27 | eventLog.AppendE( 28 | testStreamName, 29 | amdomain.NewAlertRaised( 30 | "a14308bba82f", 31 | "The building is on fire", 32 | "Fire sensor in room 456 went off", 33 | ehevent.MetaSystemUser(t0))) 34 | eventLog.AppendE( 35 | testStreamName, 36 | amdomain.NewAlertRaised( 37 | "1a33032c9081", 38 | "Water damage detected", 39 | "Water leak sensor in room 456 went off", 40 | ehevent.MetaSystemUser(t0.Add(2*time.Minute)))) 41 | 42 | app, err := LoadUntilRealtime(ctx, ehreader.NewTenantCtxWithSnapshots(ehreader.TenantId("42"), eventLog, ehreader.NewInMemSnapshotStore()), nil) 43 | assert.Ok(t, err) 44 | 45 | assert.Assert(t, FindAlertWithSubject("The building is on fire", app.State.ActiveAlerts()) != nil) 46 | assert.Assert(t, FindAlertWithSubject("Everything is calm", app.State.ActiveAlerts()) == nil) 47 | 48 | alerts := app.State.ActiveAlerts() 49 | 50 | assert.EqualJson(t, alerts, `[ 51 | { 52 | "alert_key": "a14308bba82f", 53 | "subject": "The building is on fire", 54 | "details": "Fire sensor in room 456 went off", 55 | "timestamp": "2020-02-20T14:02:00Z" 56 | }, 57 | { 58 | "alert_key": "1a33032c9081", 59 | "subject": "Water damage detected", 60 | "details": "Water leak sensor in room 456 went off", 61 | "timestamp": "2020-02-20T14:04:00Z" 62 | } 63 | ]`) 64 | 65 | eventLog.AppendE( 66 | testStreamName, 67 | amdomain.NewAlertAcknowledged( 68 | "a14308bba82f", 69 | ehevent.MetaSystemUser(t0.Add(1*time.Hour)))) 70 | 71 | assert.Ok(t, app.Reader.LoadUntilRealtime(ctx)) 72 | 73 | assert.Assert(t, len(app.State.ActiveAlerts()) == 1) 74 | } 75 | 76 | func TestHttpMonitors(t *testing.T) { 77 | ctx := context.Background() 78 | 79 | eventLog := ehreadertest.NewEventLog() 80 | eventLog.AppendE( 81 | testStreamName, 82 | amdomain.NewHttpMonitorCreated( 83 | "49365a17244e", 84 | true, 85 | "https://function61.com/", 86 | "Welcome to the best page in the universe", 87 | ehevent.MetaSystemUser(t0))) 88 | 89 | app, err := LoadUntilRealtime(ctx, ehreader.NewTenantCtxWithSnapshots(ehreader.TenantId("42"), eventLog, ehreader.NewInMemSnapshotStore()), nil) 90 | assert.Ok(t, err) 91 | 92 | assert.EqualJson(t, app.State.HttpMonitors()[0], `{ 93 | "id": "49365a17244e", 94 | "created": "2020-02-20T14:02:00Z", 95 | "enabled": true, 96 | "url": "https://function61.com/", 97 | "find": "Welcome to the best page in the universe" 98 | }`) 99 | 100 | eventLog.AppendE( 101 | testStreamName, 102 | amdomain.NewHttpMonitorEnabledUpdated( 103 | "49365a17244e", 104 | false, 105 | ehevent.MetaSystemUser(t0))) 106 | 107 | assert.Ok(t, app.Reader.LoadUntilRealtime(ctx)) 108 | 109 | assert.Assert(t, !app.State.HttpMonitors()[0].Enabled) 110 | 111 | eventLog.AppendE( 112 | testStreamName, 113 | amdomain.NewHttpMonitorDeleted( 114 | "49365a17244e", 115 | ehevent.MetaSystemUser(t0))) 116 | 117 | assert.Ok(t, app.Reader.LoadUntilRealtime(ctx)) 118 | 119 | assert.Assert(t, len(app.State.HttpMonitors()) == 0) 120 | } 121 | 122 | func TestDeadMansSwitches(t *testing.T) { 123 | ctx := context.Background() 124 | 125 | eventLog := ehreadertest.NewEventLog() 126 | eventLog.AppendE( 127 | testStreamName, 128 | amdomain.NewDeadMansSwitchCreated( 129 | "Joonas checkins", 130 | t0.Add(2*time.Hour), 131 | ehevent.MetaSystemUser(t0))) 132 | 133 | app, err := LoadUntilRealtime(ctx, ehreader.NewTenantCtxWithSnapshots(ehreader.TenantId("42"), eventLog, ehreader.NewInMemSnapshotStore()), nil) 134 | assert.Ok(t, err) 135 | 136 | assert.EqualJson(t, app.State.DeadMansSwitches(), `[ 137 | { 138 | "subject": "Joonas checkins", 139 | "ttl": "2020-02-20T16:02:00Z" 140 | } 141 | ]`) 142 | 143 | eventLog.AppendE( 144 | testStreamName, 145 | amdomain.NewDeadMansSwitchCheckin( 146 | "Joonas checkins", 147 | t0.Add(3*time.Hour), 148 | ehevent.MetaSystemUser(t0))) 149 | 150 | assert.Ok(t, app.Reader.LoadUntilRealtime(ctx)) 151 | 152 | assert.EqualJson(t, app.State.DeadMansSwitches(), `[ 153 | { 154 | "subject": "Joonas checkins", 155 | "ttl": "2020-02-20T17:02:00Z" 156 | } 157 | ]`) 158 | 159 | switches := app.State.DeadMansSwitches() 160 | 161 | assert.Assert(t, len(GetExpiredDeadMansSwitches(switches, t0.Add(1*time.Hour))) == 0) 162 | assert.Assert(t, len(GetExpiredDeadMansSwitches(switches, t0.Add(2*time.Hour))) == 0) 163 | assert.Assert(t, len(GetExpiredDeadMansSwitches(switches, t0.Add(3*time.Hour))) == 1) 164 | 165 | eventLog.AppendE( 166 | testStreamName, 167 | amdomain.NewDeadMansSwitchDeleted( 168 | "Joonas checkins", 169 | ehevent.MetaSystemUser(t0))) 170 | 171 | assert.Ok(t, app.Reader.LoadUntilRealtime(ctx)) 172 | 173 | assert.EqualJson(t, app.State.DeadMansSwitches(), `[]`) 174 | } 175 | 176 | func TestGetUnnoticedAlerts(t *testing.T) { 177 | ctx := context.Background() 178 | 179 | eventLog := ehreadertest.NewEventLog() 180 | eventLog.AppendE( 181 | testStreamName, 182 | amdomain.NewAlertRaised( 183 | "a14308bba82f", 184 | "The building is on fire", 185 | "Fire sensor in room 456 went off", 186 | ehevent.MetaSystemUser(t0))) 187 | 188 | app, err := LoadUntilRealtime( 189 | ctx, 190 | ehreader.NewTenantCtxWithSnapshots( 191 | ehreader.TenantId("42"), 192 | eventLog, 193 | ehreader.NewInMemSnapshotStore()), 194 | nil) 195 | assert.Ok(t, err) 196 | 197 | unnoticedAlertCountT0Plus := func(plus time.Duration) int { 198 | return len(GetUnnoticedAlerts(app.State.ActiveAlerts(), t0.Add(plus))) 199 | } 200 | 201 | assert.Assert(t, unnoticedAlertCountT0Plus(0*time.Hour) == 0) 202 | assert.Assert(t, unnoticedAlertCountT0Plus(3*time.Hour) == 0) 203 | assert.Assert(t, unnoticedAlertCountT0Plus(4*time.Hour) == 1) 204 | 205 | assert.EqualJson(t, app.State.LastUnnoticedAlertsNotified(), `"0001-01-01T00:00:00Z"`) 206 | 207 | eventLog.AppendE(testStreamName, amdomain.NewUnnoticedAlertsNotified( 208 | []string{"a14308bba82f"}, 209 | ehevent.MetaSystemUser(t0))) 210 | 211 | assert.Ok(t, app.Reader.LoadUntilRealtime(ctx)) 212 | 213 | assert.EqualJson(t, app.State.LastUnnoticedAlertsNotified(), `"2020-02-20T14:02:00Z"`) 214 | } 215 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | Apache License 2 | Version 2.0, January 2004 3 | http://www.apache.org/licenses/ 4 | 5 | TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 6 | 7 | 1. Definitions. 8 | 9 | "License" shall mean the terms and conditions for use, reproduction, 10 | and distribution as defined by Sections 1 through 9 of this document. 11 | 12 | "Licensor" shall mean the copyright owner or entity authorized by 13 | the copyright owner that is granting the License. 14 | 15 | "Legal Entity" shall mean the union of the acting entity and all 16 | other entities that control, are controlled by, or are under common 17 | control with that entity. For the purposes of this definition, 18 | "control" means (i) the power, direct or indirect, to cause the 19 | direction or management of such entity, whether by contract or 20 | otherwise, or (ii) ownership of fifty percent (50%) or more of the 21 | outstanding shares, or (iii) beneficial ownership of such entity. 22 | 23 | "You" (or "Your") shall mean an individual or Legal Entity 24 | exercising permissions granted by this License. 25 | 26 | "Source" form shall mean the preferred form for making modifications, 27 | including but not limited to software source code, documentation 28 | source, and configuration files. 29 | 30 | "Object" form shall mean any form resulting from mechanical 31 | transformation or translation of a Source form, including but 32 | not limited to compiled object code, generated documentation, 33 | and conversions to other media types. 34 | 35 | "Work" shall mean the work of authorship, whether in Source or 36 | Object form, made available under the License, as indicated by a 37 | copyright notice that is included in or attached to the work 38 | (an example is provided in the Appendix below). 39 | 40 | "Derivative Works" shall mean any work, whether in Source or Object 41 | form, that is based on (or derived from) the Work and for which the 42 | editorial revisions, annotations, elaborations, or other modifications 43 | represent, as a whole, an original work of authorship. For the purposes 44 | of this License, Derivative Works shall not include works that remain 45 | separable from, or merely link (or bind by name) to the interfaces of, 46 | the Work and Derivative Works thereof. 47 | 48 | "Contribution" shall mean any work of authorship, including 49 | the original version of the Work and any modifications or additions 50 | to that Work or Derivative Works thereof, that is intentionally 51 | submitted to Licensor for inclusion in the Work by the copyright owner 52 | or by an individual or Legal Entity authorized to submit on behalf of 53 | the copyright owner. For the purposes of this definition, "submitted" 54 | means any form of electronic, verbal, or written communication sent 55 | to the Licensor or its representatives, including but not limited to 56 | communication on electronic mailing lists, source code control systems, 57 | and issue tracking systems that are managed by, or on behalf of, the 58 | Licensor for the purpose of discussing and improving the Work, but 59 | excluding communication that is conspicuously marked or otherwise 60 | designated in writing by the copyright owner as "Not a Contribution." 61 | 62 | "Contributor" shall mean Licensor and any individual or Legal Entity 63 | on behalf of whom a Contribution has been received by Licensor and 64 | subsequently incorporated within the Work. 65 | 66 | 2. Grant of Copyright License. Subject to the terms and conditions of 67 | this License, each Contributor hereby grants to You a perpetual, 68 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 69 | copyright license to reproduce, prepare Derivative Works of, 70 | publicly display, publicly perform, sublicense, and distribute the 71 | Work and such Derivative Works in Source or Object form. 72 | 73 | 3. Grant of Patent License. Subject to the terms and conditions of 74 | this License, each Contributor hereby grants to You a perpetual, 75 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 76 | (except as stated in this section) patent license to make, have made, 77 | use, offer to sell, sell, import, and otherwise transfer the Work, 78 | where such license applies only to those patent claims licensable 79 | by such Contributor that are necessarily infringed by their 80 | Contribution(s) alone or by combination of their Contribution(s) 81 | with the Work to which such Contribution(s) was submitted. If You 82 | institute patent litigation against any entity (including a 83 | cross-claim or counterclaim in a lawsuit) alleging that the Work 84 | or a Contribution incorporated within the Work constitutes direct 85 | or contributory patent infringement, then any patent licenses 86 | granted to You under this License for that Work shall terminate 87 | as of the date such litigation is filed. 88 | 89 | 4. Redistribution. You may reproduce and distribute copies of the 90 | Work or Derivative Works thereof in any medium, with or without 91 | modifications, and in Source or Object form, provided that You 92 | meet the following conditions: 93 | 94 | (a) You must give any other recipients of the Work or 95 | Derivative Works a copy of this License; and 96 | 97 | (b) You must cause any modified files to carry prominent notices 98 | stating that You changed the files; and 99 | 100 | (c) You must retain, in the Source form of any Derivative Works 101 | that You distribute, all copyright, patent, trademark, and 102 | attribution notices from the Source form of the Work, 103 | excluding those notices that do not pertain to any part of 104 | the Derivative Works; and 105 | 106 | (d) If the Work includes a "NOTICE" text file as part of its 107 | distribution, then any Derivative Works that You distribute must 108 | include a readable copy of the attribution notices contained 109 | within such NOTICE file, excluding those notices that do not 110 | pertain to any part of the Derivative Works, in at least one 111 | of the following places: within a NOTICE text file distributed 112 | as part of the Derivative Works; within the Source form or 113 | documentation, if provided along with the Derivative Works; or, 114 | within a display generated by the Derivative Works, if and 115 | wherever such third-party notices normally appear. The contents 116 | of the NOTICE file are for informational purposes only and 117 | do not modify the License. You may add Your own attribution 118 | notices within Derivative Works that You distribute, alongside 119 | or as an addendum to the NOTICE text from the Work, provided 120 | that such additional attribution notices cannot be construed 121 | as modifying the License. 122 | 123 | You may add Your own copyright statement to Your modifications and 124 | may provide additional or different license terms and conditions 125 | for use, reproduction, or distribution of Your modifications, or 126 | for any such Derivative Works as a whole, provided Your use, 127 | reproduction, and distribution of the Work otherwise complies with 128 | the conditions stated in this License. 129 | 130 | 5. Submission of Contributions. Unless You explicitly state otherwise, 131 | any Contribution intentionally submitted for inclusion in the Work 132 | by You to the Licensor shall be under the terms and conditions of 133 | this License, without any additional terms or conditions. 134 | Notwithstanding the above, nothing herein shall supersede or modify 135 | the terms of any separate license agreement you may have executed 136 | with Licensor regarding such Contributions. 137 | 138 | 6. Trademarks. This License does not grant permission to use the trade 139 | names, trademarks, service marks, or product names of the Licensor, 140 | except as required for reasonable and customary use in describing the 141 | origin of the Work and reproducing the content of the NOTICE file. 142 | 143 | 7. Disclaimer of Warranty. Unless required by applicable law or 144 | agreed to in writing, Licensor provides the Work (and each 145 | Contributor provides its Contributions) on an "AS IS" BASIS, 146 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 147 | implied, including, without limitation, any warranties or conditions 148 | of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A 149 | PARTICULAR PURPOSE. You are solely responsible for determining the 150 | appropriateness of using or redistributing the Work and assume any 151 | risks associated with Your exercise of permissions under this License. 152 | 153 | 8. Limitation of Liability. In no event and under no legal theory, 154 | whether in tort (including negligence), contract, or otherwise, 155 | unless required by applicable law (such as deliberate and grossly 156 | negligent acts) or agreed to in writing, shall any Contributor be 157 | liable to You for damages, including any direct, indirect, special, 158 | incidental, or consequential damages of any character arising as a 159 | result of this License or out of the use or inability to use the 160 | Work (including but not limited to damages for loss of goodwill, 161 | work stoppage, computer failure or malfunction, or any and all 162 | other commercial damages or losses), even if such Contributor 163 | has been advised of the possibility of such damages. 164 | 165 | 9. Accepting Warranty or Additional Liability. While redistributing 166 | the Work or Derivative Works thereof, You may choose to offer, 167 | and charge a fee for, acceptance of support, warranty, indemnity, 168 | or other liability obligations and/or rights consistent with this 169 | License. However, in accepting such obligations, You may act only 170 | on Your own behalf and on Your sole responsibility, not on behalf 171 | of any other Contributor, and only if You agree to indemnify, 172 | defend, and hold each Contributor harmless for any liability 173 | incurred by, or claims asserted against, such Contributor by reason 174 | of your accepting any such warranty or additional liability. 175 | 176 | END OF TERMS AND CONDITIONS 177 | 178 | APPENDIX: How to apply the Apache License to your work. 179 | 180 | To apply the Apache License to your work, attach the following 181 | boilerplate notice, with the fields enclosed by brackets "{}" 182 | replaced with your own identifying information. (Don't include 183 | the brackets!) The text should be enclosed in the appropriate 184 | comment syntax for the file format. We also recommend that a 185 | file or class name and description of purpose be included on the 186 | same "printed page" as the copyright notice for easier 187 | identification within third-party archives. 188 | 189 | Copyright {yyyy} {name of copyright owner} 190 | 191 | Licensed under the Apache License, Version 2.0 (the "License"); 192 | you may not use this file except in compliance with the License. 193 | You may obtain a copy of the License at 194 | 195 | http://www.apache.org/licenses/LICENSE-2.0 196 | 197 | Unless required by applicable law or agreed to in writing, software 198 | distributed under the License is distributed on an "AS IS" BASIS, 199 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 200 | See the License for the specific language governing permissions and 201 | limitations under the License. 202 | -------------------------------------------------------------------------------- /go.sum: -------------------------------------------------------------------------------- 1 | cloud.google.com/go v0.26.0/go.mod h1:aQUYkXzVsufM+DwF1aE+0xfcU+56JwCaLick0ClmMTw= 2 | github.com/BurntSushi/toml v0.3.1/go.mod h1:xHWCNGjB5oqiDr8zfno3MHue2Ht5sIBksp03qcyfWMU= 3 | github.com/OneOfOne/xxhash v1.2.2/go.mod h1:HSdplMjZKSmBqAxg5vPj2TmRDmfkzw+cTzAElWljhcU= 4 | github.com/alecthomas/template v0.0.0-20160405071501-a0175ee3bccc/go.mod h1:LOuyumcjzFXgccqObfd/Ljyb9UuFJ6TxHnclSeseNhc= 5 | github.com/alecthomas/template v0.0.0-20190718012654-fb15b899a751/go.mod h1:LOuyumcjzFXgccqObfd/Ljyb9UuFJ6TxHnclSeseNhc= 6 | github.com/alecthomas/units v0.0.0-20151022065526-2efee857e7cf/go.mod h1:ybxpYRFXyAe+OPACYpWeL0wqObRcbAqCMya13uyzqw0= 7 | github.com/alecthomas/units v0.0.0-20190717042225-c3de453c63f4/go.mod h1:ybxpYRFXyAe+OPACYpWeL0wqObRcbAqCMya13uyzqw0= 8 | github.com/apcera/termtables v0.0.0-20170405184538-bcbc5dc54055 h1:IkPAzP+QjchKXXFX6LCcpDKa89b/e/0gPCUbQGWtUUY= 9 | github.com/apcera/termtables v0.0.0-20170405184538-bcbc5dc54055/go.mod h1:8mHYHlOef9UC51cK1/WRvE/iQVM8O8QlYFa8eh8r5I8= 10 | github.com/apex/gateway v1.1.1 h1:dPE3y2LQ/fSJuZikCOvekqXLyn/Wrbgt10MSECobH/Q= 11 | github.com/apex/gateway v1.1.1/go.mod h1:x7iPY22zu9D8sfrynawEwh1wZEO/kQTRaOM5ye02tWU= 12 | github.com/armon/consul-api v0.0.0-20180202201655-eb2c6b5be1b6/go.mod h1:grANhF5doyWs3UAsr3K4I6qtAmlQcZDesFNEHPZAzj8= 13 | github.com/aws/aws-lambda-go v1.13.2/go.mod h1:4UKl9IzQMoD+QF79YdCuzCwp8VbmG4VAQwij/eHl5CU= 14 | github.com/aws/aws-lambda-go v1.14.0 h1:kTr1VPabIgJsMVzHuZpNhs/5RR46LU6wyWUiHxtb3ag= 15 | github.com/aws/aws-lambda-go v1.14.0/go.mod h1:4UKl9IzQMoD+QF79YdCuzCwp8VbmG4VAQwij/eHl5CU= 16 | github.com/aws/aws-sdk-go v1.16.15/go.mod h1:KmX6BPdI08NWTb3/sm4ZGu5ShLoqVDhKgpiN924inxo= 17 | github.com/aws/aws-sdk-go v1.29.0 h1:UFxrMQhDyLak6kVtOcr4PZxNRQV0s7pY/vKAyzRvi8c= 18 | github.com/aws/aws-sdk-go v1.29.0/go.mod h1:1KvfttTE3SPKMpo8g2c6jL3ZKfXtFvKscTgahTma5Xg= 19 | github.com/beorn7/perks v0.0.0-20180321164747-3a771d992973/go.mod h1:Dwedo/Wpr24TaqPxmxbtue+5NUziq4I4S80YR8gNf3Q= 20 | github.com/beorn7/perks v1.0.0/go.mod h1:KWe93zE9D1o94FZ5RNwFwVgaQK1VOXiVxmqh+CedLV8= 21 | github.com/beorn7/perks v1.0.1/go.mod h1:G2ZrVWU2WbWT9wwq4/hrbKbnv/1ERSJQ0ibhJ6rlkpw= 22 | github.com/boltdb/bolt v1.3.1/go.mod h1:clJnj/oiGkjum5o1McbSZDSLxVThjynRyGBgiAx27Ps= 23 | github.com/cespare/xxhash v1.1.0/go.mod h1:XrSqR1VqqWfGrhpAt58auRo0WTKS1nRRg3ghfAqPWnc= 24 | github.com/cespare/xxhash/v2 v2.1.1/go.mod h1:VGX0DQ3Q6kWi7AoAeZDth3/j3BFtOZR5XLFGgcrjCOs= 25 | github.com/client9/misspell v0.3.4/go.mod h1:qj6jICC3Q7zFZvVWo7KLAzC3yx5G7kyvSDkc90ppPyw= 26 | github.com/coreos/bbolt v1.3.2/go.mod h1:iRUV2dpdMOn7Bo10OQBFzIJO9kkE559Wcmn+qkEiiKk= 27 | github.com/coreos/etcd v3.3.10+incompatible/go.mod h1:uF7uidLiAD3TWHmW31ZFd/JWoc32PjwdhPthX9715RE= 28 | github.com/coreos/go-etcd v2.0.0+incompatible/go.mod h1:Jez6KQU2B/sWsbdaef3ED8NzMklzPG4d5KIOhIy30Tk= 29 | github.com/coreos/go-semver v0.2.0/go.mod h1:nnelYz7RCh+5ahJtPPxZlU+153eP4D4r3EedlOD2RNk= 30 | github.com/coreos/go-systemd v0.0.0-20190321100706-95778dfbb74e/go.mod h1:F5haX7vjVVG0kc13fIWeqUViNPyEJxv/OmvnBo0Yme4= 31 | github.com/coreos/pkg v0.0.0-20180928190104-399ea9e2e55f/go.mod h1:E3G3o1h8I7cfcXa63jLwjI0eiQQMgzzUDFVpN/nH/eA= 32 | github.com/cpuguy83/go-md2man v1.0.10/go.mod h1:SmD6nW6nTyfqj6ABTjUi3V3JVMnlJmwcJI5acqYI6dE= 33 | github.com/cpuguy83/go-md2man/v2 v2.0.0-20190314233015-f79a8a8ca69d/go.mod h1:maD7wRr/U5Z6m/iR4s+kqSMx2CaBsrgA7czyZG/E6dU= 34 | github.com/cpuguy83/go-md2man/v2 v2.0.0/go.mod h1:maD7wRr/U5Z6m/iR4s+kqSMx2CaBsrgA7czyZG/E6dU= 35 | github.com/davecgh/go-spew v1.1.0/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38= 36 | github.com/davecgh/go-spew v1.1.1 h1:vj9j/u1bqnvCEfJOwUhtlOARqs3+rkHYY13jYWTU97c= 37 | github.com/davecgh/go-spew v1.1.1/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38= 38 | github.com/dgrijalva/jwt-go v3.2.0+incompatible/go.mod h1:E3ru+11k8xSBh+hMPgOLZmtrrCbhqsmaPHjLKYnJCaQ= 39 | github.com/dgryski/go-sip13 v0.0.0-20181026042036-e10d5fee7954/go.mod h1:vAd38F8PWV+bWy6jNmig1y/TA+kYO4g3RSRF0IAv0no= 40 | github.com/fsnotify/fsnotify v1.4.7/go.mod h1:jwhsz4b93w/PPRr/qN1Yymfu8t87LnFCMoQvtojpjFo= 41 | github.com/function61/eventhorizon v0.2.1-0.20200227140656-f89fe5d462ca h1:FY7y7ZvUjrprByPMWhWMBmyr4htvTkQ5Jh8psrhrl/w= 42 | github.com/function61/eventhorizon v0.2.1-0.20200227140656-f89fe5d462ca/go.mod h1:SztwDAaqWnLPSFyVmV0zbBgRsvExr0GzrlOj9sWTC+Y= 43 | github.com/function61/gokit v0.0.0-20200226141201-fe205250686d h1:sbcDaH7t/trZpAv77EU9hl/rjjse+Mc70MO8MAhILnM= 44 | github.com/function61/gokit v0.0.0-20200226141201-fe205250686d/go.mod h1:jAkW2pwxa4N3qxEUDA3zDhfiJInBfF4BjsbkWAU13aU= 45 | github.com/function61/gokit v0.0.0-20200229115114-3eca8c87d0cc h1:FWizzRbJISAu9y2vKheno6BpHFkMnBl4mFl1nV9wKBI= 46 | github.com/function61/gokit v0.0.0-20200229115114-3eca8c87d0cc/go.mod h1:jAkW2pwxa4N3qxEUDA3zDhfiJInBfF4BjsbkWAU13aU= 47 | github.com/function61/gokit v0.0.0-20200307135016-6dd948616ce0 h1:gWaoRNeHdRhJ8ELIwvI1xZcxdhaOTuRHr5Y9E7djSA4= 48 | github.com/function61/gokit v0.0.0-20200307135016-6dd948616ce0/go.mod h1:f6JhYQwMbwfAX472K/2A46CKet7BrpugMzuCobtqPQo= 49 | github.com/ghodss/yaml v1.0.0/go.mod h1:4dBDuWmgqj2HViK6kFavaiC9ZROes6MMH2rRYeMEF04= 50 | github.com/go-kit/kit v0.8.0/go.mod h1:xBxKIO96dXMWWy0MnWVtmwkA9/13aqxPnvrjFYMA2as= 51 | github.com/go-kit/kit v0.9.0/go.mod h1:xBxKIO96dXMWWy0MnWVtmwkA9/13aqxPnvrjFYMA2as= 52 | github.com/go-logfmt/logfmt v0.3.0/go.mod h1:Qt1PoO58o5twSAckw1HlFXLmHsOX5/0LbT9GBnD5lWE= 53 | github.com/go-logfmt/logfmt v0.4.0/go.mod h1:3RMwSq7FuexP4Kalkev3ejPJsZTpXXBr9+V4qmtdjCk= 54 | github.com/go-sql-driver/mysql v1.5.0/go.mod h1:DCzpHaOWr8IXmIStZouvnhqoel9Qv2LBy8hT2VhHyBg= 55 | github.com/go-stack/stack v1.8.0/go.mod h1:v0f6uXyyMGvRgIKkXu+yp6POWl0qKG85gN/melR3HDY= 56 | github.com/gogo/protobuf v1.1.1/go.mod h1:r8qH/GZQm5c6nD/R0oafs1akxWv10x8SbQlK7atdtwQ= 57 | github.com/gogo/protobuf v1.2.1/go.mod h1:hp+jE20tsWTFYpLwKvXlhS1hjn+gTNwPg2I6zVXpSg4= 58 | github.com/golang/glog v0.0.0-20160126235308-23def4e6c14b/go.mod h1:SBH7ygxi8pfUlaOkMMuAQtPIUF8ecWP5IEl/CR7VP2Q= 59 | github.com/golang/groupcache v0.0.0-20190129154638-5b532d6fd5ef/go.mod h1:cIg4eruTrX1D+g88fzRXU5OdNfaM+9IcxsU14FzY7Hc= 60 | github.com/golang/mock v1.1.1/go.mod h1:oTYuIxOrZwtPieC+H1uAHpcLFnEyAGVDL/k47Jfbm0A= 61 | github.com/golang/protobuf v1.2.0/go.mod h1:6lQm79b+lXiMfvg/cZm0SGofjICqVBUtrP5yJMmIC1U= 62 | github.com/golang/protobuf v1.3.1/go.mod h1:6lQm79b+lXiMfvg/cZm0SGofjICqVBUtrP5yJMmIC1U= 63 | github.com/golang/protobuf v1.3.2/go.mod h1:6lQm79b+lXiMfvg/cZm0SGofjICqVBUtrP5yJMmIC1U= 64 | github.com/google/btree v1.0.0/go.mod h1:lNA+9X1NB3Zf8V7Ke586lFgjr2dZNuvo3lPJSGZ5JPQ= 65 | github.com/google/go-cmp v0.2.0/go.mod h1:oXzfMopK8JAjlY9xF4vHSVASa0yLyX7SntLO5aqRK0M= 66 | github.com/google/go-cmp v0.3.0/go.mod h1:8QqcDgzrUqlUb/G2PQTWiueGozuR1884gddMywk6iLU= 67 | github.com/google/go-cmp v0.3.1/go.mod h1:8QqcDgzrUqlUb/G2PQTWiueGozuR1884gddMywk6iLU= 68 | github.com/google/go-cmp v0.4.0/go.mod h1:v8dTdLbMG2kIc/vJvl+f65V22dbkXbowE6jgT/gNBxE= 69 | github.com/google/gofuzz v1.0.0/go.mod h1:dBl0BpW6vV/+mYPU4Po3pmUjxk6FQPldtuIdl/M65Eg= 70 | github.com/gorilla/websocket v1.4.0/go.mod h1:E7qHFY5m1UJ88s3WnNqhKjPHQ0heANvMoAMk2YaljkQ= 71 | github.com/grpc-ecosystem/go-grpc-middleware v1.0.0/go.mod h1:FiyG127CGDf3tlThmgyCl78X/SZQqEOJBCDaAfeWzPs= 72 | github.com/grpc-ecosystem/go-grpc-prometheus v1.2.0/go.mod h1:8NvIoxWQoOIhqOTXgfV/d3M/q6VIi02HzZEHgUlZvzk= 73 | github.com/grpc-ecosystem/grpc-gateway v1.9.0/go.mod h1:vNeuVxBJEsws4ogUvrchl83t/GYV9WGTSLVdBhOQFDY= 74 | github.com/hashicorp/hcl v1.0.0/go.mod h1:E5yfLk+7swimpb2L/Alb/PJmXilQ/rhwaUYs4T20WEQ= 75 | github.com/inconshreveable/mousetrap v1.0.0 h1:Z8tu5sraLXCXIcARxBp/8cbvlwVa7Z1NHg9XEKhtSvM= 76 | github.com/inconshreveable/mousetrap v1.0.0/go.mod h1:PxqpIevigyE2G7u3NXJIT2ANytuPF1OarO4DADm73n8= 77 | github.com/jmespath/go-jmespath v0.0.0-20180206201540-c2b33e8439af h1:pmfjZENx5imkbgOkpRUYLnmbU7UEFbjtDA2hxJ1ichM= 78 | github.com/jmespath/go-jmespath v0.0.0-20180206201540-c2b33e8439af/go.mod h1:Nht3zPeWKUH0NzdCt2Blrr5ys8VGpn0CEB0cQHVjt7k= 79 | github.com/jonboulle/clockwork v0.1.0/go.mod h1:Ii8DK3G1RaLaWxj9trq07+26W01tbo22gdxWY5EU2bo= 80 | github.com/jpillora/backoff v1.0.0/go.mod h1:J/6gKK9jxlEcS3zixgDgUAsiuZ7yrSoa/FX5e0EB2j4= 81 | github.com/json-iterator/go v1.1.6/go.mod h1:+SdeFBvtyEkXs7REEP0seUULqWtbJapLOCVDaaPEHmU= 82 | github.com/json-iterator/go v1.1.7/go.mod h1:KdQUCv79m/52Kvf8AW2vK1V8akMuk1QjK/uOdHXbAo4= 83 | github.com/json-iterator/go v1.1.9/go.mod h1:KdQUCv79m/52Kvf8AW2vK1V8akMuk1QjK/uOdHXbAo4= 84 | github.com/julienschmidt/httprouter v1.2.0/go.mod h1:SYymIcj16QtmaHHD7aYtjjsJG7VTCxuUUipMqKk8s4w= 85 | github.com/kisielk/errcheck v1.1.0/go.mod h1:EZBBE59ingxPouuu3KfxchcWSUPOHkagtvWXihfKN4Q= 86 | github.com/kisielk/gotool v1.0.0/go.mod h1:XhKaO+MFFWcvkIS/tQcRk01m1F5IRFswLeQ+oQHNcck= 87 | github.com/konsorten/go-windows-terminal-sequences v1.0.1/go.mod h1:T0+1ngSBFLxvqU3pZ+m/2kptfBszLMUkC4ZK/EgS/cQ= 88 | github.com/kr/logfmt v0.0.0-20140226030751-b84e30acd515/go.mod h1:+0opPa2QZZtGFBFZlji/RkVcI2GknAs/DXo4wKdlNEc= 89 | github.com/kr/pretty v0.1.0 h1:L/CwN0zerZDmRFUapSPitk6f+Q3+0za1rQkzVuMiMFI= 90 | github.com/kr/pretty v0.1.0/go.mod h1:dAy3ld7l9f0ibDNOQOHHMYYIIbhfbHSm3C4ZsoJORNo= 91 | github.com/kr/pretty v0.2.0 h1:s5hAObm+yFO5uHYt5dYjxi2rXrsnmRpJx4OYvIWUaQs= 92 | github.com/kr/pretty v0.2.0/go.mod h1:ipq/a2n7PKx3OHsz4KJII5eveXtPO4qwEXGdVfWzfnI= 93 | github.com/kr/pty v1.1.1/go.mod h1:pFQYn66WHrOpPYNljwOMqo10TkYh1fy3cYio2l3bCsQ= 94 | github.com/kr/text v0.1.0 h1:45sCR5RtlFHMR4UwH9sdQ5TC8v0qDQCHnXt+kaKSTVE= 95 | github.com/kr/text v0.1.0/go.mod h1:4Jbv+DJW3UT/LiOwJeYQe1efqtUx/iVham/4vfdArNI= 96 | github.com/magiconair/properties v1.8.0/go.mod h1:PppfXfuXeibc/6YijjN8zIbojt8czPbwD3XqdrwzmxQ= 97 | github.com/mattn/go-runewidth v0.0.8 h1:3tS41NlGYSmhhe/8fhGRzc+z3AYCw1Fe1WAyLuujKs0= 98 | github.com/mattn/go-runewidth v0.0.8/go.mod h1:H031xJmbD/WCDINGzjvQ9THkh0rPKHF+m2gUSrubnMI= 99 | github.com/matttproud/golang_protobuf_extensions v1.0.1/go.mod h1:D8He9yQNgCq6Z5Ld7szi9bcBfOoFv/3dc6xSMkL2PC0= 100 | github.com/mitchellh/go-homedir v1.1.0/go.mod h1:SfyaCUpYCn1Vlf4IUYiD9fPX4A5wJrkLzIz1N1q0pr0= 101 | github.com/mitchellh/mapstructure v1.1.2/go.mod h1:FVVH3fgwuzCH5S8UJGiWEs2h04kUh9fWfEaFds41c1Y= 102 | github.com/modern-go/concurrent v0.0.0-20180228061459-e0a39a4cb421/go.mod h1:6dJC0mAP4ikYIbvyc7fijjWJddQyLn8Ig3JB5CqoB9Q= 103 | github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd/go.mod h1:6dJC0mAP4ikYIbvyc7fijjWJddQyLn8Ig3JB5CqoB9Q= 104 | github.com/modern-go/reflect2 v0.0.0-20180701023420-4b7aa43c6742/go.mod h1:bx2lNnkwVCuqBIxFjflWJWanXIb3RllmbCylyMrvgv0= 105 | github.com/modern-go/reflect2 v1.0.1/go.mod h1:bx2lNnkwVCuqBIxFjflWJWanXIb3RllmbCylyMrvgv0= 106 | github.com/mwitkow/go-conntrack v0.0.0-20161129095857-cc309e4a2223/go.mod h1:qRWi+5nqEBWmkhHvq77mSJWrCKwh8bxhgT7d/eI7P4U= 107 | github.com/oklog/ulid v1.3.1/go.mod h1:CirwcVhetQ6Lv90oh/F+FBtV6XMibvdAFo93nm5qn4U= 108 | github.com/patrickmn/go-cache v2.1.0+incompatible/go.mod h1:3Qf8kWWT7OJRJbdiICTKqZju1ZixQ/KpMGzzAfe6+WQ= 109 | github.com/pelletier/go-toml v1.2.0/go.mod h1:5z9KED0ma1S8pY6P1sdut58dfprrGBbd/94hg7ilaic= 110 | github.com/pkg/errors v0.8.0/go.mod h1:bwawxfHBFNV+L2hUp1rHADufV3IMtnDRdf1r5NINEl0= 111 | github.com/pkg/errors v0.8.1/go.mod h1:bwawxfHBFNV+L2hUp1rHADufV3IMtnDRdf1r5NINEl0= 112 | github.com/pkg/errors v0.9.1 h1:FEBLx1zS214owpjy7qsBeixbURkuhQAwrK5UwLGTwt4= 113 | github.com/pkg/errors v0.9.1/go.mod h1:bwawxfHBFNV+L2hUp1rHADufV3IMtnDRdf1r5NINEl0= 114 | github.com/pmezard/go-difflib v1.0.0 h1:4DBwDE0NGyQoBHbLQYPwSUPoCMWR5BEzIk/f1lZbAQM= 115 | github.com/pmezard/go-difflib v1.0.0/go.mod h1:iKH77koFhYxTK1pcRnkKkqfTogsbg7gZNVY4sRDYZ/4= 116 | github.com/prometheus/client_golang v0.9.1/go.mod h1:7SWBe2y4D6OKWSNQJUaRYU/AaXPKyh/dDVn+NZz0KFw= 117 | github.com/prometheus/client_golang v0.9.3/go.mod h1:/TN21ttK/J9q6uSwhBd54HahCDft0ttaMvbicHlPoso= 118 | github.com/prometheus/client_golang v1.0.0/go.mod h1:db9x61etRT2tGnBNRi70OPL5FsnadC4Ky3P0J6CfImo= 119 | github.com/prometheus/client_golang v1.1.0/go.mod h1:I1FGZT9+L76gKKOs5djB6ezCbFQP1xR9D75/vuwEF3g= 120 | github.com/prometheus/client_golang v1.4.1/go.mod h1:e9GMxYsXl05ICDXkRhurwBS4Q3OK1iX/F2sw+iXX5zU= 121 | github.com/prometheus/client_model v0.0.0-20180712105110-5c3871d89910/go.mod h1:MbSGuTsp3dbXC40dX6PRTWyKYBIrTGTE9sqQNg2J8bo= 122 | github.com/prometheus/client_model v0.0.0-20190129233127-fd36f4220a90/go.mod h1:xMI15A0UPsDsEKsMN9yxemIoYk6Tm2C1GtYGdfGttqA= 123 | github.com/prometheus/client_model v0.2.0/go.mod h1:xMI15A0UPsDsEKsMN9yxemIoYk6Tm2C1GtYGdfGttqA= 124 | github.com/prometheus/common v0.0.0-20181113130724-41aa239b4cce/go.mod h1:daVV7qP5qjZbuso7PdcryaAu0sAZbrN9i7WWcTMWvro= 125 | github.com/prometheus/common v0.4.0/go.mod h1:TNfzLD0ON7rHzMJeJkieUDPYmFC7Snx/y86RQel1bk4= 126 | github.com/prometheus/common v0.4.1/go.mod h1:TNfzLD0ON7rHzMJeJkieUDPYmFC7Snx/y86RQel1bk4= 127 | github.com/prometheus/common v0.6.0/go.mod h1:eBmuwkDJBwy6iBfxCBob6t6dR6ENT/y+J+Zk0j9GMYc= 128 | github.com/prometheus/common v0.9.1/go.mod h1:yhUN8i9wzaXS3w1O07YhxHEBxD+W35wd8bs7vj7HSQ4= 129 | github.com/prometheus/procfs v0.0.0-20181005140218-185b4288413d/go.mod h1:c3At6R/oaqEKCNdg8wHV1ftS6bRYblBhIjjI8uT2IGk= 130 | github.com/prometheus/procfs v0.0.0-20190507164030-5867b95ac084/go.mod h1:TjEm7ze935MbeOT/UhFTIMYKhuLP4wbCsTZCD3I8kEA= 131 | github.com/prometheus/procfs v0.0.2/go.mod h1:TjEm7ze935MbeOT/UhFTIMYKhuLP4wbCsTZCD3I8kEA= 132 | github.com/prometheus/procfs v0.0.3/go.mod h1:4A/X28fw3Fc593LaREMrKMqOKvUAntwMDaekg4FpcdQ= 133 | github.com/prometheus/procfs v0.0.8/go.mod h1:7Qr8sr6344vo1JqZ6HhLceV9o3AJ1Ff+GxbHq6oeK9A= 134 | github.com/prometheus/tsdb v0.7.1/go.mod h1:qhTCs0VvXwvX/y3TZrWD7rabWM+ijKTux40TwIPHuXU= 135 | github.com/rogpeppe/fastuuid v0.0.0-20150106093220-6724a57986af/go.mod h1:XWv6SoW27p1b0cqNHllgS5HIMJraePCO15w5zCzIWYg= 136 | github.com/russross/blackfriday v1.5.2/go.mod h1:JO/DiYxRf+HjHt06OyowR9PTA263kcR/rfWxYHBV53g= 137 | github.com/russross/blackfriday/v2 v2.0.1/go.mod h1:+Rmxgy9KzJVeS9/2gXHxylqXiyQDYRxCVz55jmeOWTM= 138 | github.com/scylladb/termtables v1.0.0 h1:uUnesUY4V1VPCotpOQLb1LjTXVvzwy7Ramx8K8+w+8U= 139 | github.com/scylladb/termtables v1.0.0/go.mod h1:C1a7PQSMz9NShzorzCiG2fk9+xuCgLkPeCvMHYR2OWg= 140 | github.com/shurcooL/sanitized_anchor_name v1.0.0/go.mod h1:1NzhyTcUVG4SuEtjjoZeVRXNmyL/1OwPU0+IJeTBvfc= 141 | github.com/sirupsen/logrus v1.2.0/go.mod h1:LxeOpSwHxABJmUn/MG1IvRgCAasNZTLOkJPxbbu5VWo= 142 | github.com/sirupsen/logrus v1.4.2/go.mod h1:tLMulIdttU9McNUspp0xgXVQah82FyeX6MwdIuYE2rE= 143 | github.com/soheilhy/cmux v0.1.4/go.mod h1:IM3LyeVVIOuxMH7sFAkER9+bJ4dT7Ms6E4xg4kGIyLM= 144 | github.com/spaolacci/murmur3 v0.0.0-20180118202830-f09979ecbc72/go.mod h1:JwIasOWyU6f++ZhiEuf87xNszmSA2myDM2Kzu9HwQUA= 145 | github.com/spf13/afero v1.1.2/go.mod h1:j4pytiNVoe2o6bmDsKpLACNPDBIoEAkihy7loJ1B0CQ= 146 | github.com/spf13/cast v1.3.0/go.mod h1:Qx5cxh0v+4UWYiBimWS+eyWzqEqokIECu5etghLkUJE= 147 | github.com/spf13/cobra v0.0.5/go.mod h1:3K3wKZymM7VvHMDS9+Akkh4K60UwM26emMESw8tLCHU= 148 | github.com/spf13/cobra v0.0.6 h1:breEStsVwemnKh2/s6gMvSdMEkwW0sK8vGStnlVBMCs= 149 | github.com/spf13/cobra v0.0.6/go.mod h1:/6GTrnGXV9HjY+aR4k0oJ5tcvakLuG6EuKReYlHNrgE= 150 | github.com/spf13/jwalterweatherman v1.0.0/go.mod h1:cQK4TGJAtQXfYWX+Ddv3mKDzgVb68N+wFjFa4jdeBTo= 151 | github.com/spf13/pflag v1.0.3 h1:zPAT6CGy6wXeQ7NtTnaTerfKOsV6V6F8agHXFiazDkg= 152 | github.com/spf13/pflag v1.0.3/go.mod h1:DYY7MBk1bdzusC3SYhjObp+wFpr4gzcvqqNjLnInEg4= 153 | github.com/spf13/pflag v1.0.5 h1:iy+VFUOCP1a+8yFto/drg2CJ5u0yRoB7fZw3DKv/JXA= 154 | github.com/spf13/pflag v1.0.5/go.mod h1:McXfInJRrz4CZXVZOBLb0bTZqETkiAhM9Iw0y3An2Bg= 155 | github.com/spf13/viper v1.3.2/go.mod h1:ZiWeW+zYFKm7srdB9IoDzzZXaJaI5eL9QjNiN/DMA2s= 156 | github.com/spf13/viper v1.4.0/go.mod h1:PTJ7Z/lr49W6bUbkmS1V3by4uWynFiR9p7+dSq/yZzE= 157 | github.com/stretchr/objx v0.1.0/go.mod h1:HFkY916IF+rwdDfMAkV7OtwuqBVzrE8GR6GFx+wExME= 158 | github.com/stretchr/objx v0.1.1/go.mod h1:HFkY916IF+rwdDfMAkV7OtwuqBVzrE8GR6GFx+wExME= 159 | github.com/stretchr/testify v1.2.2/go.mod h1:a8OnRcib4nhh0OaRAV+Yts87kKdq0PP7pXfy6kDkUVs= 160 | github.com/stretchr/testify v1.3.0/go.mod h1:M5WIy9Dh21IEIfnGCwXGc5bZfKNJtfHm1UVUgZn+9EI= 161 | github.com/stretchr/testify v1.4.0/go.mod h1:j7eGeouHqKxXV5pUuKE4zz7dFj8WfuZ+81PSLYec5m4= 162 | github.com/stretchr/testify v1.5.1 h1:nOGnQDM7FYENwehXlg/kFVnos3rEvtKTjRvOWSzb6H4= 163 | github.com/stretchr/testify v1.5.1/go.mod h1:5W2xD1RspED5o8YsWQXVCued0rvSQ+mT+I5cxcmMvtA= 164 | github.com/tj/assert v0.0.0-20190920132354-ee03d75cd160 h1:NSWpaDaurcAJY7PkL8Xt0PhZE7qpvbZl5ljd8r6U0bI= 165 | github.com/tj/assert v0.0.0-20190920132354-ee03d75cd160/go.mod h1:mZ9/Rh9oLWpLLDRpvE+3b7gP/C2YyLFYxNmcLnPTMe0= 166 | github.com/tmc/grpc-websocket-proxy v0.0.0-20190109142713-0ad062ec5ee5/go.mod h1:ncp9v5uamzpCO7NfCPTXjqaC+bZgJeR0sMTm6dMHP7U= 167 | github.com/ugorji/go v1.1.4/go.mod h1:uQMGLiO92mf5W77hV/PUCpI3pbzQx3CRekS0kk+RGrc= 168 | github.com/ugorji/go/codec v0.0.0-20181204163529-d75b2dcb6bc8/go.mod h1:VFNgLljTbGfSG7qAOspJ7OScBnGdDN/yBr0sguwnwf0= 169 | github.com/urfave/cli v1.22.1/go.mod h1:Gos4lmkARVdJ6EkW0WaNv/tZAAMe9V7XWyB60NtXRu0= 170 | github.com/xiang90/probing v0.0.0-20190116061207-43a291ad63a2/go.mod h1:UETIi67q53MR2AWcXfiuqkDkRtnGDLqkBTpCHuJHxtU= 171 | github.com/xordataexchange/crypt v0.0.3-0.20170626215501-b2862e3d0a77/go.mod h1:aYKd//L2LvnjZzWKhF00oedf4jCCReLcmhLdhm1A27Q= 172 | go.etcd.io/bbolt v1.3.2/go.mod h1:IbVyRI1SCnLcuJnV2u8VeU0CEYM7e686BmAb1XKL+uU= 173 | go.uber.org/atomic v1.4.0/go.mod h1:gD2HeocX3+yG+ygLZcrzQJaqmWj9AIm7n08wl/qW/PE= 174 | go.uber.org/multierr v1.1.0/go.mod h1:wR5kodmAFQ0UK8QlbwjlSNy0Z68gJhDJUG5sjR94q/0= 175 | go.uber.org/zap v1.10.0/go.mod h1:vwi/ZaCAaUcBkycHslxD9B2zi4UTXhF60s6SWpuDF0Q= 176 | golang.org/x/crypto v0.0.0-20180904163835-0709b304e793/go.mod h1:6SG95UA2DQfeDnfUPMdvaQW0Q7yPrPDi9nlGo2tz2b4= 177 | golang.org/x/crypto v0.0.0-20181203042331-505ab145d0a9/go.mod h1:6SG95UA2DQfeDnfUPMdvaQW0Q7yPrPDi9nlGo2tz2b4= 178 | golang.org/x/crypto v0.0.0-20190308221718-c2843e01d9a2/go.mod h1:djNgcEr1/C05ACkg1iLfiJU5Ep61QUkGW8qpdssI0+w= 179 | golang.org/x/lint v0.0.0-20181026193005-c67002cb31c3/go.mod h1:UVdnD1Gm6xHRNCYTkRU2/jEulfH38KcIWyp/GAMgvoE= 180 | golang.org/x/lint v0.0.0-20190313153728-d0100b6bd8b3/go.mod h1:6SW0HCj/g11FgYtHlgUYUwCkIfeOF89ocIRzGO/8vkc= 181 | golang.org/x/net v0.0.0-20180826012351-8a410e7b638d/go.mod h1:mL1N/T3taQHkDXs73rZJwtUhF3w3ftmwwsq0BUmARs4= 182 | golang.org/x/net v0.0.0-20181114220301-adae6a3d119a/go.mod h1:mL1N/T3taQHkDXs73rZJwtUhF3w3ftmwwsq0BUmARs4= 183 | golang.org/x/net v0.0.0-20181220203305-927f97764cc3/go.mod h1:mL1N/T3taQHkDXs73rZJwtUhF3w3ftmwwsq0BUmARs4= 184 | golang.org/x/net v0.0.0-20190311183353-d8887717615a/go.mod h1:t9HGtf8HONx5eT2rtn7q6eTqICYqUVnKs3thJo3Qplg= 185 | golang.org/x/net v0.0.0-20190522155817-f3200d17e092/go.mod h1:HSz+uSET+XFnRR8LxR5pz3Of3rY3CfYBVs4xY44aLks= 186 | golang.org/x/net v0.0.0-20190613194153-d28f0bde5980/go.mod h1:z5CRVTTTmAJ677TzLLGU+0bjPO0LkuOLi4/5GtJWs/s= 187 | golang.org/x/net v0.0.0-20200202094626-16171245cfb2/go.mod h1:z5CRVTTTmAJ677TzLLGU+0bjPO0LkuOLi4/5GtJWs/s= 188 | golang.org/x/net v0.0.0-20200226121028-0de0cce0169b h1:0mm1VjtFUOIlE1SbDlwjYaDxZVDP2S5ou6y0gSgXHu8= 189 | golang.org/x/net v0.0.0-20200226121028-0de0cce0169b/go.mod h1:z5CRVTTTmAJ677TzLLGU+0bjPO0LkuOLi4/5GtJWs/s= 190 | golang.org/x/oauth2 v0.0.0-20180821212333-d2e6202438be/go.mod h1:N/0e6XlmueqKjAGxoOufVs8QHGRruUQn6yWY3a++T0U= 191 | golang.org/x/sync v0.0.0-20180314180146-1d60e4601c6f/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM= 192 | golang.org/x/sync v0.0.0-20181108010431-42b317875d0f/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM= 193 | golang.org/x/sync v0.0.0-20181221193216-37e7f081c4d4/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM= 194 | golang.org/x/sync v0.0.0-20190911185100-cd5d95a43a6e/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM= 195 | golang.org/x/sys v0.0.0-20180830151530-49385e6e1522/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY= 196 | golang.org/x/sys v0.0.0-20180905080454-ebe1bf3edb33/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY= 197 | golang.org/x/sys v0.0.0-20181107165924-66b7b1311ac8/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY= 198 | golang.org/x/sys v0.0.0-20181116152217-5ac8a444bdc5/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY= 199 | golang.org/x/sys v0.0.0-20181205085412-a5c9d58dba9a/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY= 200 | golang.org/x/sys v0.0.0-20190215142949-d0b11bdaac8a/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY= 201 | golang.org/x/sys v0.0.0-20190422165155-953cdadca894/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs= 202 | golang.org/x/sys v0.0.0-20190801041406-cbf593c0f2f3/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs= 203 | golang.org/x/sys v0.0.0-20200121082415-34d275377bf9/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs= 204 | golang.org/x/sys v0.0.0-20200122134326-e047566fdf82/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs= 205 | golang.org/x/text v0.3.0 h1:g61tztE5qeGQ89tm6NTjjM9VPIm088od1l6aSorWRWg= 206 | golang.org/x/text v0.3.0/go.mod h1:NqM8EUOU14njkJ3fqMW+pc6Ldnwhi/IjpwHt7yyuwOQ= 207 | golang.org/x/text v0.3.2 h1:tW2bmiBqwgJj/UpqtC8EpXEZVYOwU0yG4iWbprSVAcs= 208 | golang.org/x/text v0.3.2/go.mod h1:bEr9sfX3Q8Zfm5fL9x+3itogRgK3+ptLWKqgva+5dAk= 209 | golang.org/x/time v0.0.0-20190308202827-9d24e82272b4/go.mod h1:tRJNPiyCQ0inRvYxbN9jk5I+vvW/OXSQhTDSoE431IQ= 210 | golang.org/x/tools v0.0.0-20180221164845-07fd8470d635/go.mod h1:n7NCudcB/nEzxVGmLbDWY5pfWTLqBcC2KZ6jyYvM4mQ= 211 | golang.org/x/tools v0.0.0-20180917221912-90fa682c2a6e/go.mod h1:n7NCudcB/nEzxVGmLbDWY5pfWTLqBcC2KZ6jyYvM4mQ= 212 | golang.org/x/tools v0.0.0-20190114222345-bf090417da8b/go.mod h1:n7NCudcB/nEzxVGmLbDWY5pfWTLqBcC2KZ6jyYvM4mQ= 213 | golang.org/x/tools v0.0.0-20190311212946-11955173bddd/go.mod h1:LCzVGOaR6xXOjkQ3onu1FJEFr0SW1gC7cKk1uF8kGRs= 214 | golang.org/x/xerrors v0.0.0-20191204190536-9bdfabe68543/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0= 215 | google.golang.org/appengine v1.1.0/go.mod h1:EbEs0AVv82hx2wNQdGPgUI5lhzA/G0D9YwlJXL52JkM= 216 | google.golang.org/genproto v0.0.0-20180817151627-c66870c02cf8/go.mod h1:JiN7NxoALGmiZfu7CAH4rXhgtRTLTxftemlI0sWmxmc= 217 | google.golang.org/grpc v1.19.0/go.mod h1:mqu4LbDTu4XGKhr4mRzUsmM4RtVoemTSY81AxZiDr8c= 218 | google.golang.org/grpc v1.21.0/go.mod h1:oYelfM1adQP15Ek0mdvEgi9Df8B9CZIaU1084ijfRaM= 219 | gopkg.in/alecthomas/kingpin.v2 v2.2.6/go.mod h1:FMv+mEhP44yOT+4EoQTLFTRgOQ1FBLkstjWtayDeSgw= 220 | gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0= 221 | gopkg.in/check.v1 v1.0.0-20180628173108-788fd7840127 h1:qIbj1fsPNlZgppZ+VLlY7N33q108Sa+fhmuc+sWQYwY= 222 | gopkg.in/check.v1 v1.0.0-20180628173108-788fd7840127/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0= 223 | gopkg.in/check.v1 v1.0.0-20190902080502-41f04d3bba15 h1:YR8cESwS4TdDjEe65xsg0ogRM/Nc3DYOhEAlW+xobZo= 224 | gopkg.in/check.v1 v1.0.0-20190902080502-41f04d3bba15/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0= 225 | gopkg.in/resty.v1 v1.12.0/go.mod h1:mDo4pnntr5jdWRML875a/NmxYqAlA73dVijT2AXvQQo= 226 | gopkg.in/yaml.v2 v2.0.0-20170812160011-eb3733d160e7/go.mod h1:JAlM8MvJe8wmxCU4Bli9HhUf9+ttbYbLASfIpnQbh74= 227 | gopkg.in/yaml.v2 v2.2.1/go.mod h1:hI93XBmqTisBFMUTm0b8Fm+jr3Dg1NNxqwp+5A1VGuI= 228 | gopkg.in/yaml.v2 v2.2.2 h1:ZCJp+EgiOT7lHqUV2J862kp8Qj64Jo6az82+3Td9dZw= 229 | gopkg.in/yaml.v2 v2.2.2/go.mod h1:hI93XBmqTisBFMUTm0b8Fm+jr3Dg1NNxqwp+5A1VGuI= 230 | gopkg.in/yaml.v2 v2.2.4/go.mod h1:hI93XBmqTisBFMUTm0b8Fm+jr3Dg1NNxqwp+5A1VGuI= 231 | gopkg.in/yaml.v2 v2.2.5/go.mod h1:hI93XBmqTisBFMUTm0b8Fm+jr3Dg1NNxqwp+5A1VGuI= 232 | gopkg.in/yaml.v2 v2.2.7 h1:VUgggvou5XRW9mHwD/yXxIYSMtY0zoKQf/v226p2nyo= 233 | gopkg.in/yaml.v2 v2.2.7/go.mod h1:hI93XBmqTisBFMUTm0b8Fm+jr3Dg1NNxqwp+5A1VGuI= 234 | honnef.co/go/tools v0.0.0-20190102054323-c2f93a96b099/go.mod h1:rf3lG4BRIbNafJWhAfAdb/ePZxsR/4RtNHQocxwk9r4= 235 | --------------------------------------------------------------------------------