├── .dockerignore ├── requirements.txt ├── .gitignore ├── images ├── twittergraph.png └── twittergraph_new.png ├── grafana-dashboard-provider.yml ├── BUILDING.md ├── influxdb-datasource.yml ├── app.py ├── twittergraph-dashboard.json └── README.md /.dockerignore: -------------------------------------------------------------------------------- 1 | *.yml 2 | -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | tweepy 2 | influxdb 3 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | source_me 2 | secrets.yml 3 | __pycache__/ 4 | *.key 5 | *.crt 6 | *.csr 7 | *.pem 8 | -------------------------------------------------------------------------------- /images/twittergraph.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/clcollins/twitterGraph/HEAD/images/twittergraph.png -------------------------------------------------------------------------------- /images/twittergraph_new.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/clcollins/twitterGraph/HEAD/images/twittergraph_new.png -------------------------------------------------------------------------------- /grafana-dashboard-provider.yml: -------------------------------------------------------------------------------- 1 | apiVersion: 1 2 | 3 | providers: 4 | - name: 'default' 5 | orgId: 1 6 | folder: '' 7 | type: file 8 | disableDeletion: false 9 | updateIntervalSeconds: 10 #how often Grafana will scan for changed dashboards 10 | options: 11 | path: /var/lib/grafana/dashboards 12 | -------------------------------------------------------------------------------- /BUILDING.md: -------------------------------------------------------------------------------- 1 | # Building the TwitterGraph Docker Image 2 | 3 | A part of this example project is a Docker image for the TwitterGraph Python script, run as a cron job to check Twitter stats. The image can be pulled from Dockerhub: `docker pull clcollins/twittergraph:1.0`, or it can be built using [Source to Image](https://github.com/openshift/source-to-image). 4 | 5 | ## Building with Source to Image 6 | 7 | The `clcollins/twittergraph:1.0` Docker image is built using the [Centos Python 3.6 S2I image](https://github.com/sclorg/s2i-python-container/tree/master/3.6), which contains the scripts necessary to setup Python apps during the build. (Read more about Source to Image for details). 8 | 9 | To build your own image with Source to Image, install Source to Image and run `s2i build https://github.com/clcollins/twittergraph centos/python-36-centos7 twittergraph:1.0`. 10 | -------------------------------------------------------------------------------- /influxdb-datasource.yml: -------------------------------------------------------------------------------- 1 | # config file version 2 | apiVersion: 1 3 | 4 | # list of datasources to insert/update depending 5 | # what's available in the database 6 | datasources: 7 | # name of the datasource. Required 8 | - name: influxdb 9 | # datasource type. Required 10 | type: influxdb 11 | # access mode. proxy or direct (Server or Browser in the UI). Required 12 | access: proxy 13 | # org id. will default to orgId 1 if not specified 14 | orgId: 1 15 | # url 16 | url: http://influxdb:8086 17 | # database password, if used 18 | password: root 19 | # database user, if used 20 | user: root 21 | # database name, if used 22 | database: twittergraph 23 | # version 24 | version: 1 25 | # allow users to edit datasources from the UI. 26 | editable: false 27 | 28 | -------------------------------------------------------------------------------- /app.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | 3 | import os 4 | import sys 5 | import tweepy 6 | from datetime import datetime 7 | from influxdb import InfluxDBClient 8 | 9 | 10 | def parseConfig(): 11 | """Parse the environemnt variables and return them as a dictionary.""" 12 | twitter_auth = ['TWITTER_API_KEY', 13 | 'TWITTER_API_SECRET', 14 | 'TWITTER_ACCESS_TOKEN', 15 | 'TWITTER_ACCESS_SECRET'] 16 | 17 | twitter_user = ['TWITTER_USER'] 18 | 19 | influx_auth = ['INFLUXDB_HOST', 20 | 'INFLUXDB_DATABASE', 21 | 'INFLUXDB_USERNAME', 22 | 'INFLUXDB_PASSWORD'] 23 | 24 | data = {} 25 | 26 | for i in twitter_auth, twitter_user, influx_auth: 27 | for k in i: 28 | if k not in os.environ: 29 | raise Exception('{} not found in environment'.format(k)) 30 | else: 31 | data[k] = os.environ[k] 32 | 33 | return(data) 34 | 35 | 36 | def twitterApi(api_key, api_secret, access_token, access_secret): 37 | """Authenticate and create a Twitter session.""" 38 | 39 | auth = tweepy.OAuthHandler(api_key, api_secret) 40 | auth.set_access_token(access_token, access_secret) 41 | 42 | return tweepy.API(auth) 43 | 44 | 45 | def getUser(twitter_api, user): 46 | """Query the Twitter API for the user's stats.""" 47 | return twitter_api.get_user(user) 48 | 49 | 50 | def createInfluxDB(client, db_name): 51 | """Create the database if it doesn't exist.""" 52 | dbs = client.get_list_database() 53 | if not any(db['name'] == db_name for db in dbs): 54 | client.create_database(db_name) 55 | client.switch_database(db_name) 56 | 57 | 58 | def initDBClient(host, db, user, password): 59 | """Create an InfluxDB client connection.""" 60 | 61 | client = InfluxDBClient(host, 8086, user, password, db) 62 | 63 | return(client) 64 | 65 | 66 | def createPoint(username, measurement, value, time): 67 | """Create a datapoint.""" 68 | json_body = { 69 | "measurement": measurement, 70 | "tags": { 71 | "user": username 72 | }, 73 | "time": time, 74 | "fields": { 75 | "value": value 76 | } 77 | } 78 | 79 | return json_body 80 | 81 | 82 | def main(): 83 | """Do the main.""" 84 | data = parseConfig() 85 | time = datetime.utcnow().strftime('%Y-%m-%dT%H:%M:%SZ') 86 | 87 | twitter = twitterApi(data['TWITTER_API_KEY'], 88 | data['TWITTER_API_SECRET'], 89 | data['TWITTER_ACCESS_TOKEN'], 90 | data['TWITTER_ACCESS_SECRET']) 91 | 92 | userdata = getUser(twitter, data['TWITTER_USER']) 93 | 94 | client = initDBClient(data['INFLUXDB_HOST'], 95 | data['INFLUXDB_DATABASE'], 96 | data['INFLUXDB_USERNAME'], 97 | data['INFLUXDB_PASSWORD']) 98 | 99 | createInfluxDB(client, data['INFLUXDB_DATABASE']) 100 | 101 | json_body = [] 102 | 103 | data_points = { 104 | "followers_count": userdata.followers_count, 105 | "friends_count": userdata.friends_count, 106 | "listed_count": userdata.listed_count, 107 | "favourites_count": userdata.favourites_count, 108 | "statuses_count": userdata.statuses_count 109 | } 110 | 111 | for key, value in data_points.items(): 112 | json_body.append(createPoint(data['TWITTER_USER'], 113 | key, 114 | value, 115 | time)) 116 | 117 | client.write_points(json_body) 118 | 119 | 120 | if __name__ == "__main__": 121 | sys.exit(main()) 122 | -------------------------------------------------------------------------------- /twittergraph-dashboard.json: -------------------------------------------------------------------------------- 1 | { 2 | "__requires": [ 3 | { 4 | "type": "grafana", 5 | "id": "grafana", 6 | "name": "Grafana", 7 | "version": "5.3.2" 8 | }, 9 | { 10 | "type": "panel", 11 | "id": "graph", 12 | "name": "Graph", 13 | "version": "5.0.0" 14 | }, 15 | { 16 | "type": "datasource", 17 | "id": "influxdb", 18 | "name": "InfluxDB", 19 | "version": "5.0.0" 20 | }, 21 | { 22 | "type": "panel", 23 | "id": "singlestat", 24 | "name": "Singlestat", 25 | "version": "5.0.0" 26 | } 27 | ], 28 | "annotations": { 29 | "list": [ 30 | { 31 | "builtIn": 1, 32 | "datasource": "-- Grafana --", 33 | "enable": true, 34 | "hide": true, 35 | "iconColor": "rgba(0, 211, 255, 1)", 36 | "name": "Annotations & Alerts", 37 | "type": "dashboard" 38 | } 39 | ] 40 | }, 41 | "editable": true, 42 | "gnetId": null, 43 | "graphTooltip": 0, 44 | "id": null, 45 | "links": [], 46 | "panels": [ 47 | { 48 | "cacheTimeout": null, 49 | "colorBackground": false, 50 | "colorValue": false, 51 | "colors": [ 52 | "#299c46", 53 | "rgba(237, 129, 40, 0.89)", 54 | "#d44a3a" 55 | ], 56 | "datasource": "influxdb", 57 | "format": "none", 58 | "gauge": { 59 | "maxValue": 100, 60 | "minValue": 0, 61 | "show": false, 62 | "thresholdLabels": false, 63 | "thresholdMarkers": true 64 | }, 65 | "gridPos": { 66 | "h": 5, 67 | "w": 5, 68 | "x": 0, 69 | "y": 0 70 | }, 71 | "id": 2, 72 | "interval": "", 73 | "links": [], 74 | "mappingType": 1, 75 | "mappingTypes": [ 76 | { 77 | "name": "value to text", 78 | "value": 1 79 | }, 80 | { 81 | "name": "range to text", 82 | "value": 2 83 | } 84 | ], 85 | "maxDataPoints": 100, 86 | "nullPointMode": "connected", 87 | "nullText": null, 88 | "postfix": "", 89 | "postfixFontSize": "50%", 90 | "prefix": "", 91 | "prefixFontSize": "50%", 92 | "rangeMaps": [ 93 | { 94 | "from": "null", 95 | "text": "N/A", 96 | "to": "null" 97 | } 98 | ], 99 | "sparkline": { 100 | "fillColor": "rgba(31, 118, 189, 0.18)", 101 | "full": false, 102 | "lineColor": "rgb(31, 120, 193)", 103 | "show": false 104 | }, 105 | "tableColumn": "", 106 | "targets": [ 107 | { 108 | "groupBy": [ 109 | { 110 | "params": [ 111 | "$__interval" 112 | ], 113 | "type": "time" 114 | }, 115 | { 116 | "params": [ 117 | "null" 118 | ], 119 | "type": "fill" 120 | } 121 | ], 122 | "measurement": "followers_count", 123 | "orderByTime": "ASC", 124 | "policy": "default", 125 | "refId": "A", 126 | "resultFormat": "time_series", 127 | "select": [ 128 | [ 129 | { 130 | "params": [ 131 | "value" 132 | ], 133 | "type": "field" 134 | }, 135 | { 136 | "params": [], 137 | "type": "last" 138 | } 139 | ] 140 | ], 141 | "tags": [] 142 | } 143 | ], 144 | "thresholds": "", 145 | "title": "Current Followers", 146 | "type": "singlestat", 147 | "valueFontSize": "200%", 148 | "valueMaps": [ 149 | { 150 | "op": "=", 151 | "text": "N/A", 152 | "value": "null" 153 | } 154 | ], 155 | "valueName": "current" 156 | }, 157 | { 158 | "aliasColors": {}, 159 | "bars": false, 160 | "dashLength": 10, 161 | "dashes": false, 162 | "datasource": "influxdb", 163 | "fill": 1, 164 | "gridPos": { 165 | "h": 5, 166 | "w": 19, 167 | "x": 5, 168 | "y": 0 169 | }, 170 | "id": 4, 171 | "legend": { 172 | "avg": false, 173 | "current": false, 174 | "max": false, 175 | "min": false, 176 | "show": true, 177 | "total": false, 178 | "values": false 179 | }, 180 | "lines": true, 181 | "linewidth": 1, 182 | "links": [], 183 | "nullPointMode": "null", 184 | "percentage": false, 185 | "pointradius": 5, 186 | "points": false, 187 | "renderer": "flot", 188 | "seriesOverrides": [], 189 | "spaceLength": 10, 190 | "stack": false, 191 | "steppedLine": false, 192 | "targets": [ 193 | { 194 | "groupBy": [ 195 | { 196 | "params": [ 197 | "$__interval" 198 | ], 199 | "type": "time" 200 | }, 201 | { 202 | "params": [ 203 | "null" 204 | ], 205 | "type": "fill" 206 | } 207 | ], 208 | "measurement": "followers_count", 209 | "orderByTime": "ASC", 210 | "policy": "default", 211 | "refId": "A", 212 | "resultFormat": "time_series", 213 | "select": [ 214 | [ 215 | { 216 | "params": [ 217 | "value" 218 | ], 219 | "type": "field" 220 | }, 221 | { 222 | "params": [], 223 | "type": "distinct" 224 | } 225 | ] 226 | ], 227 | "tags": [] 228 | } 229 | ], 230 | "thresholds": [], 231 | "timeFrom": null, 232 | "timeShift": null, 233 | "title": "Followers Over Time", 234 | "tooltip": { 235 | "shared": true, 236 | "sort": 0, 237 | "value_type": "individual" 238 | }, 239 | "type": "graph", 240 | "xaxis": { 241 | "buckets": null, 242 | "mode": "time", 243 | "name": null, 244 | "show": true, 245 | "values": [] 246 | }, 247 | "yaxes": [ 248 | { 249 | "format": "none", 250 | "label": null, 251 | "logBase": 1, 252 | "max": null, 253 | "min": null, 254 | "show": true 255 | }, 256 | { 257 | "format": "none", 258 | "label": null, 259 | "logBase": 1, 260 | "max": null, 261 | "min": null, 262 | "show": true 263 | } 264 | ], 265 | "yaxis": { 266 | "align": false, 267 | "alignLevel": null 268 | } 269 | }, 270 | { 271 | "cacheTimeout": null, 272 | "colorBackground": false, 273 | "colorValue": false, 274 | "colors": [ 275 | "#299c46", 276 | "rgba(237, 129, 40, 0.89)", 277 | "#d44a3a" 278 | ], 279 | "datasource": "influxdb", 280 | "format": "none", 281 | "gauge": { 282 | "maxValue": 100, 283 | "minValue": 0, 284 | "show": false, 285 | "thresholdLabels": false, 286 | "thresholdMarkers": true 287 | }, 288 | "gridPos": { 289 | "h": 5, 290 | "w": 5, 291 | "x": 0, 292 | "y": 5 293 | }, 294 | "id": 8, 295 | "interval": null, 296 | "links": [], 297 | "mappingType": 1, 298 | "mappingTypes": [ 299 | { 300 | "name": "value to text", 301 | "value": 1 302 | }, 303 | { 304 | "name": "range to text", 305 | "value": 2 306 | } 307 | ], 308 | "maxDataPoints": 100, 309 | "nullPointMode": "connected", 310 | "nullText": null, 311 | "postfix": "", 312 | "postfixFontSize": "50%", 313 | "prefix": "", 314 | "prefixFontSize": "50%", 315 | "rangeMaps": [ 316 | { 317 | "from": "null", 318 | "text": "N/A", 319 | "to": "null" 320 | } 321 | ], 322 | "sparkline": { 323 | "fillColor": "rgba(31, 118, 189, 0.18)", 324 | "full": false, 325 | "lineColor": "rgb(31, 120, 193)", 326 | "show": false 327 | }, 328 | "tableColumn": "", 329 | "targets": [ 330 | { 331 | "groupBy": [ 332 | { 333 | "params": [ 334 | "$__interval" 335 | ], 336 | "type": "time" 337 | }, 338 | { 339 | "params": [ 340 | "null" 341 | ], 342 | "type": "fill" 343 | } 344 | ], 345 | "measurement": "friends_count", 346 | "orderByTime": "ASC", 347 | "policy": "default", 348 | "refId": "A", 349 | "resultFormat": "time_series", 350 | "select": [ 351 | [ 352 | { 353 | "params": [ 354 | "value" 355 | ], 356 | "type": "field" 357 | }, 358 | { 359 | "params": [], 360 | "type": "distinct" 361 | } 362 | ] 363 | ], 364 | "tags": [] 365 | } 366 | ], 367 | "thresholds": "", 368 | "title": "Current Friends", 369 | "type": "singlestat", 370 | "valueFontSize": "200%", 371 | "valueMaps": [ 372 | { 373 | "op": "=", 374 | "text": "N/A", 375 | "value": "null" 376 | } 377 | ], 378 | "valueName": "avg" 379 | }, 380 | { 381 | "aliasColors": {}, 382 | "bars": false, 383 | "dashLength": 10, 384 | "dashes": false, 385 | "datasource": "influxdb", 386 | "fill": 1, 387 | "gridPos": { 388 | "h": 5, 389 | "w": 19, 390 | "x": 5, 391 | "y": 5 392 | }, 393 | "id": 6, 394 | "legend": { 395 | "avg": false, 396 | "current": false, 397 | "max": false, 398 | "min": false, 399 | "show": true, 400 | "total": false, 401 | "values": false 402 | }, 403 | "lines": true, 404 | "linewidth": 1, 405 | "links": [], 406 | "nullPointMode": "null", 407 | "percentage": false, 408 | "pointradius": 5, 409 | "points": false, 410 | "renderer": "flot", 411 | "seriesOverrides": [], 412 | "spaceLength": 10, 413 | "stack": false, 414 | "steppedLine": false, 415 | "targets": [ 416 | { 417 | "groupBy": [ 418 | { 419 | "params": [ 420 | "$__interval" 421 | ], 422 | "type": "time" 423 | }, 424 | { 425 | "params": [ 426 | "null" 427 | ], 428 | "type": "fill" 429 | } 430 | ], 431 | "measurement": "friends_count", 432 | "orderByTime": "ASC", 433 | "policy": "default", 434 | "refId": "A", 435 | "resultFormat": "time_series", 436 | "select": [ 437 | [ 438 | { 439 | "params": [ 440 | "value" 441 | ], 442 | "type": "field" 443 | }, 444 | { 445 | "params": [], 446 | "type": "distinct" 447 | } 448 | ] 449 | ], 450 | "tags": [] 451 | } 452 | ], 453 | "thresholds": [], 454 | "timeFrom": null, 455 | "timeShift": null, 456 | "title": "Friends Count over Time", 457 | "tooltip": { 458 | "shared": true, 459 | "sort": 0, 460 | "value_type": "individual" 461 | }, 462 | "type": "graph", 463 | "xaxis": { 464 | "buckets": null, 465 | "mode": "time", 466 | "name": null, 467 | "show": true, 468 | "values": [] 469 | }, 470 | "yaxes": [ 471 | { 472 | "format": "none", 473 | "label": null, 474 | "logBase": 1, 475 | "max": null, 476 | "min": null, 477 | "show": true 478 | }, 479 | { 480 | "format": "none", 481 | "label": null, 482 | "logBase": 1, 483 | "max": null, 484 | "min": null, 485 | "show": true 486 | } 487 | ], 488 | "yaxis": { 489 | "align": false, 490 | "alignLevel": null 491 | } 492 | }, 493 | { 494 | "cacheTimeout": null, 495 | "colorBackground": false, 496 | "colorValue": false, 497 | "colors": [ 498 | "#299c46", 499 | "rgba(237, 129, 40, 0.89)", 500 | "#d44a3a" 501 | ], 502 | "datasource": "influxdb", 503 | "format": "none", 504 | "gauge": { 505 | "maxValue": 100, 506 | "minValue": 0, 507 | "show": false, 508 | "thresholdLabels": false, 509 | "thresholdMarkers": true 510 | }, 511 | "gridPos": { 512 | "h": 5, 513 | "w": 5, 514 | "x": 0, 515 | "y": 10 516 | }, 517 | "id": 9, 518 | "interval": "", 519 | "links": [], 520 | "mappingType": 1, 521 | "mappingTypes": [ 522 | { 523 | "name": "value to text", 524 | "value": 1 525 | }, 526 | { 527 | "name": "range to text", 528 | "value": 2 529 | } 530 | ], 531 | "maxDataPoints": 100, 532 | "nullPointMode": "connected", 533 | "nullText": null, 534 | "postfix": "", 535 | "postfixFontSize": "50%", 536 | "prefix": "", 537 | "prefixFontSize": "50%", 538 | "rangeMaps": [ 539 | { 540 | "from": "null", 541 | "text": "N/A", 542 | "to": "null" 543 | } 544 | ], 545 | "sparkline": { 546 | "fillColor": "rgba(31, 118, 189, 0.18)", 547 | "full": false, 548 | "lineColor": "rgb(31, 120, 193)", 549 | "show": false 550 | }, 551 | "tableColumn": "", 552 | "targets": [ 553 | { 554 | "groupBy": [ 555 | { 556 | "params": [ 557 | "$__interval" 558 | ], 559 | "type": "time" 560 | }, 561 | { 562 | "params": [ 563 | "null" 564 | ], 565 | "type": "fill" 566 | } 567 | ], 568 | "measurement": "statuses_count", 569 | "orderByTime": "ASC", 570 | "policy": "default", 571 | "refId": "A", 572 | "resultFormat": "time_series", 573 | "select": [ 574 | [ 575 | { 576 | "params": [ 577 | "value" 578 | ], 579 | "type": "field" 580 | }, 581 | { 582 | "params": [], 583 | "type": "last" 584 | } 585 | ] 586 | ], 587 | "tags": [] 588 | } 589 | ], 590 | "thresholds": "", 591 | "title": "Current Status Count", 592 | "type": "singlestat", 593 | "valueFontSize": "200%", 594 | "valueMaps": [ 595 | { 596 | "op": "=", 597 | "text": "N/A", 598 | "value": "null" 599 | } 600 | ], 601 | "valueName": "current" 602 | }, 603 | { 604 | "aliasColors": {}, 605 | "bars": false, 606 | "dashLength": 10, 607 | "dashes": false, 608 | "datasource": "influxdb", 609 | "fill": 1, 610 | "gridPos": { 611 | "h": 5, 612 | "w": 19, 613 | "x": 5, 614 | "y": 10 615 | }, 616 | "id": 10, 617 | "legend": { 618 | "avg": false, 619 | "current": false, 620 | "max": false, 621 | "min": false, 622 | "show": true, 623 | "total": false, 624 | "values": false 625 | }, 626 | "lines": true, 627 | "linewidth": 1, 628 | "links": [], 629 | "nullPointMode": "null", 630 | "percentage": false, 631 | "pointradius": 5, 632 | "points": false, 633 | "renderer": "flot", 634 | "seriesOverrides": [], 635 | "spaceLength": 10, 636 | "stack": false, 637 | "steppedLine": false, 638 | "targets": [ 639 | { 640 | "groupBy": [ 641 | { 642 | "params": [ 643 | "$__interval" 644 | ], 645 | "type": "time" 646 | }, 647 | { 648 | "params": [ 649 | "null" 650 | ], 651 | "type": "fill" 652 | } 653 | ], 654 | "measurement": "statuses_count", 655 | "orderByTime": "ASC", 656 | "policy": "default", 657 | "refId": "A", 658 | "resultFormat": "time_series", 659 | "select": [ 660 | [ 661 | { 662 | "params": [ 663 | "value" 664 | ], 665 | "type": "field" 666 | }, 667 | { 668 | "params": [], 669 | "type": "distinct" 670 | } 671 | ] 672 | ], 673 | "tags": [] 674 | } 675 | ], 676 | "thresholds": [], 677 | "timeFrom": null, 678 | "timeShift": null, 679 | "title": "Status Count Over Time", 680 | "tooltip": { 681 | "shared": true, 682 | "sort": 0, 683 | "value_type": "individual" 684 | }, 685 | "type": "graph", 686 | "xaxis": { 687 | "buckets": null, 688 | "mode": "time", 689 | "name": null, 690 | "show": true, 691 | "values": [] 692 | }, 693 | "yaxes": [ 694 | { 695 | "format": "none", 696 | "label": null, 697 | "logBase": 1, 698 | "max": null, 699 | "min": null, 700 | "show": true 701 | }, 702 | { 703 | "format": "none", 704 | "label": null, 705 | "logBase": 1, 706 | "max": null, 707 | "min": null, 708 | "show": true 709 | } 710 | ], 711 | "yaxis": { 712 | "align": false, 713 | "alignLevel": null 714 | } 715 | }, 716 | { 717 | "cacheTimeout": null, 718 | "colorBackground": false, 719 | "colorValue": false, 720 | "colors": [ 721 | "#299c46", 722 | "rgba(237, 129, 40, 0.89)", 723 | "#d44a3a" 724 | ], 725 | "datasource": "influxdb", 726 | "format": "none", 727 | "gauge": { 728 | "maxValue": 100, 729 | "minValue": 0, 730 | "show": false, 731 | "thresholdLabels": false, 732 | "thresholdMarkers": true 733 | }, 734 | "gridPos": { 735 | "h": 5, 736 | "w": 5, 737 | "x": 0, 738 | "y": 15 739 | }, 740 | "id": 11, 741 | "interval": "", 742 | "links": [], 743 | "mappingType": 1, 744 | "mappingTypes": [ 745 | { 746 | "name": "value to text", 747 | "value": 1 748 | }, 749 | { 750 | "name": "range to text", 751 | "value": 2 752 | } 753 | ], 754 | "maxDataPoints": 100, 755 | "nullPointMode": "connected", 756 | "nullText": null, 757 | "postfix": "", 758 | "postfixFontSize": "50%", 759 | "prefix": "", 760 | "prefixFontSize": "50%", 761 | "rangeMaps": [ 762 | { 763 | "from": "null", 764 | "text": "N/A", 765 | "to": "null" 766 | } 767 | ], 768 | "sparkline": { 769 | "fillColor": "rgba(31, 118, 189, 0.18)", 770 | "full": false, 771 | "lineColor": "rgb(31, 120, 193)", 772 | "show": false 773 | }, 774 | "tableColumn": "", 775 | "targets": [ 776 | { 777 | "groupBy": [ 778 | { 779 | "params": [ 780 | "$__interval" 781 | ], 782 | "type": "time" 783 | }, 784 | { 785 | "params": [ 786 | "null" 787 | ], 788 | "type": "fill" 789 | } 790 | ], 791 | "measurement": "favourites_count", 792 | "orderByTime": "ASC", 793 | "policy": "default", 794 | "refId": "A", 795 | "resultFormat": "time_series", 796 | "select": [ 797 | [ 798 | { 799 | "params": [ 800 | "value" 801 | ], 802 | "type": "field" 803 | }, 804 | { 805 | "params": [], 806 | "type": "last" 807 | } 808 | ] 809 | ], 810 | "tags": [] 811 | } 812 | ], 813 | "thresholds": "", 814 | "title": "Current Favourites Count", 815 | "type": "singlestat", 816 | "valueFontSize": "200%", 817 | "valueMaps": [ 818 | { 819 | "op": "=", 820 | "text": "N/A", 821 | "value": "null" 822 | } 823 | ], 824 | "valueName": "current" 825 | }, 826 | { 827 | "aliasColors": {}, 828 | "bars": false, 829 | "dashLength": 10, 830 | "dashes": false, 831 | "datasource": "influxdb", 832 | "fill": 1, 833 | "gridPos": { 834 | "h": 5, 835 | "w": 19, 836 | "x": 5, 837 | "y": 15 838 | }, 839 | "id": 12, 840 | "legend": { 841 | "avg": false, 842 | "current": false, 843 | "max": false, 844 | "min": false, 845 | "show": true, 846 | "total": false, 847 | "values": false 848 | }, 849 | "lines": true, 850 | "linewidth": 1, 851 | "links": [], 852 | "nullPointMode": "null", 853 | "percentage": false, 854 | "pointradius": 5, 855 | "points": false, 856 | "renderer": "flot", 857 | "seriesOverrides": [], 858 | "spaceLength": 10, 859 | "stack": false, 860 | "steppedLine": false, 861 | "targets": [ 862 | { 863 | "groupBy": [ 864 | { 865 | "params": [ 866 | "$__interval" 867 | ], 868 | "type": "time" 869 | }, 870 | { 871 | "params": [ 872 | "null" 873 | ], 874 | "type": "fill" 875 | } 876 | ], 877 | "measurement": "favourites_count", 878 | "orderByTime": "ASC", 879 | "policy": "default", 880 | "refId": "A", 881 | "resultFormat": "time_series", 882 | "select": [ 883 | [ 884 | { 885 | "params": [ 886 | "value" 887 | ], 888 | "type": "field" 889 | }, 890 | { 891 | "params": [], 892 | "type": "distinct" 893 | } 894 | ] 895 | ], 896 | "tags": [] 897 | } 898 | ], 899 | "thresholds": [], 900 | "timeFrom": null, 901 | "timeShift": null, 902 | "title": "Favourites Count Over Time", 903 | "tooltip": { 904 | "shared": true, 905 | "sort": 0, 906 | "value_type": "individual" 907 | }, 908 | "type": "graph", 909 | "xaxis": { 910 | "buckets": null, 911 | "mode": "time", 912 | "name": null, 913 | "show": true, 914 | "values": [] 915 | }, 916 | "yaxes": [ 917 | { 918 | "format": "none", 919 | "label": null, 920 | "logBase": 1, 921 | "max": null, 922 | "min": null, 923 | "show": true 924 | }, 925 | { 926 | "format": "none", 927 | "label": null, 928 | "logBase": 1, 929 | "max": null, 930 | "min": null, 931 | "show": true 932 | } 933 | ], 934 | "yaxis": { 935 | "align": false, 936 | "alignLevel": null 937 | } 938 | }, 939 | { 940 | "cacheTimeout": null, 941 | "colorBackground": false, 942 | "colorValue": false, 943 | "colors": [ 944 | "#299c46", 945 | "rgba(237, 129, 40, 0.89)", 946 | "#d44a3a" 947 | ], 948 | "datasource": "influxdb", 949 | "format": "none", 950 | "gauge": { 951 | "maxValue": 100, 952 | "minValue": 0, 953 | "show": false, 954 | "thresholdLabels": false, 955 | "thresholdMarkers": true 956 | }, 957 | "gridPos": { 958 | "h": 5, 959 | "w": 5, 960 | "x": 0, 961 | "y": 20 962 | }, 963 | "id": 13, 964 | "interval": "", 965 | "links": [], 966 | "mappingType": 1, 967 | "mappingTypes": [ 968 | { 969 | "name": "value to text", 970 | "value": 1 971 | }, 972 | { 973 | "name": "range to text", 974 | "value": 2 975 | } 976 | ], 977 | "maxDataPoints": 100, 978 | "nullPointMode": "connected", 979 | "nullText": null, 980 | "postfix": "", 981 | "postfixFontSize": "50%", 982 | "prefix": "", 983 | "prefixFontSize": "50%", 984 | "rangeMaps": [ 985 | { 986 | "from": "null", 987 | "text": "N/A", 988 | "to": "null" 989 | } 990 | ], 991 | "sparkline": { 992 | "fillColor": "rgba(31, 118, 189, 0.18)", 993 | "full": false, 994 | "lineColor": "rgb(31, 120, 193)", 995 | "show": false 996 | }, 997 | "tableColumn": "", 998 | "targets": [ 999 | { 1000 | "groupBy": [ 1001 | { 1002 | "params": [ 1003 | "$__interval" 1004 | ], 1005 | "type": "time" 1006 | }, 1007 | { 1008 | "params": [ 1009 | "null" 1010 | ], 1011 | "type": "fill" 1012 | } 1013 | ], 1014 | "measurement": "listed_count", 1015 | "orderByTime": "ASC", 1016 | "policy": "default", 1017 | "refId": "A", 1018 | "resultFormat": "time_series", 1019 | "select": [ 1020 | [ 1021 | { 1022 | "params": [ 1023 | "value" 1024 | ], 1025 | "type": "field" 1026 | }, 1027 | { 1028 | "params": [], 1029 | "type": "last" 1030 | } 1031 | ] 1032 | ], 1033 | "tags": [] 1034 | } 1035 | ], 1036 | "thresholds": "", 1037 | "title": "Current Listed Count", 1038 | "type": "singlestat", 1039 | "valueFontSize": "200%", 1040 | "valueMaps": [ 1041 | { 1042 | "op": "=", 1043 | "text": "N/A", 1044 | "value": "null" 1045 | } 1046 | ], 1047 | "valueName": "current" 1048 | }, 1049 | { 1050 | "aliasColors": {}, 1051 | "bars": false, 1052 | "dashLength": 10, 1053 | "dashes": false, 1054 | "datasource": "influxdb", 1055 | "fill": 1, 1056 | "gridPos": { 1057 | "h": 5, 1058 | "w": 19, 1059 | "x": 5, 1060 | "y": 20 1061 | }, 1062 | "id": 14, 1063 | "legend": { 1064 | "avg": false, 1065 | "current": false, 1066 | "max": false, 1067 | "min": false, 1068 | "show": true, 1069 | "total": false, 1070 | "values": false 1071 | }, 1072 | "lines": true, 1073 | "linewidth": 1, 1074 | "links": [], 1075 | "nullPointMode": "null", 1076 | "percentage": false, 1077 | "pointradius": 5, 1078 | "points": false, 1079 | "renderer": "flot", 1080 | "seriesOverrides": [], 1081 | "spaceLength": 10, 1082 | "stack": false, 1083 | "steppedLine": false, 1084 | "targets": [ 1085 | { 1086 | "groupBy": [ 1087 | { 1088 | "params": [ 1089 | "$__interval" 1090 | ], 1091 | "type": "time" 1092 | }, 1093 | { 1094 | "params": [ 1095 | "null" 1096 | ], 1097 | "type": "fill" 1098 | } 1099 | ], 1100 | "measurement": "listed_count", 1101 | "orderByTime": "ASC", 1102 | "policy": "default", 1103 | "refId": "A", 1104 | "resultFormat": "time_series", 1105 | "select": [ 1106 | [ 1107 | { 1108 | "params": [ 1109 | "value" 1110 | ], 1111 | "type": "field" 1112 | }, 1113 | { 1114 | "params": [], 1115 | "type": "distinct" 1116 | } 1117 | ] 1118 | ], 1119 | "tags": [] 1120 | } 1121 | ], 1122 | "thresholds": [], 1123 | "timeFrom": null, 1124 | "timeShift": null, 1125 | "title": "Listed Count Over Time", 1126 | "tooltip": { 1127 | "shared": true, 1128 | "sort": 0, 1129 | "value_type": "individual" 1130 | }, 1131 | "type": "graph", 1132 | "xaxis": { 1133 | "buckets": null, 1134 | "mode": "time", 1135 | "name": null, 1136 | "show": true, 1137 | "values": [] 1138 | }, 1139 | "yaxes": [ 1140 | { 1141 | "format": "none", 1142 | "label": null, 1143 | "logBase": 1, 1144 | "max": null, 1145 | "min": null, 1146 | "show": true 1147 | }, 1148 | { 1149 | "format": "none", 1150 | "label": null, 1151 | "logBase": 1, 1152 | "max": null, 1153 | "min": null, 1154 | "show": true 1155 | } 1156 | ], 1157 | "yaxis": { 1158 | "align": false, 1159 | "alignLevel": null 1160 | } 1161 | } 1162 | ], 1163 | "refresh": false, 1164 | "schemaVersion": 16, 1165 | "style": "dark", 1166 | "tags": [], 1167 | "templating": { 1168 | "list": [] 1169 | }, 1170 | "time": { 1171 | "from": "now-15m", 1172 | "to": "now" 1173 | }, 1174 | "timepicker": { 1175 | "refresh_intervals": [ 1176 | "5s", 1177 | "10s", 1178 | "30s", 1179 | "1m", 1180 | "5m", 1181 | "15m", 1182 | "30m", 1183 | "1h", 1184 | "2h", 1185 | "1d" 1186 | ], 1187 | "time_options": [ 1188 | "5m", 1189 | "15m", 1190 | "1h", 1191 | "6h", 1192 | "12h", 1193 | "24h", 1194 | "2d", 1195 | "7d", 1196 | "30d" 1197 | ] 1198 | }, 1199 | "timezone": "", 1200 | "title": "TwitterGraph", 1201 | "uid": "b6TPTLaiz", 1202 | "version": 9 1203 | } 1204 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # This repository is archived and will no longer recieve updates. 2 | 3 | # Monitor Your Twitter Stats with a Python script, InfluxDB and Grafana running in Kubernetes or OKD 4 | 5 | ![TwitterGraph data in Grafana](images/twittergraph.png "TwitterGraph data in Grafana") 6 | 7 | Kubernetes is the de facto leader in Container Orchestration on the market right now, and it is an amazingly configurable and powerful orchestration tool, at that. As with many powerful tools, though, it can be somewhat confusing when first approached. This walkthough will cover the basics of creating multiple pods, configuring them with secret credentials and configuration files, and exposing the services to the world by creating an InfluxDB and Grafana deployment and Kubernetes Cron Job to gather statistics about your Twitter account from the Twitter Developer API, all deployed on Kubernetes or OKD (formerly OpenShift Origin). 8 | 9 | ## Requirements 10 | 11 | * A Twitter account to Monitor 12 | * A Twitter Developer API Account for gathering stats 13 | * A Kubernetes or OKD cluster (or, MiniKube or MiniShift) 14 | * The `kubectl` or `oc` cli tools installed 15 | 16 | ## What you'll learn 17 | 18 | This walk-through will introduce you to a variety of Kubernetes concepts. You'll learn about Kuberenetes CronJobs, ConfigMaps, Secrets, Deployments, Services and Ingress. 19 | 20 | If you choose to dive in further, the included files can serve as an introduction to [Tweepy](http://www.tweepy.org/), an "easy-to-use Python module for accessing the Twitter API", InfluxDB configuration, and automated Grafana [Dashboard Providers](http://docs.grafana.org/v5.0/administration/provisioning/#dashboards). 21 | 22 | 23 | ## Architecture 24 | 25 | This app consists of a Python script which polls the Twitter Developer API on a schedule for stats about your Twitter account, and stores them in InfluxDB as time series data. Grafana is displays the data in human-friendly formats (counts and graphs) on customizable dashboards. 26 | 27 | All of these components run in Kubernetes- or OKD-managed containers. 28 | 29 | 30 | ## Prerequisite: Get a Twitter Developer API Account 31 | 32 | Follow the [Twitter instructions to sign up for a Developer account](https://developer.twitter.com/en/apply/user), allowing access to the Twitter API. Record your `API_KEY`, `API_SECRET`, `ACCESS_TOKEN` and `ACCESS_SECRET` to use later. 33 | 34 | 35 | ## Prerequisite: Clone the TwitterGraph Repo 36 | 37 | The [TwitterGraph Github Repo](https://github.com/clcollins/twitterGraph/) contains all the files needed for this project, as well as a few to make life easier if you wanted to do it all over again. 38 | 39 | 40 | ## Setup InfluxDB 41 | 42 | [InfluxDB](https://www.influxdata.com/time-series-platform/influxdb/) is an opensource data store designed specifically for time series data. Since this project will be polling Twitter on a schedule using a Kubernetes CronJob, InfluxDB is perfect for holding storing the data. 43 | 44 | The [Docker-maintained InfluxDB Image on DockerHub](https://hub.docker.com/_/influxdb) will work fine for this project. It works out-of-the-box with both Kubernetes and OKD ([see OKD Considerations below](#okd_considerations)). 45 | 46 | 47 | ### Create a Deployment 48 | 49 | A [Kubernetes Deployment](https://kubernetes.io/docs/concepts/workloads/controllers/deployment/#creating-a-deployment) describes the _desired state_ of a resource. For InfluxDB, this is a single container in a pod running an instance of the InfluxDB image. 50 | 51 | A bare-bones InfluxDB deployment can be created with the `kubectl create deployment` command: 52 | 53 | ``` 54 | kubectl create deployment influxdb --image=docker.io/influxdb:1.6.4 55 | ``` 56 | 57 | The newly created deployment can be seen with the `kubectl get deployment` command: 58 | 59 | ``` 60 | kubectl get deployments 61 | NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE 62 | influxdb 1 1 1 1 7m40s 63 | ``` 64 | 65 | Specific details of the deployment can be viewed with the `kubectl describe deployment` command: 66 | 67 | ``` 68 | kubectl describe deployment influxdb 69 | Name: influxdb 70 | Namespace: twittergraph 71 | CreationTimestamp: Mon, 14 Jan 2019 11:31:12 -0500 72 | Labels: app=influxdb 73 | Annotations: deployment.kubernetes.io/revision=1 74 | Selector: app=influxdb 75 | Replicas: 1 desired | 1 updated | 1 total | 1 available | 0 unavailable 76 | StrategyType: RollingUpdate 77 | MinReadySeconds: 0 78 | RollingUpdateStrategy: 25% max unavailable, 25% max surge 79 | Pod Template: 80 | Labels: app=influxdb 81 | Containers: 82 | influxdb: 83 | Image: docker.io/influxdb:1.6.4 84 | Port: 85 | Host Port: 86 | Environment: 87 | Mounts: 88 | Volumes: 89 | Conditions: 90 | Type Status Reason 91 | ---- ------ ------ 92 | Available True MinimumReplicasAvailable 93 | Progressing True NewReplicaSetAvailable 94 | OldReplicaSets: 95 | NewReplicaSet: influxdb-85f7b44c44 (1/1 replicas created) 96 | Events: 97 | Type Reason Age From Message 98 | ---- ------ ---- ---- ------- 99 | Normal ScalingReplicaSet 8m deployment-controller Scaled up replica set influxdb-85f7b44c44 to 1 100 | ``` 101 | 102 | ### Configure InfluxDB Credentials using Secrets 103 | 104 | At the moment, Kubernetes is running an InfluxDB container with the default configuration from the docker.io/influxdb:1.6.4 image, but for a database server, that is not necessarily very helpful. The database needs to be configured to use a specific set of credentials, and to store the database data between restarts. 105 | 106 | [Kuberenetes Secrets](https://kubernetes.io/docs/concepts/configuration/secret/) are a way to store sensitive information, such as passwords, and inject them into running containers as either environment variables or mounted volumes. This is perfect for storing the database credentials and connection information, both to configure InfluxDB and to tell Grafana and the Python CronJob how to connect to it. 107 | 108 | To accomplish both tasks, we need four bits of information: 109 | 110 | 1. INFLUXDB_DATABASE - the name of the database to use 111 | 2. INFLUXDB_HOST - the hostname where the database server is running 112 | 3. INFLUXDB_USERNAME - the username to login with 113 | 4. INFLUXDB_PASSWORD - the password to login with 114 | 115 | Create a secret using the `kubectl create secret` command, and some basic credentials: 116 | 117 | ``` 118 | kubectl create secret generic influxdb-creds \ 119 | --from-literal=INFLUXDB_DATABASE=twittergraph \ 120 | --from-literal=INFLUXDB_USERNAME=root \ 121 | --from-literal=INFLUXDB_PASSWORD=root \ 122 | --from-literal=INFLUXDB_HOST=influxdb 123 | ``` 124 | 125 | The command above creates a "generic-type" secret (as opposed to "tls-" or "docker-registry-type" secrets) named "influxdb-creds", populated with some default credentials. Secrets use key/value pairs to store data, and this is perfect for use as environment variables within a container. 126 | 127 | As with the examples above, the secret created can be seen with the `kubectl get secret` command: 128 | 129 | ``` 130 | kubectl get secret influxdb-creds 131 | NAME TYPE DATA AGE 132 | influxdb-creds Opaque 4 11s 133 | ``` 134 | 135 | The keys contained within the secret (but not the values) can be seen using the `kubectl describe secret` command. In this case, the INFLUXDB_* keys are listed in the "influxdb-creds" secret: 136 | 137 | ``` 138 | kubectl describe secret influxdb-creds 139 | Name: influxdb-creds 140 | Namespace: twittergraph 141 | Labels: 142 | Annotations: 143 | 144 | Type: Opaque 145 | 146 | Data 147 | ==== 148 | INFLUXDB_DATABASE: 12 bytes 149 | INFLUXDB_HOST: 8 bytes 150 | INFLUXDB_PASSWORD: 4 bytes 151 | INFLUXDB_USERNAME: 4 bytes 152 | ``` 153 | 154 | Now that the secret has been created, they can be shared with the InfluxDB pod running the database as [environment variables](https://kubernetes.io/docs/concepts/configuration/secret/#using-secrets-as-environment-variables). 155 | 156 | To share the secret with the InfluxDB pod, they need to be referenced as environment variables in the deployment created earlier. The existing deployment can be edited with the `kubectl edit deployment` command, which will open the deployment object in the default editor set for your system. When the file is saved, Kubernetes will apply the changes to the deployment. 157 | 158 | To add environment variables for each of the secrets, the pod spec contained in the deployment needs to be modified. Specifically, the `.spec.template.spec.containers` array needs to be modified to include an `envFrom` section. 159 | 160 | Using the command `kubectl edit deployment influxdb`, find that section in the deployment (example here is truncated): 161 | 162 | ``` 163 | spec: 164 | template: 165 | spec: 166 | containers: 167 | - image: docker.io/influxdb:1.6.4 168 | imagePullPolicy: IfNotPresent 169 | name: influxdb 170 | ``` 171 | 172 | This is the section describing a very basic InfluxDB container. Secrets can be added to the container with an `env` array for each key/value to be mapped in. Alternatively, though, `envFrom` can be used to map _all_ the key/value pairs into the container, using the key names as the variables: 173 | 174 | For the values in the "influxdb-creds" secret, the container spec would look as follows: 175 | 176 | ``` 177 | spec: 178 | containers: 179 | - name: influxdb 180 | envFrom: 181 | - secretRef: 182 | name: influxdb-creds 183 | ``` 184 | 185 | After editing the deployment, Kubernetes will destroy the running pod and create a new one with the mapped environment variables. Remember, the deployment describes the _desired state_, so Kubernetes replaces the old pod with a new one matching that state. 186 | 187 | You can validate the environment variables are included in your deployment with `kubectl describe deployment influxdb`: 188 | 189 | ``` 190 | Environment Variables from: 191 | influxdb-creds Secret Optional: false 192 | ``` 193 | 194 | ### Configure persistent storage for InfluxDB 195 | 196 | A database is not very useful if all of its data is destroyed each time the service is restarted. In the current InfluxDB deployment, the data is all stored in the contianer itself, and is lost when Kubernetes destroys and recreates pods. A [PersistentVolume](https://kubernetes.io/docs/concepts/storage/persistent-volumes/) is needed to store data permanently. 197 | 198 | In order to get persistent storage in a Kubernetes cluster, a [PersistentVolumeClaim](https://kubernetes.io/docs/concepts/storage/persistent-volumes/#persistentvolumeclaims) (PVC) is created describing the type and details of the volume needed, and Kubernetes will find a previously created volume that fits the request (or create one with a dynamic volume provisioner, if there is one). 199 | 200 | Unfortunately, the `kubectl` cli tool does not have the ability to create PersistentVolumeClaims directly, but a PVC can be specified as a yaml file and created with `kubectl create -f `: 201 | 202 | Create a file named pvc.yaml with a generic 2G claim: 203 | 204 | ``` 205 | apiVersion: v1 206 | kind: PersistentVolumeClaim 207 | metadata: 208 | labels: 209 | app: influxdb 210 | project: twittergraph 211 | name: influxdb 212 | spec: 213 | accessModes: 214 | - ReadWriteOnce 215 | resources: 216 | requests: 217 | storage: 2Gi 218 | ``` 219 | 220 | Then, create the PVC: 221 | 222 | ``` 223 | kubectl create -f pvc.yaml 224 | ``` 225 | 226 | You can validate that the PVC was created and bound to a PersistentVolume with `kubectl get pvc`: 227 | 228 | ``` 229 | kubectl get pvc 230 | NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE 231 | influxdb Bound pvc-27c7b0a7-1828-11e9-831a-0800277ca5a7 2Gi RWO standard 173m 232 | ``` 233 | 234 | From the output above, you can see the PVC "influxdb" was matched to a PV (or Volume) named "pvc-27c7b0a7-1828-11e9-831a-0800277ca5a7" (your name will vary) and bound (STATUS: Bound). 235 | 236 | If your PVC does not have a volume, or the status is something other than Bound, you may need to talk to your cluster administrator. (This process should work fine with MiniKube, MiniShift, or any cluster with dynamically provisioned volumes, though.) 237 | 238 | Once a PersistentVolume has been assigned to the PersistentVolumeClaim, the volume can be mounted into the container to provide persistent storage. Once again, this entails editing the deployment, first to add a volume object, and secondly to reference that volume within the contianer spec as a "volumeMount". 239 | 240 | Edit the deployment with `kubectl edit deployment influxdb` and add a ".spec.template.spec.volumes" section below the containers section (example below truncated for brevity): 241 | 242 | ``` 243 | spec: 244 | template: 245 | spec: 246 | volumes: 247 | - name: var-lib-influxdb 248 | persistentVolumeClaim: 249 | claimName: influxdb 250 | ``` 251 | 252 | In the example above, a volume named "var-lib-influxdb" is added to the deployment, which references the PVC "influxdb" created earlier. 253 | 254 | Now, add a "volumeMount" to the container spec. The volume mount references the volume added earlier (_name: var-lib-influxdb_) and mounts the volume to the InfluxDB data directory, "/var/lib/influxdb": 255 | 256 | ``` 257 | spec: 258 | template: 259 | spec: 260 | containers: 261 | volumeMounts: 262 | - mountPath: /var/lib/influxdb 263 | name: var-lib-influxdb 264 | ``` 265 | 266 | ### The InfluxDB Deployment 267 | 268 | After the above, you should have a deployment for InfluxDB that looks something like: 269 | 270 | ``` 271 | apiVersion: extensions/v1beta1 272 | kind: Deployment 273 | metadata: 274 | annotations: 275 | deployment.kubernetes.io/revision: "3" 276 | creationTimestamp: null 277 | generation: 1 278 | labels: 279 | app: influxdb 280 | project: twittergraph 281 | name: influxdb 282 | selfLink: /apis/extensions/v1beta1/namespaces/twittergraph/deployments/influxdb 283 | spec: 284 | progressDeadlineSeconds: 600 285 | replicas: 1 286 | revisionHistoryLimit: 10 287 | selector: 288 | matchLabels: 289 | app: influxdb 290 | strategy: 291 | rollingUpdate: 292 | maxSurge: 25% 293 | maxUnavailable: 25% 294 | type: RollingUpdate 295 | template: 296 | metadata: 297 | creationTimestamp: null 298 | labels: 299 | app: influxdb 300 | spec: 301 | containers: 302 | - envFrom: 303 | - secretRef: 304 | name: influxdb-creds 305 | image: docker.io/influxdb:1.6.4 306 | imagePullPolicy: IfNotPresent 307 | name: influxdb 308 | resources: {} 309 | terminationMessagePath: /dev/termination-log 310 | terminationMessagePolicy: File 311 | volumeMounts: 312 | - mountPath: /var/lib/influxdb 313 | name: var-lib-influxdb 314 | dnsPolicy: ClusterFirst 315 | restartPolicy: Always 316 | schedulerName: default-scheduler 317 | securityContext: {} 318 | terminationGracePeriodSeconds: 30 319 | volumes: 320 | - name: var-lib-influxdb 321 | persistentVolumeClaim: 322 | claimName: influxdb 323 | status: {} 324 | ``` 325 | 326 | ### Expose InfluxDB (to the cluster only) with a Serivce 327 | 328 | By default, pods in this project are unable to talk to one another. A [Kubernetes Service]() is required to "expose" the pod to the cluster, or to the public. In the case of InfluxDB, the pod needs to be able to accept traffic on TCP port 8086 from the Grafana and Cron Job pods (that will be created later). To do this, we expose (create a service for) the pod, using a Cluster IP. Cluster IPs are only available to other pods in the cluster. We do this with the `kubectl expose` command: 329 | 330 | ``` 331 | kubectl expose deployment influxdb --port=8086 --target-port=8086 --protocol=TCP --type=ClusterIP 332 | ``` 333 | 334 | The newly-created service can be verified with `kubectl describe service` command: 335 | 336 | ``` 337 | kubectl describe service influxdb 338 | Name: influxdb 339 | Namespace: twittergraph 340 | Labels: app=influxdb 341 | project=twittergraph 342 | Annotations: 343 | Selector: app=influxdb 344 | Type: ClusterIP 345 | IP: 10.108.196.112 346 | Port: 8086/TCP 347 | TargetPort: 8086/TCP 348 | Endpoints: 172.17.0.5:8086 349 | Session Affinity: None 350 | Events: 351 | ``` 352 | 353 | Some of the details (specifically the IP addresses) will vary from the example. The "IP" is an ip address internal to your cluster that's been assigned to the service, thought which other pods can communicate with InfluxDB. The "Endpoints" is the IP and port of the container itself, listening for connections. The service will route traffic to the internal cluster IP to the container itself. 354 | 355 | Now that InfluxDB is setup, we can move on to Grafana. 356 | 357 | 358 | ## Setup Grafana 359 | 360 | [Grafana](https://grafana.com/). Grafana is an open source project for visualizing time series data (thing: pretty, pretty graphs). 361 | 362 | As with Influxdb, [The Official Grafana image on DockerHub, maintained by Grafana](https://hub.docker.com/r/grafana/grafana/) works out-of-the-box for this project, both with Kubernetes and OKD. 363 | 364 | 365 | ### Create a Deployment 366 | 367 | So, just as we did before, create a deployment based on the Official Grafana image: 368 | 369 | ``` 370 | kubectl create deployment grafana --image=docker.io/grafana/grafana:5.3.2 371 | ``` 372 | 373 | There should now be a "grafana" deployment alongside the "influxdb" deployment: 374 | 375 | ``` 376 | kubectl get deployments 377 | NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE 378 | grafana 1 1 1 1 7s 379 | influxdb 1 1 1 1 5h12m 380 | ``` 381 | 382 | ### Setup Grafana credentials and config files with Secrets and ConfigMaps 383 | 384 | Building on what you've already learned, configuring Grafana should be both similar, and easier. Grafana doesn't require persistent storage, since it's reading its data out of the InfluxDB database. It does, however, have two configuration files needed to setup a [Dashboard Provider](http://docs.grafana.org/v5.0/administration/provisioning/#dashboards) to load dashboards dynamically from files, the dashboard file itself, a third file to connect it to InfluxDB as a datasource, and finally a secret to store default login credentials. 385 | 386 | The credentials secret works the same as the "influxdb-creds" secret created already. By default, the Grafana image looks for environment variables named "GF_SECURITY_ADMIN_USER" and "GF_SECURITY_ADMIN_PASSWORD" to set the admin username and password on startup. These can be whatever you like, but remember them so you can use them to login to Grafana when we have it configured. 387 | 388 | Create a secret named "grafana-creds" for the Grafana credentials with the `kubectl create secret` command: 389 | 390 | ``` 391 | kubectl create secret generic grafana-creds \ 392 | --from-literal=GF_SECURITY_ADMIN_USER=admin \ 393 | --from-literal=GF_SECURITY_ADMIN_PASSWORD=graphsRcool 394 | ``` 395 | 396 | Share this secret as environment variables using `envFrom`, this time in the Grafana deployment. Edit the deployment with `kubectl edit deployment grafana` and add the environment variables to the container spec: 397 | 398 | ``` 399 | spec: 400 | containers: 401 | - name: grafana 402 | envFrom: 403 | - secretRef: 404 | name: grafana-creds 405 | ``` 406 | 407 | And validate the environment variables have been added to the deployment with `kubectl describe deployment grafana`: 408 | 409 | ``` 410 | Environment Variables from: 411 | grafana-creds Secret Optional: false 412 | ``` 413 | 414 | That's all that's _required_ to start using Grafana. The rest of the configuration can be done in the web interface if desired, but with just a few config files, Grafana can be fully configured when it starts. 415 | 416 | [Kubernetes ConfigMaps](https://kubernetes.io/docs/tasks/configure-pod-container/configure-pod-configmap/) are similar to secrets and can be consumed the same way by a pod, but do not store the information obfuscated within Kubernetes. Config maps are useful for adding configuration files or variables into the containers in a pod. 417 | 418 | The Grafana instance in this project has three config files that need to be written into the running container: 419 | 420 | * influxdb-datasource.yml - tells Grafana how to talk to the InfluxDB database 421 | * grafana-dashboard-provider.yml - tells Grafana where to look for JSON files describing dashboards 422 | * twittergraph-dashboard.json - describes the dashboard for displaying the Twitter data we collect 423 | 424 | Kubernetes makes adding all these files easy: they can all be added to the same config map at once, and they can be mounted to different locations on the filesystem despite being in the same config map. 425 | 426 | If you have not done so already, clone the [TwitterGraph Github Repo](https://github.com/clcollins/twitterGraph/). These files are really specific to this particular project, so the easiest way to consume them is directly from the repo (though, they could certainly be written manually). 427 | 428 | From the directory with the contents of the repo, create a config map named grafana-config using the `kubectl create configmap` command: 429 | 430 | ``` 431 | kubectl create configmap grafana-config \ 432 | --from-file=influxdb-datasource.yml=influxdb-datasource.yml \ 433 | --from-file=grafana-dashboard-provider.yml=grafana-dashboard-provider.yml \ 434 | --from-file=twittergraph-dashboard.json=twittergraph-dashboard.json 435 | ``` 436 | 437 | 438 | The `kubectl create configmap` command above creates a config map named grafana-config, and stores the contents as the value for the key specified. The `--from-file` argument follows the form `--from-file==`, so in this case, the filename is being used as the key, for future clarity. 439 | 440 | Like secrets, details of a config map can be seen with `kubectl describe configmap`. Unlike secrets, the contents of the config map are visible in the output. Use the `kubectl describe configmap grafana-config` to see the three files stored as keys in the config map (results here truncated - because they're looooooong): 441 | 442 | ``` 443 | kubectl describe configmap grafana-config 444 | kubectl describe cm grafana-config 445 | Name: grafana-config 446 | Namespace: twittergraph 447 | Labels: 448 | Annotations: 449 | 450 | Data 451 | ==== 452 | grafana-dashboard-provider.yml: 453 | ---- 454 | apiVersion: 1 455 | 456 | providers: 457 | - name: 'default' 458 | orgId: 1 459 | folder: '' 460 | type: file 461 | 462 | ``` 463 | 464 | Each of the filenames should be stored as keys, and their contents as the values (such as the "grafana-dashboard-provider.yml", above). 465 | 466 | While config maps can be shared as environment variables, the way the credential secrets were above, the contents of this config map need to be mounted into the container as files. To do this, a volume can be created from config map in the "grafana" deployment. Similar to the persistent volume, use `kubectl edit deployment grafana` to add volume `.spec.template.spec.volumes` like so: 467 | 468 | ``` 469 | spec: 470 | template: 471 | spec: 472 | volumes: 473 | - configMap: 474 | name: grafana-config 475 | name: grafana-config 476 | ``` 477 | 478 | Then, edit the container spec to mount each of the keys stored in the config map as files in their respective locations in the Grafana container. Under `.spec.template.spec.containers`, add a volumeMouts section for the volumes: 479 | 480 | ``` 481 | spec: 482 | template: 483 | spec: 484 | containers: 485 | - name: grafana 486 | volumeMounts: 487 | - mountPath: /etc/grafana/provisioning/datasources/influxdb-datasource.yml 488 | name: grafana-config 489 | readOnly: true 490 | subPath: influxdb-datasource.yml 491 | - mountPath: /etc/grafana/provisioning/dashboards/grafana-dashboard-provider.yml 492 | name: grafana-config 493 | readOnly: true 494 | subPath: grafana-dashboard-provider.yml 495 | - mountPath: /var/lib/grafana/dashboards/twittergraph-dashboard.json 496 | name: grafana-config 497 | readOnly: true 498 | subPath: twittergraph-dashboard.json 499 | ``` 500 | 501 | The `name` section references the name of the config map "volume", and the addition of the `subPath` items allows Kubernetes to mount each file without overwriting the rest of the contents of that directory. Without it, "/etc/grafana/provisioning/datasources/influxdb-datasource.yml" for example, would be the only file in "/etc/grafana/provisioning/datasources". 502 | 503 | Each of the files can be verified by looking at them within the running container using the `kubectl exec` command. First find the Grafana pod's current name. The pod will have a randomized name similar to `grafana-586775fcc4-s7r2z`, and should be visible when running the command `kubectl get pods`: 504 | 505 | ``` 506 | kubectl get pods 507 | NAME READY STATUS RESTARTS AGE 508 | grafana-586775fcc4-s7r2z 1/1 Running 0 93s 509 | influxdb-595487b7f9-zgtvx 1/1 Running 0 18h 510 | ``` 511 | 512 | Substituting the name of your Grafana pod, you can verify the contents of the influxdb-datasource.yml file, for example (truncated for brevity): 513 | 514 | ``` 515 | kubectl exec -it grafana-586775fcc4-s7r2z cat /etc/grafana/provisioning/datasources/influxdb-datasource.yml 516 | # config file version 517 | apiVersion: 1 518 | 519 | # list of datasources to insert/update depending 520 | # what's available in the database 521 | datasources: 522 | # name of the datasource. Required 523 | - name: influxdb 524 | ``` 525 | 526 | ### Expose the Grafana service 527 | 528 | Now that it's configured, expose the Grafana service so it can be viewed in a browser. Because Grafana should be visible from outside the cluster, the "LoadBalancer" service type will be used rather than the internal-only "ClusterIP" type. 529 | 530 | For production clusters or cloud environments that support LoadBalancer services, an external IP is dynamically provisioned when the service is created. For MiniKube or MiniShift, LoadBalancer services are available via the `minikube service` command, which opens your default browser to a URL and port where the service is available on your host VM. 531 | 532 | The Grafana deployment is listening on port 3000 for HTTP traffic. Expose it, using the LoadBalancer-type service, using the `kubectl expose` command: 533 | 534 | ``` 535 | kubectl expose deployment grafana --type=LoadBalancer --port=80 --target-port=3000 --protocol=TCP 536 | service/grafana exposed 537 | ``` 538 | 539 | After the service is exposed, you can validate the configuration with `kubectl get service grafana`: 540 | 541 | ``` 542 | kubectl get service grafana 543 | NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE 544 | grafana LoadBalancer 10.101.113.249 80:31235/TCP 9m35s 545 | ``` 546 | 547 | As mentioned above, MiniKube and MiniShift deployments will not automatically assign an EXTERNAL-IP, and will listed as "". Running `minikube service grafana` (or `minikube service grafana --namespace ` if you created your deployments a namespace other than "Default") will open your default browser to the IP and Port combo where Grafana is exposed on your host VM. 548 | 549 | At this point, Grafana is configured to talk to InfluxDB, and has and automatically-provisioned dashboard to display the Twitter stats. Now it's time to get some actual stats and put them into the database. 550 | 551 | 552 | ## Create the CronJob 553 | 554 | A [Kubernetes Cron Job](https://kubernetes.io/docs/concepts/workloads/controllers/cron-jobs/) is, like it's namesake [Cron](https://en.wikipedia.org/wiki/Cron), a way to run a job on a particular schedule. In the case of Kubernetes, the job is a task running in a container: a [Kubernetes Job](https://kubernetes.io/docs/concepts/workloads/controllers/jobs-run-to-completion/) scheduled and tracked by Kubernetes to ensure its completion. 555 | 556 | For this project, the Cron Job is a single container running [a Python script to gather Twitter stats](https://github.com/clcollins/twitterGraph/blob/master/app.py). 557 | 558 | ### Create a Secret for the Twitter API credentials 559 | 560 | The cron job uses your Twitter API credentials to connect to the API and pull the stats, pulling them from environment variables inside the container. Create a secret to store the Twitter API credentials and the name of the account to gather the stats from, substituting your own credentials and account name: 561 | 562 | ``` 563 | kubectl create secret generic twitter-creds \ 564 | --from-literal=TWITTER_ACCESS_SECRET= \ 565 | --from-literal=TWITTER_ACCESS_TOKEN= \ 566 | --from-literal=TWITTER_API_KEY= \ 567 | --from-literal=TWITTER_API_SECRET= \ 568 | --from-literal=TWITTER_USER= 569 | ``` 570 | 571 | ### Create a Cron Job 572 | 573 | Finally, it is time to create the cron job to gather statistics. Unfortunately, `kubectl` doesn't have a way to create a cron job directly, so once again the object must be described in a YAML file, and loaded with `kubectl create -f `. 574 | 575 | Create a file named "cronjob.yml" describing the job to run: 576 | 577 | ``` 578 | apiVersion: batch/v1beta1 579 | kind: CronJob 580 | metadata: 581 | labels: 582 | app: twittergraph 583 | name: twittergraph 584 | spec: 585 | concurrencyPolicy: Replace 586 | failedJobsHistoryLimit: 3 587 | jobTemplate: 588 | metadata: 589 | spec: 590 | template: 591 | metadata: 592 | spec: 593 | containers: 594 | - envFrom: 595 | - secretRef: 596 | name: twitter-creds 597 | - secretRef: 598 | name: influxdb-creds 599 | image: docker.io/clcollins/twittergraph:1.0 600 | imagePullPolicy: Always 601 | name: twittergraph 602 | restartPolicy: Never 603 | schedule: '*/3 * * * *' 604 | successfulJobsHistoryLimit: 3 605 | ``` 606 | 607 | Looking over this file, the key pieces of a Kubernetes Cron Job are evident. The Cron Job spec actually contains a "jobTemplate", describing the actual Kubernetes Job to run. In this case, the job consists of a single container, with the twitter credentials and influxdb credentials secrets shared as environment variables using the `envFrom` that was used above in the deployments. 608 | 609 | This job uses a custom image from Docker Hub, `clcollins/twittergraph:1.0`. This image is just python 3.6 and contains [the app.py Python script for TwitterGraph](https://github.com/clcollins/twitterGraph/blob/master/app.py). (If you'd rather build the image yourself, you can follow the instructions in [BUILDING.md](https://github.com/clcollins/twitterGraph/blob/master/BUILDING.md) in the Github repo, to build the image with [Source To Image](https://github.com/openshift/source-to-image).) 610 | 611 | Wraping the Job template spec are the Cron Job spec options. Arguably the most important part, outside of the job itself, is the `schedule`, in this case set to run every 15 minutes, forever. The other important bit is the `concurrencyPolicy`. In this case, the concurrency policy is set to "replace", so if the previous job is still running when it's time to start a new one, the pod running the old job is destroyed and replaced with a new pod. 612 | 613 | Use the `kubectl create -f cronjob.yml` command to create the cron job: 614 | 615 | ``` 616 | kubectl create -f cronjob.yaml 617 | cronjob.batch/twittergraph created 618 | ``` 619 | 620 | The cron job can then be validated with `kubectl describe cronjob twittergraph` (example truncated for brevity): 621 | 622 | ``` 623 | kubectl describe cronjob twitterGraph 624 | Name: twittergraph 625 | Namespace: twittergraph 626 | Labels: app=twittergraph 627 | Annotations: 628 | Schedule: */3 * * * * 629 | Concurrency Policy: Replace 630 | Suspend: False 631 | Starting Deadline Seconds: 632 | ``` 633 | 634 | _Note:_ with a schedule set to `*/3 * * * *` - Kubernetes won't immediately start the new job. It will wait 3 minutes for the first period to pass. If you'd like to see immediate results, you can edit the cron job with `kubectl edit cronjob twittergraph`, and change the schedule to `* * * * *` temporarily, to run every minute. Just don't forget to change it back when you're done. 635 | 636 | ## Success! 637 | 638 | That should be it! If you've followed all the steps correctly, you will be left with an InfluxDB database, a Cron Job collecting stats from your Twitter account, and a Grafana deployment to view the data. For production clusters or cloud deployments of Kubernetes or OpenShift visit the LoadBalancer IP to login to Grafana using the credentials you set with the `GF_SECURITY_ADMIN_USER` and `GF_SECURITY_ADMIN_PASSWORD` set earlier. After logging in, select the TwitterGraph dashboard from the "Home" dropdown at the top-left of the screen. You should see something like the image below, with the current counts for your followers, folks you are following, status updates, likes and lists. It's probably a bit boring at first, but if you leave it running, over time and with more data collection, the graphs will start to look more interesting and provide more useful data! 639 | 640 | ![A new TwitterGraph deployment, with just a little data](images/twittergraph_new.png "A new TwitterGraph deployment, with just a little data") 641 | 642 | ## Where to go from here 643 | 644 | The data collected by the TwitterGraph script is relatively simplistic. The stats that are colleced are described in the ["data_points" dictionary in the app.py script](https://github.com/clcollins/twitterGraph/blob/master/app.py#L103-L109), but there's [a ton of data available](https://tweepy.readthedocs.io/en/v3.5.0/api.html#tweepy-api-twitter-api-wrapper). Adding a new Cron Job that runs daily to collect the day's activity (number of post, number of follows, etc.) would be a natural extension of the data. 645 | 646 | More interesting, probably, would be the collection of the daily data correllated together: how many followers were gained or lost based on the number of posts that day, etc. 647 | --------------------------------------------------------------------------------