├── README.md ├── analyze_and_plot.py ├── make_great_again_commits.csv ├── make_great_again_count.sql ├── new_commit_count.sql ├── old_commit_count.sql ├── plot.png ├── plot_great_again.png └── plot_without_borders.png /README.md: -------------------------------------------------------------------------------- 1 | # Make github great again 2 | ## An analysis of github commit messages 3 | 4 | Having noticed it for myself recently I wanted to see if the big media coverage of Donald Trump's campaign had an influence on how people write commit messages. My assumption was that one would be able to see a spike in commit subjects that follow the pattern `make ... again` or in regex `^make .* again$` after the middle of 2015 when Trump got more and more popular. 5 | 6 | I used the data published by github on google bigquery ([blogpost](https://github.com/blog/2201-making-open-source-data-more-available)) to analyze all public commit messages. 7 | 8 | The main script that I used to query is [`make_great_again_count.sql`](/make_great_again_count.sql). It filters the commit messages by subject and then gets the daily count. Looking at the data coming out of this one can see that there is indeed an increase in messages following the pattern but I assumed that the same was true for **all** commits as github just became more and more popular. 9 | 10 | So I also queried the count of all commit messages ([`new_commit_count.sql`](/new_commit_count.sql) and [`old_commit_count.sql`](/old_commit_count.sql)) and joined those later. I did it in two steps because doing all in one was too much for bigquery to export as a single csv and I did not want to setup google cloud storage. 11 | 12 | Finally I wrote a small ipython notebook script ([`analyze_and_plot.py`](/analyze_and_plot.py)) that combines all three csv files (old commits, new commits, trump commits) and plotted the data: 13 | 14 | ![Plot of the data](/plot.png?raw=true) 15 | 16 | As you can see I decided to only plot data after 2006. This is a random choice but I just did not want to go back too far in history. I also resampled to one week to get smoother results. 17 | 18 | When looking at the plot now we come to the conclusion: Trump has not brainwashed people as much as I thought. They at least still seem to not think of him any more then before when writing commit messages. 19 | 20 | 21 | ### Update 22 | When explicitly looking for the regex `^make .* great again$'` an interesting observation can be made: As one would expect there are much fewer results (only 57 in total) but they do seem to correlate with Trump getting popular. The very first commit message that follows the pattern (being `Make "arc land" great again`) appears on 2015-10-28 with all others coming after that point in time. So maybe Trump did have some influence afterall. 23 | 24 | ![Plot of the make great again data](/plot_great_again.png?raw=true) 25 | 26 | Thanks to [@derhuerst](https://github.com/derhuerst) for the suggestion! 27 | 28 | ### Update II 29 | If we change the regex to not include border checks (so change it it `make .* great again`) we get a total of 136 commits and can clearly see that this is a trend that was introduced by Trump: 30 | 31 | ![Plot of the make great again data without border conidtions](/plot_without_borders.png?raw=true) 32 | -------------------------------------------------------------------------------- /analyze_and_plot.py: -------------------------------------------------------------------------------- 1 | import pandas as pd 2 | import matplotlib 3 | matplotlib.style.use('ggplot') 4 | %matplotlib inline 5 | 6 | # taken from http://stackoverflow.com/a/16808448 7 | def _sum(x): 8 | if len(x) == 0: 9 | return 0 10 | return sum(x) 11 | 12 | count_1 = pd.read_csv("/home/aaltje/Downloads/results-20160713-110409.csv", 13 | index_col='the_date', 14 | parse_dates=True) 15 | count_2 = pd.read_csv("/home/aaltje/Downloads/results-20160713-110423.csv", 16 | index_col='the_date', 17 | parse_dates=True) 18 | 19 | combined = count_1.append(count_2)['1970-01-02':] 20 | 21 | make_great_again = pd.read_csv("/home/aaltje/Downloads/results-20160713-104258.csv", 22 | index_col='the_date', 23 | parse_dates=True) 24 | all_data = pd.merge(make_great_again, combined, left_index=True, right_index=True) 25 | all_data['ratio'] = all_data.make_great_again_commits / all_data.all_commits 26 | all_data = all_data.resample('1W').apply(_sum) 27 | 28 | all_data['2006-01-01':].plot(figsize=(14,10), subplots=True) 29 | -------------------------------------------------------------------------------- /make_great_again_commits.csv: -------------------------------------------------------------------------------- 1 | the_date,make_great_again_commits 2 | 2015-10-28,"Make ""arc land"" great again" 3 | 2015-11-03,make the bot great again 4 | 2015-11-16,Make migration #2 great again; 5 | 2015-12-16,make tests GREAT again 6 | 2015-12-16,MAKE TESTING GREAT AGAIN! (tm) 7 | 2015-12-23,make the controller great again 8 | 2015-12-27,make classnames great again 9 | 2016-01-01,Make the testing coverage great again 10 | 2016-01-12,Make tests great again 11 | 2016-01-17,make tests great again. Using same /status than on app server 12 | 2016-01-21,Make XORFastHash great again 13 | 2016-01-22,Make things work great again(tm) 14 | 2016-01-22,Make serving Swift apps great again! 15 | 2016-01-22,Make templates great again 16 | 2016-01-23,make heroku great again 17 | 2016-01-23,make tests great again 18 | 2016-01-23,make heroku great again 19 | 2016-02-04,make pickups great again 20 | 2016-02-10,Make this sentence great again 21 | 2016-02-12,Make keywords great again. 22 | 2016-02-12,"Make build system great again, yet again" 23 | 2016-02-13,[Article] Make Sci-fi Great Again. 24 | 2016-02-14,Make the TagControl great again! 25 | 2016-02-17,Make 4.3 kernel great again! 26 | 2016-02-17,Make pre-publish great again 27 | 2016-02-21,nms: Make NMS great again 28 | 2016-02-24,make goify great againg 29 | 2016-02-26,make add new form great again 30 | 2016-02-28,Make eslint great again. 31 | 2016-02-28,"cap: implicitly enable cap-notify on CAP LS 302, to MAKE IRC GREAT AGAIN!!!!oneoneone" 32 | 2016-03-02,Make testConfig great again! 33 | 2016-03-05,Make Packet Streaming Great Again 34 | 2016-03-05,Let's Make Search Great Again! 35 | 2016-03-07,Make HistoGlobe Great Again 36 | 2016-03-07,Make HistoGlobe Great Again 37 | 2016-03-07,MAKE HISTOGLOBE GREAT AGAIN! 38 | 2016-03-07,Make HistoGlobe Great Again 39 | 2016-03-07,MAKE HISTOGLOBE GREAT AGAIN! 40 | 2016-03-07,MAKE HISTOGLOBE GREAT AGAIN! 41 | 2016-03-10,Make America great again. 42 | 2016-03-12,make scripting great again 43 | 2016-03-12,refactor: make tmuxline great again 44 | 2016-03-12,make tests great again 45 | 2016-03-17,Make Scrolling Great Again 46 | 2016-03-18,Make MazePredicateMany great again. 47 | 2016-03-23,Add navbar item - Rename social networks to media - Add titles to media - Add Twitch & Osu! to media - Make email great again - Change sizes in media & contact 48 | 2016-03-23,Make intersections great again! 49 | 2016-03-23,make tests great again again 50 | 2016-03-24,make bling great again 51 | 2016-03-24,make vendor great again 52 | 2016-04-01,Make trickle() great again. 53 | 2016-04-01,S72.00 Make SonarQube Great Again 54 | 2016-04-02,Make DB Great Again 55 | 2016-04-04,make tests great again 56 | 2016-04-04,Make Tahoma great again 57 | 2016-04-05,Make VPN Great Again 58 | 2016-04-05,Make VPN Great Again 59 | 2016-04-06,Make Moderate Comment Screen Great Again by showing links 60 | 2016-04-06,Make Moderate Comment Screen Great Again by showing links 61 | 2016-04-06,Make Moderate Comment Screen Great Again by showing links 62 | 2016-04-06,Make Moderate Comment Screen Great Again by showing links 63 | 2016-04-07,Let's make thirst great again! 64 | 2016-04-11,Make Soft CSS great again !! 65 | 2016-04-11,Make NameLayer great again 66 | 2016-04-17,make packet.c great again 67 | 2016-04-18,Make ruby completion great again 68 | 2016-04-21,Make America (aka submitOnEnter) great again (#3181) 69 | 2016-04-21,make backtraces great again 70 | 2016-04-22,Bugfix: Make recompilation great again 71 | 2016-04-28,make fork aux argument passing great again 72 | 2016-04-28,Make TestDeleteFile and TestDeleteDir great again. 73 | 2016-04-29,"Make command line args great again, bump version" 74 | 2016-04-29,"Make command line args great again, bump version" 75 | 2016-05-02,Make docfx great again! 76 | 2016-05-04,Make Big City Great Again 77 | 2016-05-05,Make mobile great again (#151) 78 | 2016-05-06,Make iFrames great again 79 | 2016-05-09,Make NavBar great again! 80 | 2016-05-10,Make EditorUtils great again (fix it so it actually compiles when building) 81 | 2016-05-11,Make carpool great again 82 | 2016-05-11,Make carpool great again 83 | 2016-05-12,Make Pulivari Great Again :trumpet: 84 | 2016-05-12,Let's make YAHRP great again! 85 | 2016-05-14,Make ols docstring great again 86 | 2016-05-15,added Commands class and tryed to make geekbot great again 87 | 2016-05-15,fix: make the script great again 88 | 2016-05-16,make folds great again 89 | 2016-05-16,make i3 great again 90 | 2016-05-20,make debian package build great again 91 | 2016-05-22,"Big UI overhaul, 'Make the UI Great Again'" 92 | 2016-05-24,Make Flow great again 93 | 2016-05-24,docs: make docs great again (#1104) 94 | 2016-05-27,Make quick edits great again. 95 | 2016-05-27,Make words great again 96 | 2016-05-28,LETS MAKE HTTP AUTH GREAT AGAIN 97 | 2016-05-30,Make autocompletion great again 98 | 2016-05-30,make retries great again 99 | 2016-05-30,make the priors panel look great again by setting label widths #492 100 | 2016-06-01,Make FFS great again! - Chisel support back in - WAILA support back in 101 | 2016-06-07,Make dotfiles great again 102 | 2016-06-08,Make short urls great again 103 | 2016-06-08,Make the Wall great again (fixes #108) 104 | 2016-06-09,Make `get` great again! 105 | 2016-06-11,Make HTML great again 106 | 2016-06-13,Make KeyboardLayoutEngine great again! 107 | 2016-06-15,Make Cherry great again! (it now compile!) 108 | 2016-06-15,Make lifetimeportlet great again 109 | 2016-06-16,Make code great again 110 | 2016-06-16,make html emails great again fixes #258 111 | 2016-06-19,Make GeoIP Great Again! 112 | 2016-06-19,Make the tests great again! 113 | 2016-06-20,Run full tests again. Make travis great again. 114 | 2016-06-21,"Merge ""Make pep8 job great again""" 115 | 2016-06-21,Make pep8 job great again 116 | 2016-06-22,Make Profile section great again 117 | 2016-06-23,Make swift great again (#2033) 118 | 2016-06-23,Make Colours Great Again 119 | 2016-06-25,"Make Ameri… erm, koel-focus great again" 120 | 2016-06-25,make cancel button great again 121 | 2016-06-25,Fix #5160 - make afbj great again 122 | 2016-06-26,make the player great again 123 | 2016-06-27,Make tox great again 124 | 2016-06-27,Remove hacks for syrups and juices; make backups great again 125 | 2016-06-29,make autism great again 126 | 2016-06-29,make emissive intensity great again 127 | 2016-06-30,Make the support identifier great again. 128 | 2016-07-01,Refactor workqueues a little to make them great again 129 | 2016-07-03,changes to make america great again 130 | 2016-07-03,Make travis r-cmd check great again 131 | 2016-07-04,Make header margin great again. 132 | 2016-07-04,Make white backgrounds great again. 133 | 2016-07-04,make font sizing great again (ems not px) 134 | 2016-07-07,make --abs-val great again 135 | 2016-07-08,Make Silk Great Again and Upgrade Dev Project (#122) 136 | 2016-07-15,"cmd/goimports, imports: make goimports great again" 137 | -------------------------------------------------------------------------------- /make_great_again_count.sql: -------------------------------------------------------------------------------- 1 | SELECT DATE(author.date) as the_date, count(*) as make_great_again_commits 2 | FROM ([bigquery-public-data:github_repos.commits]) 3 | where REGEXP_MATCH(subject, r'(?i)^make .* again$') 4 | GROUP BY the_date 5 | ORDER BY the_date; 6 | -------------------------------------------------------------------------------- /new_commit_count.sql: -------------------------------------------------------------------------------- 1 | SELECT DATE(author.date) as the_date, count(*) as all_commits 2 | FROM ([bigquery-public-data:github_repos.commits]) 3 | WHERE DATE(author.date) >= DATE('2010-01-01') 4 | GROUP BY the_date 5 | ORDER BY the_date; 6 | -------------------------------------------------------------------------------- /old_commit_count.sql: -------------------------------------------------------------------------------- 1 | SELECT DATE(author.date) as the_date, count(*) as all_commits 2 | FROM ([bigquery-public-data:github_repos.commits]) 3 | WHERE DATE(author.date) < DATE('2010-01-01') 4 | GROUP BY the_date 5 | ORDER BY the_date; 6 | -------------------------------------------------------------------------------- /plot.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/k-nut/make_github_great_again/34308243f797fd2b9da69d0d3efb0a5198dfc27d/plot.png -------------------------------------------------------------------------------- /plot_great_again.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/k-nut/make_github_great_again/34308243f797fd2b9da69d0d3efb0a5198dfc27d/plot_great_again.png -------------------------------------------------------------------------------- /plot_without_borders.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/k-nut/make_github_great_again/34308243f797fd2b9da69d0d3efb0a5198dfc27d/plot_without_borders.png --------------------------------------------------------------------------------