├── .babelrc ├── .gitignore ├── LICENSE ├── README.md ├── docs ├── error-001.png ├── error-002.png ├── gitlogg-icon-github.png └── success.png ├── package.json └── scripts ├── colors.sh ├── gitlogg-generate-log.sh ├── gitlogg-parse-json.js └── gitlogg.sh /.babelrc: -------------------------------------------------------------------------------- 1 | { 2 | presets: ['es2015'] 3 | } 4 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | *.DS_Store 2 | npm-debug.log 3 | 4 | node_modules/ 5 | assets/ 6 | _repos/ 7 | _output/ 8 | _tmp/ 9 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | The MIT License (MIT) 2 | 3 | Copyright (c) 2016 Wallace Sidhrée 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | ![Gitlogg](https://raw.githubusercontent.com/dreamyguy/gitlogg/master/docs/gitlogg-icon-github.png "Parse the 'git log' of one or several 'git' repositories into a sanitised and distributable 'JSON' file") 2 | 3 | > _Parse the 'git log' of one or several 'git' repositories into a sanitised and distributable 'JSON' file._ 4 | 5 | [![MIT Licence](https://img.shields.io/badge/license-MIT-blue.svg)](https://github.com/dreamyguy/gitlogg/blob/master/LICENSE) [![Data served by Gitlogg API](https://img.shields.io/badge/data_can_be_served_by-gitlogg--api-89336e.svg)](https://github.com/dreamyguy/gitlogg-api) [![Data served by Gitlogg API](https://img.shields.io/badge/data_can_be_rendered_by-gitinsight-89336e.svg)](https://github.com/dreamyguy/gitinsight) 6 | 7 | ## Why? 8 | 9 | `git log` is a wonderful tool. However its output can be not only surprisingly inconsistent, but also long, difficult to scan and to distribute. 10 | 11 | **Gitlogg** sanitises the `git log` and outputs it to `JSON`, a format that can easily be consumed by other applications. As long as the repositories being scanned are kept up to date, **Gitlogg** will return fresh data every time it runs. 12 | 13 | #### **Gitlogg** addresses the following challenges: 14 | 15 | * `git log` can only be used on a repository at a time. 16 | * `git log` can't be easily consumed by other applications in its original format. 17 | * `git log` doesn't return **impact**, which is the cumulative change brought by a single commit. Very interesting graphs can be built with that data, as shown on [sidhree.com][1]. 18 | * Fields that allow user input, like `subject`, need to be sanitised to be consumed. 19 | * File changes shown under `--stat` or `--shortstat` are currently not available as placeholders under `--pretty=format:`, and it is cumbersome to get commit logs to output neatly in single lines - with stats. 20 | * It is hard to retrieve commits made on a specific but generic moment, like "11pm"; at the "27th minute" of an hour; on a "Sunday"; on "March"; on "GMT -5"; on the "53rd second of a minute". 21 | * Some commits don't have stats, and that can cause the structure of the output to break, making it harder to distribute it. 22 | 23 | #### Script execution feedback 24 | 25 | **Gitlogg** is not a very complex application, but I still made an effort to provide some feedback on what is happening under the hood. Below are some screenshots of dialogs one can expect to see while executing it: 26 | 27 | ![Error 001](https://raw.githubusercontent.com/dreamyguy/gitlogg/master/docs/error-001.png "'Error 001' message as on release v0.1.3") 28 | > **Øh nøes!** The path to the folder containing all repositories *does not exist!* 29 | 30 | ![Error 002](https://raw.githubusercontent.com/dreamyguy/gitlogg/master/docs/error-002.png "'Error 002' message as on release v0.1.3") 31 | > **Øh nøes!** The path to the folder containing all repositories *exists, but is empty!* 32 | 33 | ![Success!](https://raw.githubusercontent.com/dreamyguy/gitlogg/master/docs/success.png "Success messages as on release v0.1.6") 34 | > **Success!** `JSON` parsed, based on **9** different repositories with a total of **25,537** commits. 35 | 36 | Note that I've included two huge repos _(*react* & *react-native*, that have 7,813 & 10,065 commits respectively at the time of this writting)_ for the sake of demonstration. The resulting parsed `JSON` file has 715,040 lines. All that done in less than 25 seconds. 37 | 38 | _I have successfully compiled **`470`** repositories at once_ (all repos under the organization I work for). Then I got these specs: 39 | 40 | * `gitlogg.tmp` generated in `154s` (`~2.57mins`) 41 | * `JSON` output parsed in `2792ms` 42 | * `JSON` file size: `121,5MB` 43 | * Commits processed: `118,117` 44 | * Parsed `JSON` file, lines: `3,307,280` 45 | 46 | ## Getting started 47 | 48 | **Gitlogg** requires [NodeJS][2] and [BabelJS][3]. 49 | 50 | 1. Install `NodeJS` (visit [their page][2] to find the right install for your system). 51 | 2. Run `npm run setup`. That will: 52 | 53 | * Install `BabelJS` globally by running `npm install babel-cli -g`. 54 | * Install all the local dependencies, through `npm install`. 55 | * Create the directory in which all repos to be parsed to `JSON` will be at (only on **Simple Mode**). 56 | * Create the directories expected by the scripts that output files. 57 | 58 | ## The `JSON` output 59 | 60 | The output will look like this (first commit for **Font Awesome**): 61 | 62 | [ 63 | { 64 | "repository": "Font-Awesome", 65 | "commit_nr": 1, 66 | "commit_hash": "7ed221e28df1745a20009329033ac690ef000575", 67 | "author_name": "Dave Gandy", 68 | "author_email": "dave@davegandy.com", 69 | "author_date": "Fri Feb 17 09:27:26 2012 -0500", 70 | "author_date_relative": "4 years, 3 months ago", 71 | "author_date_unix_timestamp": "1329488846", 72 | "author_date_iso_8601": "2012-02-17 09:27:26 -0500", 73 | "subject": "first commit", 74 | "subject_sanitized": "first-commit", 75 | "stats": " 1 file changed, 0 insertions(+), 0 deletions(-)", 76 | "time_hour": 9, 77 | "time_minutes": 27, 78 | "time_seconds": 26, 79 | "time_gmt": "-0500", 80 | "date_day_week": "Fri", 81 | "date_month_day": 17, 82 | "date_month_name": "Feb", 83 | "date_month_number": 2, 84 | "date_year": "2012", 85 | "date_iso_8601": "2012-02-17", 86 | "files_changed": 1, 87 | "insertions": 0, 88 | "deletions": 0, 89 | "impact": 0 90 | }, 91 | { 92 | (...) 93 | }, 94 | { 95 | (...) 96 | } 97 | ] 98 | 99 | Note that many `git log` fields were not printed here, but that's only because I've commented out some of them in the **gitlogg-parse-json.js** script. All the fields below are available. Fields marked with a `*` are either non-standard or not available as placeholders on `--pretty=format:`: 100 | 101 | * repository 102 | * commit_nr 103 | commit_hash 104 | commit_hash_abbreviated 105 | tree_hash 106 | tree_hash_abbreviated 107 | parent_hashes 108 | parent_hashes_abbreviated 109 | author_name 110 | author_name_mailmap 111 | author_email 112 | author_email_mailmap 113 | author_date 114 | author_date_RFC2822 115 | author_date_relative 116 | author_date_unix_timestamp 117 | author_date_iso_8601 118 | author_date_iso_8601_strict 119 | committer_name 120 | committer_name_mailmap 121 | committer_email 122 | committer_email_mailmap 123 | committer_date 124 | committer_date_RFC2822 125 | committer_date_relative 126 | committer_date_unix_timestamp 127 | committer_date_iso_8601 128 | committer_date_iso_8601_strict 129 | ref_names 130 | ref_names_no_wrapping 131 | encoding 132 | subject 133 | subject_sanitized 134 | commit_notes 135 | * stats 136 | * time_hour 137 | * time_minutes 138 | * time_seconds 139 | * time_gmt 140 | * date_day_week 141 | * date_month_day 142 | * date_month_name 143 | * date_month_number 144 | * date_year 145 | * date_iso_8601 146 | * files_changed 147 | * insertions 148 | * deletions 149 | * impact 150 | 151 | ## Creating the `JSON` file 152 | 153 | There are two modes and they are basically the same, except that the **Simple Mode** doesn't require configuration. The **Advanced Mode** requires one to set the absolute path to the directory containing all the repositories you'd like to parse to a single `JSON` file. 154 | 155 | #### Simple Mode 156 | 157 | To simplify the generation process to a point that no configuration is required, follow this directory structure: 158 | 159 | gitlogg/ <== This repository's root 160 | ├── scripts/ 161 | │   ├── colors.sh 162 | │   ├── gitlogg-generate-log.sh 163 | │   ├── gitlogg-parse-json.js 164 | │   └── gitlogg.sh 165 | └── _repos/ <== Copy/place/keep your repositories under the folder "_repos/" 166 | ├── repo1 167 | ├── repo2 168 | ├── repo3 169 | └── repo4 170 | 171 | 1. Copy the all the repositories you wish to parse to `JSON` to the `_repos/` folder, as shown above. 172 | 173 | 2. Granted that you are within the `gitlogg` folder (this repo's root), run: 174 | 175 | $ npm run gitlogg 176 | 177 | #### Advanced Mode 178 | 179 | To generate the `JSON` file based on repositories in any other location, you'll have to define the path to the folder that contains all your repositories. 180 | 181 | 1. Open [`gitlogg-generate-log.sh`](https://github.com/dreamyguy/gitlogg/blob/master/scripts/gitlogg-generate-log.sh#L4) with an editor of your choice and edit the `yourpath` variable: 182 | 183 | # define the absolute path to the directory that contains all your repositories 184 | yourpath=/absolute/system/path/to/directory/that/contains/all/your/repositories/ 185 | 186 | _**Tip:** drag the folder that contain your repositories to a terminal window, and you'll get the absolute system path to that folder._ 187 | 188 | 2. Granted that you are within the `gitlogg` folder (this repo's root), run: 189 | 190 | $ npm run gitlogg 191 | 192 | #### Parallel Processing 193 | 194 | The parallel processing that was released on [v0.1.8](https://github.com/dreamyguy/gitlogg/tree/v0.1.8) had problems with `xargs` and was temporarily removed. The issue is being dealt with through [pull-request #16](https://github.com/dreamyguy/gitlogg/pull/16). 195 | 196 | ## The parsed `JSON` file 197 | 198 | > Two files will be generated when running `npm run gitlogg`: **`_tmp/gitlogg.tmp`** and **`_output/gitlogg.json`**. 199 | 200 | gitlogg/ <== This repository's root 201 | ├── scripts/ 202 | │   ├── colors.sh 203 | │   ├── gitlogg-generate-log.sh 204 | │   ├── gitlogg-parse-json.js 205 | │   └── gitlogg.sh 206 | ├── _output/ 207 | │   └── gitlogg.json <== The parsed 'JSON', what we're all after. It's parsed from 'gitlogg.tmp' 208 | └── _tmp/ 209 | └── gitlogg.tmp <== The processed 'git log' 210 | 211 | Two files were necessary because of the nature of the script, that loops through all subdirectories and outputs the `git log` for all valid `git` repositories. Once that loop is done, a valid `JSON` file (`gitlogg.json`) is generated out of `gitlogg.tmp`. 212 | 213 | `gitlogg.tmp` is just a temporary file from which `gitlogg.json` bases itself on. In case the parsing fails `gitlogg.tmp` can come in handy for debugging. 214 | 215 | ## Further Notes 216 | 217 | #### Debugging 218 | 219 | I've created error messages with suggested solutions, to help you get past the most common issues. 220 | 221 | However, `git log`'s output can break while it's being processed. That's most certainly caused by fields that allow user input, like _commit messages_. These fields may contain characters (like `\r`) that crash with those reserved for the generation of `gitlogg.tmp`, namely `\n`. 222 | 223 | Efforts have been made to mitigate errors by sanitizing characters that have caused errors before, but it might still happen in some edge cases. If it does happen, have a look at the generated `gitlogg.tmp` and see if the expected structure (which is obvious) breaks. Once you have identified the line, have a closer look at the commit and look for an unusual character. 224 | 225 | Post an issue with a link to a _gist_ containing your broken `gitlogg.tmp` and I will try to reproduce the error. 226 | 227 | #### Documentation 228 | 229 | Documentation is done either by: 230 | 231 | * Commit messages, 232 | * Commit comments, 233 | * Code comments, 234 | * `README.md` files, like this one. 235 | 236 | Some of the initial commits were done deliberately to show what one gets with short commands like `$ git log`. From that initial state commits keep on introducing simplicity or complexity to the code, depending on the work flow. That in itself is a form of documentation. In other words, if you're really that interested in details, there are plenty to be had in the code itself and in its own progressive enhancement. 237 | 238 | #### License 239 | 240 | [MIT](LICENSE) 241 | 242 | #### Disclaimer 243 | 244 | This project is by no means the smartest way to parse a `git log` to `JSON`, nor does it aim at becoming so. It is simply a _learn-by-doing_ project in which I experiment with commands available on OSX's Terminal and whatever else I find along the way. 245 | 246 | **Gitlogg** was built and tested on OSX. Though an effort has been done to make it cross-platform, there could be errors on other systems. 247 | 248 | It's certainly not harmful to your repositories and it won't change any data in it. Having said that, it's served _raw_ and _'as is'_. You may get support, but don't expect it nor take it for granted. 249 | 250 | #### Known Issues 251 | 252 | There are _no known issues_ at this point. The parallelization that was introduced on [v0.1.8](https://github.com/dreamyguy/gitlogg/tree/v0.1.8) had issues with `xargs`, so its introduction was temporarily reverted until the problem has been dealt with through [pull-request #16](https://github.com/dreamyguy/gitlogg/pull/16). [v0.1.9](https://github.com/dreamyguy/gitlogg/tree/v0.1.9) was released to revert those changes. 253 | 254 | The [javascript](https://github.com/dreamyguy/gitlogg/tree/javascript) branch is a very fine piece of programming; you should definitely check it out. I haven't tested it extensively, but found a few issues, which are reported in the [issue tracker](https://github.com/dreamyguy/gitlogg/issues). 255 | 256 | The current version [v0.2.1](https://github.com/dreamyguy/gitlogg/tree/v0.2.1) is still quite stable after all these years, with no known issues. Try it! :sparkles: 257 | 258 | #### Release History 259 | 260 | * 2018-07-12 [v0.2.1](https://github.com/dreamyguy/gitlogg/tree/v0.2.1) - [View Changes](https://github.com/dreamyguy/gitlogg/compare/v0.2.0...v0.2.1) 261 | * Use `ȝ` instead of `\0` when replacing `\n` during the extraction of `git log`. `\0` is not as reliable as it seemed. 262 | * The main idea here is to use a character that occurs as seldom as possible - preferably never in `git` context. 263 | * `ȝ` (Yogh) is an old English character. If that gives problems, I'll try `ƿ` (Wynn), another abandoned English char. 264 | * 2018-07-11 [v0.2.0](https://github.com/dreamyguy/gitlogg/tree/v0.2.0) - [View Changes](https://github.com/dreamyguy/gitlogg/compare/v0.1.9...v0.2.0) 265 | * Improve console output readability 266 | * Simplify `JSON` format. 267 | * Reduce filesize of output `JSON`, in some scenarios quite dramatically 268 | * Make it importable into `MongoDB`, which is what is being used on **gitlogg-api** 269 | * Use `\0` instead of `ò` when replacing `\n` during the extraction of `git log`. 270 | * The main idea here is to use a character that occurs as seldom as possible - preferably never in `git` context. 271 | * 2016-12-15 [v0.1.9](https://github.com/dreamyguy/gitlogg/tree/v0.1.9) - [View Changes](https://github.com/dreamyguy/gitlogg/compare/v0.1.8...v0.1.9) 272 | * Remove parallelization of processes until the problem with `xargs` has been dealt with. 273 | * 2016-12-14 [v0.1.8](https://github.com/dreamyguy/gitlogg/tree/v0.1.8) - [View Changes](https://github.com/dreamyguy/gitlogg/compare/v0.1.7...v0.1.8) 274 | * Parse `JSON` through a read/write stream, so we get around the 268MB `Node`'s buffer limitation. 275 | * This limited the whole operation to a number between 173,500 and 174,000 commits. 276 | * Parallelize the generation of `git log` for multiple repos, optionally passing number of processes as a CLI argument. 277 | * Mitigate encoding problems caused by `ISO-8859-1` characters not being properly encoded to `UTF-8`. 278 | * 2016-11-21 [v0.1.7](https://github.com/dreamyguy/gitlogg/tree/v0.1.7) - [View Changes](https://github.com/dreamyguy/gitlogg/compare/v0.1.6...v0.1.7) 279 | * Better readability for 'Release History' 280 | * Correct url to logo, so it also renders outside Github 281 | * Rename sub-folder 'gitlogg' to 'scripts' to avoid confusion 282 | * Simplify initial setup and running of 'gitlogg' 283 | * Set vars instead of hardcoding values 284 | * Separate scripts from output files 285 | * Introduce 'Debugging' as a 'Further Notes' item 286 | * Tip on how to get the absolute system path to a directory 287 | * Introduce 'View Changes' links under 'Release History' 288 | * 2016-11-19 [v0.1.6](https://github.com/dreamyguy/gitlogg/tree/v0.1.6) - [View Changes](https://github.com/dreamyguy/gitlogg/compare/v0.1.5...v0.1.6) 289 | * Introduce `commit_nr`, a commit count within each repo 290 | * Show how many repos are about to be processed on console 291 | * Show what repo is being processed on console 292 | * Replace carriage return with space 293 | * 2016-06-12 [v0.1.5](https://github.com/dreamyguy/gitlogg/tree/v0.1.5) - [View Changes](https://github.com/dreamyguy/gitlogg/compare/v0.1.4...v0.1.5) 294 | * Introduce logo 295 | * Correct wrong reference to 'yourpath' 296 | * Output numbers instead of strings 297 | * 2016-05-23 [v0.1.4](https://github.com/dreamyguy/gitlogg/tree/v0.1.4) - [View Changes](https://github.com/dreamyguy/gitlogg/compare/v0.1.3...v0.1.4) 298 | * Fix a bug that would break the output in some rare cases 299 | * 2016-05-21 [v0.1.3](https://github.com/dreamyguy/gitlogg/tree/v0.1.3) - [View Changes](https://github.com/dreamyguy/gitlogg/compare/v0.1.2...v0.1.3) 300 | * Even better error handling 301 | * 2016-05-21 [v0.1.2](https://github.com/dreamyguy/gitlogg/tree/v0.1.2) - [View Changes](https://github.com/dreamyguy/gitlogg/compare/v0.1.1...v0.1.2) 302 | * Better error handling 303 | * 2016-05-21 [v0.1.1](https://github.com/dreamyguy/gitlogg/tree/v0.1.1) - [View Changes](https://github.com/dreamyguy/gitlogg/compare/v0.1.0...v0.1.1) 304 | * The 'gitlogg' release, the node-based JSON generation 305 | * 2016-05-20 [v0.1.0](https://github.com/dreamyguy/gitlogg/tree/v0.1.0) 306 | * The 'git-log-to-json' release, now considered legacy 307 | 308 | ------------- 309 | 310 | > _Brought to you by [Wallace Sidhrée][1]._ 311 | 312 | [1]: http://sidhree.com/ "Wallace Sidhrée" 313 | [2]: https://nodejs.org/en/ "NodeJS" 314 | [3]: https://babeljs.io/ "BabelJS" 315 | -------------------------------------------------------------------------------- /docs/error-001.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dreamyguy/gitlogg/77419e4d7f8ed2efec485b7271900024e0c49d69/docs/error-001.png -------------------------------------------------------------------------------- /docs/error-002.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dreamyguy/gitlogg/77419e4d7f8ed2efec485b7271900024e0c49d69/docs/error-002.png -------------------------------------------------------------------------------- /docs/gitlogg-icon-github.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dreamyguy/gitlogg/77419e4d7f8ed2efec485b7271900024e0c49d69/docs/gitlogg-icon-github.png -------------------------------------------------------------------------------- /docs/success.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dreamyguy/gitlogg/77419e4d7f8ed2efec485b7271900024e0c49d69/docs/success.png -------------------------------------------------------------------------------- /package.json: -------------------------------------------------------------------------------- 1 | { 2 | "name": "gitlogg", 3 | "version": "0.2.1", 4 | "description": "Parse the 'git log' of one or several 'git' repositories into a sanitised and distributable 'JSON' file", 5 | "keywords": [ 6 | "gitlog", 7 | "gitlogg", 8 | "git log", 9 | "git stats", 10 | "json", 11 | "git log json" 12 | ], 13 | "homepage": "https://github.com/dreamyguy/gitlogg", 14 | "repository": { 15 | "type": "git", 16 | "url": "https://github.com/dreamyguy/gitlogg" 17 | }, 18 | "author": { 19 | "name": "Wallace Sidhrée", 20 | "email": "i@dreamyguy.com", 21 | "url": "http://sidhree.com/" 22 | }, 23 | "copyright": "Copyright (c) Wallace Sidhrée - All rights reserved.", 24 | "license": "MIT", 25 | "devDependencies": { 26 | "babel-preset-es2015": "^6.9.0", 27 | "chalk": "^1.1.3" 28 | }, 29 | "scripts": { 30 | "setup": "npm install babel-cli -g && npm install && mkdir -p _output && mkdir -p _repos && mkdir -p _tmp", 31 | "gitlogg": "./scripts/gitlogg.sh" 32 | }, 33 | "engines": { 34 | "node": ">=4.3.0" 35 | }, 36 | "dependencies": { 37 | "JSONStream": "^1.2.1", 38 | "byline": "4.2.2" 39 | } 40 | } 41 | -------------------------------------------------------------------------------- /scripts/colors.sh: -------------------------------------------------------------------------------- 1 | #!/usr/bin/bash 2 | 3 | # Extracted command line colors for nicer output 4 | 5 | # Text Reset 6 | RCol='\033[0m' 7 | 8 | # Regular Bold Underline High Intensity BoldHigh Intens Background High Intensity Backgrounds 9 | Bla='\033[0;30m'; BBla='\033[1;30m'; UBla='\033[4;30m'; IBla='\033[0;90m'; BIBla='\033[1;90m'; On_Bla='\033[40m'; On_IBla='\033[0;100m'; 10 | Red='\033[0;31m'; BRed='\033[1;31m'; URed='\033[4;31m'; IRed='\033[0;91m'; BIRed='\033[1;91m'; On_Red='\033[41m'; On_IRed='\033[0;101m'; 11 | Gre='\033[0;32m'; BGre='\033[1;32m'; UGre='\033[4;32m'; IGre='\033[0;92m'; BIGre='\033[1;92m'; On_Gre='\033[42m'; On_IGre='\033[0;102m'; 12 | Yel='\033[0;33m'; BYel='\033[1;33m'; UYel='\033[4;33m'; IYel='\033[0;93m'; BIYel='\033[1;93m'; On_Yel='\033[43m'; On_IYel='\033[0;103m'; 13 | Blu='\033[0;34m'; BBlu='\033[1;34m'; UBlu='\033[4;34m'; IBlu='\033[0;94m'; BIBlu='\033[1;94m'; On_Blu='\033[44m'; On_IBlu='\033[0;104m'; 14 | Pur='\033[0;35m'; BPur='\033[1;35m'; UPur='\033[4;35m'; IPur='\033[0;95m'; BIPur='\033[1;95m'; On_Pur='\033[45m'; On_IPur='\033[0;105m'; 15 | Cya='\033[0;36m'; BCya='\033[1;36m'; UCya='\033[4;36m'; ICya='\033[0;96m'; BICya='\033[1;96m'; On_Cya='\033[46m'; On_ICya='\033[0;106m'; 16 | Whi='\033[0;37m'; BWhi='\033[1;37m'; UWhi='\033[4;37m'; IWhi='\033[0;97m'; BIWhi='\033[1;97m'; On_Whi='\033[47m'; On_IWhi='\033[0;107m'; 17 | -------------------------------------------------------------------------------- /scripts/gitlogg-generate-log.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | my_dir="$(dirname "$0")" 4 | cd $my_dir 5 | 6 | source "colors.sh" 7 | 8 | cd .. 9 | 10 | # define the absolute path to the directory that contains all your repositories. 11 | yourpath='./_repos/' 12 | 13 | # define temporary 'git log' output file that will be parsed to 'json' 14 | tempOutputFile='_tmp/gitlogg.tmp' 15 | 16 | # ensure file exists 17 | mkdir -p ${tempOutputFile%%.*} 18 | touch $tempOutputFile 19 | 20 | # name and path to this very script, for output message purposes 21 | thisFile='./scripts/gitlogg-generate-log.sh' 22 | 23 | # define path to 'json' parser 24 | jsonParser='./scripts/gitlogg-parse-json.js' 25 | 26 | # initial message 27 | initialMessage="⚡ ${Pur}~ GITLOGG ~${Yel} ⚡\n\n" 28 | 29 | # ensure there's always a '/' at the end of the 'yourpath' variable, since its value can be changed by user. 30 | case "$yourpath" in 31 | */) 32 | yourpathSanitized="${yourpath}" # no changes if there's already a slash at the end - syntax sugar 33 | ;; 34 | *) 35 | yourpathSanitized="${yourpath}/" # add a slash at the end if there isn't already one 36 | ;; 37 | esac 38 | 39 | # 'thepath' sets the path to each repository under 'yourpath' (the trailing asterix [*/] represents all the repository folders). 40 | thepath="${yourpathSanitized}*/" 41 | 42 | 43 | # function to trim whitespace 44 | trim() { 45 | local var="$*" 46 | var="${var#'${var%%[![:space:]]*}'}" # remove leading whitespace characters 47 | var="${var%'${var##*[![:space:]]}'}" # remove trailing whitespace characters 48 | echo -n "$var" 49 | } 50 | 51 | # number of directories (repos) under 'thepath' 52 | DIRCOUNT="$(find $thepath -maxdepth 0 -type d | wc -l)" 53 | 54 | # trim whitespace from DIRCOUNT 55 | DIRNR="$(trim $DIRCOUNT)" 56 | 57 | # determine if we're dealing with a singular repo or multiple 58 | if [ "${DIRNR}" -gt "1" ]; then 59 | reporef="all ${Red}${DIRNR}${Yel} repositories" 60 | elif [ "${DIRNR}" -eq "1" ]; then 61 | reporef="the one repository" 62 | fi 63 | 64 | # start counting seconds elapsed 65 | SECONDS=0 66 | 67 | # if the path exists and is not empty 68 | if [ -d "${yourpathSanitized}" ] && [ "$(ls $yourpathSanitized)" ]; then 69 | echo -e "${initialMessage} Generating ${Pur}git log ${Yel}for ${reporef} located at ${Red}'${thepath}'${Yel}. ${Blu}This might take a while!${RCol}\n" 70 | for dir in $thepath 71 | do 72 | (cd $dir && 73 | echo -e " ${Whi}Outputting ${Pur}${PWD##*/}${RCol}" >&2 && 74 | git log --all --no-merges --shortstat --reverse --pretty=format:'commits\trepository\t'"${PWD##*/}"'\tcommit_hash\t%H\tcommit_hash_abbreviated\t%h\ttree_hash\t%T\ttree_hash_abbreviated\t%t\tparent_hashes\t%P\tparent_hashes_abbreviated\t%p\tauthor_name\t%an\tauthor_name_mailmap\t%aN\tauthor_email\t%ae\tauthor_email_mailmap\t%aE\tauthor_date\t%ad\tauthor_date_RFC2822\t%aD\tauthor_date_relative\t%ar\tauthor_date_unix_timestamp\t%at\tauthor_date_iso_8601\t%ai\tauthor_date_iso_8601_strict\t%aI\tcommitter_name\t%cn\tcommitter_name_mailmap\t%cN\tcommitter_email\t%ce\tcommitter_email_mailmap\t%cE\tcommitter_date\t%cd\tcommitter_date_RFC2822\t%cD\tcommitter_date_relative\t%cr\tcommitter_date_unix_timestamp\t%ct\tcommitter_date_iso_8601\t%ci\tcommitter_date_iso_8601_strict\t%cI\tref_names\t%d\tref_names_no_wrapping\t%D\tencoding\t%e\tsubject\t%s\tsubject_sanitized\t%f\tcommit_notes\t%N\tstats\t' | 75 | iconv -f ISO-8859-1 -t UTF-8 | # convert ISO-8859-1 encoding to UTF-8 76 | sed '/^[ \t]*$/d' | # remove all newlines/line-breaks, including those with empty spaces 77 | tr '\n' 'ȝ' | # convert newlines/line-breaks to a character, so we can manipulate it without much trouble 78 | tr '\r' ' ' | # replace carriage returns with a space, so we avoid new lines popping from placeholders that allow user input 79 | sed 's/tȝcommits/tȝȝcommits/g' | # because some commits have no stats, we have to create an extra line-break to make `paste -d ' ' - -` consistent 80 | tr 'ȝ' '\n' | # bring back all line-breaks 81 | sed '{ 82 | N 83 | s/[)]\n\ncommits/)\ 84 | commits/g 85 | }' | # some rogue mystical line-breaks need to go down to their knees and beg for mercy, which they're not getting 86 | paste -d ' ' - - | # collapse lines so that the `shortstat` is merged with the rest of the commit data, on a single line 87 | awk '{print NR"\\t",$0}' | # print line number in front of each line, along with the `\t` delimiter 88 | sed 's/\\t\ commits\\trepo/\\t\commits\\trepo/g' # get rid of the one space that shouldn't be there 89 | ) 90 | done > "${tempOutputFile}" 91 | echo -e "\n ${Gre}The file ${Blu}${tempOutputFile} ${Gre}generated in${RCol}: ${SECONDS}s" && 92 | babel "${jsonParser}" | node # only parse JSON if we have a source to parse it from 93 | # if the path exists but is empty 94 | elif [ -d "${yourpathSanitized}" ] && [ ! "$(ls $yourpathSanitized)" ]; then 95 | echo -e "\n ${Whi}[ERROR 002]: ${Yel}The path to the local repositories ${Red}'${yourpath}'${Yel}, which is set on the file ${Blu}'${thisFile}' ${UYel}exists, but is empty!${RCol}" 96 | echo -e " ${Yel}Please move the repos to ${Red}'${yourpath}'${Yel} or update the variable ${Pur}'yourpath'${Yel} to reflect the absolute path to the directory where the repos are located.${RCol}" 97 | # if the path does not exists 98 | elif [ ! -d "${yourpathSanitized}" ]; then 99 | echo -e "\n ${Whi}[ERROR 001]: ${Yel}The path to the local repositories ${Red}'${yourpath}'${Yel}, which is set on the file ${Blu}'${thisFile}' ${UYel}does not exist!${RCol}" 100 | echo -e " ${Yel}Please create ${Red}'${yourpath}'${Yel} and move the repos under it, or update the variable ${Pur}'yourpath'${Yel} to reflect the absolute path to the directory where the repos are located.${RCol}" 101 | fi 102 | -------------------------------------------------------------------------------- /scripts/gitlogg-parse-json.js: -------------------------------------------------------------------------------- 1 | var fs = require('fs'), 2 | path = require('path'), 3 | chalk = require('chalk'), 4 | byline = require('byline'), 5 | Transform = require('stream').Transform, 6 | JSONStream = require('JSONStream'), 7 | output_file_temp = '_tmp/gitlogg.tmp', 8 | output_file = '_output/gitlogg.json'; 9 | 10 | console.log(chalk.yellow('\n Parsing JSON output...\n')); 11 | 12 | // initialise timer 13 | console.time(chalk.green(' JSON output parsed in')); 14 | 15 | // create the streams 16 | var stream = fs.createReadStream(output_file_temp, 'utf8'); 17 | var output = fs.createWriteStream(output_file, 'utf8'); 18 | // handle errors 19 | stream.on('error', function() { 20 | console.log(chalk.red(' Could not read from ' + output_file_temp)); 21 | }); 22 | output.on('error', function() { 23 | console.log(chalk.red(' Something went wrong, ' + output_file + ' could not be written / saved')); 24 | }); 25 | // handle completion callback 26 | output.on('finish', function() { 27 | console.timeEnd(chalk.green(' JSON output parsed in')); 28 | console.log(chalk.green(' The file ' + chalk.blue(output_file) + ' was saved. ' + chalk.yellow('Done! ✨\n'))); 29 | }); 30 | 31 | // stream the stream line by line 32 | stream = byline.createStream(stream); 33 | // create a transform stream 34 | var parser = new Transform({ objectMode: true }); 35 | // use a JSONStream: JSONStream.stringify(open, sep, close) 36 | var jsonToStrings = JSONStream.stringify('[\n ', ',\n ','\n]\n'); 37 | 38 | // output stats according to mode 39 | const getStats = ({ 40 | stats, 41 | mode, // 'files' | 'insertions' | 'deletions' 42 | }) => { 43 | let output = 0; 44 | let rgx = /(.*)/gi; 45 | let match = ''; 46 | if (stats) { 47 | if (mode === 'files') { 48 | rgx = /(?[0-9]*)(\s)(files?\schanged)/gi; 49 | match = stats.match(rgx); 50 | output = match && match[0] ? match[0].replace(rgx, '$') : 0; 51 | } 52 | if (mode === 'insertions') { 53 | rgx = /(?[0-9]*)(\s)(insertions?\(\+\))/gi; 54 | match = stats.match(rgx); 55 | output = match && match[0] ? match[0].replace(rgx, '$') : 0; 56 | } 57 | if (mode === 'deletions') { 58 | rgx = /(?[0-9]*)(\s)(deletions?\(\-\))/gi; 59 | match = stats.match(rgx); 60 | output = match && match[0] ? match[0].replace(rgx, '$') : 0; 61 | } 62 | } 63 | return output ? parseInt(output, 10) : 0; 64 | }; 65 | 66 | // decode UTF-8-ized Latin-1/ISO-8859-1 to UTF-8 67 | var decode = function(str) { 68 | var s; 69 | try { 70 | // if the string is UTF-8, this will work and not throw an error. 71 | s = decodeURIComponent(escape(str)); 72 | } catch(e) { 73 | // if it isn't, an error will be thrown, and we can asume that we have an ISO string. 74 | s = str; 75 | } 76 | return s; 77 | }; 78 | 79 | // replace double quotes with single ones 80 | var unquote = function(str) { 81 | if (str === undefined) { 82 | return ''; 83 | } else if (str != '') { 84 | return str.replace(/"/g, "'"); 85 | } else { 86 | return str; 87 | } 88 | }; 89 | 90 | // slice the string as long as it's not empty 91 | var sliceit = function(str) { 92 | if (str === undefined) { 93 | return ''; 94 | } else if (str != '') { 95 | return str.slice(1); 96 | } else { 97 | return str; 98 | } 99 | } 100 | 101 | // Util to extract content within a 'start' and an 'end' string 102 | var extractContent = ({ content, start = '', end = '', sanitized }) => { 103 | var regex = new RegExp(`(${start})([\\s\\S]*?)(${end})`, "gim"); 104 | var extractedFullString = content.match(regex)[0]; 105 | var extractedMidContent = extractedFullString.replace(regex, '$2'); 106 | var extractedMidContentSanitized = extractedMidContent.replace('\\t', '-t'); 107 | var extractedFullStringSanitized = `${extractedFullString.replace(regex, '$1')}${extractedMidContentSanitized}${extractedFullString.replace(regex, '$3')}`; 108 | return sanitized ? extractedFullStringSanitized : extractedFullString; 109 | }; 110 | 111 | // Sometimes the separator can appear within these definitions, which are the ones that allow for 'free text' strings 112 | var cleanupSeparators = function(content) { 113 | var match1 = extractContent({ content, start: "author_name\\\\t", end: "\\\\tauthor_name_mailmap" }); 114 | var match2 = extractContent({ content, start: "author_name_mailmap\\\\t", end: "\\\\tauthor_email" }); 115 | var match3 = extractContent({ content, start: "committer_name\\\\t", end: "\\\\tcommitter_name_mailmap" }); 116 | var match4 = extractContent({ content, start: "committer_name_mailmap\\\\t", end: "\\\\tcommitter_email" }); 117 | var match5 = extractContent({ content, start: "subject\\\\t", end: "\\\\tsubject_sanitized" }); 118 | var match1Sanitized = extractContent({ content, start: "author_name\\\\t", end: "\\\\tauthor_name_mailmap", sanitized: true }); 119 | var match2Sanitized = extractContent({ content, start: "author_name_mailmap\\\\t", end: "\\\\tauthor_email", sanitized: true }); 120 | var match3Sanitized = extractContent({ content, start: "committer_name\\\\t", end: "\\\\tcommitter_name_mailmap", sanitized: true }); 121 | var match4Sanitized = extractContent({ content, start: "committer_name_mailmap\\\\t", end: "\\\\tcommitter_email", sanitized: true }); 122 | var match5Sanitized = extractContent({ content, start: "subject\\\\t", end: "\\\\tsubject_sanitized", sanitized: true }); 123 | // console.log('match1', match1); 124 | // console.log('match1Sanitized', match1Sanitized); 125 | var contentSanitized = content.replace(match1, match1Sanitized) 126 | .replace(match2, match2Sanitized) 127 | .replace(match3, match3Sanitized) 128 | .replace(match4, match4Sanitized) 129 | .replace(match5, match5Sanitized); 130 | // console.log('content', contentSanitized); 131 | return contentSanitized; 132 | }; 133 | 134 | // do the transformations, through the transform stream 135 | parser._transform = function(data, encoding, done) { 136 | var separator = /\\t/; 137 | var dataDecoded = decode(data); 138 | var dataDecodedClean = cleanupSeparators(dataDecoded); 139 | var c = dataDecodedClean.trim().split(separator); 140 | // console.log(c); 141 | // vars based on sequential values ( sanitise " to ' on fields that accept user input ) 142 | var repository = c[3], // color-consolidator 143 | commit_nr = parseInt(c[0], 10), // 3 144 | commit_hash = c[5], // 5109ad5a394a4873290ff7f7a38b7ca2e1b3b8e1 145 | commit_hash_abbreviated = c[7], // 5109ad5 146 | tree_hash = c[9], // a1606ea8d6e24e1c832b52cb9c04ae1df2242ed4 147 | tree_hash_abbreviated = c[11], // a1606ea 148 | parent_hashes = c[13], // 7082fa621bf93503fe173d06ada3c6111054a62b 149 | parent_hashes_abbreviated = c[15], // 7082fa6 150 | author_name = unquote(c[17]), // Wallace Sidhrée 151 | author_name_mailmap = unquote(c[19]), // Wallace Sidhrée 152 | author_email = c[21], // i@dreamyguy.com 153 | author_email_mailmap = c[23], // i@dreamyguy.com 154 | author_date = c[25], // Fri Jan 3 14:16:56 2014 +0100 155 | author_date_RFC2822 = c[27], // Fri, 3 Jan 2014 14:16:56 +0100 156 | author_date_relative = c[29], // 2 years, 5 months ago 157 | author_date_unix_timestamp = c[31], // 1388755016 158 | author_date_iso_8601 = c[33], // 2014-01-03 14:16:56 +0100 159 | author_date_iso_8601_strict = c[35], // 2014-01-03T14:16:56+01:00 160 | committer_name = unquote(c[37]), // Wallace Sidhrée 161 | committer_name_mailmap = unquote(c[39]), // Wallace Sidhrée 162 | committer_email = c[41], // i@dreamyguy.com 163 | committer_email_mailmap = c[43], // i@dreamyguy.com 164 | committer_date = c[45], // Fri Jan 3 14:16:56 2014 +0100 165 | committer_date_RFC2822 = c[47], // Fri, 3 Jan 2014 14:16:56 +0100 166 | committer_date_relative = c[49], // 2 years, 5 months ago 167 | committer_date_unix_timestamp = c[51], // 1388755016 168 | committer_date_iso_8601 = c[53], // 2014-01-03 14:16:56 +0100 169 | committer_date_iso_8601_strict = c[55], // 2014-01-03T14:16:56+01:00 170 | ref_names = unquote(c[57]), // "" 171 | ref_names_no_wrapping = unquote(c[59]), // "" 172 | encoding = c[61], // "" 173 | subject = unquote(c[63]), // Upgrade FontAwesome from 3.2.1 to 4.0.3" 174 | subject_sanitized = c[65], // Upgrade-FontAwesome-from-3.2.1-to-4.0.3" 175 | commit_notes = unquote(c[67]), // "" 176 | stats = sliceit(c[69]); // ` 9 files changed, 507 insertions(+), 2102 deletions(-)` 177 | // vars that require manipulation 178 | var time_array = author_date.split(' '), // Fri Jan 3 14:16:56 2014 +0100 => [Fri, Jan, 3, 14:16:56, 2014, +0100] 179 | time_array_clock = time_array[3].split(':'), // 14:16:56 => [14, 16, 56] 180 | time_hour = parseInt(time_array_clock[0], 10), // [14, 16, 56] => 14 181 | time_minutes = parseInt(time_array_clock[1], 10), // [14, 16, 56] => 16 182 | time_seconds = parseInt(time_array_clock[2], 10), // [14, 16, 56] => 56 183 | time_gmt = time_array[5], // [Fri, Jan, 3, 14:16:56, 2014, +0100] => +0100 184 | date_array = author_date_iso_8601.split(' ')[0], // 2014-01-03 14:16:56 +0100 => 2014-01-03 185 | date_day_week = time_array[0], // [Fri, Jan, 3, 14:16:56, 2014, +0100] => Fri 186 | date_iso_8601 = date_array, // 2014-01-03 187 | date_month_day = parseInt(date_array.split('-')[2], 10), // 2014-01-03 => [2014, 01, 03] => 03 188 | date_month_name = time_array[1], // [Fri, Jan, 3, 14:16:56, 2014, +0100] => Jan 189 | date_month_number = parseInt(date_array.split('-')[1], 10), // 2014-01-03 => [2014, 01, 03] => 01 190 | date_year = time_array[4], // [Fri, Jan, 3, 14:16:56, 2014, +0100] => 2014 191 | files_changed = getStats({ stats, mode: 'files' }), // ` 9 files changed, 507 insertions(+), 2102 deletions(-)` => 9 192 | insertions = getStats({ stats, mode: 'insertions' }), // ` 9 files changed, 507 insertions(+), 2102 deletions(-)` => 507 193 | deletions = getStats({ stats, mode: 'deletions' }), // ` 9 files changed, 507 insertions(+), 2102 deletions(-)` => 2102 194 | impact = (insertions - deletions); // 507 - 2102 => -1595 195 | // create the object 196 | var obj = { 197 | repository: repository, 198 | commit_nr: commit_nr, 199 | commit_hash: commit_hash, 200 | // commit_hash_abbreviated: commit_hash_abbreviated, 201 | // tree_hash: tree_hash, 202 | // tree_hash_abbreviated: tree_hash_abbreviated, 203 | // parent_hashes: parent_hashes, 204 | // parent_hashes_abbreviated: parent_hashes_abbreviated, 205 | author_name: author_name, 206 | // author_name_mailmap: author_name_mailmap, 207 | author_email: author_email, 208 | // author_email_mailmap: author_email_mailmap, 209 | author_date: author_date, 210 | // author_date_RFC2822: author_date_RFC2822, 211 | author_date_relative: author_date_relative, 212 | author_date_unix_timestamp: author_date_unix_timestamp, 213 | author_date_iso_8601: author_date_iso_8601, 214 | // author_date_iso_8601_strict: author_date_iso_8601_strict, 215 | // committer_name: committer_name, 216 | // committer_name_mailmap: committer_name_mailmap, 217 | // committer_email: committer_email, 218 | // committer_email_mailmap: committer_email_mailmap, 219 | // committer_date: committer_date, 220 | // committer_date_RFC2822: committer_date_RFC2822, 221 | // committer_date_relative: committer_date_relative, 222 | // committer_date_unix_timestamp: committer_date_unix_timestamp, 223 | // committer_date_iso_8601: committer_date_iso_8601, 224 | // committer_date_iso_8601_strict: committer_date_iso_8601_strict, 225 | // ref_names: ref_names, 226 | // ref_names_no_wrapping: ref_names_no_wrapping, 227 | // encoding: encoding, 228 | subject: subject, 229 | subject_sanitized: subject_sanitized, 230 | // commit_notes: commit_notes, 231 | stats: stats, 232 | time_hour: time_hour, 233 | time_minutes: time_minutes, 234 | time_seconds: time_seconds, 235 | time_gmt: time_gmt, 236 | date_day_week: date_day_week, 237 | date_month_day: date_month_day, 238 | date_month_name: date_month_name, 239 | date_month_number: date_month_number, 240 | date_year: date_year, 241 | date_iso_8601: date_iso_8601, 242 | files_changed: files_changed, 243 | insertions: insertions, 244 | deletions: deletions, 245 | impact: impact 246 | }; 247 | this.push(obj); 248 | done(); 249 | }; 250 | 251 | // initialise stream 252 | stream 253 | .pipe(parser) 254 | .pipe(jsonToStrings) 255 | .pipe(output); 256 | -------------------------------------------------------------------------------- /scripts/gitlogg.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | bash ./scripts/gitlogg-generate-log.sh 4 | --------------------------------------------------------------------------------