├── .gitignore ├── LICENSE ├── README.md ├── img ├── hist_sample.png ├── pie_pg.png ├── plot_func.png ├── scatter_sample.png └── scatter_temperature.png ├── matplotcli.py ├── sample.json └── setup.py /.gitignore: -------------------------------------------------------------------------------- 1 | # Byte-compiled / optimized / DLL files 2 | __pycache__/ 3 | *.py[cod] 4 | *$py.class 5 | 6 | # C extensions 7 | *.so 8 | 9 | # Distribution / packaging 10 | .Python 11 | build/ 12 | develop-eggs/ 13 | dist/ 14 | downloads/ 15 | eggs/ 16 | .eggs/ 17 | lib/ 18 | lib64/ 19 | parts/ 20 | sdist/ 21 | var/ 22 | wheels/ 23 | pip-wheel-metadata/ 24 | share/python-wheels/ 25 | *.egg-info/ 26 | .installed.cfg 27 | *.egg 28 | MANIFEST 29 | 30 | # PyInstaller 31 | # Usually these files are written by a python script from a template 32 | # before PyInstaller builds the exe, so as to inject date/other infos into it. 33 | *.manifest 34 | *.spec 35 | 36 | # Installer logs 37 | pip-log.txt 38 | pip-delete-this-directory.txt 39 | 40 | # Unit test / coverage reports 41 | htmlcov/ 42 | .tox/ 43 | .nox/ 44 | .coverage 45 | .coverage.* 46 | .cache 47 | nosetests.xml 48 | coverage.xml 49 | *.cover 50 | *.py,cover 51 | .hypothesis/ 52 | .pytest_cache/ 53 | 54 | # Translations 55 | *.mo 56 | *.pot 57 | 58 | # Django stuff: 59 | *.log 60 | local_settings.py 61 | db.sqlite3 62 | db.sqlite3-journal 63 | 64 | # Flask stuff: 65 | instance/ 66 | .webassets-cache 67 | 68 | # Scrapy stuff: 69 | .scrapy 70 | 71 | # Sphinx documentation 72 | docs/_build/ 73 | 74 | # PyBuilder 75 | target/ 76 | 77 | # Jupyter Notebook 78 | .ipynb_checkpoints 79 | 80 | # IPython 81 | profile_default/ 82 | ipython_config.py 83 | 84 | # pyenv 85 | .python-version 86 | 87 | # pipenv 88 | # According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control. 89 | # However, in case of collaboration, if having platform-specific dependencies or dependencies 90 | # having no cross-platform support, pipenv may install dependencies that don't work, or not 91 | # install all needed dependencies. 92 | #Pipfile.lock 93 | 94 | # PEP 582; used by e.g. github.com/David-OConnor/pyflow 95 | __pypackages__/ 96 | 97 | # Celery stuff 98 | celerybeat-schedule 99 | celerybeat.pid 100 | 101 | # SageMath parsed files 102 | *.sage.py 103 | 104 | # Environments 105 | .env 106 | .venv 107 | env/ 108 | venv/ 109 | ENV/ 110 | env.bak/ 111 | venv.bak/ 112 | 113 | # Spyder project settings 114 | .spyderproject 115 | .spyproject 116 | 117 | # Rope project settings 118 | .ropeproject 119 | 120 | # mkdocs documentation 121 | /site 122 | 123 | # mypy 124 | .mypy_cache/ 125 | .dmypy.json 126 | dmypy.json 127 | 128 | # Pyre type checker 129 | .pyre/ 130 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2022 Daniel Moura 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # MatplotCLI 2 | 3 | ## Create matplotlib visualizations from the command-line 4 | 5 | MatplotCLI is a simple utility to quickly create plots from the command-line, leveraging [Matplotlib](https://github.com/matplotlib/matplotlib). 6 | 7 | ```sh 8 | plt "scatter(x,y,5,alpha=0.05); axis('scaled')" < sample.json 9 | ``` 10 | 11 | ![](https://github.com/dcmoura/matplotcli/raw/main/img/scatter_sample.png) 12 | 13 | ```sh 14 | plt "hist(x,30)" < sample.json 15 | ``` 16 | 17 | ![](https://github.com/dcmoura/matplotcli/raw/main/img/hist_sample.png) 18 | 19 | MatplotCLI accepts both [JSON lines](https://jsonlines.org) and arrays of JSON objects as input. 20 | Look at the recipes section to learn how to handle other formats like CSV. 21 | 22 | MatplotCLI executes python code (passed as argument) where some handy imports are already done (e.g. `from matplotlib.pyplot import *`) and where the input JSON data is already parsed and available in variables, making plotting easy. Please refer to `matplotlib.pyplot`'s [reference](https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.html#module-matplotlib.pyplot) and [tutorial](https://matplotlib.org/stable/tutorials/introductory/pyplot.html) for comprehensive documentation about the plotting commands. 23 | 24 | Data from the input JSON is made available in the following way. Given the input `myfile.json`: 25 | 26 | ```json 27 | {"a": 1, "b": 2} 28 | {"a": 10, "b": 20} 29 | {"a": 30, "c$d": 40} 30 | ``` 31 | 32 | The following variables are made available: 33 | 34 | ```python 35 | data = { 36 | "a": [1, 10, 30], 37 | "b": [2, 20, None], 38 | "c_d": [None, None, 40] 39 | } 40 | 41 | a = [1, 10, 30] 42 | b = [2, 20, None] 43 | c_d = [None, None, 40] 44 | 45 | col_names = ("a", "b", "c_d") 46 | ``` 47 | 48 | So, for a scatter plot `a vs b`, you could simply do: 49 | 50 | ``` 51 | plt "scatter(a,b); title('a vs b')" < myfile.json 52 | ``` 53 | 54 | Notice that the names of JSON properties are converted into valid Python identifiers whenever they are not (e.g. `c$d` was converted into `c_d`). 55 | 56 | ## Execution flow 57 | 58 | 1. Import matplotlib and other libs; 59 | 2. Read JSON data from standard input; 60 | 3. Execute user code; 61 | 4. Show the plot. 62 | 63 | All steps (except step 3) can be skipped through command-line options. 64 | 65 | 66 | ## Installation 67 | 68 | The easiest way to install MatplotCLI is from `pip`: 69 | 70 | ```sh 71 | pip install matplotcli 72 | ``` 73 | 74 | 75 | 76 | ## Recipes and Examples 77 | 78 | ### Plotting JSON data 79 | 80 | MatplotCLI natively supports JSON lines: 81 | 82 | ```sh 83 | echo ' 84 | {"a":0, "b":1} 85 | {"a":1, "b":0} 86 | {"a":3, "b":3}' | 87 | plt "plot(a,b)" 88 | ``` 89 | 90 | and arrays of JSON objects: 91 | 92 | 93 | ```sh 94 | echo '[ 95 | {"a":0, "b":1}, 96 | {"a":1, "b":0}, 97 | {"a":3, "b":3}]' | 98 | plt "plot(a,b)" 99 | ``` 100 | 101 | 102 | ### Plotting from a csv 103 | 104 | [SPyQL](https://github.com/dcmoura/spyql) is a data querying tool that allows running SQL queries with Python expressions on top of different data formats. Here, SPyQL is reading a CSV file, automatically detecting if there's an header row, the dialect and the data type of each column, and converting the output to JSON lines before handing over to MatplotCLI. 105 | 106 | 107 | ```sh 108 | cat my.csv | spyql "SELECT * FROM csv TO json" | plt "plot(x,y)" 109 | ``` 110 | 111 | ### Plotting from a yaml/xml/toml 112 | 113 | [yq](https://kislyuk.github.io/yq/#) converts yaml, xml and toml files to json, allowing to easily plot any of these with MatplotCLI. 114 | 115 | ```sh 116 | cat file.yaml | yq -c | plt "plot(x,y)" 117 | ``` 118 | ```sh 119 | cat file.xml | xq -c | plt "plot(x,y)" 120 | ``` 121 | ```sh 122 | cat file.toml | tomlq -c | plt "plot(x,y)" 123 | ``` 124 | 125 | ### Plotting from a parquet file 126 | 127 | `parquet-tools` allows dumping a parquet file to JSON format. [`jq -c`](https://stedolan.github.io/jq/) makes sure that the output has 1 JSON object per line before handing over to MatplotCLI. 128 | 129 | ```sh 130 | parquet-tools cat --json my.parquet | jq -c | plt "plot(x,y)" 131 | ``` 132 | 133 | ### Plotting from a database 134 | 135 | Databases CLIs typically have an option to output query results in CSV format (e.g. `psql --csv -c query` for PostgreSQL, `sqlite3 -csv -header file.db query` for SQLite). 136 | 137 | Here we are visualizing how much space each namespace is taking in a PostgreSQL database. 138 | [SPyQL](https://github.com/dcmoura/spyql) converts CSV output from the psql client to JSON lines, and makes sure there are no more than 10 items, aggregating the smaller namespaces in an `All others` category. 139 | Finally, MatplotCLI makes a pie chart based on the space each namespace is taking. 140 | 141 | ```sh 142 | psql -U myuser mydb --csv -c ' 143 | SELECT 144 | N.nspname, 145 | sum(pg_relation_size(C.oid))*1e-6 AS size_mb 146 | FROM pg_class C 147 | LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace) 148 | GROUP BY 1 149 | ORDER BY 2 DESC' | 150 | spyql " 151 | SELECT 152 | nspname if row_number < 10 else 'All others' as name, 153 | sum_agg(size_mb) AS size_mb 154 | FROM csv 155 | GROUP BY 1 156 | TO json" | 157 | plt " 158 | nice_labels = ['{0}\n{1:,.0f} MB'.format(n,s) for n,s in zip(name,size_mb)]; 159 | pie(size_mb, labels=nice_labels, autopct='%1.f%%', pctdistance=0.8, rotatelabels=True)" 160 | ``` 161 | 162 | ![](https://github.com/dcmoura/matplotcli/raw/main/img/pie_pg.png) 163 | 164 | 165 | ### Plotting a function 166 | 167 | Disabling reading from stdin and generating the output using `numpy`. 168 | 169 | ```sh 170 | plt --no-input " 171 | x = np.linspace(-1,1,2000); 172 | y = x*np.sin(1/x); 173 | plot(x,y); 174 | axis('scaled'); 175 | grid(True)" 176 | ``` 177 | ![](https://github.com/dcmoura/matplotcli/raw/main/img/plot_func.png) 178 | 179 | 180 | ### Saving the plot to an image 181 | 182 | Saving the output without showing the interactive window. 183 | 184 | ```sh 185 | cat sample.json | 186 | plt --no-show " 187 | hist(x,30); 188 | savefig('myimage.png', bbox_inches='tight')" 189 | ``` 190 | 191 | ### Plot of the global temperature 192 | 193 | Here's a complete pipeline from getting the data to transforming and plotting it: 194 | 195 | 1. Downloading a CSV file with `curl`; 196 | 2. Skipping the first row with `sed`; 197 | 3. Grabbing the year column and 12 columns with monthly temperatures to an array and converting to JSON lines format using [SPyQL](https://github.com/dcmoura/spyql); 198 | 4. Exploding the monthly array with SPyQL (resulting in 12 rows per year) while removing invalid monthly measurements; 199 | 5. Plotting with MatplotCLI . 200 | 201 | ```sh 202 | curl https://data.giss.nasa.gov/gistemp/tabledata_v4/GLB.Ts+dSST.csv | 203 | sed 1d | 204 | spyql " 205 | SELECT Year, cols[1:13] AS temps 206 | FROM csv 207 | TO json" | 208 | spyql " 209 | SELECT 210 | json->Year + ((row_number-1)%12)/12 AS year, 211 | json->temps AS temp 212 | FROM json 213 | EXPLODE json->temps 214 | WHERE json->temps is not Null 215 | TO json" | 216 | plt " 217 | scatter(year, temp, 2, temp); 218 | xlabel('Year'); 219 | ylabel('Temperature anomaly w.r.t. 1951-80 (ºC)'); 220 | title('Global surface temperature (land and ocean)')" 221 | ``` 222 | 223 | ![](https://github.com/dcmoura/matplotcli/raw/main/img/scatter_temperature.png) -------------------------------------------------------------------------------- /img/hist_sample.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dcmoura/matplotcli/f291f3d700094679b60db7318d40b7bb3a37a585/img/hist_sample.png -------------------------------------------------------------------------------- /img/pie_pg.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dcmoura/matplotcli/f291f3d700094679b60db7318d40b7bb3a37a585/img/pie_pg.png -------------------------------------------------------------------------------- /img/plot_func.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dcmoura/matplotcli/f291f3d700094679b60db7318d40b7bb3a37a585/img/plot_func.png -------------------------------------------------------------------------------- /img/scatter_sample.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dcmoura/matplotcli/f291f3d700094679b60db7318d40b7bb3a37a585/img/scatter_sample.png -------------------------------------------------------------------------------- /img/scatter_temperature.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dcmoura/matplotcli/f291f3d700094679b60db7318d40b7bb3a37a585/img/scatter_temperature.png -------------------------------------------------------------------------------- /matplotcli.py: -------------------------------------------------------------------------------- 1 | import sys 2 | import json 3 | import re 4 | import argparse 5 | 6 | 7 | def make_imports(vars): 8 | """initializes dict of variables for user code with imports""" 9 | exec( 10 | """ 11 | from matplotlib.pyplot import * 12 | import matplotlib as mpl 13 | import matplotlib.pyplot as plt 14 | from matplotlib import cm 15 | from math import * 16 | import numpy as np 17 | """, 18 | {}, 19 | vars, 20 | ) 21 | 22 | 23 | def make_str_valid_varname(s): 24 | """makes a string a valid python identifier""" 25 | # remove invalid characters (except spaces in-between) 26 | s = re.sub(r"[^0-9a-zA-Z_\s]", " ", s).strip() 27 | 28 | # replace spaces by underscores (instead of dropping spaces) for readability 29 | s = re.sub(r"\s+", "_", s) 30 | 31 | # if first char is not letter or underscore then add underscore to make it valid 32 | if not re.match("^[a-zA-Z_]", s): 33 | s = "_" + s 34 | 35 | return s 36 | 37 | 38 | def read_data(vars): 39 | """reads json lines or an array of jsons from stdin""" 40 | input_str = sys.stdin.read().lstrip() # assuming we won't be loading GB files... 41 | lines = [] 42 | if input_str: 43 | first_char = input_str[0] 44 | if first_char == "{": 45 | sys.stderr.write("plt input is JSON lines\n") 46 | lines = [json.loads(line) for line in input_str.splitlines()] 47 | elif first_char == "[": 48 | sys.stderr.write("plt input is a JSON array\n") 49 | lines = json.loads(input_str) 50 | else: 51 | raise Exception( 52 | "plt unable to determine input format, unexpected char '{}'.\n".format( 53 | first_char 54 | ) 55 | ) 56 | 57 | col_names = {key for line in lines for key in line.keys()} 58 | data = { 59 | make_str_valid_varname(col): [line.get(col) for line in lines] 60 | for col in col_names 61 | } 62 | vars.update({"col_names": tuple(data.keys()), "data": data}) 63 | vars.update(data) # for not having to go through `data` to access input columns 64 | 65 | 66 | def parse_args(): 67 | """defines and parses command-line arguments""" 68 | parser = argparse.ArgumentParser( 69 | description="""plt reads JSON lines from the stdin and 70 | executes python code with calls to matplotlib.pyplot functions to 71 | generate plots. For more information and examples visit: 72 | https://github.com/dcmoura/matplotcli""" 73 | ) 74 | parser.add_argument( 75 | "code", 76 | help=""" 77 | python code with calls to matplotlib.pyplot to generate plots. 78 | Example: plt "plot(x,y); xlabel('time (s)')" < data.json 79 | """, 80 | ) 81 | parser.add_argument( 82 | "--no-import", 83 | action="store_true", 84 | help="do not import (matplotlib) modules before executing the code", 85 | ) 86 | parser.add_argument( 87 | "--no-input", 88 | action="store_true", 89 | help="do not read input data from stdin before executing the code", 90 | ) 91 | parser.add_argument( 92 | "--no-show", 93 | action="store_true", 94 | help="do not call matplotlib.pyplot.show() after executing the code", 95 | ) 96 | return parser.parse_args() 97 | 98 | 99 | def main(): 100 | sys.tracebacklimit = 1 # keep error traces short 101 | args = parse_args() # parses cmd-line arguments 102 | vars = dict() # variables to execute the user code 103 | if not args.no_import: 104 | make_imports(vars) # add imports to the variables 105 | if not args.no_input: 106 | read_data(vars) # add the input data to the variables 107 | exec(args.code, vars, vars) # execute user code 108 | if not args.no_show: 109 | # tries showing the plot (assuming default/typical imports) 110 | try: 111 | exec("show()", vars, vars) 112 | except NameError: 113 | try: 114 | sys.stderr.write("Could not find show(), trying plt.show()\n") 115 | exec("plt.show()", vars, vars) 116 | except NameError: 117 | sys.stderr.write( 118 | "Could not find plt.show(), skipping...\n", 119 | ) 120 | 121 | 122 | if __name__ == "__main__": 123 | main() 124 | -------------------------------------------------------------------------------- /setup.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | 3 | from setuptools import setup 4 | 5 | with open("README.md", encoding="utf-8") as readme_file: 6 | readme = readme_file.read() 7 | 8 | setup( 9 | author="Daniel C. Moura", 10 | author_email="daniel.c.moura@gmail.com", 11 | python_requires=">=2.7", 12 | classifiers=[ 13 | "Development Status :: 3 - Alpha", 14 | "Environment :: Console", 15 | "Intended Audience :: Developers", 16 | "License :: OSI Approved :: MIT License", 17 | "Natural Language :: English", 18 | "Programming Language :: Python", 19 | "Programming Language :: Python :: 2", 20 | "Programming Language :: Python :: 2.7", 21 | "Programming Language :: Python :: 3", 22 | "Programming Language :: Python :: 3.6", 23 | "Programming Language :: Python :: 3.7", 24 | "Programming Language :: Python :: 3.8", 25 | "Programming Language :: Python :: 3.9", 26 | "Topic :: Scientific/Engineering :: Visualization", 27 | "Topic :: Utilities", 28 | ], 29 | description="MatplotCLI: create matplotlib visualizations from the command-line", 30 | py_modules=['matplotcli'], 31 | entry_points={ 32 | "console_scripts": [ 33 | "plt=matplotcli:main", 34 | ], 35 | }, 36 | install_requires=["matplotlib", "numpy"], 37 | license="MIT license", 38 | long_description=readme, 39 | long_description_content_type="text/markdown", 40 | keywords="visualization plots json", 41 | name="matplotcli", 42 | url="https://github.com/dcmoura/matplotcli", 43 | project_urls={ 44 | "Source": "https://github.com/dcmoura/matplotcli", 45 | }, 46 | version="0.2.0-1", 47 | ) 48 | --------------------------------------------------------------------------------