├── .gitignore ├── LICENSE ├── README.md ├── docker-compose.yml.sample ├── pg_extras ├── __init__.py └── queries │ ├── all_locks.sql │ ├── bloat.sql │ ├── blocking.sql │ ├── cache_hit.sql │ ├── calls.sql │ ├── extensions.sql │ ├── index_cache_hit.sql │ ├── index_size.sql │ ├── index_usage.sql │ ├── kill_all.sql │ ├── locks.sql │ ├── long_running_queries.sql │ ├── mandelbrot.sql │ ├── null_indexes.sql │ ├── outliers.sql │ ├── records_rank.sql │ ├── seq_scans.sql │ ├── table_cache_hit.sql │ ├── table_indexes_size.sql │ ├── table_size.sql │ ├── total_index_size.sql │ ├── total_table_size.sql │ ├── unused_indexes.sql │ └── vacuum_stats.sql └── setup.py /.gitignore: -------------------------------------------------------------------------------- 1 | build/ 2 | dist/ 3 | *.egg-info 4 | *.swp 5 | *.pyc 6 | __pycache__ 7 | docker-compose.yml 8 | 9 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | The MIT License (MIT) 2 | 3 | Copyright © Paweł Urbanek 2020 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining 6 | a copy of this software and associated documentation files (the 7 | "Software"), to deal in the Software without restriction, including 8 | without limitation the rights to use, copy, modify, merge, publish, 9 | distribute, sublicense, and/or sell copies of the Software, and to 10 | permit persons to whom the Software is furnished to do so, subject to 11 | the following conditions: 12 | 13 | The above copyright notice and this permission notice shall be 14 | included in all copies or substantial portions of the Software. 15 | 16 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 17 | EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 18 | MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. 19 | IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY 20 | CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, 21 | TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE 22 | SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. 23 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Python PG Extras [![PyPI version](https://badge.fury.io/py/pg-extras.svg)](https://badge.fury.io/py/pg-extras) 2 | 3 | Python port of [Heroku PG Extras](https://github.com/heroku/heroku-pg-extras) with several additions and improvements. The goal of this project is to provide powerful insights into the PostgreSQL database for Python apps that are not using the Heroku PostgreSQL plugin. 4 | 5 | Queries can be used to obtain information about a Postgres instance, that may be useful when analyzing performance issues. This includes information about locks, index usage, buffer cache hit ratios and vacuum statistics. Python API enables developers to easily integrate the tool into e.g. automatic monitoring tasks. 6 | 7 | You can check out this blog post for detailed step by step tutorial on how to [optimize PostgreSQL using PG Extras library](https://pawelurbanek.com/postgresql-fix-performance). 8 | 9 | Alternative versions: 10 | 11 | - [Ruby](https://github.com/pawurb/ruby-pg-extras) 12 | 13 | - [Ruby on Rails](https://github.com/pawurb/rails-pg-extras) 14 | 15 | - [NodeJS](https://github.com/pawurb/node-postgres-extras) 16 | 17 | - [Elixir](https://github.com/pawurb/ecto_psql_extras) 18 | 19 | - [Haskell](https://github.com/pawurb/haskell-pg-extras) 20 | 21 | ## Installation 22 | 23 | ```bash 24 | pip install pg-extras 25 | ``` 26 | 27 | Some of the queries (e.g., `calls` and `outliers`) require [pg_stat_statements](https://www.postgresql.org/docs/current/pgstatstatements.html) extension enabled. 28 | 29 | You can check if it is enabled in your database by running: 30 | 31 | ```python 32 | PGExtras.query('extensions') 33 | ``` 34 | You should see the similar line in the output: 35 | 36 | ```bash 37 | | pg_stat_statements | 1.7 | 1.7 | track execution statistics of all SQL statements executed | 38 | ``` 39 | 40 | ## Usage 41 | 42 | Gem expects the `os.environ['DATABASE_URL']` value in the following format: 43 | 44 | ```python 45 | "postgresql://postgres:secret@localhost:5432/database_name" 46 | ``` 47 | 48 | Alternatively you can pass it directly to the method: 49 | 50 | ```python 51 | PGExtras.query('cache_hit', database_url="postgresql://postgres:secret@localhost:5432/database_name") 52 | ``` 53 | 54 | You can run queries using a simple python API: 55 | 56 | ```python 57 | from pg_extras import PGExtras 58 | 59 | PGExtras.query('cache_hit') 60 | ``` 61 | ```bash 62 | +----------------+------------------------+ 63 | | Index and table hit rate | 64 | +----------------+------------------------+ 65 | | name | ratio | 66 | +----------------+------------------------+ 67 | | index hit rate | 0.97796610169491525424 | 68 | | table hit rate | 0.96724294813466787989 | 69 | +----------------+------------------------+ 70 | ``` 71 | 72 | By default the ASCII table is displayed. Alternatively you can return the raw query object: 73 | 74 | ```python 75 | result = PGExtras.query('cache_hit', output='raw') 76 | 77 | type(result) # => 78 | result.keys() # => ['name', 'ratio'] 79 | result.fetchall() # => [('index hit rate', Decimal('0.939...')), ('table hit rate', Decimal('0.986...'))] 80 | 81 | ``` 82 | 83 | ## Available methods 84 | 85 | ### `cache_hit` 86 | 87 | ```python 88 | 89 | PGExtras.query('cache_hit') 90 | 91 | name | ratio 92 | ----------------+------------------------ 93 | index hit rate | 0.99957765013541945832 94 | table hit rate | 1.00 95 | (2 rows) 96 | ``` 97 | 98 | This command provides information on the efficiency of the buffer cache, for both index reads (`index hit rate`) as well as table reads (`table hit rate`). A low buffer cache hit ratio can be a sign that the Postgres instance is too small for the workload. 99 | 100 | [More info](https://pawelurbanek.com/postgresql-fix-performance#cache-hit) 101 | 102 | ### `index_cache_hit` 103 | 104 | ```python 105 | 106 | PGExtras.query('index_cache_hit') 107 | 108 | | name | buffer_hits | block_reads | total_read | ratio | 109 | +-----------------------+-------------+-------------+------------+-------------------+ 110 | | teams | 187665 | 109 | 187774 | 0.999419514948821 | 111 | | subscriptions | 5160 | 6 | 5166 | 0.99883855981417 | 112 | | plans | 5718 | 9 | 5727 | 0.998428496595076 | 113 | (truncated results for brevity) 114 | ``` 115 | 116 | The same as `cache_hit` with each table's indexes cache hit info displayed separately. 117 | 118 | [More info](https://pawelurbanek.com/postgresql-fix-performance#cache-hit) 119 | 120 | ### `table_cache_hit` 121 | 122 | ```python 123 | 124 | PGExtras.query('table_cache_hit') 125 | 126 | | name | buffer_hits | block_reads | total_read | ratio | 127 | +-----------------------+-------------+-------------+------------+-------------------+ 128 | | plans | 32123 | 2 | 32125 | 0.999937743190662 | 129 | | subscriptions | 95021 | 8 | 95029 | 0.999915815172211 | 130 | | teams | 171637 | 200 | 171837 | 0.99883610631005 | 131 | (truncated results for brevity) 132 | ``` 133 | 134 | The same as `cache_hit` with each table's cache hit info displayed seperately. 135 | 136 | [More info](https://pawelurbanek.com/postgresql-fix-performance#cache-hit) 137 | 138 | ### `index_usage` 139 | 140 | ```python 141 | 142 | PGExtras.query('index_usage') 143 | 144 | relname | percent_of_times_index_used | rows_in_table 145 | ---------------------+-----------------------------+--------------- 146 | events | 65 | 1217347 147 | app_infos | 74 | 314057 148 | app_infos_user_info | 0 | 198848 149 | user_info | 5 | 94545 150 | delayed_jobs | 27 | 0 151 | (5 rows) 152 | ``` 153 | 154 | This command provides information on the efficiency of indexes, represented as what percentage of total scans were index scans. A low percentage can indicate under indexing, or wrong data being indexed. 155 | 156 | ### `locks` 157 | 158 | ```python 159 | 160 | PGExtras.query('locks') 161 | 162 | procpid | relname | transactionid | granted | query_snippet | mode | age 163 | ---------+---------+---------------+---------+-----------------------+------------------------------------- 164 | 31776 | | | t | in transaction | ExclusiveLock | 00:19:29.837898 165 | 31776 | | 1294 | t | in transaction | RowExclusiveLock | 00:19:29.837898 166 | 31912 | | | t | select * from hello; | ExclusiveLock | 00:19:17.94259 167 | 3443 | | | t | +| ExclusiveLock | 00:00:00 168 | | | | | select +| | 169 | | | | | pg_stat_activi | | 170 | (4 rows) 171 | ``` 172 | 173 | This command displays queries that have taken out an exclusive lock on a relation. Exclusive locks typically prevent other operations on that relation from taking place, and can be a cause of "hung" queries that are waiting for a lock to be granted. 174 | 175 | [More info](https://pawelurbanek.com/postgresql-fix-performance#deadlocks) 176 | 177 | ### `all_locks` 178 | 179 | ```python 180 | 181 | PGExtras.query('all_locks') 182 | 183 | ``` 184 | 185 | This command displays all the current locks, regardless of their type. 186 | 187 | ### `outliers` 188 | 189 | ```python 190 | 191 | PGExtras.query('outliers') 192 | 193 | qry | exec_time | prop_exec_time | ncalls | sync_io_time 194 | -----------------------------------------+------------------+----------------+-------------+-------------- 195 | SELECT * FROM archivable_usage_events.. | 154:39:26.431466 | 72.2% | 34,211,877 | 00:00:00 196 | COPY public.archivable_usage_events (.. | 50:38:33.198418 | 23.6% | 13 | 13:34:21.00108 197 | COPY public.usage_events (id, reporte.. | 02:32:16.335233 | 1.2% | 13 | 00:34:19.784318 198 | INSERT INTO usage_events (id, retaine.. | 01:42:59.436532 | 0.8% | 12,328,187 | 00:00:00 199 | SELECT * FROM usage_events WHERE (alp.. | 01:18:10.754354 | 0.6% | 102,114,301 | 00:00:00 200 | UPDATE usage_events SET reporter_id =.. | 00:52:35.683254 | 0.4% | 23,786,348 | 00:00:00 201 | INSERT INTO usage_events (id, retaine.. | 00:49:24.952561 | 0.4% | 21,988,201 | 00:00:00 202 | (truncated results for brevity) 203 | ``` 204 | 205 | This command displays statements, obtained from `pg_stat_statements`, ordered by the amount of time to execute in aggregate. This includes the statement itself, the total execution time for that statement, the proportion of total execution time for all statements that statement has taken up, the number of times that statement has been called, and the amount of time that statement spent on synchronous I/O (reading/writing from the file system). 206 | 207 | Typically, an efficient query will have an appropriate ratio of calls to total execution time, with as little time spent on I/O as possible. Queries that have a high total execution time but low call count should be investigated to improve their performance. Queries that have a high proportion of execution time being spent on synchronous I/O should also be investigated. 208 | 209 | [More info](https://pawelurbanek.com/postgresql-fix-performance#missing-indexes) 210 | 211 | ### `calls` 212 | 213 | ```python 214 | 215 | PGExtras.query('calls') 216 | 217 | qry | exec_time | prop_exec_time | ncalls | sync_io_time 218 | -----------------------------------------+------------------+----------------+-------------+-------------- 219 | SELECT * FROM usage_events WHERE (alp.. | 01:18:11.073333 | 0.6% | 102,120,780 | 00:00:00 220 | BEGIN | 00:00:51.285988 | 0.0% | 47,288,662 | 00:00:00 221 | COMMIT | 00:00:52.31724 | 0.0% | 47,288,615 | 00:00:00 222 | SELECT * FROM archivable_usage_event.. | 154:39:26.431466 | 72.2% | 34,211,877 | 00:00:00 223 | UPDATE usage_events SET reporter_id =.. | 00:52:35.986167 | 0.4% | 23,788,388 | 00:00:00 224 | INSERT INTO usage_events (id, retaine.. | 00:49:25.260245 | 0.4% | 21,990,326 | 00:00:00 225 | INSERT INTO usage_events (id, retaine.. | 01:42:59.436532 | 0.8% | 12,328,187 | 00:00:00 226 | (truncated results for brevity) 227 | ``` 228 | 229 | This command is much like `pg:outliers`, but ordered by the number of times a statement has been called. 230 | 231 | [More info](https://pawelurbanek.com/postgresql-fix-performance#missing-indexes) 232 | 233 | ### `blocking` 234 | 235 | ```python 236 | 237 | PGExtras.query('blocking') 238 | 239 | blocked_pid | blocking_statement | blocking_duration | blocking_pid | blocked_statement | blocked_duration 240 | -------------+--------------------------+-------------------+--------------+------------------------------------------------------------------------------------+------------------ 241 | 461 | select count(*) from app | 00:00:03.838314 | 15682 | UPDATE "app" SET "updated_at" = '2013-03-04 15:07:04.746688' WHERE "id" = 12823149 | 00:00:03.821826 242 | (1 row) 243 | ``` 244 | 245 | This command displays statements that are currently holding locks that other statements are waiting to be released. This can be used in conjunction with `pg:locks` to determine which statements need to be terminated in order to resolve lock contention. 246 | 247 | [More info](https://pawelurbanek.com/postgresql-fix-performance#deadlocks) 248 | 249 | ### `total_index_size` 250 | 251 | ```python 252 | 253 | PGExtras.query('total_index_size') 254 | 255 | size 256 | ------- 257 | 28194 MB 258 | (1 row) 259 | ``` 260 | 261 | This command displays the total size of all indexes on the database, in MB. It is calculated by taking the number of pages (reported in `relpages`) and multiplying it by the page size (8192 bytes). 262 | 263 | ### `index_size` 264 | 265 | ```python 266 | 267 | PGExtras.query('index_size') 268 | 269 | name | size 270 | ---------------------------------------------------------------+--------- 271 | idx_activity_attemptable_and_type_lesson_enrollment | 5196 MB 272 | index_enrollment_attemptables_by_attempt_and_last_in_group | 4045 MB 273 | index_attempts_on_student_id | 2611 MB 274 | enrollment_activity_attemptables_pkey | 2513 MB 275 | index_attempts_on_student_id_final_attemptable_type | 2466 MB 276 | attempts_pkey | 2466 MB 277 | index_attempts_on_response_id | 2404 MB 278 | index_attempts_on_enrollment_id | 1957 MB 279 | index_enrollment_attemptables_by_enrollment_activity_id | 1789 MB 280 | enrollment_activities_pkey | 458 MB 281 | (truncated results for brevity) 282 | ``` 283 | 284 | This command displays the size of each each index in the database, in MB. It is calculated by taking the number of pages (reported in `relpages`) and multiplying it by the page size (8192 bytes). 285 | 286 | ### `table_size` 287 | 288 | ```python 289 | 290 | PGExtras.query('table_size') 291 | 292 | name | size 293 | ---------------------------------------------------------------+--------- 294 | learning_coaches | 196 MB 295 | states | 145 MB 296 | grade_levels | 111 MB 297 | charities_customers | 73 MB 298 | charities | 66 MB 299 | (truncated results for brevity) 300 | ``` 301 | 302 | This command displays the size of each table and materialized view in the database, in MB. It is calculated by using the system administration function `pg_table_size()`, which includes the size of the main data fork, free space map, visibility map and TOAST data. 303 | 304 | ### `table_indexes_size` 305 | 306 | ```python 307 | 308 | PGExtras.query('table_indexes_size') 309 | 310 | table | indexes_size 311 | ---------------------------------------------------------------+-------------- 312 | learning_coaches | 153 MB 313 | states | 125 MB 314 | charities_customers | 93 MB 315 | charities | 16 MB 316 | grade_levels | 11 MB 317 | (truncated results for brevity) 318 | ``` 319 | 320 | This command displays the total size of indexes for each table and materialized view, in MB. It is calculated by using the system administration function `pg_indexes_size()`. 321 | 322 | ### `total_table_size` 323 | 324 | ```python 325 | 326 | PGExtras.query('total_table_size') 327 | 328 | name | size 329 | ---------------------------------------------------------------+--------- 330 | learning_coaches | 349 MB 331 | states | 270 MB 332 | charities_customers | 166 MB 333 | grade_levels | 122 MB 334 | charities | 82 MB 335 | (truncated results for brevity) 336 | ``` 337 | 338 | This command displays the total size of each table and materialized view in the database, in MB. It is calculated by using the system administration function `pg_total_relation_size()`, which includes table size, total index size and TOAST data. 339 | 340 | ### `unused_indexes` 341 | 342 | ```python 343 | 344 | PGExtras.query('unused_indexes') 345 | 346 | table | index | index_size | index_scans 347 | ---------------------+--------------------------------------------+------------+------------- 348 | public.grade_levels | index_placement_attempts_on_grade_level_id | 97 MB | 0 349 | public.observations | observations_attrs_grade_resources | 33 MB | 0 350 | public.messages | user_resource_id_idx | 12 MB | 0 351 | (3 rows) 352 | ``` 353 | 354 | This command displays indexes that have < 50 scans recorded against them, and are greater than 5 pages in size, ordered by size relative to the number of index scans. This command is generally useful for eliminating indexes that are unused, which can impact write performance, as well as read performance should they occupy space in memory. 355 | 356 | [More info](https://pawelurbanek.com/postgresql-fix-performance#unused-indexes) 357 | 358 | ### `null_indexes` 359 | 360 | ```python 361 | 362 | PGExtras.query('null_indexes') 363 | 364 | oid | index | index_size | unique | indexed_column | null_frac | expected_saving 365 | ---------+--------------------+------------+--------+----------------+-----------+----------------- 366 | 183764 | users_reset_token | 1445 MB | t | reset_token | 97.00% | 1401 MB 367 | 88732 | plan_cancelled_at | 539 MB | f | cancelled_at | 8.30% | 44 MB 368 | 9827345 | users_email | 18 MB | t | email | 28.67% | 5160 kB 369 | 370 | ``` 371 | 372 | This command displays indexes that contain `NULL` values. A high ratio of `NULL` values means that using a partial index excluding them will be beneficial in case they are not used for searching. 373 | 374 | [More info](https://pawelurbanek.com/postgresql-fix-performance#null-indexes) 375 | 376 | ### `seq_scans` 377 | 378 | ```python 379 | 380 | PGExtras.query('seq_scans') 381 | 382 | 383 | name | count 384 | -----------------------------------+---------- 385 | learning_coaches | 44820063 386 | states | 36794975 387 | grade_levels | 13972293 388 | charities_customers | 8615277 389 | charities | 4316276 390 | messages | 3922247 391 | contests_customers | 2915972 392 | classroom_goals | 2142014 393 | (truncated results for brevity) 394 | ``` 395 | 396 | This command displays the number of sequential scans recorded against all tables, descending by count of sequential scans. Tables that have very high numbers of sequential scans may be under-indexed, and it may be worth investigating queries that read from these tables. 397 | 398 | [More info](https://pawelurbanek.com/postgresql-fix-performance#missing-indexes) 399 | 400 | ### `long_running_queries` 401 | 402 | ```python 403 | 404 | PGExtras.query('long_running_queries') 405 | 406 | 407 | pid | duration | query 408 | -------+-----------------+--------------------------------------------------------------------------------------- 409 | 19578 | 02:29:11.200129 | EXPLAIN SELECT "students".* FROM "students" WHERE "students"."id" = 1450645 LIMIT 1 410 | 19465 | 02:26:05.542653 | EXPLAIN SELECT "students".* FROM "students" WHERE "students"."id" = 1889881 LIMIT 1 411 | 19632 | 02:24:46.962818 | EXPLAIN SELECT "students".* FROM "students" WHERE "students"."id" = 1581884 LIMIT 1 412 | (truncated results for brevity) 413 | ``` 414 | 415 | This command displays currently running queries, that have been running for longer than 5 minutes, descending by duration. Very long running queries can be a source of multiple issues, such as preventing DDL statements completing or vacuum being unable to update `relfrozenxid`. 416 | 417 | ### `records_rank` 418 | 419 | ```python 420 | 421 | PGExtras.query('records_rank') 422 | 423 | name | estimated_count 424 | -----------------------------------+----------------- 425 | tastypie_apiaccess | 568891 426 | notifications_event | 381227 427 | core_todo | 178614 428 | core_comment | 123969 429 | notifications_notification | 102101 430 | django_session | 68078 431 | (truncated results for brevity) 432 | ``` 433 | 434 | This command displays an estimated count of rows per table, descending by estimated count. The estimated count is derived from `n_live_tup`, which is updated by vacuum operations. Due to the way `n_live_tup` is populated, sparse vs. dense pages can result in estimations that are significantly out from the real count of rows. 435 | 436 | ### `bloat` 437 | 438 | ```python 439 | 440 | PGExtras.query('bloat') 441 | 442 | 443 | type | schemaname | object_name | bloat | waste 444 | -------+------------+-------------------------------+-------+---------- 445 | table | public | bloated_table | 1.1 | 98 MB 446 | table | public | other_bloated_table | 1.1 | 58 MB 447 | index | public | bloated_table::bloated_index | 3.7 | 34 MB 448 | table | public | clean_table | 0.2 | 3808 kB 449 | table | public | other_clean_table | 0.3 | 1576 kB 450 | (truncated results for brevity) 451 | ``` 452 | 453 | This command displays an estimation of table "bloat" – space allocated to a relation that is full of dead tuples, that has yet to be reclaimed. Tables that have a high bloat ratio, typically 10 or greater, should be investigated to see if vacuuming is aggressive enough, and can be a sign of high table churn. 454 | 455 | [More info](https://pawelurbanek.com/postgresql-fix-performance#bloat) 456 | 457 | ### `vacuum_stats` 458 | 459 | ```python 460 | 461 | PGExtras.query('vacuum_stats') 462 | 463 | schema | table | last_vacuum | last_autovacuum | rowcount | dead_rowcount | autovacuum_threshold | expect_autovacuum 464 | --------+-----------------------+-------------+------------------+----------------+----------------+----------------------+------------------- 465 | public | log_table | | 2013-04-26 17:37 | 18,030 | 0 | 3,656 | 466 | public | data_table | | 2013-04-26 13:09 | 79 | 28 | 66 | 467 | public | other_table | | 2013-04-26 11:41 | 41 | 47 | 58 | 468 | public | queue_table | | 2013-04-26 17:39 | 12 | 8,228 | 52 | yes 469 | public | picnic_table | | | 13 | 0 | 53 | 470 | (truncated results for brevity) 471 | ``` 472 | 473 | This command displays statistics related to vacuum operations for each table, including an estimation of dead rows, last autovacuum and the current autovacuum threshold. This command can be useful when determining if current vacuum thresholds require adjustments, and to determine when the table was last vacuumed. 474 | 475 | ### `kill_all` 476 | 477 | ```python 478 | 479 | PGExtras.query('kill_all') 480 | 481 | ``` 482 | 483 | This commands kills all the currently active connections to the database. It can be useful as a last resort when your database is stuck in a deadlock. 484 | 485 | ### `extensions` 486 | 487 | ```python 488 | 489 | PGExtras.query('extensions') 490 | 491 | ``` 492 | 493 | This command lists all the currently installed and available PostgreSQL extensions. 494 | 495 | ### `mandelbrot` 496 | 497 | ```python 498 | 499 | PGExtras.query('mandelbrot') 500 | 501 | ``` 502 | 503 | This command outputs the Mandelbrot set, calculated through SQL. 504 | -------------------------------------------------------------------------------- /docker-compose.yml.sample: -------------------------------------------------------------------------------- 1 | version: '3' 2 | 3 | services: 4 | postgres: 5 | image: postgres:11.5-alpine 6 | environment: 7 | POSTGRES_USER: postgres 8 | POSTGRES_DB: python-pg-extras-test 9 | POSTGRES_PASSWORD: secret 10 | ports: 11 | - '5432:5432' 12 | -------------------------------------------------------------------------------- /pg_extras/__init__.py: -------------------------------------------------------------------------------- 1 | from sqlalchemy.sql import text 2 | import pkg_resources 3 | from sqlalchemy import create_engine 4 | from tabulate import tabulate 5 | import os 6 | 7 | class PGExtras: 8 | def query(query_name, output="ascii", database_url=None): 9 | database_url_val = database_url or os.environ['DATABASE_URL'] 10 | resource_path = '/'.join(('queries', query_name + '.sql')) 11 | query_sql = pkg_resources.resource_string('pg_extras', resource_path) 12 | 13 | db = create_engine(database_url_val, echo=True) 14 | result = db.engine.execute(text(query_sql.decode('utf-8'))) 15 | 16 | if output == "ascii": 17 | print(tabulate([row for row in result], headers=result.keys(), tablefmt="grid")) 18 | elif output == "raw": 19 | return result 20 | else: 21 | print("Invalid 'output' parameter") 22 | -------------------------------------------------------------------------------- /pg_extras/queries/all_locks.sql: -------------------------------------------------------------------------------- 1 | /* Queries with active locks */ 2 | 3 | SELECT 4 | pg_stat_activity.pid, 5 | pg_class.relname, 6 | pg_locks.transactionid, 7 | pg_locks.granted, 8 | pg_locks.mode, 9 | pg_stat_activity.query AS query_snippet, 10 | age(now(),pg_stat_activity.query_start) AS "age" 11 | FROM pg_stat_activity,pg_locks left 12 | OUTER JOIN pg_class 13 | ON (pg_locks.relation = pg_class.oid) 14 | WHERE pg_stat_activity.query <> '' 15 | AND pg_locks.pid = pg_stat_activity.pid 16 | AND pg_stat_activity.pid <> pg_backend_pid() order by query_start; 17 | -------------------------------------------------------------------------------- /pg_extras/queries/bloat.sql: -------------------------------------------------------------------------------- 1 | /* Table and index bloat in your database ordered by most wasteful */ 2 | 3 | WITH constants AS ( 4 | SELECT current_setting('block_size')::numeric AS bs, 23 AS hdr, 4 AS ma 5 | ), bloat_info AS ( 6 | SELECT 7 | ma,bs,schemaname,tablename, 8 | (datawidth+(hdr+ma-(case when hdr%ma=0 THEN ma ELSE hdr%ma END)))::numeric AS datahdr, 9 | (maxfracsum*(nullhdr+ma-(case when nullhdr%ma=0 THEN ma ELSE nullhdr%ma END))) AS nullhdr2 10 | FROM ( 11 | SELECT 12 | schemaname, tablename, hdr, ma, bs, 13 | SUM((1-null_frac)*avg_width) AS datawidth, 14 | MAX(null_frac) AS maxfracsum, 15 | hdr+( 16 | SELECT 1+count(*)/8 17 | FROM pg_stats s2 18 | WHERE null_frac<>0 AND s2.schemaname = s.schemaname AND s2.tablename = s.tablename 19 | ) AS nullhdr 20 | FROM pg_stats s, constants 21 | GROUP BY 1,2,3,4,5 22 | ) AS foo 23 | ), table_bloat AS ( 24 | SELECT 25 | schemaname, tablename, cc.relpages, bs, 26 | CEIL((cc.reltuples*((datahdr+ma- 27 | (CASE WHEN datahdr%ma=0 THEN ma ELSE datahdr%ma END))+nullhdr2+4))/(bs-20::float)) AS otta 28 | FROM bloat_info 29 | JOIN pg_class cc ON cc.relname = bloat_info.tablename 30 | JOIN pg_namespace nn ON cc.relnamespace = nn.oid AND nn.nspname = bloat_info.schemaname AND nn.nspname <> 'information_schema' 31 | ), index_bloat AS ( 32 | SELECT 33 | schemaname, tablename, bs, 34 | COALESCE(c2.relname,'?') AS iname, COALESCE(c2.reltuples,0) AS ituples, COALESCE(c2.relpages,0) AS ipages, 35 | COALESCE(CEIL((c2.reltuples*(datahdr-12))/(bs-20::float)),0) AS iotta -- very rough approximation, assumes all cols 36 | FROM bloat_info 37 | JOIN pg_class cc ON cc.relname = bloat_info.tablename 38 | JOIN pg_namespace nn ON cc.relnamespace = nn.oid AND nn.nspname = bloat_info.schemaname AND nn.nspname <> 'information_schema' 39 | JOIN pg_index i ON indrelid = cc.oid 40 | JOIN pg_class c2 ON c2.oid = i.indexrelid 41 | ) 42 | SELECT 43 | type, schemaname, object_name, bloat, pg_size_pretty(raw_waste) as waste 44 | FROM 45 | (SELECT 46 | 'table' as type, 47 | schemaname, 48 | tablename as object_name, 49 | ROUND(CASE WHEN otta=0 THEN 0.0 ELSE table_bloat.relpages/otta::numeric END,1) AS bloat, 50 | CASE WHEN relpages < otta THEN '0' ELSE (bs*(table_bloat.relpages-otta)::bigint)::bigint END AS raw_waste 51 | FROM 52 | table_bloat 53 | UNION 54 | SELECT 55 | 'index' as type, 56 | schemaname, 57 | tablename || '::' || iname as object_name, 58 | ROUND(CASE WHEN iotta=0 OR ipages=0 THEN 0.0 ELSE ipages/iotta::numeric END,1) AS bloat, 59 | CASE WHEN ipages < iotta THEN '0' ELSE (bs*(ipages-iotta))::bigint END AS raw_waste 60 | FROM 61 | index_bloat) bloat_summary 62 | ORDER BY raw_waste DESC, bloat DESC; 63 | 64 | -------------------------------------------------------------------------------- /pg_extras/queries/blocking.sql: -------------------------------------------------------------------------------- 1 | /* Queries holding locks other queries are waiting to be released */ 2 | 3 | SELECT bl.pid AS blocked_pid, 4 | ka.query AS blocking_statement, 5 | now() - ka.query_start AS blocking_duration, 6 | kl.pid AS blocking_pid, 7 | a.query AS blocked_statement, 8 | now() - a.query_start AS blocked_duration 9 | FROM pg_catalog.pg_locks bl 10 | JOIN pg_catalog.pg_stat_activity a 11 | ON bl.pid = a.pid 12 | JOIN pg_catalog.pg_locks kl 13 | JOIN pg_catalog.pg_stat_activity ka 14 | ON kl.pid = ka.pid 15 | ON bl.transactionid = kl.transactionid AND bl.pid != kl.pid 16 | WHERE NOT bl.granted; 17 | -------------------------------------------------------------------------------- /pg_extras/queries/cache_hit.sql: -------------------------------------------------------------------------------- 1 | /* Index and table hit rate */ 2 | 3 | SELECT 4 | 'index hit rate' AS name, 5 | (sum(idx_blks_hit)) / nullif(sum(idx_blks_hit + idx_blks_read),0) AS ratio 6 | FROM pg_statio_user_indexes 7 | UNION ALL 8 | SELECT 9 | 'table hit rate' AS name, 10 | sum(heap_blks_hit) / nullif(sum(heap_blks_hit) + sum(heap_blks_read),0) AS ratio 11 | FROM pg_statio_user_tables; 12 | -------------------------------------------------------------------------------- /pg_extras/queries/calls.sql: -------------------------------------------------------------------------------- 1 | /* 10 queries that have highest frequency of execution */ 2 | 3 | SELECT query AS qry, 4 | interval '1 millisecond' * total_time AS exec_time, 5 | to_char((total_time/sum(total_time) OVER()) * 100, 'FM90D0') || '%' AS prop_exec_time, 6 | to_char(calls, 'FM999G999G990') AS ncalls, 7 | interval '1 millisecond' * (blk_read_time + blk_write_time) AS sync_io_time 8 | FROM pg_stat_statements WHERE userid = (SELECT usesysid FROM pg_user WHERE usename = current_user LIMIT 1) 9 | ORDER BY calls DESC LIMIT 10; 10 | -------------------------------------------------------------------------------- /pg_extras/queries/extensions.sql: -------------------------------------------------------------------------------- 1 | /* Available and installed extensions */ 2 | 3 | SELECT * FROM pg_available_extensions ORDER BY installed_version; 4 | 5 | -------------------------------------------------------------------------------- /pg_extras/queries/index_cache_hit.sql: -------------------------------------------------------------------------------- 1 | /* Calculates your cache hit rate for reading indexes */ 2 | 3 | SELECT 4 | relname AS name, 5 | idx_blks_hit AS buffer_hits, 6 | idx_blks_read AS block_reads, 7 | idx_blks_hit + idx_blks_read AS total_read, 8 | CASE (idx_blks_hit + idx_blks_read)::float 9 | WHEN 0 THEN 'Insufficient data' 10 | ELSE (idx_blks_hit / (idx_blks_hit + idx_blks_read)::float)::text 11 | END ratio 12 | FROM 13 | pg_statio_user_tables 14 | ORDER BY 15 | idx_blks_hit / (idx_blks_hit + idx_blks_read + 1)::float DESC; 16 | -------------------------------------------------------------------------------- /pg_extras/queries/index_size.sql: -------------------------------------------------------------------------------- 1 | /* The size of indexes, descending by size */ 2 | 3 | SELECT c.relname AS name, 4 | pg_size_pretty(sum(c.relpages::bigint*8192)::bigint) AS size 5 | FROM pg_class c 6 | LEFT JOIN pg_namespace n ON (n.oid = c.relnamespace) 7 | WHERE n.nspname NOT IN ('pg_catalog', 'information_schema') 8 | AND n.nspname !~ '^pg_toast' 9 | AND c.relkind='i' 10 | GROUP BY c.relname 11 | ORDER BY sum(c.relpages) DESC; 12 | -------------------------------------------------------------------------------- /pg_extras/queries/index_usage.sql: -------------------------------------------------------------------------------- 1 | /* Index hit rate (effective databases are at 99% and up) */ 2 | 3 | SELECT relname, 4 | CASE idx_scan 5 | WHEN 0 THEN 'Insufficient data' 6 | ELSE (100 * idx_scan / (seq_scan + idx_scan))::text 7 | END percent_of_times_index_used, 8 | n_live_tup rows_in_table 9 | FROM 10 | pg_stat_user_tables 11 | ORDER BY 12 | n_live_tup DESC; 13 | -------------------------------------------------------------------------------- /pg_extras/queries/kill_all.sql: -------------------------------------------------------------------------------- 1 | /* Kill all the active database connections */ 2 | 3 | SELECT pg_terminate_backend(pid) FROM pg_stat_activity 4 | WHERE pid <> pg_backend_pid() 5 | AND query <> '' 6 | AND datname = current_database(); 7 | -------------------------------------------------------------------------------- /pg_extras/queries/locks.sql: -------------------------------------------------------------------------------- 1 | /* Queries with active exclusive locks */ 2 | 3 | SELECT 4 | pg_stat_activity.pid, 5 | pg_class.relname, 6 | pg_locks.transactionid, 7 | pg_locks.granted, 8 | pg_locks.mode, 9 | pg_stat_activity.query AS query_snippet, 10 | age(now(),pg_stat_activity.query_start) AS "age" 11 | FROM pg_stat_activity,pg_locks left 12 | OUTER JOIN pg_class 13 | ON (pg_locks.relation = pg_class.oid) 14 | WHERE pg_stat_activity.query <> '' 15 | AND pg_locks.pid = pg_stat_activity.pid 16 | AND pg_locks.mode IN ('ExclusiveLock', 'AccessExclusiveLock', 'RowExclusiveLock') 17 | AND pg_stat_activity.pid <> pg_backend_pid() order by query_start; 18 | -------------------------------------------------------------------------------- /pg_extras/queries/long_running_queries.sql: -------------------------------------------------------------------------------- 1 | /* All queries longer than five minutes by descending duration */ 2 | 3 | SELECT 4 | pid, 5 | now() - pg_stat_activity.query_start AS duration, 6 | query AS query 7 | FROM 8 | pg_stat_activity 9 | WHERE 10 | pg_stat_activity.query <> ''::text 11 | AND state <> 'idle' 12 | AND now() - pg_stat_activity.query_start > interval '5 minutes' 13 | ORDER BY 14 | now() - pg_stat_activity.query_start DESC; 15 | -------------------------------------------------------------------------------- /pg_extras/queries/mandelbrot.sql: -------------------------------------------------------------------------------- 1 | /* The mandelbrot set */ 2 | 3 | WITH RECURSIVE Z(IX, IY, CX, CY, X, Y, I) AS ( 4 | SELECT IX, IY, X::float, Y::float, X::float, Y::float, 0 5 | FROM (select -2.2 + 0.031 * i, i from generate_series(0,101) as i) as xgen(x,ix), 6 | (select -1.5 + 0.031 * i, i from generate_series(0,101) as i) as ygen(y,iy) 7 | UNION ALL 8 | SELECT IX, IY, CX, CY, X * X - Y * Y + CX AS X, Y * X * 2 + CY, I + 1 9 | FROM Z 10 | WHERE X * X + Y * Y < 16::float 11 | AND I < 100 12 | ) 13 | SELECT array_to_string(array_agg(SUBSTRING(' .,,,-----++++%%%%@@@@#### ', LEAST(GREATEST(I,1),27), 1)),'') 14 | FROM ( 15 | SELECT IX, IY, MAX(I) AS I 16 | FROM Z 17 | GROUP BY IY, IX 18 | ORDER BY IY, IX 19 | ) AS ZT 20 | GROUP BY IY 21 | ORDER BY IY; 22 | -------------------------------------------------------------------------------- /pg_extras/queries/null_indexes.sql: -------------------------------------------------------------------------------- 1 | /* Find indexes with a high ratio of NULL values */ 2 | 3 | SELECT 4 | c.oid, 5 | c.relname AS index, 6 | pg_size_pretty(pg_relation_size(c.oid)) AS index_size, 7 | i.indisunique AS unique, 8 | a.attname AS indexed_column, 9 | CASE s.null_frac 10 | WHEN 0 THEN '' 11 | ELSE to_char(s.null_frac * 100, '999.00%') 12 | END AS null_frac, 13 | pg_size_pretty((pg_relation_size(c.oid) * s.null_frac)::bigint) AS expected_saving 14 | FROM 15 | pg_class c 16 | JOIN pg_index i ON i.indexrelid = c.oid 17 | JOIN pg_attribute a ON a.attrelid = c.oid 18 | JOIN pg_class c_table ON c_table.oid = i.indrelid 19 | JOIN pg_indexes ixs ON c.relname = ixs.indexname 20 | LEFT JOIN pg_stats s ON s.tablename = c_table.relname AND a.attname = s.attname 21 | WHERE 22 | -- Primary key cannot be partial 23 | NOT i.indisprimary 24 | -- Exclude already partial indexes 25 | AND i.indpred IS NULL 26 | -- Exclude composite indexes 27 | AND array_length(i.indkey, 1) = 1 28 | -- Exclude indexes without null_frac ratio 29 | AND coalesce(s.null_frac, 0) != 0 30 | -- Larger than threshold 31 | AND pg_relation_size(c.oid) > 10 * 1024 ^ 2 32 | ORDER BY 33 | pg_relation_size(c.oid) * s.null_frac DESC; 34 | -------------------------------------------------------------------------------- /pg_extras/queries/outliers.sql: -------------------------------------------------------------------------------- 1 | /* 10 queries that have longest execution time in aggregate */ 2 | 3 | SELECT interval '1 millisecond' * total_time AS total_exec_time, 4 | to_char((total_time/sum(total_time) OVER()) * 100, 'FM90D0') || '%' AS prop_exec_time, 5 | to_char(calls, 'FM999G999G999G990') AS ncalls, 6 | interval '1 millisecond' * (blk_read_time + blk_write_time) AS sync_io_time, 7 | query AS query 8 | FROM pg_stat_statements WHERE userid = (SELECT usesysid FROM pg_user WHERE usename = current_user LIMIT 1) 9 | ORDER BY total_time DESC 10 | LIMIT 10; 11 | -------------------------------------------------------------------------------- /pg_extras/queries/records_rank.sql: -------------------------------------------------------------------------------- 1 | /* All tables and the number of rows in each ordered by number of rows descending */ 2 | 3 | SELECT 4 | relname AS name, 5 | n_live_tup AS estimated_count 6 | FROM 7 | pg_stat_user_tables 8 | ORDER BY 9 | n_live_tup DESC; 10 | -------------------------------------------------------------------------------- /pg_extras/queries/seq_scans.sql: -------------------------------------------------------------------------------- 1 | /* Count of sequential scans by table descending by order */ 2 | 3 | SELECT relname AS name, 4 | seq_scan as count 5 | FROM 6 | pg_stat_user_tables 7 | ORDER BY seq_scan DESC; 8 | -------------------------------------------------------------------------------- /pg_extras/queries/table_cache_hit.sql: -------------------------------------------------------------------------------- 1 | /* Calculates your cache hit rate for reading tables */ 2 | 3 | SELECT 4 | relname AS name, 5 | heap_blks_hit AS buffer_hits, 6 | heap_blks_read AS block_reads, 7 | heap_blks_hit + heap_blks_read AS total_read, 8 | CASE (heap_blks_hit + heap_blks_read)::float 9 | WHEN 0 THEN 'Insufficient data' 10 | ELSE (heap_blks_hit / (heap_blks_hit + heap_blks_read)::float)::text 11 | END ratio 12 | FROM 13 | pg_statio_user_tables 14 | ORDER BY 15 | heap_blks_hit / (heap_blks_hit + heap_blks_read + 1)::float DESC; 16 | -------------------------------------------------------------------------------- /pg_extras/queries/table_indexes_size.sql: -------------------------------------------------------------------------------- 1 | /* Total size of all the indexes on each table, descending by size */ 2 | 3 | SELECT c.relname AS table, 4 | pg_size_pretty(pg_indexes_size(c.oid)) AS index_size 5 | FROM pg_class c 6 | LEFT JOIN pg_namespace n ON (n.oid = c.relnamespace) 7 | WHERE n.nspname NOT IN ('pg_catalog', 'information_schema') 8 | AND n.nspname !~ '^pg_toast' 9 | AND c.relkind IN ('r', 'm') 10 | ORDER BY pg_indexes_size(c.oid) DESC; 11 | -------------------------------------------------------------------------------- /pg_extras/queries/table_size.sql: -------------------------------------------------------------------------------- 1 | /* Size of the tables (excluding indexes), descending by size */ 2 | 3 | SELECT c.relname AS name, 4 | pg_size_pretty(pg_table_size(c.oid)) AS size 5 | FROM pg_class c 6 | LEFT JOIN pg_namespace n ON (n.oid = c.relnamespace) 7 | WHERE n.nspname NOT IN ('pg_catalog', 'information_schema') 8 | AND n.nspname !~ '^pg_toast' 9 | AND c.relkind IN ('r', 'm') 10 | ORDER BY pg_table_size(c.oid) DESC; 11 | -------------------------------------------------------------------------------- /pg_extras/queries/total_index_size.sql: -------------------------------------------------------------------------------- 1 | /* Total size of all indexes in MB */ 2 | 3 | SELECT pg_size_pretty(sum(c.relpages::bigint*8192)::bigint) AS size 4 | FROM pg_class c 5 | LEFT JOIN pg_namespace n ON (n.oid = c.relnamespace) 6 | WHERE n.nspname NOT IN ('pg_catalog', 'information_schema') 7 | AND n.nspname !~ '^pg_toast' 8 | AND c.relkind='i'; 9 | -------------------------------------------------------------------------------- /pg_extras/queries/total_table_size.sql: -------------------------------------------------------------------------------- 1 | /* Size of the tables (including indexes), descending by size */ 2 | 3 | SELECT c.relname AS name, 4 | pg_size_pretty(pg_total_relation_size(c.oid)) AS size 5 | FROM pg_class c 6 | LEFT JOIN pg_namespace n ON (n.oid = c.relnamespace) 7 | WHERE n.nspname NOT IN ('pg_catalog', 'information_schema') 8 | AND n.nspname !~ '^pg_toast' 9 | AND c.relkind IN ('r', 'm') 10 | ORDER BY pg_total_relation_size(c.oid) DESC; 11 | -------------------------------------------------------------------------------- /pg_extras/queries/unused_indexes.sql: -------------------------------------------------------------------------------- 1 | /* Unused and almost unused indexes */ 2 | /* Ordered by their size relative to the number of index scans. 3 | Exclude indexes of very small tables (less than 5 pages), 4 | where the planner will almost invariably select a sequential scan, 5 | but may not in the future as the table grows */ 6 | 7 | SELECT 8 | schemaname || '.' || relname AS table, 9 | indexrelname AS index, 10 | pg_size_pretty(pg_relation_size(i.indexrelid)) AS index_size, 11 | idx_scan as index_scans 12 | FROM pg_stat_user_indexes ui 13 | JOIN pg_index i ON ui.indexrelid = i.indexrelid 14 | WHERE NOT indisunique AND idx_scan < 50 AND pg_relation_size(relid) > 5 * 8192 15 | ORDER BY pg_relation_size(i.indexrelid) / nullif(idx_scan, 0) DESC NULLS FIRST, 16 | pg_relation_size(i.indexrelid) DESC; 17 | -------------------------------------------------------------------------------- /pg_extras/queries/vacuum_stats.sql: -------------------------------------------------------------------------------- 1 | /* Dead rows and whether an automatic vacuum is expected to be triggered */ 2 | 3 | WITH table_opts AS ( 4 | SELECT 5 | pg_class.oid, relname, nspname, array_to_string(reloptions, '') AS relopts 6 | FROM 7 | pg_class INNER JOIN pg_namespace ns ON relnamespace = ns.oid 8 | ), vacuum_settings AS ( 9 | SELECT 10 | oid, relname, nspname, 11 | CASE 12 | WHEN relopts LIKE '%autovacuum_vacuum_threshold%' 13 | THEN substring(relopts, '.*autovacuum_vacuum_threshold=([0-9.]+).*')::integer 14 | ELSE current_setting('autovacuum_vacuum_threshold')::integer 15 | END AS autovacuum_vacuum_threshold, 16 | CASE 17 | WHEN relopts LIKE '%autovacuum_vacuum_scale_factor%' 18 | THEN substring(relopts, '.*autovacuum_vacuum_scale_factor=([0-9.]+).*')::real 19 | ELSE current_setting('autovacuum_vacuum_scale_factor')::real 20 | END AS autovacuum_vacuum_scale_factor 21 | FROM 22 | table_opts 23 | ) 24 | SELECT 25 | vacuum_settings.nspname AS schema, 26 | vacuum_settings.relname AS table, 27 | to_char(psut.last_vacuum, 'YYYY-MM-DD HH24:MI') AS last_vacuum, 28 | to_char(psut.last_autovacuum, 'YYYY-MM-DD HH24:MI') AS last_autovacuum, 29 | to_char(pg_class.reltuples, '9G999G999G999') AS rowcount, 30 | to_char(psut.n_dead_tup, '9G999G999G999') AS dead_rowcount, 31 | to_char(autovacuum_vacuum_threshold 32 | + (autovacuum_vacuum_scale_factor::numeric * pg_class.reltuples), '9G999G999G999') AS autovacuum_threshold, 33 | CASE 34 | WHEN autovacuum_vacuum_threshold + (autovacuum_vacuum_scale_factor::numeric * pg_class.reltuples) < psut.n_dead_tup 35 | THEN 'yes' 36 | END AS expect_autovacuum 37 | FROM 38 | pg_stat_user_tables psut INNER JOIN pg_class ON psut.relid = pg_class.oid 39 | INNER JOIN vacuum_settings ON pg_class.oid = vacuum_settings.oid 40 | ORDER BY 1; 41 | -------------------------------------------------------------------------------- /setup.py: -------------------------------------------------------------------------------- 1 | import setuptools 2 | 3 | with open("README.md", "r") as fh: 4 | long_description = fh.read() 5 | 6 | setuptools.setup( 7 | name="pg-extras", 8 | version="0.1.4", 9 | author="Pawel Urbanek", 10 | author_email="contact@pawelurbanek.com", 11 | description="Python PostgreSQL database performance insights. Locks, index usage, buffer cache hit ratios, vacuum stats and more.", 12 | long_description=long_description, 13 | long_description_content_type="text/markdown", 14 | url="https://github.com/pawurb/python-pg-extras", 15 | packages=['pg_extras'], 16 | classifiers=[ 17 | "Programming Language :: Python :: 3", 18 | "License :: OSI Approved :: MIT License", 19 | "Operating System :: OS Independent", 20 | "Topic :: Software Development :: Libraries :: Python Modules", 21 | "Topic :: Database" 22 | ], 23 | python_requires='>=3.6', 24 | install_requires=[ 25 | "SQLAlchemy>=1.3", 26 | "setuptools", 27 | "tabulate", 28 | "psycopg2" 29 | ] 30 | ) 31 | --------------------------------------------------------------------------------