├── LICENSE ├── README.md └── explain-running-query.py /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright 2020 Starfish Storage Corporation 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # explain-running-query 2 | The script prints the execution plan of a running query in PostgreSQL. 3 | 4 | It comes in handy if PostgreSQL gets stuck running a query. 5 | Without the script, the best you can do is run [`EXPLAIN`](https://www.postgresql.org/docs/12/sql-explain.html) with the same query text. 6 | Still, the plan doesn't have to be the same as the original, because PostgreSQL relies heavily on table statistics when planning a query and those might have changed since the original query started. 7 | 8 | ## Usage 9 | 10 | ```bash 11 | sudo python3 explain_running_query.py PG_BACKEND_PID 12 | ``` 13 | 14 | `PG_BACKEND_PID` is a PID of PostgreSQL backend process executing the query. It can be obtained from `pid` column from [pg_stat_activity view](https://www.postgresql.org/docs/12/monitoring-stats.html#PG-STAT-ACTIVITY-VIEW) or from [pg_activity](https://pypi.org/project/pg_activity/) tool. 15 | 16 | Root privileges are required to attach gdb to the running process. 17 | 18 | 19 | ## Example 20 | 21 | ```bash 22 | $ psql -c "SELECT count(*) FROM generate_series(1, 200000000)" & 23 | $ psql -p 5433 -c "SELECT pid, query FROM pg_stat_activity" 24 | pid | query 25 | ------+---------------------------------------------------- 26 | 3587 | SELECT count(*) FROM generate_series(1, 200000000) 27 | 3616 | SELECT pid, query FROM pg_stat_activity 28 | (2 rows) 29 | 30 | $ sudo python3 explain_running_query.py 3587 31 | INFO: Waiting for low-level functions to finish. This may take a while. 32 | Query Text: SELECT count(*) FROM generate_series(1, 200000000) 33 | Plan: 34 | Aggregate (cost=2100000.00..2100000.01 rows=1 width=8) 35 | -> Function Scan on generate_series (cost=0.00..2000000.00 rows=200000000 width=0) 36 | Detaching from program: /usr/lib/postgresql/12/bin/postgres, process 3587 37 | [Inferior 1 (process 3587) detached] 38 | ``` 39 | 40 | ## Requirements 41 | 42 | - Python 3 43 | - gdb >= 7.4 44 | - PostgreSQL debug symbols (e.g. `yum install postgresql-12-debuginfo` on CentOS 7, `apt-get install postgresql-12-dbgsym` on Ubuntu Bionic) 45 | 46 | Installing PostgreSQL debug symbols won't affect PostgreSQL performance. 47 | 48 | 49 | ## How it works 50 | 51 | The script is based on existing [auto_explain](https://www.postgresql.org/docs/12/auto-explain.html) extension to PostgreSQL. 52 | Technically, it attaches `gdb` (GNU's debugger) to a PostgreSQL backend process, waits for low-level functions to complete, and runs [internal PostgreSQL functions](https://github.com/postgres/postgres/blob/7559d8ebfa11d98728e816f6b655582ce41150f3/contrib/auto_explain/auto_explain.c#L382) to obtain the query plan. 53 | 54 | ## License 55 | 56 | The script is distributed under [the MIT license](LICENSE). 57 | -------------------------------------------------------------------------------- /explain-running-query.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | # Any Python 3 interpreter should suffice to run this script as it can be run in production environment. 3 | 4 | # MIT License 5 | # 6 | # Copyright 2020 Starfish Storage Corporation 7 | # 8 | # Permission is hereby granted, free of charge, to any person obtaining a copy 9 | # of this software and associated documentation files (the "Software"), to deal 10 | # in the Software without restriction, including without limitation the rights 11 | # to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 12 | # copies of the Software, and to permit persons to whom the Software is 13 | # furnished to do so, subject to the following conditions: 14 | # 15 | # The above copyright notice and this permission notice shall be included in all 16 | # copies or substantial portions of the Software. 17 | # 18 | # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 19 | # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 20 | # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 21 | # AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 22 | # LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 23 | # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 24 | # SOFTWARE 25 | 26 | import errno 27 | import re 28 | import subprocess 29 | import sys 30 | import logging 31 | from argparse import ArgumentParser 32 | from contextlib import contextmanager 33 | 34 | 35 | def get_backtrace(gdb): 36 | main_func_regexp = re.compile("#[0-9]+ +0x[0-9a-f]+ in main ") 37 | 38 | backtrace = [] 39 | gdb.cmd("backtrace") 40 | line = gdb.stdout.readline() 41 | while line: 42 | if "No stack." in line: 43 | return [] 44 | backtrace.append(line) 45 | if main_func_regexp.search(line): 46 | # one more read and the process gets blocked 47 | break 48 | line = gdb.stdout.readline() 49 | 50 | return backtrace 51 | 52 | 53 | def get_depth_of_exec_function(backtrace): 54 | """ 55 | >>> get_depth_of_exec_function(["#1 0x00007f29e6eb7df5 in standard_ExecutorRun (queryDesc=0x562aad346d38,"]) 56 | 1 57 | >>> get_depth_of_exec_function(["#27 0x00007f29e6eb7df5 in pgss_ExecutorRun (queryDesc=0x562aad346d38,"]) 58 | 27 59 | >>> get_depth_of_exec_function(["#13 0x00007f29e6eb7df5 in explain_ExecutorRun (queryDesc=0x562aad346d38,"]) 60 | 13 61 | >>> get_depth_of_exec_function(["#4 0x00007f29e6eb7df5 in ExecEvalNot (notclause=,"]) 62 | 4 63 | >>> get_depth_of_exec_function(["#5 0x00007f29e6eb7df5 in ExecProcNode (node=node@entry=0x562aad157358,)"]) 64 | 5 65 | >>> get_depth_of_exec_function(["#12 0x00007f29e6eb7df5 in ExecutePlan (dest=0x562aad15e290,"]) 66 | 12 67 | >>> get_depth_of_exec_function(["#21 standard_ExecutorRun (queryDesc=0x562aad0b46f8, direction=,"]) 68 | 21 69 | >>> bt = ["#0 palloc0 (size=size@entry=328)", \ 70 | "#1 0x0000562aac6c9970 in InstrAlloc (n=n@entry=1, instrument_options=4)", \ 71 | "#2 0x0000562aac6bdddb in ExecInitNode (node=node@entry=0x562aad49e818,"] 72 | >>> get_depth_of_exec_function(bt) 73 | 2 74 | """ 75 | exec_regexp = re.compile(r"#([0-9]+) .*Exec[a-zA-Z]+ \(") 76 | for frame in backtrace: 77 | m = exec_regexp.search(frame) 78 | if m: 79 | return int(m.group(1)) 80 | return None 81 | 82 | 83 | def run_until_returns_from_depth_functions(gdb, depth): 84 | for _ in range(depth): 85 | gdb.cmd("finish") 86 | 87 | 88 | def get_query_desc_ptr(backtrace): 89 | """ 90 | >>> get_query_desc_ptr(["#21 standard_ExecutorRun (queryDesc=0x562aad0b46f8, direction=,"]) 91 | '0x562aad0b46f8' 92 | >>> get_query_desc_ptr(["#1 pgss_ExecutorRun (queryDesc=0x562aad0b46f8, direction=,"]) 93 | '0x562aad0b46f8' 94 | """ 95 | query_desc_ptr_regexp = re.compile(r"ExecutorRun \(queryDesc=(0x[0-9a-f]+)[,\)]") 96 | # ExecutorRun can appear more than once in backtrace when running PL/pgSQL code. Return the deepest one. 97 | for frame in backtrace: 98 | m = query_desc_ptr_regexp.search(frame) 99 | if m: 100 | return m.group(1) 101 | return None 102 | 103 | 104 | def create_explain_state_ptr(gdb): 105 | explain_state_ptr_regexp = re.compile(r"\(ExplainState \*\) (0x[0-9a-f]+)") 106 | 107 | gdb.cmd("call NewExplainState()") 108 | line = gdb.stdout.readline() 109 | while line: 110 | m = explain_state_ptr_regexp.search(line) 111 | if m: 112 | return m.group(1) 113 | line = gdb.stdout.readline() 114 | 115 | 116 | class NotSupportedGdbVersion(Exception): 117 | pass 118 | 119 | 120 | class Gdb(subprocess.Popen): 121 | GDB_PATH = "gdb" 122 | 123 | @staticmethod 124 | def get_gdb_version(): 125 | """ Example outputs of gdb --version: 126 | GNU gdb (GDB) Red Hat Enterprise Linux (7.2-92.el6) 127 | GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-115.el7 128 | GNU gdb (Ubuntu 8.1-0ubuntu3.1) 8.1.0.20180409-git 129 | GNU gdb (Ubuntu 8.2-0ubuntu1~18.04) 8.2 130 | """ 131 | version_regexp = re.compile(r"([0-9]+)\.([0-9]+)[^0-9]") 132 | gdb = subprocess.Popen([Gdb.GDB_PATH, '--version'], stdout=subprocess.PIPE, universal_newlines=True) 133 | stdout, _stderr = gdb.communicate() 134 | first_line = stdout.splitlines()[0] 135 | m = version_regexp.search(first_line) 136 | if m: 137 | major = int(m.group(1)) 138 | minor = int(m.group(2)) 139 | return major, minor 140 | logging.warning("Couldn't detect gdb version from: \"%s\", assuming it's at least 7.4", first_line) 141 | return 7, 4 # if it's older the script will fail anyway 142 | 143 | def __init__(self, pid): 144 | major, minor = self.get_gdb_version() 145 | if (major, minor) < (7, 4): 146 | raise NotSupportedGdbVersion( 147 | "gdb < 7.4 is not supported, because it fails with \"Hangup detected on fd 0\" " 148 | "when stdin is redirected to a pipe.\n" 149 | "See https://sourceware.org/bugzilla/show_bug.cgi?id=8677 for more details." 150 | ) 151 | super().__init__( 152 | [Gdb.GDB_PATH, '--silent', '--pid', str(pid)], 153 | stdin=subprocess.PIPE, 154 | stdout=subprocess.PIPE, 155 | stderr=subprocess.STDOUT, 156 | universal_newlines=True, 157 | ) 158 | 159 | def cmd(self, command): 160 | self.stdin.write(command + "\n") 161 | self.stdin.flush() 162 | 163 | 164 | @contextmanager 165 | def gdb_session(pid): 166 | try: 167 | gdb = Gdb(pid) 168 | except EnvironmentError as exc: 169 | if exc.errno == errno.ENOENT: 170 | logging.error("gdb not found") 171 | sys.exit(1) 172 | except NotSupportedGdbVersion as exc: 173 | logging.error(exc) 174 | sys.exit(1) 175 | 176 | try: 177 | yield gdb 178 | except KeyboardInterrupt: 179 | logging.error("Interrupted by user") 180 | finally: 181 | if gdb.poll() is None: 182 | logging.warning("Terminating gdb process (PID: {pid})".format(pid=gdb.pid)) 183 | gdb.terminate() 184 | stdout, stderr = gdb.communicate(None, timeout=15) 185 | if gdb.poll() != 0: 186 | logging.error( 187 | "gdb process (PID: {pid}) exited with {code}\nSTDOUT: {out}\nSTDERR: {err}".format( 188 | pid=gdb.pid, code=gdb.poll(), out=stdout, err=stderr 189 | ) 190 | ) 191 | sys.exit(1) 192 | 193 | 194 | def main(): 195 | logging.basicConfig(stream=sys.stderr, format="%(levelname)s: %(message)s", level=logging.INFO) 196 | parser = ArgumentParser( 197 | description="Print the plan of a running PostgreSQL query. " 198 | "gdb >= 7.4 and postgresql debuginfo package need to be installed." 199 | ) 200 | parser.add_argument('pid', type=int, help="PID of PostgreSQL backend process that is running a query") 201 | args = parser.parse_args() 202 | with gdb_session(args.pid) as gdb: 203 | gdb.cmd("set prompt") 204 | gdb.cmd("set pagination off") # traceback can be long 205 | gdb.cmd("set width unlimited") # each stack frame in a single line 206 | gdb.cmd("set confirm off") # no confirmation logs on quit 207 | backtrace = get_backtrace(gdb) 208 | if not backtrace: 209 | logging.error( 210 | "No stack found. Is the process running and you have permissions to run gdb on it? " 211 | "Try running this script as root." 212 | ) 213 | sys.exit(1) 214 | query_desc_ptr = get_query_desc_ptr(backtrace) 215 | if not query_desc_ptr: 216 | logging.error( 217 | """It seems that process {pid} is not executing a query plan. 218 | Possible reasons: 219 | - postgresql debuginfo package is not installed 220 | - query is finishing 221 | - query has just started and the plan is not ready yet.""".format( 222 | pid=args.pid 223 | ) 224 | ) 225 | sys.exit(1) 226 | 227 | exec_depth = get_depth_of_exec_function(backtrace) 228 | assert exec_depth is not None, "Found queryDesc in ExecutorRun but no Exec function, backtrace:\n{bt}".format( 229 | bt="".join(backtrace) 230 | ) 231 | 232 | # PostgreSQL may be in the middle of a low-level function. Calling functions to explain query may break its 233 | # internal state and cause a segfault. 234 | logging.info("Waiting for low-level functions to finish. This may take a while." "") 235 | run_until_returns_from_depth_functions(gdb, exec_depth) 236 | 237 | explain_state_ptr = create_explain_state_ptr(gdb) 238 | # based on https://github.com/postgres/postgres/blob/bd29bc417e7130/contrib/auto_explain/auto_explain.c#L331 239 | gdb.cmd("call ExplainBeginOutput({es_ptr})".format(es_ptr=explain_state_ptr)) 240 | gdb.cmd("call ExplainQueryText({es_ptr}, {qd_ptr})".format(es_ptr=explain_state_ptr, qd_ptr=query_desc_ptr)) 241 | gdb.cmd("call ExplainPropertyText(\"Plan\", \"\", {es_ptr})".format(es_ptr=explain_state_ptr)) 242 | gdb.cmd("call ExplainPrintPlan({es_ptr}, {qd_ptr})".format(es_ptr=explain_state_ptr, qd_ptr=query_desc_ptr)) 243 | gdb.cmd("call ExplainEndOutput({es_ptr})".format(es_ptr=explain_state_ptr)) 244 | gdb.cmd("printf \"%s\", (*((ExplainState *){es_ptr})->str)->data".format(es_ptr=explain_state_ptr)) 245 | stdout, stderr = gdb.communicate("quit", timeout=3) 246 | print(stdout, end='') 247 | if stderr: 248 | logging.error(stderr) 249 | return 0 250 | 251 | 252 | if __name__ == '__main__': 253 | sys.exit(main()) 254 | --------------------------------------------------------------------------------