├── firebase-sample ├── requirements.txt ├── app.py └── build-and-run.sh ├── .style.yapf ├── requirements.txt ├── BUILD ├── requirements_dev.txt ├── .gitignore ├── src ├── third_party │ ├── BUILD │ └── pylinetable.h ├── googleclouddebugger │ ├── version.py │ ├── __main__.py │ ├── native_module.h │ ├── error_data_visibility_policy.py │ ├── appengine_pretty_printers.py │ ├── labels.py │ ├── python_callback.h │ ├── backoff.py │ ├── rate_limit.h │ ├── python_callback.cc │ ├── BUILD │ ├── nullable.h │ ├── application_info.py │ ├── glob_data_visibility_policy.py │ ├── conditional_breakpoint.h │ ├── module_utils.py │ ├── module_search.py │ ├── leaky_bucket.cc │ ├── rate_limit.cc │ ├── common.h │ ├── conditional_breakpoint.cc │ ├── yaml_data_visibility_config_reader.py │ ├── leaky_bucket.h │ ├── uniquifier_computer.py │ ├── __init__.py │ ├── breakpoints_manager.py │ ├── bytecode_manipulator.h │ ├── immutability_tracer.h │ ├── bytecode_breakpoint.h │ ├── module_explorer.py │ ├── python_util.cc │ └── python_util.h ├── build-wheels.sh ├── build.sh └── setup.py ├── tests ├── py │ ├── integration_test_helper.py │ ├── error_data_visibility_policy_test.py │ ├── backoff_test.py │ ├── labels_test.py │ ├── glob_data_visibility_policy_test.py │ ├── application_info_test.py │ ├── yaml_data_visibility_config_reader_test.py │ ├── uniquifier_computer_test.py │ ├── module_search_test.py │ ├── python_test_util.py │ ├── module_utils_test.py │ ├── breakpoints_manager_test.py │ └── native_module_test.py └── cpp │ └── BUILD ├── CONTRIBUTING.md ├── alpine └── Dockerfile ├── WORKSPACE └── README.md /firebase-sample/requirements.txt: -------------------------------------------------------------------------------- 1 | flask 2 | -------------------------------------------------------------------------------- /.style.yapf: -------------------------------------------------------------------------------- 1 | [style] 2 | based_on_style = yapf 3 | -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | firebase_admin>=5.3.0 2 | pyyaml 3 | -------------------------------------------------------------------------------- /BUILD: -------------------------------------------------------------------------------- 1 | package(default_visibility = ["//visibility:public"]) 2 | 3 | -------------------------------------------------------------------------------- /requirements_dev.txt: -------------------------------------------------------------------------------- 1 | -r requirements.txt 2 | absl-py 3 | pytest 4 | requests-mock 5 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | /dist/ 2 | /src/build/ 3 | /src/dist/ 4 | /src/setup.cfg 5 | __pycache__/ 6 | *.egg-info/ 7 | .coverage 8 | /bazel-* 9 | -------------------------------------------------------------------------------- /src/third_party/BUILD: -------------------------------------------------------------------------------- 1 | package(default_visibility = ["//visibility:public"]) 2 | 3 | cc_library( 4 | name = "pylinetable", 5 | hdrs = ["pylinetable.h"], 6 | ) 7 | 8 | -------------------------------------------------------------------------------- /tests/py/integration_test_helper.py: -------------------------------------------------------------------------------- 1 | """Helper module for integration test to validate deferred breakpoints.""" 2 | 3 | 4 | def Trigger(): 5 | print('bp trigger') # BPTAG: DEFERRED 6 | -------------------------------------------------------------------------------- /firebase-sample/app.py: -------------------------------------------------------------------------------- 1 | import googleclouddebugger 2 | 3 | googleclouddebugger.enable(use_firebase=True) 4 | 5 | from flask import Flask 6 | 7 | app = Flask(__name__) 8 | 9 | 10 | @app.route("/") 11 | def hello_world(): 12 | return "

Hello World!

" 13 | -------------------------------------------------------------------------------- /tests/cpp/BUILD: -------------------------------------------------------------------------------- 1 | package(default_visibility = ["//visibility:public"]) 2 | 3 | cc_test( 4 | name = "bytecode_manipulator_test", 5 | srcs = ["bytecode_manipulator_test.cc"], 6 | deps = [ 7 | "//src/googleclouddebugger:bytecode_manipulator", 8 | "@com_google_googletest//:gtest_main", 9 | ], 10 | ) 11 | -------------------------------------------------------------------------------- /src/googleclouddebugger/version.py: -------------------------------------------------------------------------------- 1 | """Version of the Google Python Cloud Debugger.""" 2 | 3 | # Versioning scheme: MAJOR.MINOR 4 | # The major version should only change on breaking changes. Minor version 5 | # changes go between regular updates. Instances running debuggers with 6 | # different major versions will show up as two different debuggees. 7 | __version__ = '4.1' 8 | -------------------------------------------------------------------------------- /firebase-sample/build-and-run.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash -e 2 | 3 | SCRIPT_DIR=$( cd -- "$( dirname -- "${BASH_SOURCE[0]}" )" &> /dev/null && pwd ) 4 | cd "${SCRIPT_DIR}/.." 5 | 6 | cd src 7 | ./build.sh 8 | cd .. 9 | 10 | python3 -m venv /tmp/cdbg-venv 11 | source /tmp/cdbg-venv/bin/activate 12 | pip3 install -r requirements.txt 13 | pip3 install src/dist/* --force-reinstall 14 | 15 | cd firebase-sample 16 | pip3 install -r requirements.txt 17 | python3 -m flask run 18 | cd .. 19 | 20 | deactivate 21 | -------------------------------------------------------------------------------- /tests/py/error_data_visibility_policy_test.py: -------------------------------------------------------------------------------- 1 | """Tests for googleclouddebugger.error_data_visibility_policy.""" 2 | 3 | from absl.testing import absltest 4 | from googleclouddebugger import error_data_visibility_policy 5 | 6 | 7 | class ErrorDataVisibilityPolicyTest(absltest.TestCase): 8 | 9 | def testIsDataVisible(self): 10 | policy = error_data_visibility_policy.ErrorDataVisibilityPolicy( 11 | 'An error message.') 12 | 13 | self.assertEqual((False, 'An error message.'), policy.IsDataVisible('foo')) 14 | 15 | 16 | if __name__ == '__main__': 17 | absltest.main() 18 | -------------------------------------------------------------------------------- /src/googleclouddebugger/__main__.py: -------------------------------------------------------------------------------- 1 | # Copyright 2015 Google Inc. All Rights Reserved. 2 | # 3 | # Licensed under the Apache License, Version 2.0 (the "License"); 4 | # you may not use this file except in compliance with the License. 5 | # You may obtain a copy of the License at 6 | # 7 | # http://www.apache.org/licenses/LICENSE-2.0 8 | # 9 | # Unless required by applicable law or agreed to in writing, software 10 | # distributed under the License is distributed on an "AS-IS" BASIS, 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | # See the License for the specific language governing permissions and 13 | # limitations under the License. 14 | """Entry point for Python Cloud Debugger.""" # pylint: disable=invalid-name 15 | 16 | if __name__ == '__main__': 17 | import googleclouddebugger 18 | googleclouddebugger._DebuggerMain() 19 | -------------------------------------------------------------------------------- /tests/py/backoff_test.py: -------------------------------------------------------------------------------- 1 | """Unit test for backoff module.""" 2 | 3 | from absl.testing import absltest 4 | 5 | from googleclouddebugger import backoff 6 | 7 | 8 | class BackoffTest(absltest.TestCase): 9 | """Unit test for backoff module.""" 10 | 11 | def setUp(self): 12 | self._backoff = backoff.Backoff(10, 100, 1.5) 13 | 14 | def testInitial(self): 15 | self.assertEqual(10, self._backoff.Failed()) 16 | 17 | def testIncrease(self): 18 | self._backoff.Failed() 19 | self.assertEqual(15, self._backoff.Failed()) 20 | 21 | def testMaximum(self): 22 | for _ in range(100): 23 | self._backoff.Failed() 24 | 25 | self.assertEqual(100, self._backoff.Failed()) 26 | 27 | def testResetOnSuccess(self): 28 | for _ in range(4): 29 | self._backoff.Failed() 30 | self._backoff.Succeeded() 31 | self.assertEqual(10, self._backoff.Failed()) 32 | 33 | 34 | if __name__ == '__main__': 35 | absltest.main() 36 | -------------------------------------------------------------------------------- /tests/py/labels_test.py: -------------------------------------------------------------------------------- 1 | """Tests for googleclouddebugger.labels""" 2 | 3 | from absl.testing import absltest 4 | from googleclouddebugger import labels 5 | 6 | 7 | class LabelsTest(absltest.TestCase): 8 | 9 | def testDefinesLabelsCorrectly(self): 10 | self.assertEqual(labels.Breakpoint.REQUEST_LOG_ID, 'requestlogid') 11 | 12 | self.assertEqual(labels.Debuggee.DOMAIN, 'domain') 13 | self.assertEqual(labels.Debuggee.PROJECT_ID, 'projectid') 14 | self.assertEqual(labels.Debuggee.MODULE, 'module') 15 | self.assertEqual(labels.Debuggee.VERSION, 'version') 16 | self.assertEqual(labels.Debuggee.MINOR_VERSION, 'minorversion') 17 | self.assertEqual(labels.Debuggee.PLATFORM, 'platform') 18 | self.assertEqual(labels.Debuggee.REGION, 'region') 19 | 20 | def testProvidesAllLabelsSet(self): 21 | self.assertIsNotNone(labels.Breakpoint.SET_ALL) 22 | self.assertLen(labels.Breakpoint.SET_ALL, 1) 23 | 24 | self.assertIsNotNone(labels.Debuggee.SET_ALL) 25 | self.assertLen(labels.Debuggee.SET_ALL, 7) 26 | 27 | 28 | if __name__ == '__main__': 29 | absltest.main() 30 | -------------------------------------------------------------------------------- /src/googleclouddebugger/native_module.h: -------------------------------------------------------------------------------- 1 | /** 2 | * Copyright 2015 Google Inc. All Rights Reserved. 3 | * 4 | * Licensed under the Apache License, Version 2.0 (the "License"); 5 | * you may not use this file except in compliance with the License. 6 | * You may obtain a copy of the License at 7 | * 8 | * http://www.apache.org/licenses/LICENSE-2.0 9 | * 10 | * Unless required by applicable law or agreed to in writing, software 11 | * distributed under the License is distributed on an "AS IS" BASIS, 12 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | * See the License for the specific language governing permissions and 14 | * limitations under the License. 15 | */ 16 | 17 | #ifndef DEVTOOLS_CDBG_DEBUGLETS_PYTHON_NATIVE_MODULE_H_ 18 | #define DEVTOOLS_CDBG_DEBUGLETS_PYTHON_NATIVE_MODULE_H_ 19 | 20 | namespace devtools { 21 | namespace cdbg { 22 | 23 | // Python Cloud Debugger native module entry point 24 | void InitDebuggerNativeModule(); 25 | 26 | } // namespace cdbg 27 | } // namespace devtools 28 | 29 | #endif // DEVTOOLS_CDBG_DEBUGLETS_PYTHON_NATIVE_MODULE_H_ 30 | -------------------------------------------------------------------------------- /src/googleclouddebugger/error_data_visibility_policy.py: -------------------------------------------------------------------------------- 1 | # Copyright 2017 Google Inc. All Rights Reserved. 2 | # 3 | # Licensed under the Apache License, Version 2.0 (the "License"); 4 | # you may not use this file except in compliance with the License. 5 | # You may obtain a copy of the License at 6 | # 7 | # http://www.apache.org/licenses/LICENSE-2.0 8 | # 9 | # Unless required by applicable law or agreed to in writing, software 10 | # distributed under the License is distributed on an "AS-IS" BASIS, 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | # See the License for the specific language governing permissions and 13 | # limitations under the License. 14 | """Always returns the provided error on visibility requests. 15 | 16 | Example Usage: 17 | 18 | policy = ErrorDataVisibilityPolicy('An error message') 19 | 20 | policy.IsDataVisible('org.foo.bar') -> (False, 'An error message') 21 | """ 22 | 23 | 24 | class ErrorDataVisibilityPolicy(object): 25 | """Visibility policy that always returns an error to the caller.""" 26 | 27 | def __init__(self, error_message): 28 | self.error_message = error_message 29 | 30 | def IsDataVisible(self, unused_path): 31 | return (False, self.error_message) 32 | -------------------------------------------------------------------------------- /CONTRIBUTING.md: -------------------------------------------------------------------------------- 1 | # How to become a contributor and submit your own code 2 | 3 | ## Contributor License Agreements 4 | 5 | We'd love to accept your patches! Before we can take them, we have to jump a couple of legal hurdles. 6 | 7 | Please fill out either the individual or corporate Contributor License Agreement (CLA). 8 | 9 | * If you are an individual writing original source code and you're sure you own the intellectual property, then you'll need to sign an [individual CLA](http://code.google.com/legal/individual-cla-v1.0.html). 10 | * If you work for a company that wants to allow you to contribute your work, then you'll need to sign a [corporate CLA](http://code.google.com/legal/corporate-cla-v1.0.html). 11 | 12 | Follow either of the two links above to access the appropriate CLA and instructions for how to sign and return it. Once we receive it, we'll be able to accept your pull requests. 13 | 14 | ## Contributing A Patch 15 | 16 | 1. Submit an issue describing your proposed change to the repo in question. 17 | 2. The repo owner will respond to your issue promptly. 18 | 3. If your proposed change is accepted, and you haven't already done so, sign a Contributor License Agreement (see details above). 19 | 4. Fork the desired repo, develop and test your code changes. 20 | 5. Submit a pull request. 21 | -------------------------------------------------------------------------------- /alpine/Dockerfile: -------------------------------------------------------------------------------- 1 | # WARNING: Stackdriver Debugger is not regularly tested on the Alpine Linux 2 | # platform and support will be on a best effort basis. 3 | # Sample Alpine Linux image including Python and the Stackdriver Debugger agent. 4 | # To build: 5 | # docker build . # Python 2.7 6 | # docker build --build-arg PYTHON_VERSION=3 . # Python 3.6 7 | # The final image size should be around 50-60 MiB. 8 | 9 | # Stage 1: Build the agent. 10 | FROM alpine:latest 11 | 12 | ARG PYTHON_VERSION=2 13 | ENV PYTHON_VERSION=$PYTHON_VERSION 14 | ENV PYTHON=python${PYTHON_VERSION} 15 | 16 | RUN apk update 17 | RUN apk add bash git curl gcc g++ make cmake ${PYTHON}-dev 18 | RUN if [ $PYTHON_VERSION == "2" ]; then apk add py-setuptools; fi 19 | 20 | RUN git clone https://github.com/GoogleCloudPlatform/cloud-debug-python 21 | RUN PYTHON=$PYTHON bash cloud-debug-python/src/build.sh 22 | 23 | 24 | # Stage 2: Create minimal image with just Python and the debugger. 25 | FROM alpine:latest 26 | 27 | ARG PYTHON_VERSION=2 28 | ENV PYTHON_VERSION=$PYTHON_VERSION 29 | ENV PYTHON=python${PYTHON_VERSION} 30 | 31 | RUN apk --no-cache add $PYTHON libstdc++ 32 | RUN if [ $PYTHON_VERSION == "2" ]; then apk add --no-cache py-setuptools; fi 33 | 34 | COPY --from=0 /cloud-debug-python/src/dist/*.egg . 35 | RUN $PYTHON -m easy_install *.egg 36 | RUN rm *.egg 37 | -------------------------------------------------------------------------------- /tests/py/glob_data_visibility_policy_test.py: -------------------------------------------------------------------------------- 1 | """Tests for glob_data_visibility_policy.""" 2 | 3 | from absl.testing import absltest 4 | from googleclouddebugger import glob_data_visibility_policy 5 | 6 | RESPONSES = glob_data_visibility_policy.RESPONSES 7 | UNKNOWN_TYPE = (False, RESPONSES['UNKNOWN_TYPE']) 8 | BLACKLISTED = (False, RESPONSES['BLACKLISTED']) 9 | NOT_WHITELISTED = (False, RESPONSES['NOT_WHITELISTED']) 10 | VISIBLE = (True, RESPONSES['VISIBLE']) 11 | 12 | 13 | class GlobDataVisibilityPolicyTest(absltest.TestCase): 14 | 15 | def testIsDataVisible(self): 16 | blacklist_patterns = ( 17 | 'wl1.private1', 18 | 'wl2.*', 19 | '*.private2', 20 | '', 21 | ) 22 | whitelist_patterns = ('wl1.*', 'wl2.*') 23 | 24 | policy = glob_data_visibility_policy.GlobDataVisibilityPolicy( 25 | blacklist_patterns, whitelist_patterns) 26 | 27 | self.assertEqual(BLACKLISTED, policy.IsDataVisible('wl1.private1')) 28 | self.assertEqual(BLACKLISTED, policy.IsDataVisible('wl2.foo')) 29 | self.assertEqual(BLACKLISTED, policy.IsDataVisible('foo.private2')) 30 | self.assertEqual(NOT_WHITELISTED, policy.IsDataVisible('wl3.foo')) 31 | self.assertEqual(VISIBLE, policy.IsDataVisible('wl1.foo')) 32 | self.assertEqual(UNKNOWN_TYPE, policy.IsDataVisible(None)) 33 | 34 | 35 | if __name__ == '__main__': 36 | absltest.main() 37 | -------------------------------------------------------------------------------- /src/googleclouddebugger/appengine_pretty_printers.py: -------------------------------------------------------------------------------- 1 | # Copyright 2015 Google Inc. All Rights Reserved. 2 | # 3 | # Licensed under the Apache License, Version 2.0 (the "License"); 4 | # you may not use this file except in compliance with the License. 5 | # You may obtain a copy of the License at 6 | # 7 | # http://www.apache.org/licenses/LICENSE-2.0 8 | # 9 | # Unless required by applicable law or agreed to in writing, software 10 | # distributed under the License is distributed on an "AS-IS" BASIS, 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | # See the License for the specific language governing permissions and 13 | # limitations under the License. 14 | """Formatters for well known objects that don't show up nicely by default.""" 15 | 16 | try: 17 | from protorpc import messages # pylint: disable=g-import-not-at-top 18 | except ImportError: 19 | messages = None 20 | 21 | try: 22 | from google.appengine.ext import ndb # pylint: disable=g-import-not-at-top 23 | except ImportError: 24 | ndb = None 25 | 26 | 27 | def PrettyPrinter(obj): 28 | """Pretty printers for AppEngine objects.""" 29 | 30 | if ndb and isinstance(obj, ndb.Model): 31 | return obj.to_dict().items(), 'ndb.Model(%s)' % type(obj).__name__ 32 | 33 | if messages and isinstance(obj, messages.Enum): 34 | return [('name', obj.name), ('number', obj.number)], type(obj).__name__ 35 | 36 | return None 37 | -------------------------------------------------------------------------------- /src/googleclouddebugger/labels.py: -------------------------------------------------------------------------------- 1 | # Copyright 2015 Google Inc. All Rights Reserved. 2 | # 3 | # Licensed under the Apache License, Version 2.0 (the "License"); 4 | # you may not use this file except in compliance with the License. 5 | # You may obtain a copy of the License at 6 | # 7 | # http://www.apache.org/licenses/LICENSE-2.0 8 | # 9 | # Unless required by applicable law or agreed to in writing, software 10 | # distributed under the License is distributed on an "AS-IS" BASIS, 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | # See the License for the specific language governing permissions and 13 | # limitations under the License. 14 | """Defines the keys of the well known labels used by the cloud debugger. 15 | 16 | TODO: Define these strings in a common format for all agents to 17 | share. This file needs to be maintained with the code generator file 18 | being used in the UI, until the labels are unified. 19 | """ 20 | 21 | 22 | class Breakpoint(object): 23 | REQUEST_LOG_ID = 'requestlogid' 24 | 25 | SET_ALL = frozenset([ 26 | 'requestlogid', 27 | ]) 28 | 29 | 30 | class Debuggee(object): 31 | DOMAIN = 'domain' 32 | PROJECT_ID = 'projectid' 33 | MODULE = 'module' 34 | VERSION = 'version' 35 | MINOR_VERSION = 'minorversion' 36 | PLATFORM = 'platform' 37 | REGION = 'region' 38 | 39 | SET_ALL = frozenset([ 40 | 'domain', 41 | 'projectid', 42 | 'module', 43 | 'version', 44 | 'minorversion', 45 | 'platform', 46 | 'region', 47 | ]) 48 | -------------------------------------------------------------------------------- /src/googleclouddebugger/python_callback.h: -------------------------------------------------------------------------------- 1 | /** 2 | * Copyright 2015 Google Inc. All Rights Reserved. 3 | * 4 | * Licensed under the Apache License, Version 2.0 (the "License"); 5 | * you may not use this file except in compliance with the License. 6 | * You may obtain a copy of the License at 7 | * 8 | * http://www.apache.org/licenses/LICENSE-2.0 9 | * 10 | * Unless required by applicable law or agreed to in writing, software 11 | * distributed under the License is distributed on an "AS IS" BASIS, 12 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | * See the License for the specific language governing permissions and 14 | * limitations under the License. 15 | */ 16 | 17 | #ifndef DEVTOOLS_CDBG_DEBUGLETS_PYTHON_PYTHON_CALLBACK_H_ 18 | #define DEVTOOLS_CDBG_DEBUGLETS_PYTHON_PYTHON_CALLBACK_H_ 19 | 20 | #include 21 | 22 | #include "common.h" 23 | #include "python_util.h" 24 | 25 | namespace devtools { 26 | namespace cdbg { 27 | 28 | // Wraps std::function in a zero arguments Python callable. 29 | class PythonCallback { 30 | public: 31 | PythonCallback() {} 32 | 33 | // Creates a zero argument Python callable that will delegate to "callback" 34 | // when invoked. The callback returns will always return None. 35 | static ScopedPyObject Wrap(std::function callback); 36 | 37 | // Disables any futher invocations of "callback_". The "method" is the 38 | // return value of "Wrap". 39 | static void Disable(PyObject* method); 40 | 41 | static PyTypeObject python_type_; 42 | 43 | private: 44 | static PyObject* Run(PyObject* self); 45 | 46 | private: 47 | // Callback to invoke or nullptr if the callback was cancelled. 48 | std::function callback_; 49 | 50 | static PyMethodDef callback_method_def_; 51 | 52 | DISALLOW_COPY_AND_ASSIGN(PythonCallback); 53 | }; 54 | 55 | } // namespace cdbg 56 | } // namespace devtools 57 | 58 | #endif // DEVTOOLS_CDBG_DEBUGLETS_PYTHON_PYTHON_CALLBACK_H_ 59 | -------------------------------------------------------------------------------- /WORKSPACE: -------------------------------------------------------------------------------- 1 | load("@bazel_tools//tools/build_defs/repo:http.bzl", "http_archive") 2 | 3 | http_archive( 4 | name = "bazel_skylib", 5 | sha256 = "74d544d96f4a5bb630d465ca8bbcfe231e3594e5aae57e1edbf17a6eb3ca2506", 6 | urls = [ 7 | "https://mirror.bazel.build/github.com/bazelbuild/bazel-skylib/releases/download/1.3.0/bazel-skylib-1.3.0.tar.gz", 8 | "https://github.com/bazelbuild/bazel-skylib/releases/download/1.3.0/bazel-skylib-1.3.0.tar.gz", 9 | ], 10 | ) 11 | load("@bazel_skylib//:workspace.bzl", "bazel_skylib_workspace") 12 | bazel_skylib_workspace() 13 | 14 | http_archive( 15 | name = "com_github_gflags_gflags", 16 | sha256 = "34af2f15cf7367513b352bdcd2493ab14ce43692d2dcd9dfc499492966c64dcf", 17 | strip_prefix = "gflags-2.2.2", 18 | urls = ["https://github.com/gflags/gflags/archive/v2.2.2.tar.gz"], 19 | ) 20 | 21 | http_archive( 22 | name = "com_github_google_glog", 23 | sha256 = "21bc744fb7f2fa701ee8db339ded7dce4f975d0d55837a97be7d46e8382dea5a", 24 | strip_prefix = "glog-0.5.0", 25 | urls = ["https://github.com/google/glog/archive/v0.5.0.zip"], 26 | ) 27 | 28 | # Pinning to 1.12.1, the last release that supports C++11 29 | http_archive( 30 | name = "com_google_googletest", 31 | urls = ["https://github.com/google/googletest/archive/58d77fa8070e8cec2dc1ed015d66b454c8d78850.tar.gz"], 32 | strip_prefix = "googletest-58d77fa8070e8cec2dc1ed015d66b454c8d78850", 33 | ) 34 | 35 | # Used to build against Python.h 36 | http_archive( 37 | name = "pybind11_bazel", 38 | strip_prefix = "pybind11_bazel-faf56fb3df11287f26dbc66fdedf60a2fc2c6631", 39 | urls = ["https://github.com/pybind/pybind11_bazel/archive/faf56fb3df11287f26dbc66fdedf60a2fc2c6631.zip"], 40 | ) 41 | 42 | http_archive( 43 | name = "pybind11", 44 | build_file = "@pybind11_bazel//:pybind11.BUILD", 45 | strip_prefix = "pybind11-2.9.2", 46 | urls = ["https://github.com/pybind/pybind11/archive/v2.9.2.tar.gz"], 47 | ) 48 | load("@pybind11_bazel//:python_configure.bzl", "python_configure") 49 | python_configure(name = "local_config_python")#, python_interpreter_target = interpreter) 50 | 51 | -------------------------------------------------------------------------------- /src/googleclouddebugger/backoff.py: -------------------------------------------------------------------------------- 1 | # Copyright 2015 Google Inc. All Rights Reserved. 2 | # 3 | # Licensed under the Apache License, Version 2.0 (the "License"); 4 | # you may not use this file except in compliance with the License. 5 | # You may obtain a copy of the License at 6 | # 7 | # http://www.apache.org/licenses/LICENSE-2.0 8 | # 9 | # Unless required by applicable law or agreed to in writing, software 10 | # distributed under the License is distributed on an "AS-IS" BASIS, 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | # See the License for the specific language governing permissions and 13 | # limitations under the License. 14 | """Implements exponential backoff for retry timeouts.""" 15 | 16 | 17 | class Backoff(object): 18 | """Exponential backoff for retry timeouts. 19 | 20 | This class manages delay between retries for a single kind of request. It 21 | starts from a small delay. The delay is exponentially increased between 22 | subsequent failures, up to the specified maximum. Once the request succeeds 23 | once, the delay is reset to minimum. 24 | 25 | Attributes: 26 | min_interval_sec: initial small delay. 27 | max_interval_sec: maximum delay between retries. 28 | multiplier: factor for exponential increase. 29 | """ 30 | 31 | def __init__(self, min_interval_sec=10, max_interval_sec=600, multiplier=2): 32 | """Class constructor. 33 | 34 | Args: 35 | min_interval_sec: initial small delay. 36 | max_interval_sec: maximum delay between retries. 37 | multiplier: factor for exponential increase. 38 | """ 39 | self.min_interval_sec = min_interval_sec 40 | self.max_interval_sec = max_interval_sec 41 | self.multiplier = multiplier 42 | self.Succeeded() 43 | 44 | def Succeeded(self): 45 | """Resets the delay to minimum upon request success.""" 46 | self._current_interval_sec = self.min_interval_sec 47 | 48 | def Failed(self): 49 | """Indicates that a request has failed. 50 | 51 | Returns: 52 | Time interval to wait before retrying (in seconds). 53 | """ 54 | interval = self._current_interval_sec 55 | self._current_interval_sec = min( 56 | self.max_interval_sec, self._current_interval_sec * self.multiplier) 57 | return interval 58 | -------------------------------------------------------------------------------- /src/googleclouddebugger/rate_limit.h: -------------------------------------------------------------------------------- 1 | /** 2 | * Copyright 2015 Google Inc. All Rights Reserved. 3 | * 4 | * Licensed under the Apache License, Version 2.0 (the "License"); 5 | * you may not use this file except in compliance with the License. 6 | * You may obtain a copy of the License at 7 | * 8 | * http://www.apache.org/licenses/LICENSE-2.0 9 | * 10 | * Unless required by applicable law or agreed to in writing, software 11 | * distributed under the License is distributed on an "AS IS" BASIS, 12 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | * See the License for the specific language governing permissions and 14 | * limitations under the License. 15 | */ 16 | 17 | #ifndef DEVTOOLS_CDBG_DEBUGLETS_PYTHON_RATE_LIMIT_H_ 18 | #define DEVTOOLS_CDBG_DEBUGLETS_PYTHON_RATE_LIMIT_H_ 19 | 20 | #include 21 | 22 | #include "leaky_bucket.h" 23 | #include "common.h" 24 | 25 | namespace devtools { 26 | namespace cdbg { 27 | 28 | // Initializes quota objects if not initialized yet. 29 | void LazyInitializeRateLimit(); 30 | 31 | // Release quota objects. 32 | void CleanupRateLimit(); 33 | 34 | // Condition and dynamic logging rate limits are defined as the maximum 35 | // number of lines of Python code per second to execute. These rate are enforced 36 | // as following: 37 | // 1. If a single breakpoint contributes to half the maximum rate, that 38 | // breakpoint will be deactivated. 39 | // 2. If all breakpoints combined hit the maximum rate, any breakpoint to 40 | // exceed the limit gets disabled. 41 | // 42 | // The first rule ensures that in vast majority of scenarios expensive 43 | // breakpoints will get deactivated. The second rule guarantees that in edge 44 | // case scenarios the total amount of time spent in condition evaluation will 45 | // not exceed the alotted limit. 46 | // 47 | // While the actual cost of Python lines is not uniform, we only care about the 48 | // average. All limits ignore the number of CPUs since Python is inherently 49 | // single threaded. 50 | LeakyBucket* GetGlobalConditionQuota(); 51 | std::unique_ptr CreatePerBreakpointConditionQuota(); 52 | LeakyBucket* GetGlobalDynamicLogQuota(); 53 | LeakyBucket* GetGlobalDynamicLogBytesQuota(); 54 | } // namespace cdbg 55 | } // namespace devtools 56 | 57 | #endif // DEVTOOLS_CDBG_DEBUGLETS_PYTHON_RATE_LIMIT_H_ 58 | -------------------------------------------------------------------------------- /src/googleclouddebugger/python_callback.cc: -------------------------------------------------------------------------------- 1 | /** 2 | * Copyright 2015 Google Inc. All Rights Reserved. 3 | * 4 | * Licensed under the Apache License, Version 2.0 (the "License"); 5 | * you may not use this file except in compliance with the License. 6 | * You may obtain a copy of the License at 7 | * 8 | * http://www.apache.org/licenses/LICENSE-2.0 9 | * 10 | * Unless required by applicable law or agreed to in writing, software 11 | * distributed under the License is distributed on an "AS IS" BASIS, 12 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | * See the License for the specific language governing permissions and 14 | * limitations under the License. 15 | */ 16 | 17 | // Ensure that Python.h is included before any other header. 18 | #include "common.h" 19 | 20 | #include "python_callback.h" 21 | 22 | namespace devtools { 23 | namespace cdbg { 24 | 25 | PyTypeObject PythonCallback::python_type_ = 26 | DefaultTypeDefinition(CDBG_SCOPED_NAME("_Callback")); 27 | 28 | PyMethodDef PythonCallback::callback_method_def_ = { 29 | const_cast("Callback"), // ml_name 30 | reinterpret_cast(PythonCallback::Run), // ml_meth 31 | METH_NOARGS, // ml_flags 32 | const_cast("") // ml_doc 33 | }; 34 | 35 | ScopedPyObject PythonCallback::Wrap(std::function callback) { 36 | ScopedPyObject callback_obj = NewNativePythonObject(); 37 | py_object_cast(callback_obj.get())->callback_ = callback; 38 | 39 | ScopedPyObject callback_method(PyCFunction_NewEx( 40 | &callback_method_def_, 41 | callback_obj.get(), 42 | GetDebugletModule())); 43 | 44 | return callback_method; 45 | } 46 | 47 | 48 | void PythonCallback::Disable(PyObject* method) { 49 | DCHECK(PyCFunction_Check(method)); 50 | 51 | auto instance = py_object_cast(PyCFunction_GET_SELF(method)); 52 | DCHECK(instance); 53 | 54 | instance->callback_ = nullptr; 55 | } 56 | 57 | 58 | PyObject* PythonCallback::Run(PyObject* self) { 59 | auto instance = py_object_cast(self); 60 | 61 | if (instance->callback_ != nullptr) { 62 | instance->callback_(); 63 | } 64 | 65 | Py_RETURN_NONE; 66 | } 67 | 68 | } // namespace cdbg 69 | } // namespace devtools 70 | -------------------------------------------------------------------------------- /src/googleclouddebugger/BUILD: -------------------------------------------------------------------------------- 1 | package(default_visibility = ["//visibility:public"]) 2 | 3 | cc_library( 4 | name = "common", 5 | hdrs = ["common.h"], 6 | deps = [ 7 | "@com_github_google_glog//:glog", 8 | "@local_config_python//:python_headers", 9 | ], 10 | ) 11 | 12 | cc_library( 13 | name = "nullable", 14 | hdrs = ["nullable.h"], 15 | deps = [ 16 | ":common", 17 | ], 18 | ) 19 | 20 | cc_library( 21 | name = "python_util", 22 | srcs = ["python_util.cc"], 23 | hdrs = ["python_util.h"], 24 | deps = [ 25 | ":common", 26 | ":nullable", 27 | "//src/third_party:pylinetable", 28 | ], 29 | ) 30 | 31 | 32 | cc_library( 33 | name = "python_callback", 34 | srcs = ["python_callback.cc"], 35 | hdrs = ["python_callback.h"], 36 | deps = [ 37 | ":common", 38 | ":python_util", 39 | ], 40 | ) 41 | 42 | cc_library( 43 | name = "leaky_bucket", 44 | srcs = ["leaky_bucket.cc"], 45 | hdrs = ["leaky_bucket.h"], 46 | deps = [ 47 | ":common", 48 | ], 49 | ) 50 | 51 | cc_library( 52 | name = "rate_limit", 53 | srcs = ["rate_limit.cc"], 54 | hdrs = ["rate_limit.h"], 55 | deps = [ 56 | ":common", 57 | ":leaky_bucket", 58 | ], 59 | ) 60 | 61 | cc_library( 62 | name = "bytecode_manipulator", 63 | srcs = ["bytecode_manipulator.cc"], 64 | hdrs = ["bytecode_manipulator.h"], 65 | deps = [ 66 | ":common", 67 | ], 68 | ) 69 | 70 | cc_library( 71 | name = "bytecode_breakpoint", 72 | srcs = ["bytecode_breakpoint.cc"], 73 | hdrs = ["bytecode_breakpoint.h"], 74 | deps = [ 75 | ":bytecode_manipulator", 76 | ":common", 77 | ":python_callback", 78 | ":python_util", 79 | ], 80 | ) 81 | 82 | cc_library( 83 | name = "immutability_tracer", 84 | srcs = ["immutability_tracer.cc"], 85 | hdrs = ["immutability_tracer.h"], 86 | deps = [ 87 | ":common", 88 | ":python_util", 89 | ], 90 | ) 91 | 92 | cc_library( 93 | name = "conditional_breakpoint", 94 | srcs = ["conditional_breakpoint.cc"], 95 | hdrs = ["conditional_breakpoint.h"], 96 | deps = [ 97 | ":common", 98 | ":immutability_tracer", 99 | ":python_util", 100 | ":rate_limit", 101 | ":leaky_bucket", 102 | ], 103 | ) 104 | -------------------------------------------------------------------------------- /src/googleclouddebugger/nullable.h: -------------------------------------------------------------------------------- 1 | /** 2 | * Copyright 2015 Google Inc. All Rights Reserved. 3 | * 4 | * Licensed under the Apache License, Version 2.0 (the "License"); 5 | * you may not use this file except in compliance with the License. 6 | * You may obtain a copy of the License at 7 | * 8 | * http://www.apache.org/licenses/LICENSE-2.0 9 | * 10 | * Unless required by applicable law or agreed to in writing, software 11 | * distributed under the License is distributed on an "AS IS" BASIS, 12 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | * See the License for the specific language governing permissions and 14 | * limitations under the License. 15 | */ 16 | 17 | #ifndef DEVTOOLS_CDBG_DEBUGLETS_PYTHON_NULLABLE_H_ 18 | #define DEVTOOLS_CDBG_DEBUGLETS_PYTHON_NULLABLE_H_ 19 | 20 | #include "common.h" 21 | 22 | namespace devtools { 23 | namespace cdbg { 24 | 25 | template 26 | class Nullable { 27 | public: 28 | Nullable() : has_value_(false) {} 29 | 30 | // Copy constructor. 31 | Nullable(const Nullable& other) 32 | : has_value_(other.has_value()) { 33 | if (other.has_value()) { 34 | value_ = other.value_; 35 | } 36 | } 37 | 38 | // Implicit initialization from the value of type T. 39 | explicit Nullable(const T& value) : has_value_(true), value_(value) {} 40 | 41 | // Assignment of the value of type Nullable. 42 | Nullable& operator= (const Nullable& other) { 43 | has_value_ = other.has_value(); 44 | if (has_value_) { 45 | value_ = other.value(); 46 | } 47 | 48 | return *this; 49 | } 50 | 51 | // Explicitly sets the value of type T. 52 | void set_value(const T& value) { 53 | has_value_ = true; 54 | value_ = value; 55 | } 56 | 57 | // Reset back to no value. 58 | void clear() { 59 | has_value_ = false; 60 | } 61 | 62 | // Returns true if value is initialized, false otherwise. 63 | bool has_value() const { 64 | return has_value_; 65 | } 66 | 67 | // Explicitly returns stored value. 68 | const T& value() const { 69 | DCHECK(has_value()); 70 | return value_; 71 | } 72 | 73 | bool operator== (const Nullable& other) const { 74 | return (!has_value_ && !other.has_value_) || 75 | (has_value_ && other.has_value_ && (value_ == other.value_)); 76 | } 77 | 78 | bool operator!= (const Nullable& other) const { 79 | return !(*this == other); 80 | } 81 | 82 | private: 83 | bool has_value_; 84 | T value_; 85 | 86 | // Intentionally copyable. 87 | }; 88 | 89 | } // namespace cdbg 90 | } // namespace devtools 91 | 92 | #endif // DEVTOOLS_CDBG_DEBUGLETS_PYTHON_NULLABLE_H_ 93 | -------------------------------------------------------------------------------- /tests/py/application_info_test.py: -------------------------------------------------------------------------------- 1 | """Tests for application_info.""" 2 | 3 | import os 4 | from unittest import mock 5 | 6 | import requests 7 | 8 | from googleclouddebugger import application_info 9 | from absl.testing import absltest 10 | 11 | 12 | class ApplicationInfoTest(absltest.TestCase): 13 | 14 | def test_get_platform_default(self): 15 | """Returns default platform when no platform is detected.""" 16 | self.assertEqual(application_info.PlatformType.DEFAULT, 17 | application_info.GetPlatform()) 18 | 19 | def test_get_platform_gcf_name(self): 20 | """Returns cloud_function when the FUNCTION_NAME env variable is set.""" 21 | try: 22 | os.environ['FUNCTION_NAME'] = 'function-name' 23 | self.assertEqual(application_info.PlatformType.CLOUD_FUNCTION, 24 | application_info.GetPlatform()) 25 | finally: 26 | del os.environ['FUNCTION_NAME'] 27 | 28 | def test_get_platform_gcf_target(self): 29 | """Returns cloud_function when the FUNCTION_TARGET env variable is set.""" 30 | try: 31 | os.environ['FUNCTION_TARGET'] = 'function-target' 32 | self.assertEqual(application_info.PlatformType.CLOUD_FUNCTION, 33 | application_info.GetPlatform()) 34 | finally: 35 | del os.environ['FUNCTION_TARGET'] 36 | 37 | def test_get_region_none(self): 38 | """Returns None when no region is detected.""" 39 | self.assertIsNone(application_info.GetRegion()) 40 | 41 | def test_get_region_gcf(self): 42 | """Returns correct region when the FUNCTION_REGION env variable is set.""" 43 | try: 44 | os.environ['FUNCTION_REGION'] = 'function-region' 45 | self.assertEqual('function-region', application_info.GetRegion()) 46 | finally: 47 | del os.environ['FUNCTION_REGION'] 48 | 49 | @mock.patch('requests.get') 50 | def test_get_region_metadata_server(self, mock_requests_get): 51 | """Returns correct region if found in metadata server.""" 52 | success_response = mock.Mock(requests.Response) 53 | success_response.status_code = 200 54 | success_response.text = 'a/b/function-region' 55 | mock_requests_get.return_value = success_response 56 | 57 | self.assertEqual('function-region', application_info.GetRegion()) 58 | 59 | @mock.patch('requests.get') 60 | def test_get_region_metadata_server_fail(self, mock_requests_get): 61 | """Returns None if region not found in metadata server.""" 62 | exception = requests.exceptions.HTTPError() 63 | failed_response = mock.Mock(requests.Response) 64 | failed_response.status_code = 400 65 | failed_response.raise_for_status.side_effect = exception 66 | mock_requests_get.return_value = failed_response 67 | 68 | self.assertIsNone(application_info.GetRegion()) 69 | 70 | 71 | if __name__ == '__main__': 72 | absltest.main() 73 | -------------------------------------------------------------------------------- /src/googleclouddebugger/application_info.py: -------------------------------------------------------------------------------- 1 | # Copyright 2015 Google Inc. All Rights Reserved. 2 | # 3 | # Licensed under the Apache License, Version 2.0 (the "License"); 4 | # you may not use this file except in compliance with the License. 5 | # You may obtain a copy of the License at 6 | # 7 | # http://www.apache.org/licenses/LICENSE-2.0 8 | # 9 | # Unless required by applicable law or agreed to in writing, software 10 | # distributed under the License is distributed on an "AS-IS" BASIS, 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | # See the License for the specific language governing permissions and 13 | # limitations under the License. 14 | """Module to fetch information regarding the current application. 15 | 16 | Some examples of the information the methods in this module fetch are platform 17 | and region of the application. 18 | """ 19 | 20 | import enum 21 | import os 22 | import requests 23 | 24 | # These environment variables will be set automatically by cloud functions 25 | # depending on the runtime. If one of these values is set, we can infer that 26 | # the current environment is GCF. Reference: 27 | # https://cloud.google.com/functions/docs/env-var#runtime_environment_variables_set_automatically 28 | _GCF_EXISTENCE_ENV_VARIABLES = ['FUNCTION_NAME', 'FUNCTION_TARGET'] 29 | _GCF_REGION_ENV_VARIABLE = 'FUNCTION_REGION' 30 | 31 | _GCP_METADATA_REGION_URL = 'http://metadata/computeMetadata/v1/instance/region' 32 | _GCP_METADATA_HEADER = {'Metadata-Flavor': 'Google'} 33 | 34 | 35 | class PlatformType(enum.Enum): 36 | """The type of platform the application is running on. 37 | 38 | TODO: Define this enum in a common format for all agents to 39 | share. This enum needs to be maintained between the labels code generator 40 | and other agents, until there is a unified way to generate it. 41 | """ 42 | CLOUD_FUNCTION = 'cloud_function' 43 | DEFAULT = 'default' 44 | 45 | 46 | def GetPlatform(): 47 | """Returns PlatformType for the current application.""" 48 | 49 | # Check if it's a cloud function. 50 | for name in _GCF_EXISTENCE_ENV_VARIABLES: 51 | if name in os.environ: 52 | return PlatformType.CLOUD_FUNCTION 53 | 54 | # If we weren't able to identify the platform, fall back to default value. 55 | return PlatformType.DEFAULT 56 | 57 | 58 | def GetRegion(): 59 | """Returns region of the current application.""" 60 | 61 | # If it's running cloud function with an old runtime. 62 | if _GCF_REGION_ENV_VARIABLE in os.environ: 63 | return os.environ.get(_GCF_REGION_ENV_VARIABLE) 64 | 65 | # Otherwise try fetching it from the metadata server. 66 | try: 67 | response = requests.get( 68 | _GCP_METADATA_REGION_URL, headers=_GCP_METADATA_HEADER) 69 | response.raise_for_status() 70 | # Example of response text: projects/id/regions/us-central1. So we strip 71 | # everything before the last /. 72 | region = response.text.split('/')[-1] 73 | if region == 'html>': 74 | # Sometimes we get an html response! 75 | return None 76 | 77 | return region 78 | except requests.exceptions.RequestException: 79 | return None 80 | -------------------------------------------------------------------------------- /src/build-wheels.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash -e 2 | 3 | GFLAGS_URL=https://github.com/gflags/gflags/archive/v2.2.2.tar.gz 4 | GLOG_URL=https://github.com/google/glog/archive/v0.4.0.tar.gz 5 | 6 | SUPPORTED_VERSIONS=(cp36-cp36m cp37-cp37m cp38-cp38 cp39-cp39 cp310-cp310) 7 | 8 | ROOT=$(cd $(dirname "${BASH_SOURCE[0]}") >/dev/null; /bin/pwd -P) 9 | 10 | # Parallelize the build over N threads where N is the number of cores * 1.5. 11 | PARALLEL_BUILD_OPTION="-j $(($(nproc 2> /dev/null || echo 4)*3/2))" 12 | 13 | # Clean up any previous build/test files. 14 | rm -rf \ 15 | ${ROOT}/build \ 16 | ${ROOT}/dist \ 17 | ${ROOT}/setup.cfg \ 18 | ${ROOT}/google_python_cloud_debugger.egg-info \ 19 | /io/dist \ 20 | /io/tests/py/__pycache__ 21 | 22 | # Create directory for third-party libraries. 23 | mkdir -p ${ROOT}/build/third_party 24 | 25 | # Build and install gflags to build/third_party. 26 | pushd ${ROOT}/build/third_party 27 | curl -Lk ${GFLAGS_URL} -o gflags.tar.gz 28 | tar xzvf gflags.tar.gz 29 | cd gflags-* 30 | mkdir build 31 | cd build 32 | cmake -DCMAKE_CXX_FLAGS=-fpic \ 33 | -DGFLAGS_NAMESPACE=google \ 34 | -DCMAKE_INSTALL_PREFIX:PATH=${ROOT}/build/third_party \ 35 | .. 36 | make ${PARALLEL_BUILD_OPTION} 37 | make install 38 | popd 39 | 40 | # Build and install glog to build/third_party. 41 | pushd ${ROOT}/build/third_party 42 | curl -L ${GLOG_URL} -o glog.tar.gz 43 | tar xzvf glog.tar.gz 44 | cd glog-* 45 | mkdir build 46 | cd build 47 | cmake -DCMAKE_CXX_FLAGS=-fpic \ 48 | -DCMAKE_PREFIX_PATH=${ROOT}/build/third_party \ 49 | -DCMAKE_INSTALL_PREFIX:PATH=${ROOT}/build/third_party \ 50 | .. 51 | make ${PARALLEL_BUILD_OPTION} 52 | make install 53 | popd 54 | 55 | # Extract build version from version.py 56 | grep "^ *__version__ *=" "/io/src/googleclouddebugger/version.py" | grep -Eo "[0-9.]+" > "version.txt" 57 | AGENT_VERSION=$(cat "version.txt") 58 | echo "Building distribution packages for python agent version ${AGENT_VERSION}" 59 | 60 | # Create setup.cfg file and point to the third_party libraries we just build. 61 | echo "[global] 62 | verbose=1 63 | 64 | [build_ext] 65 | include_dirs=${ROOT}/build/third_party/include 66 | library_dirs=${ROOT}/build/third_party/lib:${ROOT}/build/third_party/lib64" > ${ROOT}/setup.cfg 67 | 68 | # Build the Python Cloud Debugger agent. 69 | pushd ${ROOT} 70 | 71 | for PY_VERSION in ${SUPPORTED_VERSIONS[@]}; do 72 | echo "Building the ${PY_VERSION} agent" 73 | "/opt/python/${PY_VERSION}/bin/pip" install -r /io/requirements_dev.txt 74 | "/opt/python/${PY_VERSION}/bin/pip" wheel /io/src --no-deps -w /tmp/dist/ 75 | PACKAGE_NAME="google_python_cloud_debugger-${AGENT_VERSION}" 76 | WHL_FILENAME="${PACKAGE_NAME}-${PY_VERSION}-linux_x86_64.whl" 77 | auditwheel repair "/tmp/dist/${WHL_FILENAME}" -w /io/dist/ 78 | 79 | echo "Running tests" 80 | "/opt/python/${PY_VERSION}/bin/pip" install google-python-cloud-debugger --no-index -f /io/dist 81 | "/opt/python/${PY_VERSION}/bin/pytest" /io/tests/py 82 | done 83 | 84 | popd 85 | 86 | # Clean up temporary directories. 87 | rm -rf \ 88 | ${ROOT}/build \ 89 | ${ROOT}/setup.cfg \ 90 | ${ROOT}/google_python_cloud_debugger.egg-info \ 91 | /io/tests/py/__pycache__ 92 | 93 | echo "Build artifacts are in the dist directory" 94 | 95 | -------------------------------------------------------------------------------- /src/googleclouddebugger/glob_data_visibility_policy.py: -------------------------------------------------------------------------------- 1 | # Copyright 2017 Google Inc. All Rights Reserved. 2 | # 3 | # Licensed under the Apache License, Version 2.0 (the "License"); 4 | # you may not use this file except in compliance with the License. 5 | # You may obtain a copy of the License at 6 | # 7 | # http://www.apache.org/licenses/LICENSE-2.0 8 | # 9 | # Unless required by applicable law or agreed to in writing, software 10 | # distributed under the License is distributed on an "AS-IS" BASIS, 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | # See the License for the specific language governing permissions and 13 | # limitations under the License. 14 | """Determines the visibility of python data and symbols. 15 | 16 | Example Usage: 17 | 18 | blacklist_patterns = ( 19 | 'com.private.*' 20 | 'com.foo.bar' 21 | ) 22 | whitelist_patterns = ( 23 | 'com.*' 24 | ) 25 | policy = GlobDataVisibilityPolicy(blacklist_patterns, whitelist_patterns) 26 | 27 | policy.IsDataVisible('org.foo.bar') -> (False, 'not whitelisted by config') 28 | policy.IsDataVisible('com.foo.bar') -> (False, 'blacklisted by config') 29 | policy.IsDataVisible('com.private.foo') -> (False, 'blacklisted by config') 30 | policy.IsDataVisible('com.foo') -> (True, 'visible') 31 | """ 32 | 33 | import fnmatch 34 | 35 | # Possible visibility responses 36 | RESPONSES = { 37 | 'UNKNOWN_TYPE': 'could not determine type', 38 | 'BLACKLISTED': 'blacklisted by config', 39 | 'NOT_WHITELISTED': 'not whitelisted by config', 40 | 'VISIBLE': 'visible', 41 | } 42 | 43 | 44 | class GlobDataVisibilityPolicy(object): 45 | """Policy provides visibility policy details to the caller.""" 46 | 47 | def __init__(self, blacklist_patterns, whitelist_patterns): 48 | self.blacklist_patterns = blacklist_patterns 49 | self.whitelist_patterns = whitelist_patterns 50 | 51 | def IsDataVisible(self, path): 52 | """Returns a tuple (visible, reason) stating if the data should be visible. 53 | 54 | Args: 55 | path: A dot separated path that represents a package, class, method or 56 | variable. The format is identical to pythons "import" statement. 57 | 58 | Returns: 59 | (visible, reason) where visible is a boolean that is True if the data 60 | should be visible. Reason is a string reason that can be displayed 61 | to the user and indicates why data is visible or not visible. 62 | """ 63 | if path is None: 64 | return (False, RESPONSES['UNKNOWN_TYPE']) 65 | 66 | if _Matches(path, self.blacklist_patterns): 67 | return (False, RESPONSES['BLACKLISTED']) 68 | 69 | if not _Matches(path, self.whitelist_patterns): 70 | return (False, RESPONSES['NOT_WHITELISTED']) 71 | 72 | return (True, RESPONSES['VISIBLE']) 73 | 74 | 75 | def _Matches(path, pattern_list): 76 | """Returns true if path matches any patten found in pattern_list. 77 | 78 | Args: 79 | path: A dot separated path to a package, class, method or variable 80 | pattern_list: A list of wildcard patterns 81 | 82 | Returns: 83 | True if path matches any wildcard found in pattern_list. 84 | """ 85 | # Note: This code does not scale to large pattern_list sizes. 86 | return any(fnmatch.fnmatchcase(path, pattern) for pattern in pattern_list) 87 | -------------------------------------------------------------------------------- /src/googleclouddebugger/conditional_breakpoint.h: -------------------------------------------------------------------------------- 1 | /** 2 | * Copyright 2015 Google Inc. All Rights Reserved. 3 | * 4 | * Licensed under the Apache License, Version 2.0 (the "License"); 5 | * you may not use this file except in compliance with the License. 6 | * You may obtain a copy of the License at 7 | * 8 | * http://www.apache.org/licenses/LICENSE-2.0 9 | * 10 | * Unless required by applicable law or agreed to in writing, software 11 | * distributed under the License is distributed on an "AS IS" BASIS, 12 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | * See the License for the specific language governing permissions and 14 | * limitations under the License. 15 | */ 16 | 17 | #ifndef DEVTOOLS_CDBG_DEBUGLETS_PYTHON_CONDITIONAL_BREAKPOINT_H_ 18 | #define DEVTOOLS_CDBG_DEBUGLETS_PYTHON_CONDITIONAL_BREAKPOINT_H_ 19 | 20 | #include "leaky_bucket.h" 21 | #include "common.h" 22 | #include "python_util.h" 23 | 24 | namespace devtools { 25 | namespace cdbg { 26 | 27 | // Breakpoints emulator will typically notify the next layer when a breakpoint 28 | // hits. However there are other situations that the next layer need to be 29 | // aware of. 30 | enum class BreakpointEvent { 31 | // The breakpoint was hit. 32 | Hit, 33 | 34 | // Error occurred (e.g. breakpoint could not be set). 35 | Error, 36 | 37 | // Evaluation of conditional expression is consuming too much resources. It is 38 | // a responsibility of the next layer to disable the offending breakpoint. 39 | GlobalConditionQuotaExceeded, 40 | BreakpointConditionQuotaExceeded, 41 | 42 | // The conditional expression changes state of the program and therefore not 43 | // allowed. 44 | ConditionExpressionMutable, 45 | }; 46 | 47 | 48 | // Implements breakpoint action to evaluate optional breakpoint condition. If 49 | // the condition matches, calls Python callable object. 50 | class ConditionalBreakpoint { 51 | public: 52 | ConditionalBreakpoint(ScopedPyCodeObject condition, ScopedPyObject callback); 53 | 54 | ~ConditionalBreakpoint(); 55 | 56 | void OnBreakpointHit(); 57 | 58 | void OnBreakpointError(); 59 | 60 | private: 61 | // Evaluates breakpoint condition within the context of the specified frame. 62 | // Returns true if the breakpoint doesn't have condition or if condition 63 | // was evaluated to True. Otherwise returns false. Raised exceptions are 64 | // considered as condition not matched. 65 | bool EvaluateCondition(PyFrameObject* frame); 66 | 67 | // Takes "time_ns" tokens from the quota for CPU consumption due to breakpoint 68 | // condition. If the quota is exceeded, this function clears the breakpoint 69 | // and reports "ConditionQuotaExceeded" breakpoint event. 70 | void ApplyConditionQuota(int time_ns); 71 | 72 | // Notifies the next layer through the callable object. 73 | void NotifyBreakpointEvent(BreakpointEvent event, PyFrameObject* frame); 74 | 75 | private: 76 | // Callable object representing the compiled conditional expression to 77 | // evaluate on each breakpoint hit. If the breakpoint has no condition, this 78 | // field will be nullptr. 79 | ScopedPyCodeObject condition_; 80 | 81 | // Python callable object to invoke on breakpoint events. 82 | ScopedPyObject python_callback_; 83 | 84 | // Per breakpoint quota on cost of evaluating breakpoint conditions. See 85 | // "rate_limit.h" file for detailed explanation. 86 | std::unique_ptr per_breakpoint_condition_quota_; 87 | 88 | DISALLOW_COPY_AND_ASSIGN(ConditionalBreakpoint); 89 | }; 90 | 91 | } // namespace cdbg 92 | } // namespace devtools 93 | 94 | #endif // DEVTOOLS_CDBG_DEBUGLETS_PYTHON_CONDITIONAL_BREAKPOINT_H_ 95 | -------------------------------------------------------------------------------- /src/build.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash -e 2 | # 3 | # Copyright 2015 Google Inc. All Rights Reserved. 4 | # 5 | # Licensed under the Apache License, Version 2.0 (the "License"); 6 | # you may not use this file except in compliance with the License. 7 | # You may obtain a copy of the License at 8 | # 9 | # http://www.apache.org/licenses/LICENSE-2.0 10 | # 11 | # Unless required by applicable law or agreed to in writing, software 12 | # distributed under the License is distributed on an "AS IS" BASIS, 13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 14 | # See the License for the specific language governing permissions and 15 | # limitations under the License. 16 | # 17 | 18 | # 19 | # This script builds the Python Cloud Debugger agent from source code. The 20 | # debugger is currently only supported on Linux. 21 | # 22 | # The build script assumes Python, cmake, curl and gcc are installed. 23 | # To install these dependencies on Debian, run this commandd: 24 | # sudo apt-get -y -q --no-install-recommends install \ 25 | # curl ca-certificates gcc build-essential cmake \ 26 | # python python-dev libpython2.7 python-setuptools 27 | # 28 | # The Python Cloud Debugger agent uses glog and gflags libraries. We build them 29 | # first. Then we use setuptools to build the debugger agent. The entire 30 | # build process is local and does not change any system directories. 31 | # 32 | # Home page of gflags: https://github.com/gflags/gflags 33 | # Home page of glog: https://github.com/google/glog 34 | # 35 | 36 | GFLAGS_URL=https://github.com/gflags/gflags/archive/v2.2.2.tar.gz 37 | GLOG_URL=https://github.com/google/glog/archive/v0.4.0.tar.gz 38 | 39 | ROOT=$(cd $(dirname "${BASH_SOURCE[0]}") >/dev/null; /bin/pwd -P) 40 | 41 | # Parallelize the build over N threads where N is the number of cores * 1.5. 42 | PARALLEL_BUILD_OPTION="-j $(($(nproc 2> /dev/null || echo 4)*3/2))" 43 | 44 | # Clean up any previous build files. 45 | rm -rf \ 46 | ${ROOT}/build \ 47 | ${ROOT}/dist \ 48 | ${ROOT}/setup.cfg \ 49 | ${ROOT}/google_python_cloud_debugger.egg-info 50 | 51 | # Create directory for third-party libraries. 52 | mkdir -p ${ROOT}/build/third_party 53 | 54 | # Build and install gflags to build/third_party. 55 | pushd ${ROOT}/build/third_party 56 | curl -Lk ${GFLAGS_URL} -o gflags.tar.gz 57 | tar xzvf gflags.tar.gz 58 | cd gflags-* 59 | mkdir build 60 | cd build 61 | cmake -DCMAKE_CXX_FLAGS=-fpic \ 62 | -DGFLAGS_NAMESPACE=google \ 63 | -DCMAKE_INSTALL_PREFIX:PATH=${ROOT}/build/third_party \ 64 | .. 65 | make ${PARALLEL_BUILD_OPTION} 66 | make install 67 | popd 68 | 69 | # Build and install glog to build/third_party. 70 | pushd ${ROOT}/build/third_party 71 | curl -L ${GLOG_URL} -o glog.tar.gz 72 | tar xzvf glog.tar.gz 73 | cd glog-* 74 | mkdir build 75 | cd build 76 | cmake -DCMAKE_CXX_FLAGS=-fpic \ 77 | -DCMAKE_PREFIX_PATH=${ROOT}/build/third_party \ 78 | -DCMAKE_INSTALL_PREFIX:PATH=${ROOT}/build/third_party \ 79 | .. 80 | make ${PARALLEL_BUILD_OPTION} 81 | make install 82 | popd 83 | 84 | # Create setup.cfg file and point to the third_party libraries we just built. 85 | echo "[global] 86 | verbose=1 87 | 88 | [build_ext] 89 | include_dirs=${ROOT}/build/third_party/include 90 | library_dirs=${ROOT}/build/third_party/lib:${ROOT}/build/third_party/lib64" > ${ROOT}/setup.cfg 91 | 92 | # Build the Python Cloud Debugger agent. 93 | pushd ${ROOT} 94 | # Use custom python command if variable is set 95 | "${PYTHON:-python3}" -m pip wheel . --no-deps -w dist 96 | popd 97 | 98 | # Clean up temporary directories. 99 | rm -rf \ 100 | ${ROOT}/build \ 101 | ${ROOT}/setup.cfg \ 102 | ${ROOT}/google_python_cloud_debugger.egg-info 103 | -------------------------------------------------------------------------------- /tests/py/yaml_data_visibility_config_reader_test.py: -------------------------------------------------------------------------------- 1 | """Tests for yaml_data_visibility_config_reader.""" 2 | 3 | import os 4 | import sys 5 | from unittest import mock 6 | 7 | from io import StringIO 8 | 9 | from absl.testing import absltest 10 | from googleclouddebugger import yaml_data_visibility_config_reader 11 | 12 | 13 | class StringIOOpen(object): 14 | """An open for StringIO that supports "with" semantics. 15 | 16 | I tried using mock.mock_open, but the read logic in the yaml.load code is 17 | incompatible with the returned mock object, leading to a test hang/timeout. 18 | """ 19 | 20 | def __init__(self, data): 21 | self.file_obj = StringIO(data) 22 | 23 | def __enter__(self): 24 | return self.file_obj 25 | 26 | def __exit__(self, type, value, traceback): # pylint: disable=redefined-builtin 27 | pass 28 | 29 | 30 | class YamlDataVisibilityConfigReaderTest(absltest.TestCase): 31 | 32 | def testOpenAndReadSuccess(self): 33 | data = """ 34 | blacklist: 35 | - bl1 36 | """ 37 | path_prefix = 'googleclouddebugger.' 38 | with mock.patch( 39 | path_prefix + 'yaml_data_visibility_config_reader.open', 40 | create=True) as m: 41 | m.return_value = StringIOOpen(data) 42 | config = yaml_data_visibility_config_reader.OpenAndRead() 43 | m.assert_called_with( 44 | os.path.join(sys.path[0], 'debugger-blacklist.yaml'), 'r') 45 | self.assertEqual(config.blacklist_patterns, ['bl1']) 46 | 47 | def testOpenAndReadFileNotFound(self): 48 | path_prefix = 'googleclouddebugger.' 49 | with mock.patch( 50 | path_prefix + 'yaml_data_visibility_config_reader.open', 51 | create=True, 52 | side_effect=IOError('IO Error')): 53 | f = yaml_data_visibility_config_reader.OpenAndRead() 54 | self.assertIsNone(f) 55 | 56 | def testReadDataSuccess(self): 57 | data = """ 58 | blacklist: 59 | - bl1 60 | - bl2 61 | whitelist: 62 | - wl1 63 | - wl2.* 64 | """ 65 | 66 | config = yaml_data_visibility_config_reader.Read(StringIO(data)) 67 | self.assertItemsEqual(config.blacklist_patterns, ('bl1', 'bl2')) 68 | self.assertItemsEqual(config.whitelist_patterns, ('wl1', 'wl2.*')) 69 | 70 | def testYAMLLoadError(self): 71 | 72 | class ErrorIO(object): 73 | 74 | def read(self, size): 75 | del size # Unused 76 | raise IOError('IO Error') 77 | 78 | with self.assertRaises(yaml_data_visibility_config_reader.YAMLLoadError): 79 | yaml_data_visibility_config_reader.Read(ErrorIO()) 80 | 81 | def testBadYamlSyntax(self): 82 | data = """ 83 | blacklist: whitelist: 84 | """ 85 | 86 | with self.assertRaises(yaml_data_visibility_config_reader.ParseError): 87 | yaml_data_visibility_config_reader.Read(StringIO(data)) 88 | 89 | def testUnknownConfigKeyError(self): 90 | data = """ 91 | foo: 92 | - bar 93 | """ 94 | 95 | with self.assertRaises( 96 | yaml_data_visibility_config_reader.UnknownConfigKeyError): 97 | yaml_data_visibility_config_reader.Read(StringIO(data)) 98 | 99 | def testNotAListError(self): 100 | data = """ 101 | blacklist: 102 | foo: 103 | - bar 104 | """ 105 | 106 | with self.assertRaises(yaml_data_visibility_config_reader.NotAListError): 107 | yaml_data_visibility_config_reader.Read(StringIO(data)) 108 | 109 | def testElementNotAStringError(self): 110 | data = """ 111 | blacklist: 112 | - 5 113 | """ 114 | 115 | with self.assertRaises( 116 | yaml_data_visibility_config_reader.ElementNotAStringError): 117 | yaml_data_visibility_config_reader.Read(StringIO(data)) 118 | 119 | 120 | if __name__ == '__main__': 121 | absltest.main() 122 | -------------------------------------------------------------------------------- /src/googleclouddebugger/module_utils.py: -------------------------------------------------------------------------------- 1 | # Copyright 2015 Google Inc. All Rights Reserved. 2 | # 3 | # Licensed under the Apache License, Version 2.0 (the "License"); 4 | # you may not use this file except in compliance with the License. 5 | # You may obtain a copy of the License at 6 | # 7 | # http://www.apache.org/licenses/LICENSE-2.0 8 | # 9 | # Unless required by applicable law or agreed to in writing, software 10 | # distributed under the License is distributed on an "AS-IS" BASIS, 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | # See the License for the specific language governing permissions and 13 | # limitations under the License. 14 | """Provides utility functions for module path processing.""" 15 | 16 | import os 17 | import sys 18 | 19 | def NormalizePath(path): 20 | """Normalizes a path. 21 | 22 | E.g. One example is it will convert "/a/b/./c" -> "/a/b/c" 23 | """ 24 | # TODO: Calling os.path.normpath "may change the meaning of a 25 | # path that contains symbolic links" (e.g., "A/foo/../B" != "A/B" if foo is a 26 | # symlink). This might cause trouble when matching against loaded module 27 | # paths. We should try to avoid using it. 28 | # Example: 29 | # > import symlink.a 30 | # > symlink.a.__file__ 31 | # symlink/a.py 32 | # > import target.a 33 | # > starget.a.__file__ 34 | # target/a.py 35 | # Python interpreter treats these as two separate modules. So, we also need to 36 | # handle them the same way. 37 | return os.path.normpath(path) 38 | 39 | 40 | def IsPathSuffix(mod_path, path): 41 | """Checks whether path is a full path suffix of mod_path. 42 | 43 | Args: 44 | mod_path: Must be an absolute path to a source file. Must not have 45 | file extension. 46 | path: A relative path. Must not have file extension. 47 | 48 | Returns: 49 | True if path is a full path suffix of mod_path. False otherwise. 50 | """ 51 | return (mod_path.endswith(path) and (len(mod_path) == len(path) or 52 | mod_path[:-len(path)].endswith(os.sep))) 53 | 54 | 55 | def GetLoadedModuleBySuffix(path): 56 | """Searches sys.modules to find a module with the given file path. 57 | 58 | Args: 59 | path: Path to the source file. It can be relative or absolute, as suffix 60 | match can handle both. If absolute, it must have already been 61 | sanitized. 62 | 63 | Algorithm: 64 | The given path must be a full suffix of a loaded module to be a valid match. 65 | File extensions are ignored when performing suffix match. 66 | 67 | Example: 68 | path: 'a/b/c.py' 69 | modules: {'a': 'a.py', 'a.b': 'a/b.py', 'a.b.c': 'a/b/c.pyc'] 70 | returns: module('a.b.c') 71 | 72 | Returns: 73 | The module that corresponds to path, or None if such module was not 74 | found. 75 | """ 76 | root = os.path.splitext(path)[0] 77 | for module in sys.modules.values(): 78 | mod_root = os.path.splitext(getattr(module, '__file__', None) or '')[0] 79 | 80 | if not mod_root: 81 | continue 82 | 83 | # While mod_root can contain symlinks, we cannot eliminate them. This is 84 | # because, we must perform exactly the same transformations on mod_root and 85 | # path, yet path can be relative to an unknown directory which prevents 86 | # identifying and eliminating symbolic links. 87 | # 88 | # Therefore, we only convert relative to absolute path. 89 | if not os.path.isabs(mod_root): 90 | mod_root = os.path.join(os.getcwd(), mod_root) 91 | 92 | # In the following invocation 'python3 ./main.py' (using the ./), the 93 | # mod_root variable will '/base/path/./main'. In order to correctly compare 94 | # it with the root variable, it needs to be '/base/path/main'. 95 | mod_root = NormalizePath(mod_root) 96 | 97 | if IsPathSuffix(mod_root, root): 98 | return module 99 | 100 | return None 101 | -------------------------------------------------------------------------------- /src/googleclouddebugger/module_search.py: -------------------------------------------------------------------------------- 1 | # Copyright 2015 Google Inc. All Rights Reserved. 2 | # 3 | # Licensed under the Apache License, Version 2.0 (the "License"); 4 | # you may not use this file except in compliance with the License. 5 | # You may obtain a copy of the License at 6 | # 7 | # http://www.apache.org/licenses/LICENSE-2.0 8 | # 9 | # Unless required by applicable law or agreed to in writing, software 10 | # distributed under the License is distributed on an "AS-IS" BASIS, 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | # See the License for the specific language governing permissions and 13 | # limitations under the License. 14 | """Inclusive search for module files.""" 15 | 16 | import os 17 | import sys 18 | 19 | 20 | def Search(path): 21 | """Search sys.path to find a source file that matches path. 22 | 23 | The provided input path may have an unknown number of irrelevant outer 24 | directories (e.g., /garbage1/garbage2/real1/real2/x.py'). This function 25 | does multiple search iterations until an actual Python module file that 26 | matches the input path is found. At each iteration, it strips one leading 27 | directory from the path and searches the directories at sys.path 28 | for a match. 29 | 30 | Examples: 31 | sys.path: ['/x1/x2', '/y1/y2'] 32 | Search order: [.pyo|.pyc|.py] 33 | /x1/x2/a/b/c 34 | /x1/x2/b/c 35 | /x1/x2/c 36 | /y1/y2/a/b/c 37 | /y1/y2/b/c 38 | /y1/y2/c 39 | Filesystem: ['/y1/y2/a/b/c.pyc'] 40 | 41 | 1) Search('a/b/c.py') 42 | Returns '/y1/y2/a/b/c.pyc' 43 | 2) Search('q/w/a/b/c.py') 44 | Returns '/y1/y2/a/b/c.pyc' 45 | 3) Search('q/w/c.py') 46 | Returns 'q/w/c.py' 47 | 48 | The provided input path may also be relative to an unknown directory. 49 | The path may include some or all outer package names. 50 | 51 | Examples (continued): 52 | 53 | 4) Search('c.py') 54 | Returns 'c.py' 55 | 5) Search('b/c.py') 56 | Returns 'b/c.py' 57 | 58 | Args: 59 | path: Path that describes a source file. Must contain .py file extension. 60 | Must not contain any leading os.sep character. 61 | 62 | Returns: 63 | Full path to the matched source file, if a match is found. Otherwise, 64 | returns the input path. 65 | 66 | Raises: 67 | AssertionError: if the provided path is an absolute path, or if it does not 68 | have a .py extension. 69 | """ 70 | 71 | def SearchCandidates(p): 72 | """Generates all candidates for the fuzzy search of p.""" 73 | while p: 74 | yield p 75 | (_, _, p) = p.partition(os.sep) 76 | 77 | # Verify that the os.sep is already stripped from the input. 78 | assert not path.startswith(os.sep) 79 | 80 | # Strip the file extension, it will not be needed. 81 | src_root, src_ext = os.path.splitext(path) 82 | assert src_ext == '.py' 83 | 84 | # Search longer suffixes first. Move to shorter suffixes only if longer 85 | # suffixes do not result in any matches. 86 | for src_part in SearchCandidates(src_root): 87 | # Search is done in sys.path order, which gives higher priority to earlier 88 | # entries in sys.path list. 89 | for sys_path in sys.path: 90 | f = os.path.join(sys_path, src_part) 91 | # The order in which we search the extensions does not matter. 92 | for ext in ('.pyo', '.pyc', '.py'): 93 | # The os.path.exists check internally follows symlinks and flattens 94 | # relative paths, so we don't have to deal with it. 95 | fext = f + ext 96 | if os.path.exists(fext): 97 | # Once we identify a matching file in the filesystem, we should 98 | # preserve the (1) potentially-symlinked and (2) 99 | # potentially-non-flattened file path (f+ext), because that's exactly 100 | # how we expect it to appear in sys.modules when we search the file 101 | # there. 102 | return fext 103 | 104 | # A matching file was not found in sys.path directories. 105 | return path 106 | -------------------------------------------------------------------------------- /src/googleclouddebugger/leaky_bucket.cc: -------------------------------------------------------------------------------- 1 | /** 2 | * Copyright 2015 Google Inc. All Rights Reserved. 3 | * 4 | * Licensed under the Apache License, Version 2.0 (the "License"); 5 | * you may not use this file except in compliance with the License. 6 | * You may obtain a copy of the License at 7 | * 8 | * http://www.apache.org/licenses/LICENSE-2.0 9 | * 10 | * Unless required by applicable law or agreed to in writing, software 11 | * distributed under the License is distributed on an "AS IS" BASIS, 12 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | * See the License for the specific language governing permissions and 14 | * limitations under the License. 15 | */ 16 | 17 | // Ensure that Python.h is included before any other header in Python debuglet. 18 | #include "common.h" 19 | 20 | #include "leaky_bucket.h" 21 | 22 | #include 23 | #include 24 | #include 25 | 26 | namespace devtools { 27 | namespace cdbg { 28 | 29 | static int64_t NowInNanoseconds() { 30 | timespec time; 31 | clock_gettime(CLOCK_MONOTONIC, &time); 32 | return 1000000000LL * time.tv_sec + time.tv_nsec; 33 | } 34 | 35 | LeakyBucket::LeakyBucket(int64_t capacity, int64_t fill_rate) 36 | : capacity_(capacity), 37 | fractional_tokens_(0.0), 38 | fill_rate_(fill_rate), 39 | fill_time_ns_(NowInNanoseconds()) { 40 | tokens_ = capacity; 41 | } 42 | 43 | bool LeakyBucket::RequestTokensSlow(int64_t requested_tokens) { 44 | // Getting the time outside the lock is significantly faster (reduces 45 | // contention, etc.). 46 | const int64_t current_time_ns = NowInNanoseconds(); 47 | 48 | std::lock_guard lock(mu_); 49 | 50 | const int64_t cur_tokens = AtomicLoadTokens(); 51 | if (cur_tokens >= 0) { 52 | return true; 53 | } 54 | 55 | const int64_t available_tokens = 56 | RefillBucket(requested_tokens + cur_tokens, current_time_ns); 57 | if (available_tokens >= 0) { 58 | return true; 59 | } 60 | 61 | // Since we were unable to satisfy the request, we need to restore the 62 | // requested tokens. 63 | AtomicIncrementTokens(requested_tokens); 64 | 65 | return false; 66 | } 67 | 68 | int64_t LeakyBucket::RefillBucket(int64_t available_tokens, 69 | int64_t current_time_ns) { 70 | if (current_time_ns <= fill_time_ns_) { 71 | // We check to see if the bucket has been refilled after we checked the 72 | // current time but before we grabbed mu_. If it has there's nothing to do. 73 | return AtomicLoadTokens(); 74 | } 75 | 76 | const int64_t elapsed_ns = current_time_ns - fill_time_ns_; 77 | fill_time_ns_ = current_time_ns; 78 | 79 | // Calculate the number of tokens we can add. Note elapsed is in ns while 80 | // fill_rate_ is in tokens per second, hence the scaling factor. 81 | // We can get a negative amount of tokens by calling TakeTokens. Make sure we 82 | // don't add more than the capacity of leaky bucket. 83 | fractional_tokens_ += 84 | std::min(elapsed_ns * (fill_rate_ / 1e9), static_cast(capacity_)); 85 | const int64_t ideal_tokens_to_add = fractional_tokens_; 86 | 87 | const int64_t max_tokens_to_add = capacity_ - available_tokens; 88 | int64_t real_tokens_to_add; 89 | if (max_tokens_to_add < ideal_tokens_to_add) { 90 | fractional_tokens_ = 0.0; 91 | real_tokens_to_add = max_tokens_to_add; 92 | } else { 93 | real_tokens_to_add = ideal_tokens_to_add; 94 | fractional_tokens_ -= real_tokens_to_add; 95 | } 96 | 97 | return AtomicIncrementTokens(real_tokens_to_add); 98 | } 99 | 100 | void LeakyBucket::TakeTokens(int64_t tokens) { 101 | const int64_t remaining = AtomicIncrementTokens(-tokens); 102 | 103 | if (remaining < 0) { 104 | // (Try to) refill the bucket. If we don't do this, we could just 105 | // keep decreasing forever without refilling. We need to be 106 | // refilling at least as frequently as every capacity_ / 107 | // fill_rate_ seconds. Otherwise, we waste tokens. 108 | const int64_t current_time_ns = NowInNanoseconds(); 109 | 110 | std::lock_guard lock(mu_); 111 | RefillBucket(remaining, current_time_ns); 112 | } 113 | } 114 | 115 | } // namespace cdbg 116 | } // namespace devtools 117 | -------------------------------------------------------------------------------- /src/googleclouddebugger/rate_limit.cc: -------------------------------------------------------------------------------- 1 | /** 2 | * Copyright 2015 Google Inc. All Rights Reserved. 3 | * 4 | * Licensed under the Apache License, Version 2.0 (the "License"); 5 | * you may not use this file except in compliance with the License. 6 | * You may obtain a copy of the License at 7 | * 8 | * http://www.apache.org/licenses/LICENSE-2.0 9 | * 10 | * Unless required by applicable law or agreed to in writing, software 11 | * distributed under the License is distributed on an "AS IS" BASIS, 12 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | * See the License for the specific language governing permissions and 14 | * limitations under the License. 15 | */ 16 | 17 | // Ensure that Python.h is included before any other header. 18 | #include "common.h" 19 | 20 | #include "rate_limit.h" 21 | 22 | #include 23 | 24 | ABSL_FLAG( 25 | int32, max_condition_lines_rate, 5000, 26 | "maximum number of Python lines/sec to spend on condition evaluation"); 27 | 28 | ABSL_FLAG( 29 | int32, max_dynamic_log_rate, 30 | 50, // maximum of 50 log entries per second on average 31 | "maximum rate of dynamic log entries in this process; short bursts are " 32 | "allowed to exceed this limit"); 33 | 34 | ABSL_FLAG(int32, max_dynamic_log_bytes_rate, 35 | 20480, // maximum of 20K bytes per second on average 36 | "maximum rate of dynamic log bytes in this process; short bursts are " 37 | "allowed to exceed this limit"); 38 | 39 | namespace devtools { 40 | namespace cdbg { 41 | 42 | // Define capacity of leaky bucket: 43 | // capacity = fill_rate * capacity_factor 44 | // 45 | // The capacity is conceptually unrelated to fill rate, but we don't want to 46 | // expose this knob to the developers. Defining it as a factor of a fill rate 47 | // is a convinient heuristics. 48 | // 49 | // Smaller factor values ensure that a burst of CPU consumption due to the 50 | // debugger wil not impact the service throughput. Longer values will allow the 51 | // burst, and will only disable the breakpoint if CPU consumption due to 52 | // debugger is continuous for a prolonged period of time. 53 | static const double kConditionCostCapacityFactor = 0.1; 54 | static const double kDynamicLogCapacityFactor = 5; 55 | static const double kDynamicLogBytesCapacityFactor = 2; 56 | 57 | static std::unique_ptr g_global_condition_quota; 58 | static std::unique_ptr g_global_dynamic_log_quota; 59 | static std::unique_ptr g_global_dynamic_log_bytes_quota; 60 | 61 | static int64_t GetBaseConditionQuotaCapacity() { 62 | return absl::GetFlag(FLAGS_max_condition_lines_rate) * 63 | kConditionCostCapacityFactor; 64 | } 65 | 66 | void LazyInitializeRateLimit() { 67 | if (g_global_condition_quota == nullptr) { 68 | g_global_condition_quota.reset( 69 | new LeakyBucket(GetBaseConditionQuotaCapacity(), 70 | absl::GetFlag(FLAGS_max_condition_lines_rate))); 71 | 72 | g_global_dynamic_log_quota.reset(new LeakyBucket( 73 | absl::GetFlag(FLAGS_max_dynamic_log_rate) * kDynamicLogCapacityFactor, 74 | absl::GetFlag(FLAGS_max_dynamic_log_rate))); 75 | 76 | g_global_dynamic_log_bytes_quota.reset( 77 | new LeakyBucket(absl::GetFlag(FLAGS_max_dynamic_log_bytes_rate) * 78 | kDynamicLogBytesCapacityFactor, 79 | absl::GetFlag(FLAGS_max_dynamic_log_bytes_rate))); 80 | } 81 | } 82 | 83 | 84 | void CleanupRateLimit() { 85 | g_global_condition_quota = nullptr; 86 | g_global_dynamic_log_quota = nullptr; 87 | g_global_dynamic_log_bytes_quota = nullptr; 88 | } 89 | 90 | 91 | LeakyBucket* GetGlobalConditionQuota() { 92 | return g_global_condition_quota.get(); 93 | } 94 | 95 | LeakyBucket* GetGlobalDynamicLogQuota() { 96 | return g_global_dynamic_log_quota.get(); 97 | } 98 | 99 | LeakyBucket* GetGlobalDynamicLogBytesQuota() { 100 | return g_global_dynamic_log_bytes_quota.get(); 101 | } 102 | 103 | std::unique_ptr CreatePerBreakpointConditionQuota() { 104 | return std::unique_ptr( 105 | new LeakyBucket(GetBaseConditionQuotaCapacity() / 2, 106 | absl::GetFlag(FLAGS_max_condition_lines_rate) / 2)); 107 | } 108 | 109 | } // namespace cdbg 110 | } // namespace devtools 111 | -------------------------------------------------------------------------------- /src/googleclouddebugger/common.h: -------------------------------------------------------------------------------- 1 | /** 2 | * Copyright 2015 Google Inc. All Rights Reserved. 3 | * 4 | * Licensed under the Apache License, Version 2.0 (the "License"); 5 | * you may not use this file except in compliance with the License. 6 | * You may obtain a copy of the License at 7 | * 8 | * http://www.apache.org/licenses/LICENSE-2.0 9 | * 10 | * Unless required by applicable law or agreed to in writing, software 11 | * distributed under the License is distributed on an "AS IS" BASIS, 12 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | * See the License for the specific language governing permissions and 14 | * limitations under the License. 15 | */ 16 | 17 | #ifndef DEVTOOLS_CDBG_DEBUGLETS_PYTHON_COMMON_H_ 18 | #define DEVTOOLS_CDBG_DEBUGLETS_PYTHON_COMMON_H_ 19 | 20 | // Open source includes and definition of common constants. 21 | // 22 | 23 | // Python.h must be included before any other header files. 24 | // For details see: https://docs.python.org/2/c-api/intro.html 25 | #include "Python.h" 26 | #include "frameobject.h" 27 | #include "structmember.h" 28 | #include "opcode.h" 29 | 30 | #include 31 | #include 32 | #include 33 | #include 34 | 35 | #include "glog/logging.h" 36 | 37 | #define DISALLOW_COPY_AND_ASSIGN(TypeName) \ 38 | TypeName(const TypeName&) = delete; \ 39 | void operator=(const TypeName&) = delete 40 | 41 | template 42 | char (&ArraySizeHelper(const T (&array)[N]))[N]; 43 | 44 | #define arraysize(array) (sizeof(ArraySizeHelper(array))) 45 | 46 | typedef signed char int8; 47 | typedef short int16; 48 | typedef int int32; 49 | typedef long long int64; 50 | typedef unsigned char uint8; 51 | typedef unsigned short uint16; 52 | typedef unsigned int uint32; 53 | typedef unsigned long long uint64; 54 | 55 | using std::string; 56 | 57 | using google::LogSink; 58 | using google::LogSeverity; 59 | using google::AddLogSink; 60 | using google::RemoveLogSink; 61 | 62 | // The open source build uses gflags, which uses the traditional (v1) flags APIs 63 | // to define/declare/access command line flags. The internal build has upgraded 64 | // to use v2 flags API (DEFINE_FLAG/DECLARE_FLAG/GetFlag/SetFlag), which is not 65 | // supported by gflags yet (and absl is not released to open source yet). 66 | // Here, we use simple, dummy v2 flags wrappers around v1 flags implementation. 67 | // This allows us to use the same flags APIs both internally and externally. 68 | 69 | #define ABSL_FLAG(type, name, default_value, help) \ 70 | DEFINE_##type(name, default_value, help) 71 | 72 | #define ABSL_DECLARE_FLAG(type, name) DECLARE_##type(name) 73 | 74 | namespace absl { 75 | // Return the value of an old-style flag. Not thread-safe. 76 | inline bool GetFlag(bool flag) { return flag; } 77 | inline int32 GetFlag(int32 flag) { return flag; } 78 | inline int64 GetFlag(int64 flag) { return flag; } 79 | inline uint64 GetFlag(uint64 flag) { return flag; } 80 | inline double GetFlag(double flag) { return flag; } 81 | inline string GetFlag(const string& flag) { return flag; } 82 | 83 | // Change the value of an old-style flag. Not thread-safe. 84 | inline void SetFlag(bool* f, bool v) { *f = v; } 85 | inline void SetFlag(int32* f, int32 v) { *f = v; } 86 | inline void SetFlag(int64* f, int64 v) { *f = v; } 87 | inline void SetFlag(uint64* f, uint64 v) { *f = v; } 88 | inline void SetFlag(double* f, double v) { *f = v; } 89 | inline void SetFlag(string* f, const string& v) { *f = v; } 90 | } // namespace absl 91 | 92 | // Python 3 compatibility 93 | #if PY_MAJOR_VERSION >= 3 94 | // Python 2 has both an 'int' and a 'long' type, and Python 3 only as an 'int' 95 | // type which is the equivalent of Python 2's 'long'. 96 | // PyInt* functions will refer to 'int' in Python 2 and 3. 97 | #define PyInt_FromLong PyLong_FromLong 98 | #define PyInt_AsLong PyLong_AsLong 99 | #define PyInt_CheckExact PyLong_CheckExact 100 | 101 | // Python 3's 'bytes' type is the equivalent of Python 2's 'str' type, which are 102 | // byte arrays. Python 3's 'str' type represents a unicode string. 103 | // In this codebase: 104 | // PyString* functions will refer to 'str' in Python 2 and 3. 105 | // PyBytes* functions will refer to 'str' in Python 2 and 'bytes' in Python 3. 106 | #define PyString_AsString PyUnicode_AsUTF8 107 | #endif 108 | 109 | #endif // DEVTOOLS_CDBG_DEBUGLETS_PYTHON_COMMON_H_ 110 | -------------------------------------------------------------------------------- /src/googleclouddebugger/conditional_breakpoint.cc: -------------------------------------------------------------------------------- 1 | /** 2 | * Copyright 2015 Google Inc. All Rights Reserved. 3 | * 4 | * Licensed under the Apache License, Version 2.0 (the "License"); 5 | * you may not use this file except in compliance with the License. 6 | * You may obtain a copy of the License at 7 | * 8 | * http://www.apache.org/licenses/LICENSE-2.0 9 | * 10 | * Unless required by applicable law or agreed to in writing, software 11 | * distributed under the License is distributed on an "AS IS" BASIS, 12 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | * See the License for the specific language governing permissions and 14 | * limitations under the License. 15 | */ 16 | 17 | // Ensure that Python.h is included before any other header. 18 | #include "common.h" 19 | 20 | #include "conditional_breakpoint.h" 21 | 22 | #include 23 | 24 | #include "immutability_tracer.h" 25 | #include "rate_limit.h" 26 | 27 | namespace devtools { 28 | namespace cdbg { 29 | 30 | ConditionalBreakpoint::ConditionalBreakpoint( 31 | ScopedPyCodeObject condition, 32 | ScopedPyObject callback) 33 | : condition_(condition), 34 | python_callback_(callback), 35 | per_breakpoint_condition_quota_(CreatePerBreakpointConditionQuota()) { 36 | } 37 | 38 | 39 | ConditionalBreakpoint::~ConditionalBreakpoint() { 40 | } 41 | 42 | 43 | 44 | void ConditionalBreakpoint::OnBreakpointHit() { 45 | PyFrameObject* frame = PyThreadState_Get()->frame; 46 | 47 | if (!EvaluateCondition(frame)) { 48 | return; 49 | } 50 | 51 | NotifyBreakpointEvent(BreakpointEvent::Hit, frame); 52 | } 53 | 54 | 55 | void ConditionalBreakpoint::OnBreakpointError() { 56 | NotifyBreakpointEvent(BreakpointEvent::Error, nullptr); 57 | } 58 | 59 | 60 | bool ConditionalBreakpoint::EvaluateCondition(PyFrameObject* frame) { 61 | if (condition_ == nullptr) { 62 | return true; 63 | } 64 | 65 | PyFrame_FastToLocals(frame); 66 | 67 | ScopedPyObject result; 68 | bool is_mutable_code_detected = false; 69 | int32_t line_count = 0; 70 | 71 | { 72 | ScopedImmutabilityTracer immutability_tracer; 73 | result.reset(PyEval_EvalCode( 74 | #if PY_MAJOR_VERSION >= 3 75 | reinterpret_cast(condition_.get()), 76 | #else 77 | condition_.get(), 78 | #endif 79 | frame->f_globals, 80 | frame->f_locals)); 81 | is_mutable_code_detected = immutability_tracer.IsMutableCodeDetected(); 82 | line_count = immutability_tracer.GetLineCount(); 83 | } 84 | 85 | // TODO: clear breakpoint if condition evaluation failed due to 86 | // mutable code or timeout. 87 | 88 | auto eval_exception = ClearPythonException(); 89 | 90 | if (is_mutable_code_detected) { 91 | NotifyBreakpointEvent( 92 | BreakpointEvent::ConditionExpressionMutable, 93 | nullptr); 94 | return false; 95 | } 96 | 97 | if (eval_exception.has_value()) { 98 | DLOG(INFO) << "Expression evaluation failed: " << eval_exception.value(); 99 | return false; 100 | } 101 | 102 | if (PyObject_IsTrue(result.get())) { 103 | return true; 104 | } 105 | 106 | ApplyConditionQuota(line_count); 107 | 108 | return false; 109 | } 110 | 111 | 112 | void ConditionalBreakpoint::ApplyConditionQuota(int time_ns) { 113 | // Apply global cost limit. 114 | if (!GetGlobalConditionQuota()->RequestTokens(time_ns)) { 115 | LOG(INFO) << "Global condition quota exceeded"; 116 | NotifyBreakpointEvent( 117 | BreakpointEvent::GlobalConditionQuotaExceeded, 118 | nullptr); 119 | return; 120 | } 121 | 122 | // Apply per-breakpoint cost limit. 123 | if (!per_breakpoint_condition_quota_->RequestTokens(time_ns)) { 124 | LOG(INFO) << "Per breakpoint condition quota exceeded"; 125 | NotifyBreakpointEvent( 126 | BreakpointEvent::BreakpointConditionQuotaExceeded, 127 | nullptr); 128 | return; 129 | } 130 | } 131 | 132 | 133 | void ConditionalBreakpoint::NotifyBreakpointEvent( 134 | BreakpointEvent event, 135 | PyFrameObject* frame) { 136 | ScopedPyObject obj_event(PyInt_FromLong(static_cast(event))); 137 | PyObject* obj_frame = reinterpret_cast(frame) ?: Py_None; 138 | ScopedPyObject callback_args(PyTuple_Pack(2, obj_event.get(), obj_frame)); 139 | 140 | ScopedPyObject result( 141 | PyObject_Call(python_callback_.get(), callback_args.get(), nullptr)); 142 | ClearPythonException(); 143 | } 144 | 145 | 146 | } // namespace cdbg 147 | } // namespace devtools 148 | 149 | -------------------------------------------------------------------------------- /src/setup.py: -------------------------------------------------------------------------------- 1 | # Copyright 2015 Google Inc. All Rights Reserved. 2 | # 3 | # Licensed under the Apache License, Version 2.0 (the "License"); 4 | # you may not use this file except in compliance with the License. 5 | # You may obtain a copy of the License at 6 | # 7 | # http://www.apache.org/licenses/LICENSE-2.0 8 | # 9 | # Unless required by applicable law or agreed to in writing, software 10 | # distributed under the License is distributed on an "AS-IS" BASIS, 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | # See the License for the specific language governing permissions and 13 | # limitations under the License. 14 | """Python Cloud Debugger build and packaging script.""" 15 | 16 | from configparser import ConfigParser 17 | from glob import glob 18 | import os 19 | import re 20 | from distutils import sysconfig 21 | from setuptools import Extension 22 | from setuptools import setup 23 | 24 | 25 | def RemovePrefixes(optlist, bad_prefixes): 26 | for bad_prefix in bad_prefixes: 27 | for i, flag in enumerate(optlist): 28 | if flag.startswith(bad_prefix): 29 | optlist.pop(i) 30 | break 31 | return optlist 32 | 33 | 34 | def ReadConfig(section, value, default): 35 | try: 36 | config = ConfigParser() 37 | config.read('setup.cfg') 38 | return config.get(section, value) 39 | except: # pylint: disable=bare-except 40 | return default 41 | 42 | 43 | LONG_DESCRIPTION = ( 44 | 'The Cloud Debugger lets you inspect the state of an application at any\n' 45 | 'code location without stopping or slowing it down. The debugger makes it\n' 46 | 'easier to view the application state without adding logging statements.\n' 47 | '\n' 48 | 'For more details please see ' 49 | 'https://github.com/GoogleCloudPlatform/cloud-debug-python\n') 50 | 51 | lib_dirs = ReadConfig('build_ext', 'library_dirs', 52 | sysconfig.get_config_var('LIBDIR')).split(':') 53 | extra_compile_args = ReadConfig('cc_options', 'extra_compile_args', '').split() 54 | extra_link_args = ReadConfig('cc_options', 'extra_link_args', '').split() 55 | 56 | static_libs = [] 57 | deps = ['libgflags.a', 'libglog.a'] 58 | for dep in deps: 59 | for lib_dir in lib_dirs: 60 | path = os.path.join(lib_dir, dep) 61 | if os.path.isfile(path): 62 | static_libs.append(path) 63 | assert len(static_libs) == len(deps), (static_libs, deps, lib_dirs) 64 | 65 | cvars = sysconfig.get_config_vars() 66 | cvars['OPT'] = str.join( 67 | ' ', 68 | RemovePrefixes( 69 | cvars.get('OPT').split(), ['-g', '-O', '-Wstrict-prototypes'])) 70 | 71 | # Determine the current version of the package. The easiest way would be to 72 | # import "googleclouddebugger" and read its __version__ attribute. 73 | # Unfortunately we can't do that because "googleclouddebugger" depends on 74 | # "cdbg_native" that hasn't been built yet. 75 | version = None 76 | with open('googleclouddebugger/version.py', 'r') as version_file: 77 | version_pattern = re.compile(r"^\s*__version__\s*=\s*'([0-9.]*)'") 78 | for line in version_file: 79 | match = version_pattern.match(line) 80 | if match: 81 | version = match.groups()[0] 82 | assert version 83 | 84 | cdbg_native_module = Extension( 85 | 'googleclouddebugger.cdbg_native', 86 | sources=glob('googleclouddebugger/*.cc'), 87 | extra_compile_args=[ 88 | '-std=c++0x', 89 | '-Werror', 90 | '-g0', 91 | '-O3', 92 | ] + extra_compile_args, 93 | extra_link_args=static_libs + extra_link_args, 94 | libraries=['rt']) 95 | 96 | setup( 97 | name='google-python-cloud-debugger', 98 | description='Python Cloud Debugger', 99 | long_description=LONG_DESCRIPTION, 100 | url='https://github.com/GoogleCloudPlatform/cloud-debug-python', 101 | author='Google Inc.', 102 | version=version, 103 | install_requires=[ 104 | 'firebase-admin>=5.3.0', 105 | 'pyyaml', 106 | ], 107 | packages=['googleclouddebugger'], 108 | ext_modules=[cdbg_native_module], 109 | license='Apache License, Version 2.0', 110 | keywords='google cloud debugger', 111 | classifiers=[ 112 | 'Programming Language :: Python :: 3.6', 113 | 'Programming Language :: Python :: 3.7', 114 | 'Programming Language :: Python :: 3.8', 115 | 'Programming Language :: Python :: 3.9', 116 | 'Programming Language :: Python :: 3.10', 117 | 'Development Status :: 3 - Alpha', 118 | 'Intended Audience :: Developers', 119 | ]) 120 | -------------------------------------------------------------------------------- /tests/py/uniquifier_computer_test.py: -------------------------------------------------------------------------------- 1 | """Unit test for uniquifier_computer module.""" 2 | 3 | import os 4 | import sys 5 | import tempfile 6 | 7 | from absl.testing import absltest 8 | 9 | from googleclouddebugger import uniquifier_computer 10 | 11 | 12 | class UniquifierComputerTest(absltest.TestCase): 13 | 14 | def _Compute(self, files): 15 | """Creates a directory structure and computes uniquifier on it. 16 | 17 | Args: 18 | files: dictionary of relative path to file content. 19 | 20 | Returns: 21 | Uniquifier data lines. 22 | """ 23 | 24 | class Hash(object): 25 | """Fake implementation of hash to collect raw data.""" 26 | 27 | def __init__(self): 28 | self.data = b'' 29 | 30 | def update(self, s): 31 | self.data += s 32 | 33 | root = tempfile.mkdtemp('', 'fake_app_') 34 | for relative_path, content in files.items(): 35 | path = os.path.join(root, relative_path) 36 | directory = os.path.split(path)[0] 37 | if not os.path.exists(directory): 38 | os.makedirs(directory) 39 | with open(path, 'w') as f: 40 | f.write(content) 41 | 42 | sys.path.insert(0, root) 43 | try: 44 | hash_obj = Hash() 45 | uniquifier_computer.ComputeApplicationUniquifier(hash_obj) 46 | return [ 47 | u.decode() for u in ( 48 | hash_obj.data.rstrip(b'\n').split(b'\n') if hash_obj.data else []) 49 | ] 50 | finally: 51 | del sys.path[0] 52 | 53 | def testEmpty(self): 54 | self.assertListEqual([], self._Compute({})) 55 | 56 | def testBundle(self): 57 | self.assertListEqual([ 58 | 'first.py:1', 'in1/__init__.py:6', 'in1/a.py:3', 'in1/b.py:4', 59 | 'in1/in2/__init__.py:7', 'in1/in2/c.py:5', 'second.py:2' 60 | ], 61 | self._Compute({ 62 | 'db.app': 'abc', 63 | 'first.py': 'a', 64 | 'second.py': 'bb', 65 | 'in1/a.py': 'ccc', 66 | 'in1/b.py': 'dddd', 67 | 'in1/in2/c.py': 'eeeee', 68 | 'in1/__init__.py': 'ffffff', 69 | 'in1/in2/__init__.py': 'ggggggg' 70 | })) 71 | 72 | def testEmptyFile(self): 73 | self.assertListEqual(['empty.py:0'], self._Compute({'empty.py': ''})) 74 | 75 | def testNonPythonFilesIgnored(self): 76 | self.assertListEqual(['real.py:1'], 77 | self._Compute({ 78 | 'file.p': '', 79 | 'file.pya': '', 80 | 'real.py': '1' 81 | })) 82 | 83 | def testNonPackageDirectoriesIgnored(self): 84 | self.assertListEqual(['dir2/__init__.py:1'], 85 | self._Compute({ 86 | 'dir1/file.py': '', 87 | 'dir2/__init__.py': 'a', 88 | 'dir2/image.gif': '' 89 | })) 90 | 91 | def testDepthLimit(self): 92 | self.assertListEqual([ 93 | ''.join(str(n) + '/' 94 | for n in range(1, m + 1)) + '__init__.py:%d' % m 95 | for m in range(9, 0, -1) 96 | ], 97 | self._Compute({ 98 | '1/__init__.py': '1', 99 | '1/2/__init__.py': '2' * 2, 100 | '1/2/3/__init__.py': '3' * 3, 101 | '1/2/3/4/__init__.py': '4' * 4, 102 | '1/2/3/4/5/__init__.py': '5' * 5, 103 | '1/2/3/4/5/6/__init__.py': '6' * 6, 104 | '1/2/3/4/5/6/7/__init__.py': '7' * 7, 105 | '1/2/3/4/5/6/7/8/__init__.py': '8' * 8, 106 | '1/2/3/4/5/6/7/8/9/__init__.py': '9' * 9, 107 | '1/2/3/4/5/6/7/8/9/10/__init__.py': 'a' * 10, 108 | '1/2/3/4/5/6/7/8/9/10/11/__init__.py': 'b' * 11 109 | })) 110 | 111 | def testPrecedence(self): 112 | self.assertListEqual(['my.py:3'], 113 | self._Compute({ 114 | 'my.pyo': 'a', 115 | 'my.pyc': 'aa', 116 | 'my.py': 'aaa' 117 | })) 118 | 119 | 120 | if __name__ == '__main__': 121 | absltest.main() 122 | -------------------------------------------------------------------------------- /src/googleclouddebugger/yaml_data_visibility_config_reader.py: -------------------------------------------------------------------------------- 1 | # Copyright 2017 Google Inc. All Rights Reserved. 2 | # 3 | # Licensed under the Apache License, Version 2.0 (the "License"); 4 | # you may not use this file except in compliance with the License. 5 | # You may obtain a copy of the License at 6 | # 7 | # http://www.apache.org/licenses/LICENSE-2.0 8 | # 9 | # Unless required by applicable law or agreed to in writing, software 10 | # distributed under the License is distributed on an "AS-IS" BASIS, 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | # See the License for the specific language governing permissions and 13 | # limitations under the License. 14 | """Reads a YAML configuration file to determine visibility policy. 15 | 16 | Example Usage: 17 | try: 18 | config = yaml_data_visibility_config_reader.OpenAndRead(filename) 19 | except yaml_data_visibility_config_reader.Error as e: 20 | ... 21 | 22 | visibility_policy = GlobDataVisibilityPolicy( 23 | config.blacklist_patterns, 24 | config.whitelist_patterns) 25 | """ 26 | 27 | import os 28 | import sys 29 | import yaml 30 | 31 | 32 | class Error(Exception): 33 | """Generic error class that other errors in this module inherit from.""" 34 | pass 35 | 36 | 37 | class YAMLLoadError(Error): 38 | """Thrown when reading an opened file fails.""" 39 | pass 40 | 41 | 42 | class ParseError(Error): 43 | """Thrown when there is a problem with the YAML structure.""" 44 | pass 45 | 46 | 47 | class UnknownConfigKeyError(Error): 48 | """Thrown when the YAML contains an unsupported keyword.""" 49 | pass 50 | 51 | 52 | class NotAListError(Error): 53 | """Thrown when a YAML key does not reference a list.""" 54 | pass 55 | 56 | 57 | class ElementNotAStringError(Error): 58 | """Thrown when a YAML list element is not a string.""" 59 | pass 60 | 61 | 62 | class Config(object): 63 | """Configuration object that Read() returns to the caller.""" 64 | 65 | def __init__(self, blacklist_patterns, whitelist_patterns): 66 | self.blacklist_patterns = blacklist_patterns 67 | self.whitelist_patterns = whitelist_patterns 68 | 69 | 70 | def OpenAndRead(relative_path='debugger-blacklist.yaml'): 71 | """Attempts to find the yaml configuration file, then read it. 72 | 73 | Args: 74 | relative_path: Optional relative path override. 75 | 76 | Returns: 77 | A Config object if the open and read were successful, None if the file 78 | does not exist (which is not considered an error). 79 | 80 | Raises: 81 | Error (some subclass): As thrown by the called Read() function. 82 | """ 83 | 84 | # Note: This logic follows the convention established by source-context.json 85 | try: 86 | with open(os.path.join(sys.path[0], relative_path), 'r') as f: 87 | return Read(f) 88 | except IOError: 89 | return None 90 | 91 | 92 | def Read(f): 93 | """Reads and returns Config data from a yaml file. 94 | 95 | Args: 96 | f: Yaml file to parse. 97 | 98 | Returns: 99 | Config object as defined in this file. 100 | 101 | Raises: 102 | Error (some subclass): If there is a problem loading or parsing the file. 103 | """ 104 | try: 105 | yaml_data = yaml.safe_load(f) 106 | except yaml.YAMLError as e: 107 | raise ParseError('%s' % e) 108 | except IOError as e: 109 | raise YAMLLoadError('%s' % e) 110 | 111 | _CheckData(yaml_data) 112 | 113 | try: 114 | return Config( 115 | yaml_data.get('blacklist', ()), yaml_data.get('whitelist', ('*'))) 116 | except UnicodeDecodeError as e: 117 | raise YAMLLoadError('%s' % e) 118 | 119 | 120 | def _CheckData(yaml_data): 121 | """Checks data for illegal keys and formatting.""" 122 | legal_keys = set(('blacklist', 'whitelist')) 123 | unknown_keys = set(yaml_data) - legal_keys 124 | if unknown_keys: 125 | raise UnknownConfigKeyError('Unknown keys in configuration: %s' % 126 | unknown_keys) 127 | 128 | for key, data in yaml_data.items(): 129 | _AssertDataIsList(key, data) 130 | 131 | 132 | def _AssertDataIsList(key, lst): 133 | """Assert that lst contains list data and is not structured.""" 134 | 135 | # list and tuple are supported. Not supported are direct strings 136 | # and dictionary; these indicate too much or two little structure. 137 | if not isinstance(lst, list) and not isinstance(lst, tuple): 138 | raise NotAListError('%s must be a list' % key) 139 | 140 | # each list entry must be a string 141 | for element in lst: 142 | if not isinstance(element, str): 143 | raise ElementNotAStringError('Unsupported list element %s found in %s', 144 | (element, lst)) 145 | -------------------------------------------------------------------------------- /tests/py/module_search_test.py: -------------------------------------------------------------------------------- 1 | """Unit test for module_search module.""" 2 | 3 | import os 4 | import sys 5 | import tempfile 6 | 7 | from absl.testing import absltest 8 | 9 | from googleclouddebugger import module_search 10 | 11 | 12 | # TODO: Add tests for whitespace in location path including in, 13 | # extension, basename, path 14 | class SearchModulesTest(absltest.TestCase): 15 | 16 | def setUp(self): 17 | self._test_package_dir = tempfile.mkdtemp('', 'package_') 18 | sys.path.append(self._test_package_dir) 19 | 20 | def tearDown(self): 21 | sys.path.remove(self._test_package_dir) 22 | 23 | def testSearchValidSourcePath(self): 24 | # These modules are on the sys.path. 25 | self.assertEndsWith( 26 | module_search.Search('googleclouddebugger/module_search.py'), 27 | '/site-packages/googleclouddebugger/module_search.py') 28 | 29 | # inspect and dis are libraries with no real file. So, we 30 | # can no longer match them by file path. 31 | 32 | def testSearchInvalidSourcePath(self): 33 | # This is an invalid module that doesn't exist anywhere. 34 | self.assertEqual(module_search.Search('aaaaa.py'), 'aaaaa.py') 35 | 36 | # This module exists, but the search input is missing the outer package 37 | # name. 38 | self.assertEqual(module_search.Search('absltest.py'), 'absltest.py') 39 | 40 | def testSearchInvalidExtension(self): 41 | # Test that the module rejects invalid extension in the input. 42 | with self.assertRaises(AssertionError): 43 | module_search.Search('module_search.x') 44 | 45 | def testSearchPathStartsWithSep(self): 46 | # Test that module rejects invalid leading os.sep char in the input. 47 | with self.assertRaises(AssertionError): 48 | module_search.Search('/module_search') 49 | 50 | def testSearchRelativeSysPath(self): 51 | # An entry in sys.path is in relative form, and represents the same 52 | # directory as as another absolute entry in sys.path. 53 | for directory in ['', 'a', 'a/b']: 54 | self._CreateFile(os.path.join(directory, '__init__.py')) 55 | self._CreateFile('a/b/first.py') 56 | 57 | try: 58 | # Inject a relative path into sys.path that refers to a directory already 59 | # in sys.path. It should produce the same result as the non-relative form. 60 | testdir_alias = os.path.join(self._test_package_dir, 'a/../a') 61 | 62 | # Add 'a/../a' to sys.path so that 'b/first.py' is reachable. 63 | sys.path.insert(0, testdir_alias) 64 | 65 | # Returned result should have a successful file match and relative 66 | # paths should be kept as-is. 67 | result = module_search.Search('b/first.py') 68 | self.assertEndsWith(result, 'a/../a/b/first.py') 69 | 70 | finally: 71 | sys.path.remove(testdir_alias) 72 | 73 | def testSearchSymLinkInSysPath(self): 74 | # An entry in sys.path is a symlink. 75 | for directory in ['', 'a', 'a/b']: 76 | self._CreateFile(os.path.join(directory, '__init__.py'), '') 77 | self._CreateFile('a/b/first.py') 78 | self._CreateSymLink('a', 'link') 79 | 80 | try: 81 | # Add 'link/' to sys.path so that 'b/first.py' is reachable. 82 | sys.path.append(os.path.join(self._test_package_dir, 'link')) 83 | 84 | # Returned result should have a successful file match and symbolic 85 | # links should be kept. 86 | self.assertEndsWith(module_search.Search('b/first.py'), 'link/b/first.py') 87 | finally: 88 | sys.path.remove(os.path.join(self._test_package_dir, 'link')) 89 | 90 | def _CreateFile(self, path, contents='assert False "Unexpected import"\n'): 91 | full_path = os.path.join(self._test_package_dir, path) 92 | directory, unused_name = os.path.split(full_path) 93 | 94 | if not os.path.isdir(directory): 95 | os.makedirs(directory) 96 | 97 | with open(full_path, 'w') as writer: 98 | writer.write(contents) 99 | 100 | return path 101 | 102 | def _CreateSymLink(self, source, link_name): 103 | full_source_path = os.path.join(self._test_package_dir, source) 104 | full_link_path = os.path.join(self._test_package_dir, link_name) 105 | os.symlink(full_source_path, full_link_path) 106 | 107 | # Since we cannot use os.path.samefile or os.path.realpath to eliminate 108 | # symlinks reliably, we only check suffix equivalence of file paths in these 109 | # unit tests. 110 | def _AssertEndsWith(self, match, path): 111 | """Asserts exactly one match ending with path.""" 112 | self.assertLen(match, 1) 113 | self.assertEndsWith(match[0], path) 114 | 115 | def _AssertEqFile(self, match, path): 116 | """Asserts exactly one match equals to the file created with _CreateFile.""" 117 | self.assertLen(match, 1) 118 | self.assertEqual(match[0], os.path.join(self._test_package_dir, path)) 119 | 120 | 121 | if __name__ == '__main__': 122 | absltest.main() 123 | -------------------------------------------------------------------------------- /src/googleclouddebugger/leaky_bucket.h: -------------------------------------------------------------------------------- 1 | /** 2 | * Copyright 2015 Google Inc. All Rights Reserved. 3 | * 4 | * Licensed under the Apache License, Version 2.0 (the "License"); 5 | * you may not use this file except in compliance with the License. 6 | * You may obtain a copy of the License at 7 | * 8 | * http://www.apache.org/licenses/LICENSE-2.0 9 | * 10 | * Unless required by applicable law or agreed to in writing, software 11 | * distributed under the License is distributed on an "AS IS" BASIS, 12 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | * See the License for the specific language governing permissions and 14 | * limitations under the License. 15 | */ 16 | 17 | #ifndef DEVTOOLS_CDBG_COMMON_LEAKY_BUCKET_H_ 18 | #define DEVTOOLS_CDBG_COMMON_LEAKY_BUCKET_H_ 19 | 20 | #include 21 | #include 22 | #include // NOLINT 23 | 24 | #include "common.h" 25 | 26 | namespace devtools { 27 | namespace cdbg { 28 | 29 | // Implements a bucket that fills tokens at a constant rate up to a maximum 30 | // capacity. This class is thread-safe. 31 | // 32 | class LeakyBucket { 33 | public: 34 | // "capacity": The max number of tokens the bucket can hold at any point. 35 | // "fill_rate": The rate which the bucket fills in tokens per second. 36 | LeakyBucket(int64_t capacity, int64_t fill_rate); 37 | 38 | ~LeakyBucket() {} 39 | 40 | // Requests tokens from the bucket. If the bucket does not contain enough 41 | // tokens, returns false, and no tokens are issued. Requesting more 42 | // tokens than the "capacity_" will always fail, and CHECKs in debug mode. 43 | // 44 | // The LeakyBucket has at most "capacity_" tokens. You can use this to control 45 | // your bursts, subject to some limitations. An example of the control that 46 | // the capacity provides: imagine that you have no traffic, and therefore no 47 | // tokens are being acquired. Suddenly, infinite demand arrives. 48 | // At most "capacity_" tokens will be granted immediately. Subsequent 49 | // requests will only be admitted based on the fill rate. 50 | inline bool RequestTokens(int64_t requested_tokens); 51 | 52 | // Takes tokens from bucket, possibly sending the number of tokens in the 53 | // bucket negative. 54 | void TakeTokens(int64_t tokens); 55 | 56 | private: 57 | // The slow path of RequestTokens. Grabs a lock and may refill tokens_ 58 | // using the fill rate and time passed since last fill. 59 | bool RequestTokensSlow(int64_t requested_tokens); 60 | 61 | // Refills the bucket with newly added tokens since last update and returns 62 | // the current amount of tokens in the bucket. 'available_tokens' indicates 63 | // the number of tokens in the bucket before refilling. 'current_time_ns' 64 | // indicates the current time in nanoseconds. 65 | int64_t RefillBucket(int64_t available_tokens, int64_t current_time_ns); 66 | 67 | // Atomically increment "tokens_". 68 | inline int64_t AtomicIncrementTokens(int64_t increment) { 69 | return tokens_.fetch_add(increment, std::memory_order_relaxed) + increment; 70 | } 71 | 72 | // Atomically load the value of "tokens_". 73 | inline int64_t AtomicLoadTokens() const { 74 | return tokens_.load(std::memory_order_relaxed); 75 | } 76 | 77 | private: 78 | // Protects fill_time_ns_ and fractional_tokens_. 79 | std::mutex mu_; 80 | 81 | // Current number of tokens in the bucket. Tokens is guarded by "mu_" 82 | // only if we're planning to increment it. This is to prevent "tokens_" 83 | // from ever exceeding "capacity_". See RequestTokens in the leaky_bucket.cc 84 | // file. 85 | // 86 | // Tokens can be momentarily negative, either via TakeTokens or 87 | // during a normal RequestTokens that was not satisfied. 88 | std::atomic tokens_; 89 | 90 | // Capacity of the bucket. 91 | const int64_t capacity_; 92 | 93 | // Although the main token count is an integer we also track fractional tokens 94 | // for increased precision. 95 | double fractional_tokens_; 96 | 97 | // Fill rate in tokens per second. 98 | const int64_t fill_rate_; 99 | 100 | // Time in nanoseconds of the last refill. 101 | int64_t fill_time_ns_; 102 | 103 | DISALLOW_COPY_AND_ASSIGN(LeakyBucket); 104 | }; 105 | 106 | // Inline fast-path. 107 | inline bool LeakyBucket::RequestTokens(int64_t requested_tokens) { 108 | if (requested_tokens > capacity_) { 109 | return false; 110 | } 111 | 112 | // Try and grab some tokens. remaining is how many tokens are 113 | // left after subtracting out requested tokens. 114 | int64_t remaining = AtomicIncrementTokens(-requested_tokens); 115 | if (remaining >= 0) { 116 | // We had at least as much as we needed. 117 | return true; 118 | } 119 | 120 | return RequestTokensSlow(requested_tokens); 121 | } 122 | 123 | } // namespace cdbg 124 | } // namespace devtools 125 | 126 | #endif // DEVTOOLS_CDBG_COMMON_LEAKY_BUCKET_H_ 127 | -------------------------------------------------------------------------------- /src/googleclouddebugger/uniquifier_computer.py: -------------------------------------------------------------------------------- 1 | # Copyright 2015 Google Inc. All Rights Reserved. 2 | # 3 | # Licensed under the Apache License, Version 2.0 (the "License"); 4 | # you may not use this file except in compliance with the License. 5 | # You may obtain a copy of the License at 6 | # 7 | # http://www.apache.org/licenses/LICENSE-2.0 8 | # 9 | # Unless required by applicable law or agreed to in writing, software 10 | # distributed under the License is distributed on an "AS-IS" BASIS, 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | # See the License for the specific language governing permissions and 13 | # limitations under the License. 14 | """Computes a unique identifier of the deployed application. 15 | 16 | When the application runs under AppEngine, the deployment is uniquely 17 | identified by a minor version string. However when the application runs on 18 | in an unmanaged environment (such as Google Computer Engine virtual machine), 19 | we don't know the version of the application. 20 | 21 | We could ignore it, but in the absence of source context, two agents could be 22 | running different versions of the application, but still get bundled as the 23 | same debuggee. This would result in inconsistent behavior when setting 24 | breakpoints. 25 | """ 26 | 27 | import os 28 | import sys 29 | 30 | # Maximum recursion depth to follow when traversing the file system. This limit 31 | # will prevent stack overflow in case of a loop created by symbolic links. 32 | _MAX_DEPTH = 10 33 | 34 | 35 | def ComputeApplicationUniquifier(hash_obj): 36 | """Computes hash of application files. 37 | 38 | Application files can be anywhere on the disk. The application is free to 39 | import a Python module from an arbitrary path ok the disk. It is also 40 | impossible to distinguish application files from third party libraries. 41 | Third party libraries are typically installed with "pip" and there is not a 42 | good way to guarantee that all instances of the application are going to have 43 | exactly the same version of each package. There is also a huge amount of files 44 | in all sys.path directories and it will take too much time to traverse them 45 | all. We therefore make an assumption that application files are only located 46 | in sys.path[0]. 47 | 48 | When traversing files in sys.path, we can expect both .py and .pyc files. For 49 | source deployment, we will find both .py and .pyc files. In this case we will 50 | only index .py files and ignored .pyc file. In case of binary deployment, only 51 | .pyc file will be there. 52 | 53 | The naive way to hash files would be to read the file content and compute some 54 | sort of a hash (e.g. SHA1). This can be expensive as well, so instead we just 55 | hash file name and file size. It is a good enough heuristics to identify 56 | modified files across different deployments. 57 | 58 | Args: 59 | hash_obj: hash aggregator to update with application uniquifier. 60 | """ 61 | 62 | def ProcessDirectory(path, relative_path, depth=1): 63 | """Recursively computes application uniquifier for a particular directory. 64 | 65 | Args: 66 | path: absolute path of the directory to start. 67 | relative_path: path relative to sys.path[0] 68 | depth: current recursion depth. 69 | """ 70 | 71 | if depth > _MAX_DEPTH: 72 | return 73 | 74 | try: 75 | names = os.listdir(path) 76 | except BaseException: 77 | return 78 | 79 | # Sort file names to ensure consistent hash regardless of order returned 80 | # by os.listdir. This will also put .py files before .pyc and .pyo files. 81 | modules = set() 82 | for name in sorted(names): 83 | current_path = os.path.join(path, name) 84 | if not os.path.isdir(current_path): 85 | file_name, ext = os.path.splitext(name) 86 | if ext not in ('.py', '.pyc', '.pyo'): 87 | continue # This is not an application file. 88 | if file_name in modules: 89 | continue # This is a .pyc file and we already indexed .py file. 90 | 91 | modules.add(file_name) 92 | ProcessApplicationFile(current_path, os.path.join(relative_path, name)) 93 | elif IsPackage(current_path): 94 | ProcessDirectory(current_path, os.path.join(relative_path, name), 95 | depth + 1) 96 | 97 | def IsPackage(path): 98 | """Checks if the specified directory is a valid Python package.""" 99 | init_base_path = os.path.join(path, '__init__.py') 100 | return (os.path.isfile(init_base_path) or 101 | os.path.isfile(init_base_path + 'c') or 102 | os.path.isfile(init_base_path + 'o')) 103 | 104 | def ProcessApplicationFile(path, relative_path): 105 | """Updates the hash with the specified application file.""" 106 | hash_obj.update(relative_path.encode()) 107 | hash_obj.update(':'.encode()) 108 | try: 109 | hash_obj.update(str(os.stat(path).st_size).encode()) 110 | except BaseException: 111 | pass 112 | hash_obj.update('\n'.encode()) 113 | 114 | ProcessDirectory(sys.path[0], '') 115 | -------------------------------------------------------------------------------- /src/googleclouddebugger/__init__.py: -------------------------------------------------------------------------------- 1 | # Copyright 2015 Google Inc. All Rights Reserved. 2 | # 3 | # Licensed under the Apache License, Version 2.0 (the "License"); 4 | # you may not use this file except in compliance with the License. 5 | # You may obtain a copy of the License at 6 | # 7 | # http://www.apache.org/licenses/LICENSE-2.0 8 | # 9 | # Unless required by applicable law or agreed to in writing, software 10 | # distributed under the License is distributed on an "AS-IS" BASIS, 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | # See the License for the specific language governing permissions and 13 | # limitations under the License. 14 | """Main module for Python Cloud Debugger. 15 | 16 | The debugger is enabled in a very similar way to enabling pdb. 17 | 18 | The debugger becomes the main module. It eats up its arguments until it gets 19 | to argument '--' that serves as a separator between debugger arguments and 20 | the application command line. It then attaches the debugger and runs the 21 | actual app. 22 | """ 23 | 24 | import logging 25 | import os 26 | import sys 27 | 28 | from . import appengine_pretty_printers 29 | from . import breakpoints_manager 30 | from . import collector 31 | from . import error_data_visibility_policy 32 | from . import firebase_client 33 | from . import glob_data_visibility_policy 34 | from . import yaml_data_visibility_config_reader 35 | from . import cdbg_native 36 | from . import version 37 | 38 | __version__ = version.__version__ 39 | 40 | _flags = None 41 | _backend_client = None 42 | _breakpoints_manager = None 43 | 44 | 45 | def _StartDebugger(): 46 | """Configures and starts the debugger.""" 47 | global _backend_client 48 | global _breakpoints_manager 49 | 50 | cdbg_native.InitializeModule(_flags) 51 | cdbg_native.LogInfo( 52 | f'Initializing Cloud Debugger Python agent version: {__version__}') 53 | 54 | _backend_client = firebase_client.FirebaseClient() 55 | _backend_client.SetupAuth( 56 | _flags.get('project_id'), _flags.get('service_account_json_file'), 57 | _flags.get('firebase_db_url')) 58 | 59 | visibility_policy = _GetVisibilityPolicy() 60 | 61 | _breakpoints_manager = breakpoints_manager.BreakpointsManager( 62 | _backend_client, visibility_policy) 63 | 64 | # Set up loggers for logpoints. 65 | collector.SetLogger(logging.getLogger()) 66 | 67 | collector.CaptureCollector.pretty_printers.append( 68 | appengine_pretty_printers.PrettyPrinter) 69 | 70 | _backend_client.on_active_breakpoints_changed = ( 71 | _breakpoints_manager.SetActiveBreakpoints) 72 | _backend_client.on_idle = _breakpoints_manager.CheckBreakpointsExpiration 73 | 74 | _backend_client.InitializeDebuggeeLabels(_flags) 75 | _backend_client.Start() 76 | 77 | 78 | def _GetVisibilityPolicy(): 79 | """If a debugger configuration is found, create a visibility policy.""" 80 | try: 81 | visibility_config = yaml_data_visibility_config_reader.OpenAndRead() 82 | except yaml_data_visibility_config_reader.Error as err: 83 | return error_data_visibility_policy.ErrorDataVisibilityPolicy( 84 | f'Could not process debugger config: {err}') 85 | 86 | if visibility_config: 87 | return glob_data_visibility_policy.GlobDataVisibilityPolicy( 88 | visibility_config.blacklist_patterns, 89 | visibility_config.whitelist_patterns) 90 | 91 | return None 92 | 93 | 94 | def _DebuggerMain(): 95 | """Starts the debugger and runs the application with debugger attached.""" 96 | global _flags 97 | 98 | # The first argument is cdbg module, which we don't care. 99 | del sys.argv[0] 100 | 101 | # Parse debugger flags until we encounter '--'. 102 | _flags = {} 103 | while sys.argv[0]: 104 | arg = sys.argv[0] 105 | del sys.argv[0] 106 | 107 | if arg == '--': 108 | break 109 | 110 | (name, value) = arg.strip('-').split('=', 2) 111 | _flags[name] = value 112 | 113 | _StartDebugger() 114 | 115 | # Run the app. The following code was mostly copied from pdb.py. 116 | app_path = sys.argv[0] 117 | 118 | sys.path[0] = os.path.dirname(app_path) 119 | 120 | import __main__ # pylint: disable=import-outside-toplevel 121 | __main__.__dict__.clear() 122 | __main__.__dict__.update({ 123 | '__name__': '__main__', 124 | '__file__': app_path, 125 | '__builtins__': __builtins__ 126 | }) 127 | locals = globals = __main__.__dict__ # pylint: disable=redefined-builtin 128 | 129 | sys.modules['__main__'] = __main__ 130 | 131 | with open(app_path, encoding='utf-8') as f: 132 | code = compile(f.read(), app_path, 'exec') 133 | exec(code, globals, locals) # pylint: disable=exec-used 134 | 135 | 136 | # pylint: disable=invalid-name 137 | def enable(**kwargs): 138 | """Starts the debugger for already running application. 139 | 140 | This function should only be called once. 141 | 142 | Args: 143 | **kwargs: debugger configuration flags. 144 | 145 | Raises: 146 | RuntimeError: if called more than once. 147 | ValueError: if flags is not a valid dictionary. 148 | """ 149 | global _flags 150 | 151 | if _flags is not None: 152 | raise RuntimeError('Debugger already attached') 153 | 154 | _flags = kwargs 155 | _StartDebugger() 156 | 157 | 158 | # AttachDebugger is an alias for enable, preserved for compatibility. 159 | AttachDebugger = enable 160 | -------------------------------------------------------------------------------- /src/googleclouddebugger/breakpoints_manager.py: -------------------------------------------------------------------------------- 1 | # Copyright 2015 Google Inc. All Rights Reserved. 2 | # 3 | # Licensed under the Apache License, Version 2.0 (the "License"); 4 | # you may not use this file except in compliance with the License. 5 | # You may obtain a copy of the License at 6 | # 7 | # http://www.apache.org/licenses/LICENSE-2.0 8 | # 9 | # Unless required by applicable law or agreed to in writing, software 10 | # distributed under the License is distributed on an "AS-IS" BASIS, 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | # See the License for the specific language governing permissions and 13 | # limitations under the License. 14 | """Manages lifetime of individual breakpoint objects.""" 15 | 16 | from datetime import datetime 17 | from threading import RLock 18 | 19 | from . import python_breakpoint 20 | 21 | 22 | class BreakpointsManager(object): 23 | """Manages active breakpoints. 24 | 25 | The primary input to this class is the callback indicating that a list of 26 | active breakpoints has changed. BreakpointsManager compares it with the 27 | current list of breakpoints. It then creates PythonBreakpoint objects 28 | corresponding to new breakpoints and removes breakpoints that are no 29 | longer active. 30 | 31 | This class is thread safe. 32 | 33 | Args: 34 | hub_client: queries active breakpoints from the backend and sends 35 | breakpoint updates back to the backend. 36 | data_visibility_policy: An object used to determine the visibiliy 37 | of a captured variable. May be None if no policy is available. 38 | """ 39 | 40 | def __init__(self, hub_client, data_visibility_policy): 41 | self._hub_client = hub_client 42 | self.data_visibility_policy = data_visibility_policy 43 | 44 | # Lock to synchronize access to data across multiple threads. 45 | self._lock = RLock() 46 | 47 | # After the breakpoint completes, it is removed from list of active 48 | # breakpoints. However it takes time until the backend is notified. During 49 | # this time, the backend will still report the just completed breakpoint 50 | # as active. We don't want to set the breakpoint again, so we keep a set 51 | # of completed breakpoint IDs. 52 | self._completed = set() 53 | 54 | # Map of active breakpoints. The key is breakpoint ID. 55 | self._active = {} 56 | 57 | # Closest expiration of all active breakpoints or past time if not known. 58 | self._next_expiration = datetime.max 59 | 60 | def SetActiveBreakpoints(self, breakpoints_data): 61 | """Adds new breakpoints and removes missing ones. 62 | 63 | Args: 64 | breakpoints_data: updated list of active breakpoints. 65 | """ 66 | with self._lock: 67 | ids = set([x['id'] for x in breakpoints_data]) 68 | 69 | # Clear breakpoints that no longer show up in active breakpoints list. 70 | for breakpoint_id in self._active.keys() - ids: 71 | self._active.pop(breakpoint_id).Clear() 72 | 73 | # Create new breakpoints. 74 | self._active.update([ 75 | (x['id'], 76 | python_breakpoint.PythonBreakpoint(x, self._hub_client, self, 77 | self.data_visibility_policy)) 78 | for x in breakpoints_data 79 | if x['id'] in ids - self._active.keys() - self._completed 80 | ]) 81 | 82 | # Remove entries from completed_breakpoints_ that weren't listed in 83 | # breakpoints_data vector. These are confirmed to have been removed by the 84 | # hub and the debuglet can now assume that they will never show up ever 85 | # again. The backend never reuses breakpoint IDs. 86 | self._completed &= ids 87 | 88 | if self._active: 89 | self._next_expiration = datetime.min # Not known. 90 | else: 91 | self._next_expiration = datetime.max # Nothing to expire. 92 | 93 | def CompleteBreakpoint(self, breakpoint_id): 94 | """Marks the specified breaking as completed. 95 | 96 | Appends the ID to set of completed breakpoints and clears it. 97 | 98 | Args: 99 | breakpoint_id: breakpoint ID to complete. 100 | """ 101 | with self._lock: 102 | self._completed.add(breakpoint_id) 103 | if breakpoint_id in self._active: 104 | self._active.pop(breakpoint_id).Clear() 105 | 106 | def CheckBreakpointsExpiration(self): 107 | """Completes all breakpoints that have been active for too long.""" 108 | with self._lock: 109 | current_time = BreakpointsManager.GetCurrentTime() 110 | if self._next_expiration > current_time: 111 | return 112 | 113 | expired_breakpoints = [] 114 | self._next_expiration = datetime.max 115 | for breakpoint in self._active.values(): 116 | expiration_time = breakpoint.GetExpirationTime() 117 | if expiration_time <= current_time: 118 | expired_breakpoints.append(breakpoint) 119 | else: 120 | self._next_expiration = min(self._next_expiration, expiration_time) 121 | 122 | for breakpoint in expired_breakpoints: 123 | breakpoint.ExpireBreakpoint() 124 | 125 | @staticmethod 126 | def GetCurrentTime(): 127 | """Wrapper around datetime.now() function. 128 | 129 | The datetime class is a built-in one and therefore not patchable by unit 130 | tests. We wrap datetime.now() in a static method to work around it. 131 | 132 | Returns: 133 | Current time 134 | """ 135 | return datetime.utcnow() 136 | -------------------------------------------------------------------------------- /src/googleclouddebugger/bytecode_manipulator.h: -------------------------------------------------------------------------------- 1 | /** 2 | * Copyright 2015 Google Inc. All Rights Reserved. 3 | * 4 | * Licensed under the Apache License, Version 2.0 (the "License"); 5 | * you may not use this file except in compliance with the License. 6 | * You may obtain a copy of the License at 7 | * 8 | * http://www.apache.org/licenses/LICENSE-2.0 9 | * 10 | * Unless required by applicable law or agreed to in writing, software 11 | * distributed under the License is distributed on an "AS IS" BASIS, 12 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | * See the License for the specific language governing permissions and 14 | * limitations under the License. 15 | */ 16 | 17 | #ifndef DEVTOOLS_CDBG_DEBUGLETS_PYTHON_BYTECODE_MANIPULATOR_H_ 18 | #define DEVTOOLS_CDBG_DEBUGLETS_PYTHON_BYTECODE_MANIPULATOR_H_ 19 | 20 | #include 21 | #include 22 | 23 | #include "common.h" 24 | 25 | namespace devtools { 26 | namespace cdbg { 27 | 28 | // Inserts breakpoint method calls into bytecode of Python method. 29 | // 30 | // By default new instructions are inserted into the bytecode. When this 31 | // happens, all other branch instructions need to be adjusted. 32 | // For example consider this Python code: 33 | // def test(): 34 | // return 'hello' 35 | // It's bytecode without any breakpoints is: 36 | // 0 LOAD_CONST 1 ('hello') 37 | // 3 RETURN_VALUE 38 | // The transformed bytecode with a breakpoint set at "print 'After'" line: 39 | // 0 LOAD_CONST 2 (cdbg_native._Callback) 40 | // 3 CALL_FUNCTION 0 41 | // 6 POP_TOP 42 | // 7 LOAD_CONST 1 ('hello') 43 | // 10 RETURN_VALUE 44 | // 45 | // Special care is given to generator methods. These are methods that use 46 | // yield statement that translates to YIELD_VALUE. Built-in generator class 47 | // keeps the Python frame around in between the calls. The frame stores 48 | // the offset of the instruction to return in "f_lasti". This offset has to 49 | // stay valid, even if the breakpoint is set or cleared in between calls to the 50 | // generator function. To achieve this the breakpoint code is appended to the 51 | // end of the method instead of the default insertion. 52 | // For example consider this Python code: 53 | // def test(): 54 | // yield 'hello' 55 | // Its bytecode without any breakpoints is: 56 | // 0 LOAD_CONST 1 ('hello') 57 | // 3 YIELD_VALUE 58 | // 4 POP_TOP 59 | // 5 LOAD_CONST 0 (None) 60 | // 8 RETURN_VALUE 61 | // When setting a breakpoint in the "yield" line, the bytecode is transformed: 62 | // 0 JUMP_ABSOLUTE 9 63 | // 3 YIELD_VALUE 64 | // 4 POP_TOP 65 | // 5 LOAD_CONST 0 (None) 66 | // 8 RETURN_VALUE 67 | // 9 LOAD_CONST 2 (cdbg_native._Callback) 68 | // 12 CALL_FUNCTION 0 69 | // 15 POP_TOP 70 | // 16 LOAD_CONST 1 ('hello') 71 | // 19 JUMP_ABSOLUTE 3 72 | class BytecodeManipulator { 73 | public: 74 | BytecodeManipulator(std::vector bytecode, const bool has_linedata, 75 | std::vector linedata); 76 | 77 | // Gets the transformed method bytecode. 78 | const std::vector& bytecode() const { return data_.bytecode; } 79 | 80 | // Returns true if this class was initialized with line numbers table. 81 | bool has_linedata() const { return has_linedata_; } 82 | 83 | // Gets the method line numbers table or empty vector if not available. 84 | const std::vector& linedata() const { return data_.linedata; } 85 | 86 | // Rewrites the method bytecode to invoke callable at the specified offset. 87 | // Return false if the method call could not be inserted. The bytecode 88 | // is not affected. 89 | bool InjectMethodCall(int offset, int callable_const_index); 90 | 91 | private: 92 | // Algorithm to insert breakpoint callback into method bytecode. 93 | enum Strategy { 94 | // Fail any attempts to set a breakpoint in this method. 95 | STRATEGY_FAIL, 96 | 97 | // Inserts method call instruction right into the method bytecode. This 98 | // strategy works for all possible locations, but can't be used in 99 | // generators (i.e. methods that use "yield"). 100 | STRATEGY_INSERT, 101 | 102 | // Appends method call instruction at the end of the method bytecode. This 103 | // strategy works for generators (i.e. methods that use "yield"). The bad 104 | // news is that breakpoints can't be set in all locations. 105 | STRATEGY_APPEND 106 | }; 107 | 108 | struct Data { 109 | // Bytecode of a transformed method. 110 | std::vector bytecode; 111 | 112 | // Method line numbers table or empty vector if "has_linedata_" is false. 113 | std::vector linedata; 114 | }; 115 | 116 | // Insert space into the bytecode. This space is later used to add new 117 | // instructions. 118 | bool InsertSpace(Data* data, int offset, int size) const; 119 | 120 | // Injects a method call using STRATEGY_INSERT on a temporary copy of "Data" 121 | // that can be dropped in case of a failure. 122 | bool InsertMethodCall(Data* data, int offset, int const_index) const; 123 | 124 | // Injects a method call using STRATEGY_APPEND on a temporary copy of "Data" 125 | // that can be dropped in case of a failure. 126 | bool AppendMethodCall(Data* data, int offset, int const_index) const; 127 | 128 | private: 129 | // Method bytecode and line number table. 130 | Data data_; 131 | 132 | // True if the method has line number table. 133 | const bool has_linedata_; 134 | 135 | // Algorithm to insert breakpoint callback into method bytecode. 136 | Strategy strategy_; 137 | 138 | DISALLOW_COPY_AND_ASSIGN(BytecodeManipulator); 139 | }; 140 | 141 | } // namespace cdbg 142 | } // namespace devtools 143 | 144 | #endif // DEVTOOLS_CDBG_DEBUGLETS_PYTHON_BYTECODE_MANIPULATOR_H_ 145 | -------------------------------------------------------------------------------- /tests/py/python_test_util.py: -------------------------------------------------------------------------------- 1 | """Set of helper methods for Python debuglet unit and component tests.""" 2 | 3 | import inspect 4 | import re 5 | 6 | 7 | def GetModuleInfo(obj): 8 | """Gets the source file path and breakpoint tags for a module. 9 | 10 | Breakpoint tag is a named label of a source line. The tag is marked 11 | with "# BPTAG: XXX" comment. 12 | 13 | Args: 14 | obj: any object inside the queried module. 15 | 16 | Returns: 17 | (path, tags) tuple where tags is a dictionary mapping tag name to 18 | line numbers where this tag appears. 19 | """ 20 | return (inspect.getsourcefile(obj), GetSourceFileTags(obj)) 21 | 22 | 23 | def GetSourceFileTags(source): 24 | """Gets breakpoint tags for the specified source file. 25 | 26 | Breakpoint tag is a named label of a source line. The tag is marked 27 | with "# BPTAG: XXX" comment. 28 | 29 | Args: 30 | source: either path to the .py file to analyze or any code related 31 | object (e.g. module, function, code object). 32 | 33 | Returns: 34 | Dictionary mapping tag name to line numbers where this tag appears. 35 | """ 36 | if isinstance(source, str): 37 | lines = open(source, 'r').read().splitlines() 38 | start_line = 1 # line number is 1 based 39 | else: 40 | lines, start_line = inspect.getsourcelines(source) 41 | if not start_line: # "getsourcelines" returns start_line of 0 for modules. 42 | start_line = 1 43 | 44 | tags = {} 45 | regex = re.compile(r'# BPTAG: ([0-9a-zA-Z_]+)\s*$') 46 | for n, line in enumerate(lines): 47 | m = regex.search(line) 48 | if m: 49 | tag = m.group(1) 50 | if tag in tags: 51 | tags[tag].append(n + start_line) 52 | else: 53 | tags[tag] = [n + start_line] 54 | 55 | return tags 56 | 57 | 58 | def ResolveTag(obj, tag): 59 | """Resolves the breakpoint tag into source file path and a line number. 60 | 61 | Breakpoint tag is a named label of a source line. The tag is marked 62 | with "# BPTAG: XXX" comment. 63 | 64 | Raises 65 | 66 | Args: 67 | obj: any object inside the queried module. 68 | tag: tag name to resolve. 69 | 70 | Raises: 71 | Exception: if no line in the source file define the specified tag or if 72 | more than one line define the tag. 73 | 74 | Returns: 75 | (path, line) tuple, where line is the line number where the tag appears. 76 | """ 77 | path, tags = GetModuleInfo(obj) 78 | if tag not in tags: 79 | raise Exception('tag %s not found' % tag) 80 | lines = tags[tag] 81 | if len(lines) != 1: 82 | raise Exception('tag %s is ambiguous (lines: %s)' % (tag, lines)) 83 | return path, lines[0] 84 | 85 | 86 | def DateTimeToTimestamp(t): 87 | """Converts the specified time to Timestamp format. 88 | 89 | Args: 90 | t: datetime instance 91 | 92 | Returns: 93 | Time in Timestamp format 94 | """ 95 | return t.strftime('%Y-%m-%dT%H:%M:%S.%f') + 'Z' 96 | 97 | 98 | def DateTimeToTimestampNew(t): 99 | """Converts the specified time to Timestamp format in seconds granularity. 100 | 101 | Args: 102 | t: datetime instance 103 | 104 | Returns: 105 | Time in Timestamp format in seconds granularity 106 | """ 107 | return t.strftime('%Y-%m-%dT%H:%M:%S') + 'Z' 108 | 109 | def DateTimeToUnixMsec(t): 110 | """Returns the Unix time as in integer value in milliseconds""" 111 | return int(t.timestamp() * 1000) 112 | 113 | 114 | def PackFrameVariable(breakpoint, name, frame=0, collection='locals'): 115 | """Finds local variable or argument by name. 116 | 117 | Indirections created through varTableIndex are recursively collapsed. Fails 118 | the test case if the named variable is not found. 119 | 120 | Args: 121 | breakpoint: queried breakpoint. 122 | name: name of the local variable or argument. 123 | frame: stack frame index to examine. 124 | collection: 'locals' to get local variable or 'arguments' for an argument. 125 | 126 | Returns: 127 | Single dictionary of variable data. 128 | 129 | Raises: 130 | AssertionError: if the named variable not found. 131 | """ 132 | for variable in breakpoint['stackFrames'][frame][collection]: 133 | if variable['name'] == name: 134 | return _Pack(variable, breakpoint) 135 | 136 | raise AssertionError('Variable %s not found in frame %d collection %s' % 137 | (name, frame, collection)) 138 | 139 | 140 | def PackWatchedExpression(breakpoint, expression): 141 | """Finds watched expression by index. 142 | 143 | Indirections created through varTableIndex are recursively collapsed. Fails 144 | the test case if the named variable is not found. 145 | 146 | Args: 147 | breakpoint: queried breakpoint. 148 | expression: index of the watched expression. 149 | 150 | Returns: 151 | Single dictionary of variable data. 152 | """ 153 | return _Pack(breakpoint['evaluatedExpressions'][expression], breakpoint) 154 | 155 | 156 | def _Pack(variable, breakpoint): 157 | """Recursively collapses indirections created through varTableIndex. 158 | 159 | Circular references by objects are not supported. If variable subtree 160 | has circular references, this function will hang. 161 | 162 | Variable members are sorted by name. This helps asserting the content of 163 | variable since Python has no guarantees over the order of keys of a 164 | dictionary. 165 | 166 | Args: 167 | variable: variable object to pack. Not modified. 168 | breakpoint: queried breakpoint. 169 | 170 | Returns: 171 | A new dictionary with packed variable object. 172 | """ 173 | packed = dict(variable) 174 | 175 | while 'varTableIndex' in packed: 176 | ref = breakpoint['variableTable'][packed['varTableIndex']] 177 | assert 'name' not in ref 178 | assert 'value' not in packed 179 | assert 'members' not in packed 180 | assert 'status' not in ref and 'status' not in packed 181 | del packed['varTableIndex'] 182 | packed.update(ref) 183 | 184 | if 'members' in packed: 185 | packed['members'] = sorted( 186 | [_Pack(m, breakpoint) for m in packed['members']], 187 | key=lambda m: m.get('name', '')) 188 | 189 | return packed 190 | -------------------------------------------------------------------------------- /src/googleclouddebugger/immutability_tracer.h: -------------------------------------------------------------------------------- 1 | /** 2 | * Copyright 2015 Google Inc. All Rights Reserved. 3 | * 4 | * Licensed under the Apache License, Version 2.0 (the "License"); 5 | * you may not use this file except in compliance with the License. 6 | * You may obtain a copy of the License at 7 | * 8 | * http://www.apache.org/licenses/LICENSE-2.0 9 | * 10 | * Unless required by applicable law or agreed to in writing, software 11 | * distributed under the License is distributed on an "AS IS" BASIS, 12 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | * See the License for the specific language governing permissions and 14 | * limitations under the License. 15 | */ 16 | 17 | #ifndef DEVTOOLS_CDBG_DEBUGLETS_PYTHON_IMMUTABILITY_TRACER_H_ 18 | #define DEVTOOLS_CDBG_DEBUGLETS_PYTHON_IMMUTABILITY_TRACER_H_ 19 | 20 | #include 21 | #include 22 | 23 | #include "common.h" 24 | #include "python_util.h" 25 | 26 | namespace devtools { 27 | namespace cdbg { 28 | 29 | // Uses Python line tracer to track evaluation of Python expression. As the 30 | // evaluation progresses, verifies that no opcodes with side effect are 31 | // executed. 32 | // 33 | // Execution of code with side effects will be blocked and exception will 34 | // be thrown. 35 | // 36 | // This class is not thread safe. All the functions assume Interpreter Lock 37 | // held by the current thread. 38 | // 39 | // This class resets tracer ("PyEval_SetTrace") in destructor. It does not 40 | // restore the previous one (because such Python does not provide such API). 41 | // It is up to the caller to reset the tracer. 42 | class ImmutabilityTracer { 43 | public: 44 | ImmutabilityTracer(); 45 | 46 | ~ImmutabilityTracer(); 47 | 48 | // Starts immutability tracer on the current thread. 49 | void Start(PyObject* self); 50 | 51 | // Stops immutability tracer on the current thread. 52 | void Stop(); 53 | 54 | // Returns true if the expression wasn't completely executed because of 55 | // a mutable code. 56 | bool IsMutableCodeDetected() const { return mutable_code_detected_; } 57 | 58 | // Gets the number of lines executed while the tracer was enabled. Native 59 | // functions calls are counted as a single line. 60 | int32_t GetLineCount() const { return line_count_; } 61 | 62 | private: 63 | // Python tracer callback function. 64 | static int OnTraceCallback( 65 | PyObject* obj, 66 | PyFrameObject* frame, 67 | int what, 68 | PyObject* arg) { 69 | auto* instance = py_object_cast(obj); 70 | return instance->OnTraceCallbackInternal(frame, what, arg); 71 | } 72 | 73 | // Python tracer callback function (instance function for convenience). 74 | int OnTraceCallbackInternal(PyFrameObject* frame, int what, PyObject* arg); 75 | 76 | // Verifies that the code object doesn't include calls to blocked primitives. 77 | void VerifyCodeObject(ScopedPyCodeObject code_object); 78 | 79 | // Verifies immutability of code on a single line. 80 | void ProcessCodeLine(PyCodeObject* code_object, int line_number); 81 | 82 | // Verifies immutability of block of opcodes. 83 | void ProcessCodeRange(const uint8_t* code_start, const uint8_t* opcodes, 84 | int size); 85 | 86 | // Verifies that the called C function is whitelisted. 87 | void ProcessCCall(PyObject* function); 88 | 89 | // Sets an exception indicating that the code is mutable. 90 | void SetMutableCodeException(); 91 | 92 | public: 93 | // Definition of Python type object. 94 | static PyTypeObject python_type_; 95 | 96 | private: 97 | // Weak reference to Python object wrapping this class. 98 | PyObject* self_; 99 | 100 | // Evaluation thread. 101 | PyThreadState* thread_state_; 102 | 103 | // Set of code object verified to not have any blocked primitives. 104 | std::unordered_set< 105 | ScopedPyCodeObject, 106 | ScopedPyCodeObject::Hash> verified_code_objects_; 107 | 108 | // Original value of PyThreadState::tracing. We revert it to 0 to enforce 109 | // trace callback on this thread, even if the whole thing was executed from 110 | // within another trace callback (that caught the breakpoint). 111 | int32_t original_thread_state_tracing_; 112 | 113 | // Counts the number of lines executed while the tracer was enabled. Native 114 | // functions calls are counted as a single line. 115 | int32_t line_count_; 116 | 117 | // Set to true after immutable statement is detected. When it happens we 118 | // want to stop execution of the entire construct entirely. 119 | bool mutable_code_detected_; 120 | 121 | DISALLOW_COPY_AND_ASSIGN(ImmutabilityTracer); 122 | }; 123 | 124 | // Creates and initializes instance of "ImmutabilityTracer" in constructor and 125 | // stops the tracer in destructor. 126 | // 127 | // This class assumes Interpreter Lock held by the current thread throughout 128 | // its lifetime. 129 | class ScopedImmutabilityTracer { 130 | public: 131 | ScopedImmutabilityTracer() 132 | : tracer_(NewNativePythonObject()) { 133 | Instance()->Start(tracer_.get()); 134 | } 135 | 136 | ~ScopedImmutabilityTracer() { 137 | Instance()->Stop(); 138 | } 139 | 140 | // Returns true if the expression wasn't completely executed because of 141 | // a mutable code. 142 | bool IsMutableCodeDetected() const { 143 | return Instance()->IsMutableCodeDetected(); 144 | } 145 | 146 | // Gets the number of lines executed while the tracer was enabled. Native 147 | // functions calls are counted as a single line. 148 | int32_t GetLineCount() const { return Instance()->GetLineCount(); } 149 | 150 | private: 151 | ImmutabilityTracer* Instance() { 152 | return py_object_cast(tracer_.get()); 153 | } 154 | 155 | const ImmutabilityTracer* Instance() const { 156 | return py_object_cast(tracer_.get()); 157 | } 158 | 159 | private: 160 | const ScopedPyObject tracer_; 161 | 162 | DISALLOW_COPY_AND_ASSIGN(ScopedImmutabilityTracer); 163 | }; 164 | 165 | } // namespace cdbg 166 | } // namespace devtools 167 | 168 | #endif // DEVTOOLS_CDBG_DEBUGLETS_PYTHON_IMMUTABILITY_TRACER_H_ 169 | -------------------------------------------------------------------------------- /src/third_party/pylinetable.h: -------------------------------------------------------------------------------- 1 | /** 2 | * Copyright (c) 2001-2023 Python Software Foundation; All Rights Reserved 3 | * 4 | * You may obtain a copy of the PSF License at 5 | * 6 | * https://docs.python.org/3/license.html 7 | */ 8 | 9 | #ifndef DEVTOOLS_CDBG_DEBUGLETS_PYTHON_PYLINETABLE_H_ 10 | #define DEVTOOLS_CDBG_DEBUGLETS_PYTHON_PYLINETABLE_H_ 11 | 12 | /* Python Linetable helper methods. 13 | * They are not part of the cpython api. 14 | * This code has been extracted from: 15 | * https://github.com/python/cpython/blob/main/Objects/codeobject.c 16 | * 17 | * See https://peps.python.org/pep-0626/#out-of-process-debuggers-and-profilers 18 | * for more information about this code and its usage. 19 | */ 20 | 21 | #if PY_VERSION_HEX >= 0x030B0000 22 | // Things are different in 3.11 than 3.10. 23 | // See https://github.com/python/cpython/blob/main/Objects/locations.md 24 | 25 | typedef enum _PyCodeLocationInfoKind { 26 | /* short forms are 0 to 9 */ 27 | PY_CODE_LOCATION_INFO_SHORT0 = 0, 28 | /* one lineforms are 10 to 12 */ 29 | PY_CODE_LOCATION_INFO_ONE_LINE0 = 10, 30 | PY_CODE_LOCATION_INFO_ONE_LINE1 = 11, 31 | PY_CODE_LOCATION_INFO_ONE_LINE2 = 12, 32 | 33 | PY_CODE_LOCATION_INFO_NO_COLUMNS = 13, 34 | PY_CODE_LOCATION_INFO_LONG = 14, 35 | PY_CODE_LOCATION_INFO_NONE = 15 36 | } _PyCodeLocationInfoKind; 37 | 38 | /** Out of process API for initializing the location table. */ 39 | extern void _PyLineTable_InitAddressRange( 40 | const char *linetable, 41 | Py_ssize_t length, 42 | int firstlineno, 43 | PyCodeAddressRange *range); 44 | 45 | /** API for traversing the line number table. */ 46 | extern int _PyLineTable_NextAddressRange(PyCodeAddressRange *range); 47 | 48 | 49 | void _PyLineTable_InitAddressRange(const char *linetable, Py_ssize_t length, int firstlineno, PyCodeAddressRange *range) { 50 | range->opaque.lo_next = linetable; 51 | range->opaque.limit = range->opaque.lo_next + length; 52 | range->ar_start = -1; 53 | range->ar_end = 0; 54 | range->opaque.computed_line = firstlineno; 55 | range->ar_line = -1; 56 | } 57 | 58 | static int 59 | scan_varint(const uint8_t *ptr) 60 | { 61 | unsigned int read = *ptr++; 62 | unsigned int val = read & 63; 63 | unsigned int shift = 0; 64 | while (read & 64) { 65 | read = *ptr++; 66 | shift += 6; 67 | val |= (read & 63) << shift; 68 | } 69 | return val; 70 | } 71 | 72 | static int 73 | scan_signed_varint(const uint8_t *ptr) 74 | { 75 | unsigned int uval = scan_varint(ptr); 76 | if (uval & 1) { 77 | return -(int)(uval >> 1); 78 | } 79 | else { 80 | return uval >> 1; 81 | } 82 | } 83 | 84 | static int 85 | get_line_delta(const uint8_t *ptr) 86 | { 87 | int code = ((*ptr) >> 3) & 15; 88 | switch (code) { 89 | case PY_CODE_LOCATION_INFO_NONE: 90 | return 0; 91 | case PY_CODE_LOCATION_INFO_NO_COLUMNS: 92 | case PY_CODE_LOCATION_INFO_LONG: 93 | return scan_signed_varint(ptr+1); 94 | case PY_CODE_LOCATION_INFO_ONE_LINE0: 95 | return 0; 96 | case PY_CODE_LOCATION_INFO_ONE_LINE1: 97 | return 1; 98 | case PY_CODE_LOCATION_INFO_ONE_LINE2: 99 | return 2; 100 | default: 101 | /* Same line */ 102 | return 0; 103 | } 104 | } 105 | 106 | static int 107 | is_no_line_marker(uint8_t b) 108 | { 109 | return (b >> 3) == 0x1f; 110 | } 111 | 112 | 113 | #define ASSERT_VALID_BOUNDS(bounds) \ 114 | assert(bounds->opaque.lo_next <= bounds->opaque.limit && \ 115 | (bounds->ar_line == -1 || bounds->ar_line == bounds->opaque.computed_line) && \ 116 | (bounds->opaque.lo_next == bounds->opaque.limit || \ 117 | (*bounds->opaque.lo_next) & 128)) 118 | 119 | static int 120 | next_code_delta(PyCodeAddressRange *bounds) 121 | { 122 | assert((*bounds->opaque.lo_next) & 128); 123 | return (((*bounds->opaque.lo_next) & 7) + 1) * sizeof(_Py_CODEUNIT); 124 | } 125 | 126 | static void 127 | advance(PyCodeAddressRange *bounds) 128 | { 129 | ASSERT_VALID_BOUNDS(bounds); 130 | bounds->opaque.computed_line += get_line_delta(reinterpret_cast(bounds->opaque.lo_next)); 131 | if (is_no_line_marker(*bounds->opaque.lo_next)) { 132 | bounds->ar_line = -1; 133 | } 134 | else { 135 | bounds->ar_line = bounds->opaque.computed_line; 136 | } 137 | bounds->ar_start = bounds->ar_end; 138 | bounds->ar_end += next_code_delta(bounds); 139 | do { 140 | bounds->opaque.lo_next++; 141 | } while (bounds->opaque.lo_next < bounds->opaque.limit && 142 | ((*bounds->opaque.lo_next) & 128) == 0); 143 | ASSERT_VALID_BOUNDS(bounds); 144 | } 145 | 146 | static inline int 147 | at_end(PyCodeAddressRange *bounds) { 148 | return bounds->opaque.lo_next >= bounds->opaque.limit; 149 | } 150 | 151 | int 152 | _PyLineTable_NextAddressRange(PyCodeAddressRange *range) 153 | { 154 | if (at_end(range)) { 155 | return 0; 156 | } 157 | advance(range); 158 | assert(range->ar_end > range->ar_start); 159 | return 1; 160 | } 161 | #elif PY_VERSION_HEX >= 0x030A0000 162 | void 163 | _PyLineTable_InitAddressRange(const char *linetable, Py_ssize_t length, int firstlineno, PyCodeAddressRange *range) 164 | { 165 | range->opaque.lo_next = linetable; 166 | range->opaque.limit = range->opaque.lo_next + length; 167 | range->ar_start = -1; 168 | range->ar_end = 0; 169 | range->opaque.computed_line = firstlineno; 170 | range->ar_line = -1; 171 | } 172 | 173 | static void 174 | advance(PyCodeAddressRange *bounds) 175 | { 176 | bounds->ar_start = bounds->ar_end; 177 | int delta = ((unsigned char *)bounds->opaque.lo_next)[0]; 178 | bounds->ar_end += delta; 179 | int ldelta = ((signed char *)bounds->opaque.lo_next)[1]; 180 | bounds->opaque.lo_next += 2; 181 | if (ldelta == -128) { 182 | bounds->ar_line = -1; 183 | } 184 | else { 185 | bounds->opaque.computed_line += ldelta; 186 | bounds->ar_line = bounds->opaque.computed_line; 187 | } 188 | } 189 | 190 | static inline int 191 | at_end(PyCodeAddressRange *bounds) { 192 | return bounds->opaque.lo_next >= bounds->opaque.limit; 193 | } 194 | 195 | int 196 | _PyLineTable_NextAddressRange(PyCodeAddressRange *range) 197 | { 198 | if (at_end(range)) { 199 | return 0; 200 | } 201 | advance(range); 202 | while (range->ar_start == range->ar_end) { 203 | assert(!at_end(range)); 204 | advance(range); 205 | } 206 | return 1; 207 | } 208 | #endif 209 | 210 | #endif // DEVTOOLS_CDBG_DEBUGLETS_PYTHON_PYLINETABLE_H_ 211 | -------------------------------------------------------------------------------- /tests/py/module_utils_test.py: -------------------------------------------------------------------------------- 1 | """Tests for googleclouddebugger.module_utils.""" 2 | 3 | import os 4 | import sys 5 | import tempfile 6 | 7 | from absl.testing import absltest 8 | 9 | from googleclouddebugger import module_utils 10 | 11 | 12 | class TestModule(object): 13 | """Dummy class with __name__ and __file__ attributes.""" 14 | 15 | def __init__(self, name, path): 16 | self.__name__ = name 17 | self.__file__ = path 18 | 19 | 20 | def _AddSysModule(name, path): 21 | sys.modules[name] = TestModule(name, path) 22 | 23 | 24 | class ModuleUtilsTest(absltest.TestCase): 25 | 26 | def setUp(self): 27 | self._test_package_dir = tempfile.mkdtemp('', 'package_') 28 | self.modules = sys.modules.copy() 29 | 30 | def tearDown(self): 31 | sys.modules = self.modules 32 | self.modules = None 33 | 34 | def _CreateFile(self, path): 35 | full_path = os.path.join(self._test_package_dir, path) 36 | directory, unused_name = os.path.split(full_path) 37 | 38 | if not os.path.isdir(directory): 39 | os.makedirs(directory) 40 | 41 | with open(full_path, 'w') as writer: 42 | writer.write('') 43 | 44 | return full_path 45 | 46 | def _CreateSymLink(self, source, link_name): 47 | full_source_path = os.path.join(self._test_package_dir, source) 48 | full_link_path = os.path.join(self._test_package_dir, link_name) 49 | os.symlink(full_source_path, full_link_path) 50 | return full_link_path 51 | 52 | def _AssertEndsWith(self, a, b, msg=None): 53 | """Assert that string a ends with string b.""" 54 | if not a.endswith(b): 55 | standard_msg = '%s does not end with %s' % (a, b) 56 | self.fail(self._formatMessage(msg, standard_msg)) 57 | 58 | def testSimpleLoadedModuleFromSuffix(self): 59 | # Lookup simple module. 60 | _AddSysModule('m1', '/a/b/p1/m1.pyc') 61 | for suffix in [ 62 | 'm1.py', 'm1.pyc', 'm1.pyo', 'p1/m1.py', 'b/p1/m1.py', 'a/b/p1/m1.py', 63 | '/a/b/p1/m1.py' 64 | ]: 65 | m1 = module_utils.GetLoadedModuleBySuffix(suffix) 66 | self.assertTrue(m1, 'Module not found') 67 | self.assertEqual('/a/b/p1/m1.pyc', m1.__file__) 68 | 69 | # Lookup simple package, no ext. 70 | _AddSysModule('p1', '/a/b/p1/__init__.pyc') 71 | for suffix in [ 72 | 'p1/__init__.py', 'b/p1/__init__.py', 'a/b/p1/__init__.py', 73 | '/a/b/p1/__init__.py' 74 | ]: 75 | p1 = module_utils.GetLoadedModuleBySuffix(suffix) 76 | self.assertTrue(p1, 'Package not found') 77 | self.assertEqual('/a/b/p1/__init__.pyc', p1.__file__) 78 | 79 | # Lookup via bad suffix. 80 | for suffix in [ 81 | 'm2.py', 'p2/m1.py', 'b2/p1/m1.py', 'a2/b/p1/m1.py', '/a2/b/p1/m1.py' 82 | ]: 83 | m1 = module_utils.GetLoadedModuleBySuffix(suffix) 84 | self.assertFalse(m1, 'Module found unexpectedly') 85 | 86 | def testComplexLoadedModuleFromSuffix(self): 87 | # Lookup complex module. 88 | _AddSysModule('b.p1.m1', '/a/b/p1/m1.pyc') 89 | for suffix in [ 90 | 'm1.py', 'p1/m1.py', 'b/p1/m1.py', 'a/b/p1/m1.py', '/a/b/p1/m1.py' 91 | ]: 92 | m1 = module_utils.GetLoadedModuleBySuffix(suffix) 93 | self.assertTrue(m1, 'Module not found') 94 | self.assertEqual('/a/b/p1/m1.pyc', m1.__file__) 95 | 96 | # Lookup complex package, no ext. 97 | _AddSysModule('a.b.p1', '/a/b/p1/__init__.pyc') 98 | for suffix in [ 99 | 'p1/__init__.py', 'b/p1/__init__.py', 'a/b/p1/__init__.py', 100 | '/a/b/p1/__init__.py' 101 | ]: 102 | p1 = module_utils.GetLoadedModuleBySuffix(suffix) 103 | self.assertTrue(p1, 'Package not found') 104 | self.assertEqual('/a/b/p1/__init__.pyc', p1.__file__) 105 | 106 | def testSimilarLoadedModuleFromSuffix(self): 107 | # Lookup similar module, no ext. 108 | _AddSysModule('m1', '/a/b/p2/m1.pyc') 109 | _AddSysModule('p1.m1', '/a/b1/p1/m1.pyc') 110 | _AddSysModule('b.p1.m1', '/a1/b/p1/m1.pyc') 111 | _AddSysModule('a.b.p1.m1', '/a/b/p1/m1.pyc') 112 | 113 | m1 = module_utils.GetLoadedModuleBySuffix('/a/b/p1/m1.py') 114 | self.assertTrue(m1, 'Module not found') 115 | self.assertEqual('/a/b/p1/m1.pyc', m1.__file__) 116 | 117 | # Lookup similar package, no ext. 118 | _AddSysModule('p1', '/a1/b1/p1/__init__.pyc') 119 | _AddSysModule('b.p1', '/a1/b/p1/__init__.pyc') 120 | _AddSysModule('a.b.p1', '/a/b/p1/__init__.pyc') 121 | p1 = module_utils.GetLoadedModuleBySuffix('/a/b/p1/__init__.py') 122 | self.assertTrue(p1, 'Package not found') 123 | self.assertEqual('/a/b/p1/__init__.pyc', p1.__file__) 124 | 125 | def testDuplicateLoadedModuleFromSuffix(self): 126 | # Lookup name dup module and package. 127 | _AddSysModule('m1', '/m1/__init__.pyc') 128 | _AddSysModule('m1.m1', '/m1/m1.pyc') 129 | _AddSysModule('m1.m1.m1', '/m1/m1/m1/__init__.pyc') 130 | _AddSysModule('m1.m1.m1.m1', '/m1/m1/m1/m1.pyc') 131 | 132 | # Ambiguous request, multiple modules might have matched. 133 | m1 = module_utils.GetLoadedModuleBySuffix('/m1/__init__.py') 134 | self.assertTrue(m1, 'Package not found') 135 | self.assertIn(m1.__file__, ['/m1/__init__.pyc', '/m1/m1/m1/__init__.pyc']) 136 | 137 | # Ambiguous request, multiple modules might have matched. 138 | m1m1 = module_utils.GetLoadedModuleBySuffix('/m1/m1.py') 139 | self.assertTrue(m1m1, 'Module not found') 140 | self.assertIn(m1m1.__file__, ['/m1/m1.pyc', '/m1/m1/m1/m1.pyc']) 141 | 142 | # Not ambiguous. Only 1 match possible. 143 | m1m1m1 = module_utils.GetLoadedModuleBySuffix('/m1/m1/m1/__init__.py') 144 | self.assertTrue(m1m1m1, 'Package not found') 145 | self.assertEqual('/m1/m1/m1/__init__.pyc', m1m1m1.__file__) 146 | 147 | # Not ambiguous. Only 1 match possible. 148 | m1m1m1m1 = module_utils.GetLoadedModuleBySuffix('/m1/m1/m1/m1.py') 149 | self.assertTrue(m1m1m1m1, 'Module not found') 150 | self.assertEqual('/m1/m1/m1/m1.pyc', m1m1m1m1.__file__) 151 | 152 | def testMainLoadedModuleFromSuffix(self): 153 | # Lookup complex module. 154 | _AddSysModule('__main__', '/a/b/p/m.pyc') 155 | m1 = module_utils.GetLoadedModuleBySuffix('/a/b/p/m.py') 156 | self.assertTrue(m1, 'Module not found') 157 | self.assertEqual('/a/b/p/m.pyc', m1.__file__) 158 | 159 | def testMainWithDotSlashLoadedModuleFromSuffix(self): 160 | # Lookup module started via 'python3 ./m.py', notice the './' 161 | _AddSysModule('__main__', '/a/b/p/./m.pyc') 162 | m1 = module_utils.GetLoadedModuleBySuffix('/a/b/p/m.py') 163 | self.assertIsNotNone(m1) 164 | self.assertTrue(m1, 'Module not found') 165 | self.assertEqual('/a/b/p/./m.pyc', m1.__file__) 166 | 167 | if __name__ == '__main__': 168 | absltest.main() 169 | -------------------------------------------------------------------------------- /src/googleclouddebugger/bytecode_breakpoint.h: -------------------------------------------------------------------------------- 1 | /** 2 | * Copyright 2015 Google Inc. All Rights Reserved. 3 | * 4 | * Licensed under the Apache License, Version 2.0 (the "License"); 5 | * you may not use this file except in compliance with the License. 6 | * You may obtain a copy of the License at 7 | * 8 | * http://www.apache.org/licenses/LICENSE-2.0 9 | * 10 | * Unless required by applicable law or agreed to in writing, software 11 | * distributed under the License is distributed on an "AS IS" BASIS, 12 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | * See the License for the specific language governing permissions and 14 | * limitations under the License. 15 | */ 16 | 17 | #ifndef DEVTOOLS_CDBG_DEBUGLETS_PYTHON_BYTECODE_BREAKPOINT_H_ 18 | #define DEVTOOLS_CDBG_DEBUGLETS_PYTHON_BYTECODE_BREAKPOINT_H_ 19 | 20 | #include 21 | #include 22 | #include 23 | 24 | #include "common.h" 25 | #include "python_util.h" 26 | 27 | namespace devtools { 28 | namespace cdbg { 29 | 30 | // Enum representing the status of a breakpoint. State tracking is helpful 31 | // for testing and debugging the bytecode breakpoints. 32 | // ======================================================================= 33 | // State transition map: 34 | // 35 | // (start) kUnknown 36 | // |- [CreateBreakpoint] 37 | // | 38 | // | 39 | // | [ActivateBreakpoint] [PatchCodeObject] 40 | // v | | 41 | // kInactive ----> kActive <---> kError 42 | // | | | 43 | // |-------| | |-------| 44 | // | | | 45 | // |- |- |- [ClearBreakpoint] 46 | // v v v 47 | // kDone 48 | // 49 | // ======================================================================= 50 | enum class BreakpointStatus { 51 | // Unknown status for the breakpoint 52 | kUnknown = 0, 53 | 54 | // Breakpoint is created and is patched in the bytecode. 55 | kActive, 56 | 57 | // Breakpoint is created but is currently not patched in the bytecode. 58 | kInactive, 59 | 60 | // Breakpoint has been cleared. 61 | kDone, 62 | 63 | // Breakpoint is created but failed to be activated (patched in the bytecode). 64 | kError 65 | }; 66 | 67 | // Sets breakpoints in Python code with zero runtime overhead. 68 | // BytecodeBreakpoint rewrites Python bytecode to insert a breakpoint. The 69 | // implementation is specific to CPython 2.7. 70 | // TODO: rename to BreakpointsEmulator when the original implementation 71 | // of BreakpointsEmulator goes away. 72 | class BytecodeBreakpoint { 73 | public: 74 | BytecodeBreakpoint(); 75 | 76 | ~BytecodeBreakpoint(); 77 | 78 | // Clears all the set breakpoints. 79 | void Detach(); 80 | 81 | // Creates a new breakpoint in the specified code object. More than one 82 | // breakpoint can be created at the same source location. When the breakpoint 83 | // hits, the "callback" parameter is invoked. Every time this method fails to 84 | // create the breakpoint, "error_callback" is invoked and a cookie value of 85 | // -1 is returned. If it succeeds in creating the breakpoint, returns the 86 | // unique cookie used to activate and clear the breakpoint. Note this method 87 | // only creates the breakpoint, to activate it you must call 88 | // "ActivateBreakpoint". 89 | int CreateBreakpoint(PyCodeObject* code_object, int line, 90 | std::function hit_callback, 91 | std::function error_callback); 92 | 93 | // Activates a previously created breakpoint. If it fails to set any 94 | // breakpoint, the error callback will be invoked. This method is kept 95 | // separate from "CreateBreakpoint" to ensure that the cookie is available 96 | // before the "error_callback" is invoked. Calling this method with a cookie 97 | // value of -1 is a no-op. Note that any breakpoints in the same function that 98 | // previously failed to activate will retry to activate during this call. 99 | // TODO: Provide a method "ActivateAllBreakpoints" to optimize 100 | // the code and patch the code once, instead of multiple times. 101 | void ActivateBreakpoint(int cookie); 102 | 103 | // Removes a previously set breakpoint. Calling this method with a cookie 104 | // value of -1 is a no-op. Note that any breakpoints in the same function that 105 | // previously failed to activate will retry to activate during this call. 106 | void ClearBreakpoint(int cookie); 107 | 108 | // Get the status of a breakpoint. 109 | BreakpointStatus GetBreakpointStatus(int cookie); 110 | 111 | private: 112 | // Information about the breakpoint. 113 | struct Breakpoint { 114 | // Method in which the breakpoint is set. 115 | ScopedPyCodeObject code_object; 116 | 117 | // Line number on which the breakpoint is set. 118 | int line; 119 | 120 | // Offset to the instruction on which the breakpoint is set. 121 | int offset; 122 | 123 | // Python callable object to invoke on breakpoint hit. 124 | ScopedPyObject hit_callable; 125 | 126 | // Callback to invoke every time this class fails to install 127 | // the breakpoint. 128 | std::function error_callback; 129 | 130 | // Breakpoint ID used to clear the breakpoint. 131 | int cookie; 132 | 133 | // Status of the breakpoint. 134 | BreakpointStatus status; 135 | }; 136 | 137 | // Set of breakpoints in a particular code object and original data of 138 | // the code object to clear breakpoints. 139 | struct CodeObjectBreakpoints { 140 | // Patched code object. 141 | ScopedPyCodeObject code_object; 142 | 143 | // Maps breakpoint offset to breakpoint information. The map is sorted in 144 | // a descending order. 145 | std::multimap> breakpoints; 146 | 147 | // Python runtime assumes that objects referenced by "PyCodeObject" stay 148 | // alive as long as the code object is alive. Therefore when patching the 149 | // code object, we can't just decrement reference count for code and 150 | // constants. Instead we store these references in a special zombie pool. 151 | // Then once we know that no Python thread is executing the code object, 152 | // we can release all of them. 153 | // TODO: implement garbage collection for zombie refs. 154 | std::vector zombie_refs; 155 | 156 | // Original value of PyCodeObject::co_stacksize before patching. 157 | int original_stacksize; 158 | 159 | // Original value of PyCodeObject::co_consts before patching. 160 | ScopedPyObject original_consts; 161 | 162 | // Original value of PyCodeObject::co_code before patching. 163 | ScopedPyObject original_code; 164 | 165 | // Original value of PythonCode::co_lnotab or PythonCode::co_linetable 166 | // before patching. This is the line numbers table in CPython <= 3.9 and 167 | // CPython >= 3.10 respectively 168 | ScopedPyObject original_linedata; 169 | }; 170 | 171 | // Loads code object into "patches_" if not there yet. Returns nullptr if 172 | // the code object has no code or corrupted. 173 | CodeObjectBreakpoints* PreparePatchCodeObject( 174 | const ScopedPyCodeObject& code_object); 175 | 176 | // Patches the code object with breakpoints. If the code object has no more 177 | // breakpoints, resets the code object to its original state. This operation 178 | // is idempotent. 179 | void PatchCodeObject(CodeObjectBreakpoints* code); 180 | 181 | private: 182 | // Global counter of breakpoints to generate a unique breakpoint cookie. 183 | int cookie_counter_; 184 | 185 | // Maps breakpoint cookie to full breakpoint information. 186 | std::map cookie_map_; 187 | 188 | // Patched code objects. 189 | std::unordered_map< 190 | ScopedPyCodeObject, 191 | CodeObjectBreakpoints*, 192 | ScopedPyCodeObject::Hash> patches_; 193 | 194 | DISALLOW_COPY_AND_ASSIGN(BytecodeBreakpoint); 195 | }; 196 | 197 | } // namespace cdbg 198 | } // namespace devtools 199 | 200 | #endif // DEVTOOLS_CDBG_DEBUGLETS_PYTHON_BYTECODE_BREAKPOINT_H_ 201 | -------------------------------------------------------------------------------- /tests/py/breakpoints_manager_test.py: -------------------------------------------------------------------------------- 1 | """Unit test for breakpoints_manager module.""" 2 | 3 | from datetime import datetime 4 | from datetime import timedelta 5 | from unittest import mock 6 | 7 | from absl.testing import absltest 8 | 9 | from googleclouddebugger import breakpoints_manager 10 | 11 | 12 | class BreakpointsManagerTest(absltest.TestCase): 13 | """Unit test for breakpoints_manager module.""" 14 | 15 | def setUp(self): 16 | self._breakpoints_manager = breakpoints_manager.BreakpointsManager( 17 | self, None) 18 | 19 | path = 'googleclouddebugger.breakpoints_manager.' 20 | breakpoint_class = path + 'python_breakpoint.PythonBreakpoint' 21 | 22 | patcher = mock.patch(breakpoint_class) 23 | self._mock_breakpoint = patcher.start() 24 | self.addCleanup(patcher.stop) 25 | 26 | def testEmpty(self): 27 | self.assertEmpty(self._breakpoints_manager._active) 28 | 29 | def testSetSingle(self): 30 | self._breakpoints_manager.SetActiveBreakpoints([{'id': 'ID1'}]) 31 | self._mock_breakpoint.assert_has_calls( 32 | [mock.call({'id': 'ID1'}, self, self._breakpoints_manager, None)]) 33 | self.assertLen(self._breakpoints_manager._active, 1) 34 | 35 | def testSetDouble(self): 36 | self._breakpoints_manager.SetActiveBreakpoints([{'id': 'ID1'}]) 37 | self._mock_breakpoint.assert_has_calls( 38 | [mock.call({'id': 'ID1'}, self, self._breakpoints_manager, None)]) 39 | self.assertLen(self._breakpoints_manager._active, 1) 40 | 41 | self._breakpoints_manager.SetActiveBreakpoints([{ 42 | 'id': 'ID1' 43 | }, { 44 | 'id': 'ID2' 45 | }]) 46 | self._mock_breakpoint.assert_has_calls([ 47 | mock.call({'id': 'ID1'}, self, self._breakpoints_manager, None), 48 | mock.call({'id': 'ID2'}, self, self._breakpoints_manager, None) 49 | ]) 50 | self.assertLen(self._breakpoints_manager._active, 2) 51 | 52 | def testSetRepeated(self): 53 | self._breakpoints_manager.SetActiveBreakpoints([{'id': 'ID1'}]) 54 | self._breakpoints_manager.SetActiveBreakpoints([{'id': 'ID1'}]) 55 | self.assertEqual(1, self._mock_breakpoint.call_count) 56 | 57 | def testClear(self): 58 | self._breakpoints_manager.SetActiveBreakpoints([{'id': 'ID1'}]) 59 | self._breakpoints_manager.SetActiveBreakpoints([]) 60 | self.assertEqual(1, self._mock_breakpoint.return_value.Clear.call_count) 61 | self.assertEmpty(self._breakpoints_manager._active) 62 | 63 | def testCompleteInvalidId(self): 64 | self._breakpoints_manager.CompleteBreakpoint('ID_INVALID') 65 | 66 | def testComplete(self): 67 | self._breakpoints_manager.SetActiveBreakpoints([{'id': 'ID1'}]) 68 | self._breakpoints_manager.CompleteBreakpoint('ID1') 69 | self.assertEqual(1, self._mock_breakpoint.return_value.Clear.call_count) 70 | 71 | def testSetCompleted(self): 72 | self._breakpoints_manager.CompleteBreakpoint('ID1') 73 | self._breakpoints_manager.SetActiveBreakpoints([{'id': 'ID1'}]) 74 | self.assertEqual(0, self._mock_breakpoint.call_count) 75 | 76 | def testCompletedCleanup(self): 77 | self._breakpoints_manager.CompleteBreakpoint('ID1') 78 | self._breakpoints_manager.SetActiveBreakpoints([]) 79 | self._breakpoints_manager.SetActiveBreakpoints([{'id': 'ID1'}]) 80 | self.assertEqual(1, self._mock_breakpoint.call_count) 81 | 82 | def testMultipleSetDelete(self): 83 | self._breakpoints_manager.SetActiveBreakpoints([{ 84 | 'id': 'ID1' 85 | }, { 86 | 'id': 'ID2' 87 | }, { 88 | 'id': 'ID3' 89 | }, { 90 | 'id': 'ID4' 91 | }]) 92 | self.assertLen(self._breakpoints_manager._active, 4) 93 | 94 | self._breakpoints_manager.SetActiveBreakpoints([{ 95 | 'id': 'ID1' 96 | }, { 97 | 'id': 'ID2' 98 | }, { 99 | 'id': 'ID3' 100 | }, { 101 | 'id': 'ID4' 102 | }]) 103 | self.assertLen(self._breakpoints_manager._active, 4) 104 | 105 | self._breakpoints_manager.SetActiveBreakpoints([]) 106 | self.assertEmpty(self._breakpoints_manager._active) 107 | 108 | def testCombination(self): 109 | self._breakpoints_manager.SetActiveBreakpoints([{ 110 | 'id': 'ID1' 111 | }, { 112 | 'id': 'ID2' 113 | }, { 114 | 'id': 'ID3' 115 | }]) 116 | self.assertLen(self._breakpoints_manager._active, 3) 117 | 118 | self._breakpoints_manager.CompleteBreakpoint('ID2') 119 | self.assertEqual(1, self._mock_breakpoint.return_value.Clear.call_count) 120 | self.assertLen(self._breakpoints_manager._active, 2) 121 | 122 | self._breakpoints_manager.SetActiveBreakpoints([{ 123 | 'id': 'ID2' 124 | }, { 125 | 'id': 'ID3' 126 | }, { 127 | 'id': 'ID4' 128 | }]) 129 | self.assertEqual(2, self._mock_breakpoint.return_value.Clear.call_count) 130 | self.assertLen(self._breakpoints_manager._active, 2) 131 | 132 | self._breakpoints_manager.CompleteBreakpoint('ID2') 133 | self.assertEqual(2, self._mock_breakpoint.return_value.Clear.call_count) 134 | self.assertLen(self._breakpoints_manager._active, 2) 135 | 136 | self._breakpoints_manager.SetActiveBreakpoints([]) 137 | self.assertEqual(4, self._mock_breakpoint.return_value.Clear.call_count) 138 | self.assertEmpty(self._breakpoints_manager._active) 139 | 140 | def testCheckExpirationNoBreakpoints(self): 141 | self._breakpoints_manager.CheckBreakpointsExpiration() 142 | 143 | def testCheckNotExpired(self): 144 | self._breakpoints_manager.SetActiveBreakpoints([{ 145 | 'id': 'ID1' 146 | }, { 147 | 'id': 'ID2' 148 | }]) 149 | self._mock_breakpoint.return_value.GetExpirationTime.return_value = ( 150 | datetime.utcnow() + timedelta(minutes=1)) 151 | self._breakpoints_manager.CheckBreakpointsExpiration() 152 | self.assertEqual( 153 | 0, self._mock_breakpoint.return_value.ExpireBreakpoint.call_count) 154 | 155 | def testCheckExpired(self): 156 | self._breakpoints_manager.SetActiveBreakpoints([{ 157 | 'id': 'ID1' 158 | }, { 159 | 'id': 'ID2' 160 | }]) 161 | self._mock_breakpoint.return_value.GetExpirationTime.return_value = ( 162 | datetime.utcnow() - timedelta(minutes=1)) 163 | self._breakpoints_manager.CheckBreakpointsExpiration() 164 | self.assertEqual( 165 | 2, self._mock_breakpoint.return_value.ExpireBreakpoint.call_count) 166 | 167 | def testCheckExpirationReset(self): 168 | self._breakpoints_manager.SetActiveBreakpoints([{'id': 'ID1'}]) 169 | self._mock_breakpoint.return_value.GetExpirationTime.return_value = ( 170 | datetime.utcnow() + timedelta(minutes=1)) 171 | self._breakpoints_manager.CheckBreakpointsExpiration() 172 | self.assertEqual( 173 | 0, self._mock_breakpoint.return_value.ExpireBreakpoint.call_count) 174 | 175 | self._breakpoints_manager.SetActiveBreakpoints([{ 176 | 'id': 'ID1' 177 | }, { 178 | 'id': 'ID2' 179 | }]) 180 | self._mock_breakpoint.return_value.GetExpirationTime.return_value = ( 181 | datetime.utcnow() - timedelta(minutes=1)) 182 | self._breakpoints_manager.CheckBreakpointsExpiration() 183 | self.assertEqual( 184 | 2, self._mock_breakpoint.return_value.ExpireBreakpoint.call_count) 185 | 186 | def testCheckExpirationCacheNegative(self): 187 | base = datetime(2015, 1, 1) 188 | 189 | with mock.patch.object(breakpoints_manager.BreakpointsManager, 190 | 'GetCurrentTime') as mock_time: 191 | mock_time.return_value = base 192 | 193 | self._breakpoints_manager.SetActiveBreakpoints([{'id': 'ID1'}]) 194 | self._mock_breakpoint.return_value.GetExpirationTime.return_value = ( 195 | base + timedelta(minutes=1)) 196 | 197 | self._breakpoints_manager.CheckBreakpointsExpiration() 198 | self.assertEqual( 199 | 0, self._mock_breakpoint.return_value.ExpireBreakpoint.call_count) 200 | 201 | # The nearest expiration time is cached, so this should have no effect. 202 | self._mock_breakpoint.return_value.GetExpirationTime.return_value = ( 203 | base - timedelta(minutes=1)) 204 | self._breakpoints_manager.CheckBreakpointsExpiration() 205 | self.assertEqual( 206 | 0, self._mock_breakpoint.return_value.ExpireBreakpoint.call_count) 207 | 208 | def testCheckExpirationCachePositive(self): 209 | base = datetime(2015, 1, 1) 210 | 211 | with mock.patch.object(breakpoints_manager.BreakpointsManager, 212 | 'GetCurrentTime') as mock_time: 213 | self._breakpoints_manager.SetActiveBreakpoints([{'id': 'ID1'}]) 214 | self._mock_breakpoint.return_value.GetExpirationTime.return_value = ( 215 | base + timedelta(minutes=1)) 216 | 217 | mock_time.return_value = base 218 | self._breakpoints_manager.CheckBreakpointsExpiration() 219 | self.assertEqual( 220 | 0, self._mock_breakpoint.return_value.ExpireBreakpoint.call_count) 221 | 222 | mock_time.return_value = base + timedelta(minutes=2) 223 | self._breakpoints_manager.CheckBreakpointsExpiration() 224 | self.assertEqual( 225 | 1, self._mock_breakpoint.return_value.ExpireBreakpoint.call_count) 226 | 227 | 228 | if __name__ == '__main__': 229 | absltest.main() 230 | -------------------------------------------------------------------------------- /src/googleclouddebugger/module_explorer.py: -------------------------------------------------------------------------------- 1 | # Copyright 2015 Google Inc. All Rights Reserved. 2 | # 3 | # Licensed under the Apache License, Version 2.0 (the "License"); 4 | # you may not use this file except in compliance with the License. 5 | # You may obtain a copy of the License at 6 | # 7 | # http://www.apache.org/licenses/LICENSE-2.0 8 | # 9 | # Unless required by applicable law or agreed to in writing, software 10 | # distributed under the License is distributed on an "AS-IS" BASIS, 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | # See the License for the specific language governing permissions and 13 | # limitations under the License. 14 | """Finds all the code objects defined by a module.""" 15 | 16 | import gc 17 | import os 18 | import sys 19 | import types 20 | 21 | # Maximum traversal depth when looking for all the code objects referenced by 22 | # a module or another code object. 23 | _MAX_REFERENTS_BFS_DEPTH = 15 24 | 25 | # Absolute limit on the amount of objects to scan when looking for all the code 26 | # objects implemented in a module. 27 | _MAX_VISIT_OBJECTS = 100000 28 | 29 | # Maximum referents an object can have before it is skipped in the BFS 30 | # traversal. This is to prevent things like long objects or dictionaries that 31 | # probably do not contain code objects from using the _MAX_VISIT_OBJECTS quota. 32 | _MAX_OBJECT_REFERENTS = 1000 33 | 34 | # Object types to ignore when looking for the code objects. 35 | _BFS_IGNORE_TYPES = (types.ModuleType, type(None), bool, float, bytes, str, int, 36 | types.BuiltinFunctionType, types.BuiltinMethodType, list) 37 | 38 | 39 | def GetCodeObjectAtLine(module, line): 40 | """Searches for a code object at the specified line in the specified module. 41 | 42 | Args: 43 | module: module to explore. 44 | line: 1-based line number of the statement. 45 | 46 | Returns: 47 | (True, Code object) on success or (False, (prev_line, next_line)) on 48 | failure, where prev_line and next_line are the closest lines with code above 49 | and below the specified line, or None if they do not exist. 50 | """ 51 | if not hasattr(module, '__file__'): 52 | return (False, (None, None)) 53 | 54 | prev_line = 0 55 | next_line = sys.maxsize 56 | 57 | for code_object in _GetModuleCodeObjects(module): 58 | for co_line_number in _GetLineNumbers(code_object): 59 | if co_line_number == line: 60 | return (True, code_object) 61 | elif co_line_number < line: 62 | prev_line = max(prev_line, co_line_number) 63 | elif co_line_number > line: 64 | next_line = min(next_line, co_line_number) 65 | # Continue because line numbers may not be sequential. 66 | 67 | prev_line = None if prev_line == 0 else prev_line 68 | next_line = None if next_line == sys.maxsize else next_line 69 | return (False, (prev_line, next_line)) 70 | 71 | 72 | def _GetLineNumbers(code_object): 73 | """Generator for getting the line numbers of a code object. 74 | 75 | Args: 76 | code_object: the code object. 77 | 78 | Yields: 79 | The next line number in the code object. 80 | """ 81 | 82 | if sys.version_info.minor < 10: 83 | # Get the line number deltas, which are the odd number entries, from the 84 | # lnotab. See 85 | # https://svn.python.org/projects/python/branches/pep-0384/Objects/lnotab_notes.txt 86 | # In Python 3, prior to 3.10, this is just a byte array. 87 | line_incrs = code_object.co_lnotab[1::2] 88 | current_line = code_object.co_firstlineno 89 | for line_incr in line_incrs: 90 | if line_incr >= 0x80: 91 | # line_incrs is an array of 8-bit signed integers 92 | line_incr -= 0x100 93 | current_line += line_incr 94 | yield current_line 95 | else: 96 | # Get the line numbers directly, which are the third entry in the tuples. 97 | # https://peps.python.org/pep-0626/#the-new-co-lines-method-of-code-objects 98 | line_numbers = [entry[2] for entry in code_object.co_lines()] 99 | for line_number in line_numbers: 100 | if line_number is not None: 101 | yield line_number 102 | 103 | 104 | def _GetModuleCodeObjects(module): 105 | """Gets all code objects defined in the specified module. 106 | 107 | There are two BFS traversals involved. One in this function and the other in 108 | _FindCodeObjectsReferents. Only the BFS in _FindCodeObjectsReferents has 109 | a depth limit. This function does not. The motivation is that this function 110 | explores code object of the module and they can have any arbitrary nesting 111 | level. _FindCodeObjectsReferents, on the other hand, traverses through class 112 | definitions and random references. It's much more expensive and will likely 113 | go into unrelated objects. 114 | 115 | There is also a limit on how many total objects are going to be traversed in 116 | all. This limit makes sure that if something goes wrong, the lookup doesn't 117 | hang. 118 | 119 | Args: 120 | module: module to explore. 121 | 122 | Returns: 123 | Set of code objects defined in module. 124 | """ 125 | 126 | visit_recorder = _VisitRecorder() 127 | current = [module] 128 | code_objects = set() 129 | while current: 130 | current = _FindCodeObjectsReferents(module, current, visit_recorder) 131 | code_objects |= current 132 | 133 | # Unfortunately Python code objects don't implement tp_traverse, so this 134 | # type can't be used with gc.get_referents. The workaround is to get the 135 | # relevant objects explicitly here. 136 | current = [code_object.co_consts for code_object in current] 137 | 138 | return code_objects 139 | 140 | 141 | def _FindCodeObjectsReferents(module, start_objects, visit_recorder): 142 | """Looks for all the code objects referenced by objects in start_objects. 143 | 144 | The traversal implemented by this function is a shallow one. In other words 145 | if the reference chain is a -> b -> co1 -> c -> co2, this function will 146 | return [co1] only. 147 | 148 | The traversal is implemented with BFS. The maximum depth is limited to avoid 149 | touching all the objects in the process. Each object is only visited once 150 | using visit_recorder. 151 | 152 | Args: 153 | module: module in which we are looking for code objects. 154 | start_objects: initial set of objects for the BFS traversal. 155 | visit_recorder: instance of _VisitRecorder class to ensure each object is 156 | visited at most once. 157 | 158 | Returns: 159 | List of code objects. 160 | """ 161 | 162 | def CheckIgnoreCodeObject(code_object): 163 | """Checks if the code object can be ignored. 164 | 165 | Code objects that are not implemented in the module, or are from a lambda or 166 | generator expression can be ignored. 167 | 168 | If the module was precompiled, the code object may point to .py file, while 169 | the module says that it originated from .pyc file. We just strip extension 170 | altogether to work around it. 171 | 172 | Args: 173 | code_object: code object that we want to check against module. 174 | 175 | Returns: 176 | True if the code object can be ignored, False otherwise. 177 | """ 178 | if code_object.co_name in ('', ''): 179 | return True 180 | 181 | code_object_file = os.path.splitext(code_object.co_filename)[0] 182 | module_file = os.path.splitext(module.__file__)[0] 183 | 184 | # The simple case. 185 | if code_object_file == module_file: 186 | return False 187 | 188 | return True 189 | 190 | def CheckIgnoreClass(cls): 191 | """Returns True if the class is definitely not coming from "module".""" 192 | cls_module = sys.modules.get(cls.__module__) 193 | if not cls_module: 194 | return False # We can't tell for sure, so explore this class. 195 | 196 | return (cls_module is not module and 197 | getattr(cls_module, '__file__', None) != module.__file__) 198 | 199 | code_objects = set() 200 | current = start_objects 201 | for obj in current: 202 | visit_recorder.Record(obj) 203 | 204 | depth = 0 205 | while current and depth < _MAX_REFERENTS_BFS_DEPTH: 206 | new_current = [] 207 | for current_obj in current: 208 | referents = gc.get_referents(current_obj) 209 | if (current_obj is not module.__dict__ and 210 | len(referents) > _MAX_OBJECT_REFERENTS): 211 | continue 212 | 213 | for obj in referents: 214 | if isinstance(obj, _BFS_IGNORE_TYPES) or not visit_recorder.Record(obj): 215 | continue 216 | 217 | if isinstance(obj, types.CodeType) and CheckIgnoreCodeObject(obj): 218 | continue 219 | 220 | if isinstance(obj, type) and CheckIgnoreClass(obj): 221 | continue 222 | 223 | if isinstance(obj, types.CodeType): 224 | code_objects.add(obj) 225 | else: 226 | new_current.append(obj) 227 | 228 | current = new_current 229 | depth += 1 230 | 231 | return code_objects 232 | 233 | 234 | class _VisitRecorder(object): 235 | """Helper class to track of already visited objects and implement quota. 236 | 237 | This class keeps a map from integer to object. The key is a unique object 238 | ID (raw object pointer). The value is the object itself. We need to keep the 239 | object in the map, so that it doesn't get released during iteration (since 240 | object ID is only unique as long as the object is alive). 241 | """ 242 | 243 | def __init__(self): 244 | self._visit_recorder_objects = {} 245 | 246 | def Record(self, obj): 247 | """Records the object as visited. 248 | 249 | Args: 250 | obj: visited object. 251 | 252 | Returns: 253 | True if the object hasn't been previously visited or False if it has 254 | already been recorded or the quota has been exhausted. 255 | """ 256 | if len(self._visit_recorder_objects) >= _MAX_VISIT_OBJECTS: 257 | return False 258 | 259 | obj_id = id(obj) 260 | if obj_id in self._visit_recorder_objects: 261 | return False 262 | 263 | self._visit_recorder_objects[obj_id] = obj 264 | return True 265 | -------------------------------------------------------------------------------- /tests/py/native_module_test.py: -------------------------------------------------------------------------------- 1 | """Unit tests for native module.""" 2 | 3 | import inspect 4 | import sys 5 | import threading 6 | import time 7 | 8 | from absl.testing import absltest 9 | 10 | from googleclouddebugger import cdbg_native as native 11 | import python_test_util 12 | 13 | 14 | def _DoHardWork(base): 15 | for i in range(base): 16 | if base * i < 0: 17 | return True 18 | return False 19 | 20 | 21 | class NativeModuleTest(absltest.TestCase): 22 | """Unit tests for native module.""" 23 | 24 | def setUp(self): 25 | # Lock for thread safety. 26 | self._lock = threading.Lock() 27 | 28 | # Count hit count for the breakpoints we set. 29 | self._breakpoint_counter = 0 30 | 31 | # Registers breakpoint events other than breakpoint hit. 32 | self._breakpoint_events = [] 33 | 34 | # Keep track of breakpoints we set to reset them on cleanup. 35 | self._cookies = [] 36 | 37 | def tearDown(self): 38 | # Verify that we didn't get any breakpoint events that the test did 39 | # not expect. 40 | self.assertEqual([], self._PopBreakpointEvents()) 41 | 42 | self._ClearAllBreakpoints() 43 | 44 | def testUnconditionalBreakpoint(self): 45 | 46 | def Trigger(): 47 | unused_lock = threading.Lock() 48 | print('Breakpoint trigger') # BPTAG: UNCONDITIONAL_BREAKPOINT 49 | 50 | self._SetBreakpoint(Trigger, 'UNCONDITIONAL_BREAKPOINT') 51 | Trigger() 52 | self.assertEqual(1, self._breakpoint_counter) 53 | 54 | def testConditionalBreakpoint(self): 55 | 56 | def Trigger(): 57 | d = {} 58 | for i in range(1, 10): 59 | d[i] = i**2 # BPTAG: CONDITIONAL_BREAKPOINT 60 | 61 | self._SetBreakpoint(Trigger, 'CONDITIONAL_BREAKPOINT', 'i % 3 == 1') 62 | Trigger() 63 | self.assertEqual(3, self._breakpoint_counter) 64 | 65 | def testClearBreakpoint(self): 66 | """Set two breakpoint on the same line, then clear one.""" 67 | 68 | def Trigger(): 69 | print('Breakpoint trigger') # BPTAG: CLEAR_BREAKPOINT 70 | 71 | self._SetBreakpoint(Trigger, 'CLEAR_BREAKPOINT') 72 | self._SetBreakpoint(Trigger, 'CLEAR_BREAKPOINT') 73 | native.ClearConditionalBreakpoint(self._cookies.pop()) 74 | Trigger() 75 | self.assertEqual(1, self._breakpoint_counter) 76 | 77 | def testMissingModule(self): 78 | 79 | def Test(): 80 | native.CreateConditionalBreakpoint(None, 123123, None, 81 | self._BreakpointEvent) 82 | 83 | self.assertRaises(TypeError, Test) 84 | 85 | def testBadModule(self): 86 | 87 | def Test(): 88 | native.CreateConditionalBreakpoint('str', 123123, None, 89 | self._BreakpointEvent) 90 | 91 | self.assertRaises(TypeError, Test) 92 | 93 | def testInvalidCondition(self): 94 | 95 | def Test(): 96 | native.CreateConditionalBreakpoint(sys.modules[__name__], 123123, '2+2', 97 | self._BreakpointEvent) 98 | 99 | self.assertRaises(TypeError, Test) 100 | 101 | def testMissingCallback(self): 102 | 103 | def Test(): 104 | native.CreateConditionalBreakpoint('code.py', 123123, None, None) 105 | 106 | self.assertRaises(TypeError, Test) 107 | 108 | def testInvalidCallback(self): 109 | 110 | def Test(): 111 | native.CreateConditionalBreakpoint('code.py', 123123, None, {}) 112 | 113 | self.assertRaises(TypeError, Test) 114 | 115 | def testMissingCookie(self): 116 | self.assertRaises(TypeError, 117 | lambda: native.ClearConditionalBreakpoint(None)) 118 | 119 | def testInvalidCookie(self): 120 | native.ClearConditionalBreakpoint(387873457) 121 | 122 | def testMutableCondition(self): 123 | 124 | def Trigger(): 125 | 126 | def MutableMethod(): 127 | self._evil = True 128 | return True 129 | 130 | print('MutableMethod = %s' % MutableMethod) # BPTAG: MUTABLE_CONDITION 131 | 132 | self._SetBreakpoint(Trigger, 'MUTABLE_CONDITION', 'MutableMethod()') 133 | Trigger() 134 | self.assertEqual([native.BREAKPOINT_EVENT_CONDITION_EXPRESSION_MUTABLE], 135 | self._PopBreakpointEvents()) 136 | 137 | def testGlobalConditionQuotaExceeded(self): 138 | 139 | def Trigger(): 140 | print('Breakpoint trigger') # BPTAG: GLOBAL_CONDITION_QUOTA 141 | 142 | self._SetBreakpoint(Trigger, 'GLOBAL_CONDITION_QUOTA', '_DoHardWork(1000)') 143 | Trigger() 144 | self._ClearAllBreakpoints() 145 | 146 | self.assertListEqual( 147 | [native.BREAKPOINT_EVENT_GLOBAL_CONDITION_QUOTA_EXCEEDED], 148 | self._PopBreakpointEvents()) 149 | 150 | # Sleep for some time to let the quota recover. 151 | time.sleep(0.1) 152 | 153 | def testBreakpointConditionQuotaExceeded(self): 154 | 155 | def Trigger(): 156 | print('Breakpoint trigger') # BPTAG: PER_BREAKPOINT_CONDITION_QUOTA 157 | 158 | time.sleep(1) 159 | 160 | # Per-breakpoint quota is lower than the global one. Exponentially 161 | # increase the complexity of a condition until we hit it. 162 | base = 100 163 | while True: 164 | self._SetBreakpoint(Trigger, 'PER_BREAKPOINT_CONDITION_QUOTA', 165 | '_DoHardWork(%d)' % base) 166 | Trigger() 167 | self._ClearAllBreakpoints() 168 | 169 | events = self._PopBreakpointEvents() 170 | if events: 171 | self.assertEqual( 172 | [native.BREAKPOINT_EVENT_BREAKPOINT_CONDITION_QUOTA_EXCEEDED], 173 | events) 174 | break 175 | 176 | base *= 1.2 177 | time.sleep(0.1) 178 | 179 | # Sleep for some time to let the quota recover. 180 | time.sleep(0.1) 181 | 182 | def testImmutableCallSuccess(self): 183 | 184 | def Add(a, b, c): 185 | return a + b + c 186 | 187 | def Magic(): 188 | return 'cake' 189 | 190 | self.assertEqual('643535', 191 | self._CallImmutable(inspect.currentframe(), 'str(643535)')) 192 | self.assertEqual( 193 | 786 + 23 + 891, 194 | self._CallImmutable(inspect.currentframe(), 'Add(786, 23, 891)')) 195 | self.assertEqual('cake', 196 | self._CallImmutable(inspect.currentframe(), 'Magic()')) 197 | return Add or Magic 198 | 199 | def testImmutableCallMutable(self): 200 | 201 | def Change(): 202 | dictionary['bad'] = True 203 | 204 | dictionary = {} 205 | frame = inspect.currentframe() 206 | self.assertRaises(SystemError, 207 | lambda: self._CallImmutable(frame, 'Change()')) 208 | self.assertEqual({}, dictionary) 209 | return Change 210 | 211 | def testImmutableCallExceptionPropagation(self): 212 | 213 | def Divide(a, b): 214 | return a / b 215 | 216 | frame = inspect.currentframe() 217 | self.assertRaises(ZeroDivisionError, 218 | lambda: self._CallImmutable(frame, 'Divide(1, 0)')) 219 | return Divide 220 | 221 | def testImmutableCallInvalidFrame(self): 222 | self.assertRaises(TypeError, lambda: native.CallImmutable(None, lambda: 1)) 223 | self.assertRaises(TypeError, 224 | lambda: native.CallImmutable('not a frame', lambda: 1)) 225 | 226 | def testImmutableCallInvalidCallable(self): 227 | frame = inspect.currentframe() 228 | self.assertRaises(TypeError, lambda: native.CallImmutable(frame, None)) 229 | self.assertRaises(TypeError, 230 | lambda: native.CallImmutable(frame, 'not a callable')) 231 | 232 | def _SetBreakpoint(self, method, tag, condition=None): 233 | """Sets a breakpoint in this source file. 234 | 235 | The line number is identified by tag. This function does not verify that 236 | the source line is in the specified method. 237 | 238 | The breakpoint may have an optional condition. 239 | 240 | Args: 241 | method: method in which the breakpoint will be set. 242 | tag: label for a source line. 243 | condition: optional breakpoint condition. 244 | """ 245 | unused_path, line = python_test_util.ResolveTag(type(self), tag) 246 | 247 | compiled_condition = None 248 | if condition is not None: 249 | compiled_condition = compile(condition, '', 'eval') 250 | 251 | cookie = native.CreateConditionalBreakpoint(method.__code__, line, 252 | compiled_condition, 253 | self._BreakpointEvent) 254 | 255 | self._cookies.append(cookie) 256 | native.ActivateConditionalBreakpoint(cookie) 257 | 258 | def _ClearAllBreakpoints(self): 259 | """Removes all previously set breakpoints.""" 260 | for cookie in self._cookies: 261 | native.ClearConditionalBreakpoint(cookie) 262 | 263 | def _CallImmutable(self, frame, expression): 264 | """Wrapper over native.ImmutableCall for callable.""" 265 | return native.CallImmutable(frame, 266 | compile(expression, '', 'eval')) 267 | 268 | def _BreakpointEvent(self, event, frame): 269 | """Callback on breakpoint event. 270 | 271 | See thread_breakpoints.h for more details of possible events. 272 | 273 | Args: 274 | event: breakpoint event (see kIntegerConstants in native_module.cc). 275 | frame: Python stack frame of breakpoint hit or None for other events. 276 | """ 277 | with self._lock: 278 | if event == native.BREAKPOINT_EVENT_HIT: 279 | self.assertTrue(inspect.isframe(frame)) 280 | self._breakpoint_counter += 1 281 | else: 282 | self._breakpoint_events.append(event) 283 | 284 | def _PopBreakpointEvents(self): 285 | """Gets and resets the list of breakpoint events received so far.""" 286 | with self._lock: 287 | events = self._breakpoint_events 288 | self._breakpoint_events = [] 289 | return events 290 | 291 | def _HasBreakpointEvents(self): 292 | """Checks whether there are unprocessed breakpoint events.""" 293 | with self._lock: 294 | if self._breakpoint_events: 295 | return True 296 | return False 297 | 298 | 299 | if __name__ == '__main__': 300 | absltest.main() 301 | -------------------------------------------------------------------------------- /src/googleclouddebugger/python_util.cc: -------------------------------------------------------------------------------- 1 | /** 2 | * Copyright 2015 Google Inc. All Rights Reserved. 3 | * 4 | * Licensed under the Apache License, Version 2.0 (the "License"); 5 | * you may not use this file except in compliance with the License. 6 | * You may obtain a copy of the License at 7 | * 8 | * http://www.apache.org/licenses/LICENSE-2.0 9 | * 10 | * Unless required by applicable law or agreed to in writing, software 11 | * distributed under the License is distributed on an "AS IS" BASIS, 12 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | * See the License for the specific language governing permissions and 14 | * limitations under the License. 15 | */ 16 | 17 | // Ensure that Python.h is included before any other header. 18 | #include "common.h" 19 | 20 | #include "python_util.h" 21 | 22 | #include 23 | 24 | #include 25 | 26 | #if PY_VERSION_HEX >= 0x030A0000 27 | #include "../third_party/pylinetable.h" 28 | #endif // PY_VERSION_HEX >= 0x030A0000 29 | 30 | 31 | namespace devtools { 32 | namespace cdbg { 33 | 34 | // Python module object corresponding to the debuglet extension. 35 | static PyObject* g_debuglet_module = nullptr; 36 | 37 | 38 | CodeObjectLinesEnumerator::CodeObjectLinesEnumerator( 39 | PyCodeObject* code_object) { 40 | #if PY_VERSION_HEX < 0x030A0000 41 | Initialize(code_object->co_firstlineno, code_object->co_lnotab); 42 | #else 43 | Initialize(code_object->co_firstlineno, code_object->co_linetable); 44 | #endif // PY_VERSION_HEX < 0x030A0000 45 | } 46 | 47 | 48 | CodeObjectLinesEnumerator::CodeObjectLinesEnumerator( 49 | int firstlineno, 50 | PyObject* linedata) { 51 | Initialize(firstlineno, linedata); 52 | } 53 | 54 | 55 | #if PY_VERSION_HEX < 0x030A0000 56 | void CodeObjectLinesEnumerator::Initialize( 57 | int firstlineno, 58 | PyObject* lnotab) { 59 | offset_ = 0; 60 | line_number_ = firstlineno; 61 | remaining_entries_ = PyBytes_Size(lnotab) / 2; 62 | next_entry_ = reinterpret_cast(PyBytes_AsString(lnotab)); 63 | 64 | // If the line table starts with offset 0, the first line is not 65 | // "code_object->co_firstlineno", but the following line. 66 | if ((remaining_entries_ > 0) && (next_entry_[0] == 0)) { 67 | Next(); 68 | } 69 | } 70 | 71 | 72 | // See this URL for explanation of "co_lnotab" data structure: 73 | // http://svn.python.org/projects/python/branches/pep-0384/Objects/lnotab_notes.txt // NOLINT 74 | // For reference implementation see PyCode_Addr2Line (Objects/codeobject.c). 75 | bool CodeObjectLinesEnumerator::Next() { 76 | if (remaining_entries_ == 0) { 77 | return false; 78 | } 79 | 80 | while (true) { 81 | offset_ += next_entry_[0]; 82 | line_number_ += static_cast(next_entry_[1]); 83 | 84 | bool stop = ((next_entry_[0] != 0xFF) || (next_entry_[1] != 0)) && 85 | ((next_entry_[0] != 0) || (next_entry_[1] != 0xFF)); 86 | 87 | --remaining_entries_; 88 | next_entry_ += 2; 89 | 90 | if (stop) { 91 | return true; 92 | } 93 | 94 | if (remaining_entries_ <= 0) { // Corrupted line table. 95 | return false; 96 | } 97 | } 98 | } 99 | #else 100 | 101 | void CodeObjectLinesEnumerator::Initialize( 102 | int firstlineno, 103 | PyObject* linetable) { 104 | Py_ssize_t length = PyBytes_Size(linetable); 105 | _PyLineTable_InitAddressRange(PyBytes_AsString(linetable), length, firstlineno, &range_); 106 | } 107 | 108 | bool CodeObjectLinesEnumerator::Next() { 109 | while (_PyLineTable_NextAddressRange(&range_)) { 110 | if (range_.ar_line >= 0) { 111 | line_number_ = range_.ar_line; 112 | offset_ = range_.ar_start; 113 | return true; 114 | } 115 | } 116 | return false; 117 | } 118 | #endif // PY_VERSION_HEX < 0x030A0000 119 | 120 | PyObject* GetDebugletModule() { 121 | DCHECK(g_debuglet_module != nullptr); 122 | return g_debuglet_module; 123 | } 124 | 125 | 126 | void SetDebugletModule(PyObject* module) { 127 | DCHECK_NE(g_debuglet_module == nullptr, module == nullptr); 128 | 129 | g_debuglet_module = module; 130 | } 131 | 132 | 133 | PyTypeObject DefaultTypeDefinition(const char* type_name) { 134 | return { 135 | #if PY_MAJOR_VERSION >= 3 136 | PyVarObject_HEAD_INIT(nullptr, /* ob_size */ 0) 137 | #else 138 | PyObject_HEAD_INIT(nullptr) 139 | 0, /* ob_size */ 140 | #endif 141 | type_name, /* tp_name */ 142 | 0, /* tp_basicsize */ 143 | 0, /* tp_itemsize */ 144 | 0, /* tp_dealloc */ 145 | 0, /* tp_print */ 146 | 0, /* tp_getattr */ 147 | 0, /* tp_setattr */ 148 | 0, /* tp_compare */ 149 | 0, /* tp_repr */ 150 | 0, /* tp_as_number */ 151 | 0, /* tp_as_sequence */ 152 | 0, /* tp_as_mapping */ 153 | 0, /* tp_hash */ 154 | 0, /* tp_call */ 155 | 0, /* tp_str */ 156 | 0, /* tp_getattro */ 157 | 0, /* tp_setattro */ 158 | 0, /* tp_as_buffer */ 159 | Py_TPFLAGS_DEFAULT, /* tp_flags */ 160 | 0, /* tp_doc */ 161 | 0, /* tp_traverse */ 162 | 0, /* tp_clear */ 163 | 0, /* tp_richcompare */ 164 | 0, /* tp_weaklistoffset */ 165 | 0, /* tp_iter */ 166 | 0, /* tp_iternext */ 167 | 0, /* tp_methods */ 168 | 0, /* tp_members */ 169 | 0, /* tp_getset */ 170 | 0, /* tp_base */ 171 | 0, /* tp_dict */ 172 | 0, /* tp_descr_get */ 173 | 0, /* tp_descr_set */ 174 | 0, /* tp_dictoffset */ 175 | 0, /* tp_init */ 176 | 0, /* tp_alloc */ 177 | 0, /* tp_new */ 178 | }; 179 | } 180 | 181 | 182 | bool RegisterPythonType(PyTypeObject* type) { 183 | if (PyType_Ready(type) < 0) { 184 | LOG(ERROR) << "Python type not ready: " << type->tp_name; 185 | return false; 186 | } 187 | 188 | const char* type_name = strrchr(type->tp_name, '.'); 189 | if (type_name != nullptr) { 190 | ++type_name; 191 | } else { 192 | type_name = type->tp_name; 193 | } 194 | 195 | Py_INCREF(type); 196 | if (PyModule_AddObject( 197 | GetDebugletModule(), 198 | type_name, 199 | reinterpret_cast(type))) { 200 | LOG(ERROR) << "Failed to add type object to native module"; 201 | return false; 202 | } 203 | 204 | return true; 205 | } 206 | 207 | Nullable ClearPythonException() { 208 | PyObject* exception_obj = PyErr_Occurred(); 209 | if (exception_obj == nullptr) { 210 | return Nullable(); // return nullptr. 211 | } 212 | 213 | // TODO: call str(exception_obj) with a verification of immutability 214 | // that the object state is not being altered. 215 | 216 | auto exception_type = reinterpret_cast(exception_obj->ob_type); 217 | std::string msg = exception_type->tp_name; 218 | 219 | #ifndef NDEBUG 220 | PyErr_Print(); 221 | #else 222 | static constexpr time_t EXCEPTION_THROTTLE_SECONDS = 30; 223 | static time_t last_exception_reported = 0; 224 | 225 | time_t current_time = time(nullptr); 226 | if (current_time - last_exception_reported >= EXCEPTION_THROTTLE_SECONDS) { 227 | last_exception_reported = current_time; 228 | PyErr_Print(); 229 | } 230 | #endif // NDEBUG 231 | 232 | PyErr_Clear(); 233 | 234 | return Nullable(msg); 235 | } 236 | 237 | PyObject* GetDebugletModuleObject(const char* key) { 238 | PyObject* module_dict = PyModule_GetDict(GetDebugletModule()); 239 | if (module_dict == nullptr) { 240 | LOG(ERROR) << "Module has no dictionary"; 241 | return nullptr; 242 | } 243 | 244 | PyObject* object = PyDict_GetItemString(module_dict, key); 245 | if (object == nullptr) { 246 | LOG(ERROR) << "Object " << key << " not found in module dictionary"; 247 | return nullptr; 248 | } 249 | 250 | return object; 251 | } 252 | 253 | std::string CodeObjectDebugString(PyCodeObject* code_object) { 254 | if (code_object == nullptr) { 255 | return ""; 256 | } 257 | 258 | if (!PyCode_Check(code_object)) { 259 | return ""; 260 | } 261 | 262 | std::string str; 263 | 264 | if ((code_object->co_name != nullptr) && 265 | PyBytes_CheckExact(code_object->co_name)) { 266 | str += PyBytes_AS_STRING(code_object->co_name); 267 | } else { 268 | str += ""; 269 | } 270 | 271 | str += ':'; 272 | str += std::to_string(static_cast(code_object->co_firstlineno)); 273 | 274 | if ((code_object->co_filename != nullptr) && 275 | PyBytes_CheckExact(code_object->co_filename)) { 276 | str += " at "; 277 | str += PyBytes_AS_STRING(code_object->co_filename); 278 | } 279 | 280 | return str; 281 | } 282 | 283 | std::vector PyBytesToByteArray(PyObject* obj) { 284 | DCHECK(PyBytes_CheckExact(obj)); 285 | 286 | const size_t bytecode_size = PyBytes_GET_SIZE(obj); 287 | const uint8_t* const bytecode_data = 288 | reinterpret_cast(PyBytes_AS_STRING(obj)); 289 | return std::vector(bytecode_data, bytecode_data + bytecode_size); 290 | } 291 | 292 | // Creates a new tuple by appending "items" to elements in "tuple". 293 | ScopedPyObject AppendTuple( 294 | PyObject* tuple, 295 | const std::vector& items) { 296 | const size_t tuple_size = PyTuple_GET_SIZE(tuple); 297 | ScopedPyObject new_tuple(PyTuple_New(tuple_size + items.size())); 298 | 299 | for (size_t i = 0; i < tuple_size; ++i) { 300 | PyObject* item = PyTuple_GET_ITEM(tuple, i); 301 | Py_XINCREF(item); 302 | PyTuple_SET_ITEM(new_tuple.get(), i, item); 303 | } 304 | 305 | for (size_t i = 0; i < items.size(); ++i) { 306 | Py_XINCREF(items[i]); 307 | PyTuple_SET_ITEM(new_tuple.get(), tuple_size + i, items[i]); 308 | } 309 | 310 | return new_tuple; 311 | } 312 | 313 | } // namespace cdbg 314 | } // namespace devtools 315 | 316 | -------------------------------------------------------------------------------- /src/googleclouddebugger/python_util.h: -------------------------------------------------------------------------------- 1 | /** 2 | * Copyright 2015 Google Inc. All Rights Reserved. 3 | * 4 | * Licensed under the Apache License, Version 2.0 (the "License"); 5 | * you may not use this file except in compliance with the License. 6 | * You may obtain a copy of the License at 7 | * 8 | * http://www.apache.org/licenses/LICENSE-2.0 9 | * 10 | * Unless required by applicable law or agreed to in writing, software 11 | * distributed under the License is distributed on an "AS IS" BASIS, 12 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | * See the License for the specific language governing permissions and 14 | * limitations under the License. 15 | */ 16 | 17 | #ifndef DEVTOOLS_CDBG_DEBUGLETS_PYTHON_PYTHON_UTIL_H_ 18 | #define DEVTOOLS_CDBG_DEBUGLETS_PYTHON_PYTHON_UTIL_H_ 19 | 20 | #include 21 | #include 22 | #include 23 | 24 | #include "common.h" 25 | #include "nullable.h" 26 | 27 | #define CDBG_MODULE_NAME "cdbg_native" 28 | #define CDBG_SCOPED_NAME(n) CDBG_MODULE_NAME "." n 29 | 30 | namespace devtools { 31 | namespace cdbg { 32 | 33 | // 34 | // Note: all methods in this module must be called with Interpreter Lock held 35 | // by the current thread. 36 | // 37 | 38 | // Wraps C++ class as Python object 39 | struct PyObjectWrapper { 40 | PyObject_HEAD 41 | void* data; 42 | }; 43 | 44 | 45 | // Helper class to automatically increase/decrease reference count on 46 | // a Python object. 47 | // 48 | // This class can assumes the calling thread holds the Interpreter Lock. This 49 | // is particularly important in "ScopedPyObjectT" destructor. 50 | // 51 | // This class is not thread safe. 52 | template 53 | class ScopedPyObjectT { 54 | public: 55 | // STL compatible class to compute hash of PyObject. 56 | class Hash { 57 | public: 58 | size_t operator() (const ScopedPyObjectT& value) const { 59 | return reinterpret_cast(value.get()); 60 | } 61 | }; 62 | 63 | ScopedPyObjectT() : obj_(nullptr) {} 64 | 65 | // Takes over the reference. 66 | explicit ScopedPyObjectT(TPointer* obj) : obj_(obj) {} 67 | 68 | ScopedPyObjectT(const ScopedPyObjectT& other) { 69 | obj_ = other.obj_; 70 | Py_XINCREF(obj_); 71 | } 72 | 73 | ~ScopedPyObjectT() { 74 | // Only do anything if Python is running. If not, we get might get a 75 | // segfault when we try to decrement the reference count of the underlying 76 | // object when this destructor is run after Python itself has cleaned up. 77 | // https://bugs.python.org/issue17703 78 | if (Py_IsInitialized()) { 79 | reset(nullptr); 80 | } 81 | } 82 | 83 | static ScopedPyObjectT NewReference(TPointer* obj) { 84 | Py_XINCREF(obj); 85 | return ScopedPyObjectT(obj); 86 | } 87 | 88 | TPointer* get() const { return obj_; } 89 | 90 | bool is_null() const { return obj_ == nullptr; } 91 | 92 | ScopedPyObjectT& operator= (const ScopedPyObjectT& other) { 93 | if (obj_ == other.obj_) { 94 | return *this; 95 | } 96 | 97 | Py_XDECREF(obj_); 98 | obj_ = other.obj_; 99 | Py_XINCREF(obj_); 100 | 101 | return *this; 102 | } 103 | 104 | bool operator== (TPointer* other) const { 105 | return obj_ == other; 106 | } 107 | 108 | bool operator!= (TPointer* other) const { 109 | return obj_ != other; 110 | } 111 | 112 | bool operator== (const ScopedPyObjectT& other) const { 113 | return obj_ == other.obj_; 114 | } 115 | 116 | bool operator!= (const ScopedPyObjectT& other) const { 117 | return obj_ != other.obj_; 118 | } 119 | 120 | // Resets the ScopedPyObject, releasing the reference to the 121 | // underlying python object. Claims the reference to the new object, 122 | // if it is non-NULL. 123 | void reset(TPointer* obj) { 124 | Py_XDECREF(obj_); 125 | obj_ = obj; 126 | } 127 | 128 | // Releases the reference to the underlying python object. This 129 | // does not decrement the reference count. This function should be 130 | // used when the reference is being passed to some other function, 131 | // class, etc. The return value of this function is the underlying 132 | // Python object itself. 133 | TPointer* release() { 134 | TPointer* ret_val = obj_; 135 | obj_ = nullptr; 136 | return ret_val; 137 | } 138 | 139 | // Swaps the underlying python objects for two ScopedPyObjects. 140 | void swap(const ScopedPyObjectT& other) { 141 | std::swap(obj_, other.obj_); 142 | } 143 | 144 | private: 145 | // The underlying python object for which we hold a reference. Can be nullptr. 146 | TPointer* obj_; 147 | }; 148 | 149 | typedef ScopedPyObjectT ScopedPyObject; 150 | typedef ScopedPyObjectT ScopedPyCodeObject; 151 | 152 | // Helper class to call "PyThreadState_Swap" and revert it back to the 153 | // previous thread in destructor. 154 | class ScopedThreadStateSwap { 155 | public: 156 | explicit ScopedThreadStateSwap(PyThreadState* thread_state) 157 | : prev_thread_state_(PyThreadState_Swap(thread_state)) {} 158 | 159 | ~ScopedThreadStateSwap() { 160 | PyThreadState_Swap(prev_thread_state_); 161 | } 162 | 163 | private: 164 | PyThreadState* const prev_thread_state_; 165 | 166 | DISALLOW_COPY_AND_ASSIGN(ScopedThreadStateSwap); 167 | }; 168 | 169 | // Enumerates code object line table. 170 | // Usage example: 171 | // CodeObjectLinesEnumerator e; 172 | // while (enumerator.Next()) { 173 | // LOG(INFO) << "Line " << e.line_number() << " @ " << e.offset(); 174 | // } 175 | class CodeObjectLinesEnumerator { 176 | public: 177 | // Does not change reference count of "code_object". 178 | explicit CodeObjectLinesEnumerator(PyCodeObject* code_object); 179 | 180 | // Uses explicitly provided line table. 181 | CodeObjectLinesEnumerator(int firstlineno, PyObject* linedata); 182 | 183 | // Moves over to the next entry in code object line table. 184 | bool Next(); 185 | 186 | // Gets the bytecode offset of the current line. 187 | int32_t offset() const { return offset_; } 188 | 189 | // Gets the current source code line number. 190 | int32_t line_number() const { return line_number_; } 191 | 192 | private: 193 | void Initialize(int firstlineno, PyObject* linedata); 194 | 195 | private: 196 | // Bytecode offset of the current line. 197 | int32_t offset_; 198 | 199 | // Current source code line number 200 | int32_t line_number_; 201 | 202 | #if PY_VERSION_HEX < 0x030A0000 203 | // Number of remaining entries in line table. 204 | int remaining_entries_; 205 | 206 | // Pointer to the next entry of line table. 207 | const uint8_t* next_entry_; 208 | 209 | #else 210 | // Current address range in the linetable data. 211 | PyCodeAddressRange range_; 212 | 213 | #endif 214 | DISALLOW_COPY_AND_ASSIGN(CodeObjectLinesEnumerator); 215 | }; 216 | 217 | 218 | template 219 | bool operator== (TPointer* ref1, const ScopedPyObjectT& ref2) { 220 | return ref2 == ref1; 221 | } 222 | 223 | 224 | template 225 | bool operator!= (TPointer* ref1, const ScopedPyObjectT& ref2) { 226 | return ref2 != ref1; 227 | } 228 | 229 | 230 | // Sets the debuglet's Python module object. Should only be called during 231 | // initialization. 232 | void SetDebugletModule(PyObject* module); 233 | 234 | // Gets the debuglet's Python module object. Returns borrowed reference. 235 | PyObject* GetDebugletModule(); 236 | 237 | // Default value for "PyTypeObject" with no methods. Size, initialization and 238 | // cleanup routines are filled in by RegisterPythonType method. 239 | PyTypeObject DefaultTypeDefinition(const char* type_name); 240 | 241 | // Registers a custom Python type. Does not take ownership over "type". 242 | // "type" has to stay unchanged throughout the Python module lifetime. 243 | bool RegisterPythonType(PyTypeObject* type); 244 | 245 | template 246 | int DefaultPythonTypeInit(PyObject* self, PyObject* args, PyObject* kwds) { 247 | PyObjectWrapper* wrapper = reinterpret_cast(self); 248 | wrapper->data = new T; 249 | 250 | return 0; 251 | } 252 | 253 | template 254 | void DefaultPythonTypeDestructor(PyObject* self) { 255 | PyObjectWrapper* wrapper = reinterpret_cast(self); 256 | delete reinterpret_cast(wrapper->data); 257 | 258 | PyObject_Del(self); 259 | } 260 | 261 | template 262 | bool RegisterPythonType() { 263 | // Set defaults for the native type. 264 | if (T::python_type_.tp_basicsize == 0) { 265 | T::python_type_.tp_basicsize = sizeof(PyObjectWrapper); 266 | } 267 | 268 | if ((T::python_type_.tp_init == nullptr) && 269 | (T::python_type_.tp_dealloc == nullptr)) { 270 | T::python_type_.tp_init = DefaultPythonTypeInit; 271 | T::python_type_.tp_dealloc = DefaultPythonTypeDestructor; 272 | } 273 | 274 | return RegisterPythonType(&T::python_type_); 275 | } 276 | 277 | 278 | // Safe cast of PyObject to a native C++ object. Returns nullptr if "obj" is 279 | // nullptr or a different type. 280 | template 281 | T* py_object_cast(PyObject* obj) { 282 | if (obj == nullptr) { 283 | return nullptr; 284 | } 285 | 286 | if (Py_TYPE(obj) != &T::python_type_) { 287 | DCHECK(false); 288 | return nullptr; 289 | } 290 | 291 | return reinterpret_cast( 292 | reinterpret_cast(obj)->data); 293 | } 294 | 295 | 296 | // Creates a new native Python object. 297 | template 298 | ScopedPyObject NewNativePythonObject() { 299 | PyObject* new_object = PyObject_New(PyObject, &T::python_type_); 300 | if (new_object == nullptr) { 301 | return ScopedPyObject(); // return nullptr. 302 | } 303 | 304 | if (T::python_type_.tp_init(new_object, nullptr, nullptr) < 0) { 305 | PyObject_Del(new_object); 306 | return ScopedPyObject(); // return nullptr. 307 | } 308 | 309 | return ScopedPyObject(new_object); 310 | } 311 | 312 | // Checks whether the previous call generated an exception. If not, returns 313 | // nullptr. Otherwise formats the exception to string. 314 | Nullable ClearPythonException(); 315 | 316 | // Gets Python object from dictionary of a native module. Returns nullptr if not 317 | // found. In case of success returns borrowed reference. 318 | PyObject* GetDebugletModuleObject(const char* key); 319 | 320 | // Formats the name and the origin of the code object for logging. 321 | std::string CodeObjectDebugString(PyCodeObject* code_object); 322 | 323 | // Reads Python string as a byte array. The function does not verify that 324 | // "obj" is of a string type. 325 | std::vector PyBytesToByteArray(PyObject* obj); 326 | 327 | // Creates a new tuple by appending "items" to elements in "tuple". 328 | ScopedPyObject AppendTuple( 329 | PyObject* tuple, 330 | const std::vector& items); 331 | 332 | } // namespace cdbg 333 | } // namespace devtools 334 | 335 | #endif // DEVTOOLS_CDBG_DEBUGLETS_PYTHON_PYTHON_UTIL_H_ 336 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Python Snapshot Debugger Agent 2 | 3 | [Snapshot debugger](https://github.com/GoogleCloudPlatform/snapshot-debugger/) 4 | agent for Python 3.6, Python 3.7, Python 3.8, Python 3.9, and Python 3.10. 5 | 6 | 7 | ## Project Status: Archived 8 | 9 | This project has been archived and is no longer supported. There will be no 10 | further bug fixes or security patches. The repository can be forked by users 11 | if they want to maintain it going forward. 12 | 13 | 14 | ## Overview 15 | 16 | Snapshot Debugger lets you inspect the state 17 | of a running cloud application, at any code location, without stopping or 18 | slowing it down. It is not your traditional process debugger but rather an 19 | always on, whole app debugger taking snapshots from any instance of the app. 20 | 21 | Snapshot Debugger is safe for use with production apps or during development. The 22 | Python debugger agent only few milliseconds to the request latency when a debug 23 | snapshot is captured. In most cases, this is not noticeable to users. 24 | Furthermore, the Python debugger agent does not allow modification of 25 | application state in any way, and has close to zero impact on the app instances. 26 | 27 | Snapshot Debugger attaches to all instances of the app providing the ability to 28 | take debug snapshots and add logpoints. A snapshot captures the call-stack and 29 | variables from any one instance that executes the snapshot location. A logpoint 30 | writes a formatted message to the application log whenever any instance of the 31 | app executes the logpoint location. 32 | 33 | The Python debugger agent is only supported on Linux at the moment. It was 34 | tested on Debian Linux, but it should work on other distributions as well. 35 | 36 | Snapshot Debugger consists of 3 primary components: 37 | 38 | 1. The Python debugger agent (this repo implements one for CPython 3.6, 39 | 3.7, 3.8, 3.9, and 3.10). 40 | 2. A Firebase Realtime Database for storing and managing snapshots/logpoints. 41 | Explore the 42 | [schema](https://github.com/GoogleCloudPlatform/snapshot-debugger/blob/main/docs/SCHEMA.md). 43 | 3. User interface, including a command line interface 44 | [`snapshot-dbg-cli`](https://pypi.org/project/snapshot-dbg-cli/) and a 45 | [VSCode extension](https://github.com/GoogleCloudPlatform/snapshot-debugger/tree/main/snapshot_dbg_extension) 46 | 47 | ## Installation 48 | 49 | The easiest way to install the Python Cloud Debugger is with PyPI: 50 | 51 | ```shell 52 | pip install google-python-cloud-debugger 53 | ``` 54 | 55 | You can also build the agent from source code: 56 | 57 | ```shell 58 | git clone https://github.com/GoogleCloudPlatform/cloud-debug-python.git 59 | cd cloud-debug-python/src/ 60 | ./build.sh 61 | pip install dist/google_python_cloud_debugger-*.whl 62 | ``` 63 | 64 | Note that the build script assumes some dependencies. To install these 65 | dependencies on Debian, run this command: 66 | 67 | ```shell 68 | sudo apt-get -y -q --no-install-recommends install \ 69 | curl ca-certificates gcc build-essential cmake \ 70 | python3 python3-dev python3-pip 71 | ``` 72 | 73 | If the desired target version of Python is not the default version of 74 | the 'python3' command on your system, run the build script as `PYTHON=python3.x 75 | ./build.sh`. 76 | 77 | ### Alpine Linux 78 | 79 | The Python agent is not regularly tested on Alpine Linux, and support will be on 80 | a best effort basis. The [Dockerfile](alpine/Dockerfile) shows how to build a 81 | minimal image with the agent installed. 82 | 83 | ## Setup 84 | 85 | ### Google Cloud Platform 86 | 87 | 1. First, make sure that the VM has the 88 | [required scopes](https://github.com/GoogleCloudPlatform/snapshot-debugger/blob/main/docs/configuration.md#access-scopes). 89 | 90 | 2. Install the Python debugger agent as explained in the 91 | [Installation](#installation) section. 92 | 93 | 3. Enable the debugger in your application: 94 | 95 | ```python 96 | # Attach Python Cloud Debugger 97 | try: 98 | import googleclouddebugger 99 | googleclouddebugger.enable(module='[MODULE]', version='[VERSION]') 100 | except ImportError: 101 | pass 102 | ``` 103 | 104 | Where: 105 | 106 | * `[MODULE]` is the name of your app. This, along with the version, is 107 | used to identify the debug target in the UI.
108 | Example values: `MyApp`, `Backend`, or `Frontend`. 109 | 110 | * `[VERSION]` is the app version (for example, the build ID). The UI 111 | displays the running version as `[MODULE] - [VERSION]`.
112 | Example values: `v1.0`, `build_147`, or `v20170714`. 113 | 114 | ### Outside Google Cloud Platform 115 | 116 | To use the Python debugger agent on machines not hosted by Google Cloud 117 | Platform, you must set up credentials to authenticate with Google Cloud APIs. By 118 | default, the debugger agent tries to find the [Application Default 119 | Credentials](https://cloud.google.com/docs/authentication/production) on the 120 | system. This can either be from your personal account or a dedicated service 121 | account. 122 | 123 | #### Personal Account 124 | 125 | 1. Set up Application Default Credentials through 126 | [gcloud](https://cloud.google.com/sdk/gcloud/reference/auth/application-default/login). 127 | 128 | ```shell 129 | gcloud auth application-default login 130 | ``` 131 | 132 | 2. Follow the rest of the steps in the [GCP](#google-cloud-platform) section. 133 | 134 | #### Service Account 135 | 136 | 1. Use the Google Cloud Console Service Accounts 137 | [page](https://console.cloud.google.com/iam-admin/serviceaccounts/project) 138 | to create a credentials file for an existing or new service account. The 139 | service account must have at least the `roles/firebasedatabase.admin` role. 140 | 141 | 2. Once you have the service account credentials JSON file, deploy it alongside 142 | the Python debugger agent. 143 | 144 | 3. Set the `GOOGLE_APPLICATION_CREDENTIALS` environment variable. 145 | 146 | ```shell 147 | export GOOGLE_APPLICATION_CREDENTIALS=/path/to/credentials.json 148 | ``` 149 | 150 | Alternatively, you can provide the path to the credentials file directly to 151 | the debugger agent. 152 | 153 | ```python 154 | # Attach Python Cloud Debugger 155 | try: 156 | import googleclouddebugger 157 | googleclouddebugger.enable( 158 | module='[MODULE]', 159 | version='[VERSION]', 160 | service_account_json_file='/path/to/credentials.json') 161 | except ImportError: 162 | pass 163 | ``` 164 | 4. Follow the rest of the steps in the [GCP](#google-cloud-platform) section. 165 | 166 | ### Django Web Framework 167 | 168 | You can use the Cloud Debugger to debug Django web framework applications. 169 | 170 | The best way to enable the Cloud Debugger with Django is to add the following 171 | code fragment to your `manage.py` file: 172 | 173 | ```python 174 | # Attach the Python Cloud debugger (only the main server process). 175 | if os.environ.get('RUN_MAIN') or '--noreload' in sys.argv: 176 | try: 177 | import googleclouddebugger 178 | googleclouddebugger.enable(module='[MODULE]', version='[VERSION]') 179 | except ImportError: 180 | pass 181 | ``` 182 | 183 | Alternatively, you can pass the `--noreload` flag when running the Django 184 | `manage.py` and use any one of the option A and B listed earlier. Note that 185 | using the `--noreload` flag disables the autoreload feature in Django, which 186 | means local changes to files will not be automatically picked up by Django. 187 | 188 | ## Historical note 189 | 190 | Version 3.x of this agent supported both the now shutdown Cloud Debugger service 191 | (by default) and the 192 | [Snapshot Debugger](https://github.com/GoogleCloudPlatform/snapshot-debugger/) 193 | (Firebase RTDB backend) by setting the `use_firebase` flag to true. Version 4.0 194 | removed support for the Cloud Debugger service, making the Snapshot Debugger the 195 | default. To note the `use_firebase` flag is now obsolete, but still present for 196 | backward compatibility. 197 | 198 | ## Flag Reference 199 | 200 | The agent offers various flags to configure its behavior. Flags can be specified 201 | as keyword arguments: 202 | 203 | ```python 204 | googleclouddebugger.enable(flag_name='flag_value') 205 | ``` 206 | 207 | or as command line arguments when running the agent as a module: 208 | 209 | ```shell 210 | python -m googleclouddebugger --flag_name=flag_value -- myapp.py 211 | ``` 212 | 213 | The following flags are available: 214 | 215 | `module`: A name for your app. This, along with the version, is used to identify 216 | the debug target in the UI.
217 | Example values: `MyApp`, `Backend`, or `Frontend`. 218 | 219 | `version`: A version for your app. The UI displays the running version as 220 | `[MODULE] - [VERSION]`.
221 | If not provided, the UI will display the generated debuggee ID instead.
222 | Example values: `v1.0`, `build_147`, or `v20170714`. 223 | 224 | `service_account_json_file`: Path to JSON credentials of a [service 225 | account](https://cloud.google.com/iam/docs/service-accounts) to use for 226 | authentication. If not provided, the agent will fall back to [Application 227 | Default Credentials](https://cloud.google.com/docs/authentication/production) 228 | which are automatically available on machines hosted on GCP, or can be set via 229 | `gcloud auth application-default login` or the `GOOGLE_APPLICATION_CREDENTIALS` 230 | environment variable. 231 | 232 | `firebase_db_url`: Url pointing to a configured Firebase Realtime Database for 233 | the agent to use to store snapshot data. 234 | https://**PROJECT_ID**-cdbg.firebaseio.com will be used if not provided. where 235 | **PROJECT_ID** is your project ID. 236 | 237 | ## Development 238 | 239 | The following instructions are intended to help with modifying the codebase. 240 | 241 | ### Testing 242 | 243 | #### Unit tests 244 | 245 | Run the `build_and_test.sh` script from the root of the repository to build and 246 | run the unit tests using the locally installed version of Python. 247 | 248 | Run `bazel test tests/cpp:all` from the root of the repository to run unit 249 | tests against the C++ portion of the codebase. 250 | 251 | #### Local development 252 | 253 | You may want to run an agent with local changes in an application in order to 254 | validate functionality in a way that unit tests don't fully cover. To do this, 255 | you will need to build the agent: 256 | ``` 257 | cd src 258 | ./build.sh 259 | cd .. 260 | ``` 261 | 262 | The built agent will be available in the `src/dist` directory. You can now 263 | force the installation of the agent using: 264 | ``` 265 | pip3 install src/dist/* --force-reinstall 266 | ``` 267 | 268 | You can now run your test application using the development build of the agent 269 | in whatever way you desire. 270 | 271 | It is recommended that you do this within a 272 | [virtual environment](https://docs.python.org/3/library/venv.html). 273 | 274 | ### Build & Release (for project owners) 275 | 276 | Before performing a release, be sure to update the version number in 277 | `src/googleclouddebugger/version.py`. Tag the commit that increments the 278 | version number (eg. `v3.1`) and create a Github release. 279 | 280 | Run the `build-dist.sh` script from the root of the repository to build, 281 | test, and generate the distribution whls. You may need to use `sudo` 282 | depending on your system's docker setup. 283 | 284 | Build artifacts will be placed in `/dist` and can be pushed to pypi by running: 285 | ``` 286 | twine upload dist/*.whl 287 | ``` 288 | --------------------------------------------------------------------------------