├── CHANGELOG.md ├── CONTRIBUTING.md ├── LICENSE ├── MANIFEST.in ├── README.md ├── googlecloudprofiler ├── __init__.py ├── __version__.py ├── backoff.py ├── builder.py ├── client.py ├── cpu_profiler.py ├── profile_pb2.py ├── pythonprofiler.py └── src │ ├── _profiler.cc │ ├── clock.cc │ ├── clock.h │ ├── log.cc │ ├── log.h │ ├── populate_frames.cc │ ├── populate_frames.h │ ├── profiler.cc │ ├── profiler.h │ ├── stacktraces.cc │ └── stacktraces.h ├── kokoro ├── common.cfg ├── continuous.cfg ├── integration_test.go ├── integration_test.sh ├── presubmit.cfg └── release.cfg └── setup.py /CHANGELOG.md: -------------------------------------------------------------------------------- 1 | ## 4.1.0 2 | 3 | ### Features 4 | 5 | * feat: support Python 3.11 by using `_PyInternalFrame` directly 6 | ([244e7ca](https://github.com/GoogleCloudPlatform/cloud-profiler-python/244e7ca370fc2261980626394f2d2f91ab180fd2)) 7 | 8 | ### Bug Fixes 9 | 10 | * fix: update Go installation method 11 | ([ae0c31b](https://github.com/GoogleCloudPlatform/cloud-profiler-python/ae0c31b7ed962c8158e8a24ba44be6442c7374f8)) 12 | * fix(deps): Add protobuf version lower bound 13 | ([ea964fc](https://github.com/GoogleCloudPlatform/cloud-profiler-python/ea964fc72095ca60492e7e5e1e4a45b01a91eccc)) 14 | 15 | ### Internal / Testing Changes 16 | 17 | * chore: add aaronabbott@ as a python agent author 18 | ([ca0bc43](https://github.com/GoogleCloudPlatform/cloud-profiler-python/ca0bc431498cd56f75e6eb474b2f4cfb37dc4818)) 19 | 20 | ## 4.0.0 21 | 22 | ### ⚠ BREAKING CHANGES 23 | 24 | * fix(deps)!: make agent compatible with protobuf v21.1 25 | ([8d2ad1e](https://github.com/GoogleCloudPlatform/cloud-profiler-python/8d2ad1ee0cfe5b15d346040442c28433c1786e28)) 26 | 27 | ## 3.1.0 28 | 29 | ### Features 30 | 31 | * fix: relax service name regexp to allow service name to start with number 32 | ([179adcb](https://github.com/GoogleCloudPlatform/cloud-profiler-python/179adcb407bf84a74a935b753e02aedf7b007138)) 33 | 34 | ### 3.0.8 35 | 36 | ### Internal / Testing Changes 37 | 38 | * test: make integration test use Go 1.17.7 39 | ([6996cf6](https://github.com/GoogleCloudPlatform/cloud-profiler-python/6996cf6eea8ba814abab5ff625ca5a03b09dbc08)) 40 | 41 | * test: Use Python 3.6 specific get-pip.py when testing with Python 3.6. 42 | ([70f93b5](https://github.com/GoogleCloudPlatform/cloud-profiler-python/70f93b53187e074f5fd354a9f1fd19e25de79a6d)) 43 | 44 | ### 3.0.7 45 | 46 | ### Bug Fixes 47 | 48 | * fix: rollback workaround for certification issue 49 | ([ca588f5](https://github.com/GoogleCloudPlatform/cloud-profiler-python/ca588f58081258b1259a2564f2a1614c6c949495)) 50 | 51 | ### 3.0.6 52 | 53 | ### Bug Fixes 54 | 55 | * fix: workaround certificate expiration issue in integration tests 56 | ([80f423f](https://github.com/GoogleCloudPlatform/cloud-profiler-python/80f423f439cbc780d2da8930abc0b99308378abb)) 57 | 58 | ### Internal / Testing Changes 59 | 60 | * chore: log most errors at warning level. 61 | ([8147311](https://github.com/GoogleCloudPlatform/cloud-profiler-python/814731125216fd1c332c9d90074635485e8ac62f)) 62 | 63 | ### 3.0.5 64 | 65 | ### Documentation 66 | 67 | * doc: update the changelog for release of 3.0.4 68 | ([57b45cf](https://github.com/GoogleCloudPlatform/cloud-profiler-python/57b45cf2a72333063bc64c3dc636098e0571e8cf)) 69 | 70 | ### Internal / Testing Changes 71 | 72 | * test: display environment variables when encountering an error 73 | ([ad2ce5b](https://github.com/GoogleCloudPlatform/cloud-profiler-python/ad2ce5bad286fd7f1dddee741babb5a374339518)) 74 | 75 | * test: temporarily disable testing for Python 3.10 until 76 | https://github.com/pypa/pip/issues/9951 is resolved 77 | ([4197241](https://github.com/GoogleCloudPlatform/cloud-profiler-python/41972412bb45e484552bac803bf1319222224415)) 78 | 79 | * chore: make CHANGELOG.md a top-level file 80 | ([058e646](https://github.com/GoogleCloudPlatform/cloud-profiler-python/058e6467a217f48b2155b2f31336fcd4e7fb4030)) 81 | 82 | ### 3.0.4 83 | 84 | ### Dependencies 85 | 86 | * chore: requires google-api-python-client != 2.0.2 to avoid private API 87 | incompatibility issue 88 | ([5738fe8](https://github.com/GoogleCloudPlatform/cloud-profiler-python/5738fe8e2a68dee548c0b4ba9465bfa48d019706)) 89 | 90 | ### Documentation 91 | 92 | * doc: add CHANGELOG.md file 93 | ([dabfdd6](https://github.com/GoogleCloudPlatform/cloud-profiler-python/dabfdd6cdd8c3a181c4d8dec607a7e907e4fac7e)) 94 | 95 | ### 3.0.3 96 | 97 | ### Bug Fixes 98 | 99 | * Fix the import issue that breaks the Python agent on macOS. 100 | ([a254dd6](https://github.com/GoogleCloudPlatform/cloud-profiler-python/a254dd60eb871332789d9b10d0cb97a35e82cbc9)) 101 | 102 | ### 3.0.2 103 | 104 | ### Internal / Testing Changes 105 | 106 | * Add integration tests to officially support 3.8 and 3.9. 107 | ([58eeb62](https://github.com/GoogleCloudPlatform/cloud-profiler-python/58eeb622d44919e0ab622dbfa90b3e75888c9b04)) 108 | 109 | ### 3.0.1 110 | 111 | ### Bug Fixes 112 | 113 | * Use google-api-python-client < 2.0.0 since latest version is not compatible 114 | with test endpoints. 115 | ([5f6459a](https://github.com/GoogleCloudPlatform/cloud-profiler-python/5f6459ac968195890bccf918b19959b3f5ed317d)) 116 | 117 | ## 3.0.0 118 | 119 | ### ⚠ BREAKING CHANGES 120 | 121 | * Drop support for Python version prior to 3.6. 122 | ([c64557c](https://github.com/GoogleCloudPlatform/cloud-profiler-python/c64557c8c13cf84f38edeb70080c0db6dd3b2bac)) 123 | -------------------------------------------------------------------------------- /CONTRIBUTING.md: -------------------------------------------------------------------------------- 1 | # How to Contribute 2 | 3 | We'd love to accept your patches and contributions to this project. There are 4 | just a few small guidelines you need to follow. 5 | 6 | ## Contributor License Agreement 7 | 8 | Contributions to this project must be accompanied by a Contributor License 9 | Agreement. You (or your employer) retain the copyright to your contribution; 10 | this simply gives us permission to use and redistribute your contributions as 11 | part of the project. Head over to to see 12 | your current agreements on file or to sign a new one. 13 | 14 | You generally only need to submit a CLA once, so if you've already submitted one 15 | (even if it was for a different project), you probably don't need to do it 16 | again. 17 | 18 | ## Code reviews 19 | 20 | All submissions, including submissions by project members, require review. We 21 | use GitHub pull requests for this purpose. Consult 22 | [GitHub Help](https://help.github.com/articles/about-pull-requests/) for more 23 | information on using pull requests. 24 | 25 | ## Community Guidelines 26 | 27 | This project follows 28 | [Google's Open Source Community Guidelines](https://opensource.google.com/conduct/). 29 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | 2 | Apache License 3 | Version 2.0, January 2004 4 | http://www.apache.org/licenses/ 5 | 6 | TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 7 | 8 | 1. Definitions. 9 | 10 | "License" shall mean the terms and conditions for use, reproduction, 11 | and distribution as defined by Sections 1 through 9 of this document. 12 | 13 | "Licensor" shall mean the copyright owner or entity authorized by 14 | the copyright owner that is granting the License. 15 | 16 | "Legal Entity" shall mean the union of the acting entity and all 17 | other entities that control, are controlled by, or are under common 18 | control with that entity. For the purposes of this definition, 19 | "control" means (i) the power, direct or indirect, to cause the 20 | direction or management of such entity, whether by contract or 21 | otherwise, or (ii) ownership of fifty percent (50%) or more of the 22 | outstanding shares, or (iii) beneficial ownership of such entity. 23 | 24 | "You" (or "Your") shall mean an individual or Legal Entity 25 | exercising permissions granted by this License. 26 | 27 | "Source" form shall mean the preferred form for making modifications, 28 | including but not limited to software source code, documentation 29 | source, and configuration files. 30 | 31 | "Object" form shall mean any form resulting from mechanical 32 | transformation or translation of a Source form, including but 33 | not limited to compiled object code, generated documentation, 34 | and conversions to other media types. 35 | 36 | "Work" shall mean the work of authorship, whether in Source or 37 | Object form, made available under the License, as indicated by a 38 | copyright notice that is included in or attached to the work 39 | (an example is provided in the Appendix below). 40 | 41 | "Derivative Works" shall mean any work, whether in Source or Object 42 | form, that is based on (or derived from) the Work and for which the 43 | editorial revisions, annotations, elaborations, or other modifications 44 | represent, as a whole, an original work of authorship. For the purposes 45 | of this License, Derivative Works shall not include works that remain 46 | separable from, or merely link (or bind by name) to the interfaces of, 47 | the Work and Derivative Works thereof. 48 | 49 | "Contribution" shall mean any work of authorship, including 50 | the original version of the Work and any modifications or additions 51 | to that Work or Derivative Works thereof, that is intentionally 52 | submitted to Licensor for inclusion in the Work by the copyright owner 53 | or by an individual or Legal Entity authorized to submit on behalf of 54 | the copyright owner. For the purposes of this definition, "submitted" 55 | means any form of electronic, verbal, or written communication sent 56 | to the Licensor or its representatives, including but not limited to 57 | communication on electronic mailing lists, source code control systems, 58 | and issue tracking systems that are managed by, or on behalf of, the 59 | Licensor for the purpose of discussing and improving the Work, but 60 | excluding communication that is conspicuously marked or otherwise 61 | designated in writing by the copyright owner as "Not a Contribution." 62 | 63 | "Contributor" shall mean Licensor and any individual or Legal Entity 64 | on behalf of whom a Contribution has been received by Licensor and 65 | subsequently incorporated within the Work. 66 | 67 | 2. Grant of Copyright License. Subject to the terms and conditions of 68 | this License, each Contributor hereby grants to You a perpetual, 69 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 70 | copyright license to reproduce, prepare Derivative Works of, 71 | publicly display, publicly perform, sublicense, and distribute the 72 | Work and such Derivative Works in Source or Object form. 73 | 74 | 3. Grant of Patent License. Subject to the terms and conditions of 75 | this License, each Contributor hereby grants to You a perpetual, 76 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 77 | (except as stated in this section) patent license to make, have made, 78 | use, offer to sell, sell, import, and otherwise transfer the Work, 79 | where such license applies only to those patent claims licensable 80 | by such Contributor that are necessarily infringed by their 81 | Contribution(s) alone or by combination of their Contribution(s) 82 | with the Work to which such Contribution(s) was submitted. If You 83 | institute patent litigation against any entity (including a 84 | cross-claim or counterclaim in a lawsuit) alleging that the Work 85 | or a Contribution incorporated within the Work constitutes direct 86 | or contributory patent infringement, then any patent licenses 87 | granted to You under this License for that Work shall terminate 88 | as of the date such litigation is filed. 89 | 90 | 4. Redistribution. You may reproduce and distribute copies of the 91 | Work or Derivative Works thereof in any medium, with or without 92 | modifications, and in Source or Object form, provided that You 93 | meet the following conditions: 94 | 95 | (a) You must give any other recipients of the Work or 96 | Derivative Works a copy of this License; and 97 | 98 | (b) You must cause any modified files to carry prominent notices 99 | stating that You changed the files; and 100 | 101 | (c) You must retain, in the Source form of any Derivative Works 102 | that You distribute, all copyright, patent, trademark, and 103 | attribution notices from the Source form of the Work, 104 | excluding those notices that do not pertain to any part of 105 | the Derivative Works; and 106 | 107 | (d) If the Work includes a "NOTICE" text file as part of its 108 | distribution, then any Derivative Works that You distribute must 109 | include a readable copy of the attribution notices contained 110 | within such NOTICE file, excluding those notices that do not 111 | pertain to any part of the Derivative Works, in at least one 112 | of the following places: within a NOTICE text file distributed 113 | as part of the Derivative Works; within the Source form or 114 | documentation, if provided along with the Derivative Works; or, 115 | within a display generated by the Derivative Works, if and 116 | wherever such third-party notices normally appear. The contents 117 | of the NOTICE file are for informational purposes only and 118 | do not modify the License. You may add Your own attribution 119 | notices within Derivative Works that You distribute, alongside 120 | or as an addendum to the NOTICE text from the Work, provided 121 | that such additional attribution notices cannot be construed 122 | as modifying the License. 123 | 124 | You may add Your own copyright statement to Your modifications and 125 | may provide additional or different license terms and conditions 126 | for use, reproduction, or distribution of Your modifications, or 127 | for any such Derivative Works as a whole, provided Your use, 128 | reproduction, and distribution of the Work otherwise complies with 129 | the conditions stated in this License. 130 | 131 | 5. Submission of Contributions. Unless You explicitly state otherwise, 132 | any Contribution intentionally submitted for inclusion in the Work 133 | by You to the Licensor shall be under the terms and conditions of 134 | this License, without any additional terms or conditions. 135 | Notwithstanding the above, nothing herein shall supersede or modify 136 | the terms of any separate license agreement you may have executed 137 | with Licensor regarding such Contributions. 138 | 139 | 6. Trademarks. This License does not grant permission to use the trade 140 | names, trademarks, service marks, or product names of the Licensor, 141 | except as required for reasonable and customary use in describing the 142 | origin of the Work and reproducing the content of the NOTICE file. 143 | 144 | 7. Disclaimer of Warranty. Unless required by applicable law or 145 | agreed to in writing, Licensor provides the Work (and each 146 | Contributor provides its Contributions) on an "AS IS" BASIS, 147 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 148 | implied, including, without limitation, any warranties or conditions 149 | of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A 150 | PARTICULAR PURPOSE. You are solely responsible for determining the 151 | appropriateness of using or redistributing the Work and assume any 152 | risks associated with Your exercise of permissions under this License. 153 | 154 | 8. Limitation of Liability. In no event and under no legal theory, 155 | whether in tort (including negligence), contract, or otherwise, 156 | unless required by applicable law (such as deliberate and grossly 157 | negligent acts) or agreed to in writing, shall any Contributor be 158 | liable to You for damages, including any direct, indirect, special, 159 | incidental, or consequential damages of any character arising as a 160 | result of this License or out of the use or inability to use the 161 | Work (including but not limited to damages for loss of goodwill, 162 | work stoppage, computer failure or malfunction, or any and all 163 | other commercial damages or losses), even if such Contributor 164 | has been advised of the possibility of such damages. 165 | 166 | 9. Accepting Warranty or Additional Liability. While redistributing 167 | the Work or Derivative Works thereof, You may choose to offer, 168 | and charge a fee for, acceptance of support, warranty, indemnity, 169 | or other liability obligations and/or rights consistent with this 170 | License. However, in accepting such obligations, You may act only 171 | on Your own behalf and on Your sole responsibility, not on behalf 172 | of any other Contributor, and only if You agree to indemnify, 173 | defend, and hold each Contributor harmless for any liability 174 | incurred by, or claims asserted against, such Contributor by reason 175 | of your accepting any such warranty or additional liability. 176 | 177 | END OF TERMS AND CONDITIONS 178 | 179 | APPENDIX: How to apply the Apache License to your work. 180 | 181 | To apply the Apache License to your work, attach the following 182 | boilerplate notice, with the fields enclosed by brackets "[]" 183 | replaced with your own identifying information. (Don't include 184 | the brackets!) The text should be enclosed in the appropriate 185 | comment syntax for the file format. We also recommend that a 186 | file or class name and description of purpose be included on the 187 | same "printed page" as the copyright notice for easier 188 | identification within third-party archives. 189 | 190 | Copyright [yyyy] [name of copyright owner] 191 | 192 | Licensed under the Apache License, Version 2.0 (the "License"); 193 | you may not use this file except in compliance with the License. 194 | You may obtain a copy of the License at 195 | 196 | http://www.apache.org/licenses/LICENSE-2.0 197 | 198 | Unless required by applicable law or agreed to in writing, software 199 | distributed under the License is distributed on an "AS IS" BASIS, 200 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 201 | See the License for the specific language governing permissions and 202 | limitations under the License. 203 | -------------------------------------------------------------------------------- /MANIFEST.in: -------------------------------------------------------------------------------- 1 | include MANIFEST.in 2 | include LICENSE 3 | 4 | # Header files are not by default included in the source distribution created 5 | # by sdist. 6 | recursive-include googlecloudprofiler *.h -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Google Cloud Python profiling agent 2 | 3 | Python profiling agent for 4 | [Google Cloud Profiler](https://cloud.google.com/profiler/). 5 | 6 | See 7 | [Google Cloud Profiler profiling Python code](https://cloud.google.com/profiler/docs/profiling-python) 8 | for detailed documentation. 9 | 10 | ## Supported OS 11 | 12 | Linux. Profiling Python applications is supported for Linux kernels whose 13 | standard C library is implemented with `glibc` or with `musl`. For configuration 14 | information specific to Linux Alpine kernels, see 15 | [Running on Linux Alpine](https://cloud.google.com/profiler/docs/profiling-python#running_with_linux_alpine). 16 | 17 | ## Supported Python Versions 18 | 19 | Python >= 3.7 and <= 3.11 20 | 21 | ## Installation & usage 22 | 23 | 1. Install the profiler package using PyPI: 24 | 25 | ```shell 26 | pip3 install google-cloud-profiler 27 | ``` 28 | 29 | 2. Enable the profiler in your application: 30 | 31 | ```python 32 | import googlecloudprofiler 33 | 34 | def main(): 35 | # Profiler initialization. It starts a daemon thread which continuously 36 | # collects and uploads profiles. Best done as early as possible. 37 | try: 38 | googlecloudprofiler.start( 39 | service='hello-profiler', 40 | service_version='1.0.1', 41 | # verbose is the logging level. 0-error, 1-warning, 2-info, 42 | # 3-debug. It defaults to 0 (error) if not set. 43 | verbose=3, 44 | # project_id must be set if not running on GCP. 45 | # project_id='my-project-id', 46 | ) 47 | except (ValueError, NotImplementedError) as exc: 48 | print(exc) # Handle errors here 49 | ``` 50 | 51 | ## Installation on Linux Alpine 52 | 53 | The Python profiling agent has a native component. The base Alpine image for 54 | Python does not have all dependencies required to build this native component 55 | installed. To build the Python profiling agent on Alpine, one must install the 56 | package `build-base`. 57 | 58 | To use the Python profiling agent on Alpine without installing additional 59 | dependencies on to the final Alpine image, one can use a two-stage build and 60 | compile the Python profiling agent in the first stage. 61 | 62 | Here is an example of a Docker image that uses a multi-stage build to compile 63 | and install the Python profiling agent: 64 | 65 | ``` 66 | FROM python:3.7-alpine as builder 67 | 68 | # Install build-base to allow for compilation of the profiling agent. 69 | RUN apk add --update --no-cache build-base 70 | 71 | # Compile the profiling agent, generating wheels for it. 72 | RUN pip3 wheel --wheel-dir=/tmp/wheels google-cloud-profiler 73 | 74 | 75 | FROM python:3.7-alpine 76 | 77 | # Copy over the directory containing wheels for the profiling agent. 78 | COPY --from=builder /tmp/wheels /tmp/wheels 79 | 80 | # Install the profiling agent. 81 | RUN pip3 install --no-index --find-links=/tmp/wheels google-cloud-profiler 82 | 83 | # Install any other required modules or dependencies, and copy an app which 84 | # enables the profiler as described in "Enable the profiler in your 85 | # application". 86 | COPY ./bench.py . 87 | 88 | # Run the application when the docker image is run, using either CMD (as is done 89 | # here) or ENTRYPOINT. 90 | CMD python3 -u bench.py 91 | ``` 92 | 93 | 94 | ## Troubleshooting 95 | 96 | ### Resource temporarily unavailable errors with Python 97 | 98 | If you see the following log entries after enabling the Profiler: 99 | 100 | ``` 101 | BlockingIOError: [Errno 11] Resource temporarily unavailable 102 | Exception ignored when trying to write to the signal wakeup fd 103 | ``` 104 | 105 | see https://cloud.google.com/profiler/docs/troubleshooting#python-blocking for 106 | the cause and the workaround. 107 | -------------------------------------------------------------------------------- /googlecloudprofiler/__init__.py: -------------------------------------------------------------------------------- 1 | # Copyright 2018 Google LLC 2 | # 3 | # Licensed under the Apache License, Version 2.0 (the "License"); 4 | # you may not use this file except in compliance with the License. 5 | # You may obtain a copy of the License at 6 | # 7 | # https://www.apache.org/licenses/LICENSE-2.0 8 | # 9 | # Unless required by applicable law or agreed to in writing, software 10 | # distributed under the License is distributed on an "AS IS" BASIS, 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | # See the License for the specific language governing permissions and 13 | # limitations under the License. 14 | """Init module for Python Cloud Profiler.""" 15 | 16 | import logging 17 | import sys 18 | from googlecloudprofiler import __version__ as version 19 | from googlecloudprofiler import client 20 | 21 | _started = False 22 | 23 | logger = logging.getLogger(__name__) 24 | 25 | 26 | def start(service=None, 27 | service_version=None, 28 | project_id=None, 29 | service_account_json_file=None, 30 | verbose=0, 31 | disable_cpu_profiling=False, 32 | disable_wall_profiling=False, 33 | period_ms=10, 34 | discovery_service_url=None): 35 | """Starts the profiler. 36 | 37 | This function starts a daemon thread which polls the profiler server for 38 | instructions, and collects and uploads profiles as requested. It should only 39 | be called once. Subsequent calls will be ignored. If wall profiling is 40 | enabled, this function must be called on the main thread. 41 | 42 | Args: 43 | service: A string specifying the name of the service under which the 44 | profiled data will be recorded and exposed at the profiler UI for the 45 | project. The string should be the same across different replicas of your 46 | service so that the globally constant profiling rate is maintained. Do not 47 | put things like PID or unique pod ID in the name. The string must match 48 | the regular expression '^[a-z0-9]([-a-z0-9_.]{0,253}[a-z0-9])?$'. When not 49 | specified, the value of GAE_SERVICE environment variable will be used, 50 | which is set for applications running on Google App Engine; if GAE_SERVICE 51 | is not set,the value of K_VERSION environment variable, which is set on 52 | Knative containers, will be used. If specified neither here nor via an 53 | envrionment variable, a value error will be raised. 54 | service_version: An optional string specifying the version of the service. 55 | It can be an arbitrary string. Profiler profiles once per minute for each 56 | version of each service in each zone. It defaults to GAE_VERSION 57 | environment variable if that is set, to K_REVISION environment variable if 58 | that is set and GAE_VERSION is not set, and to empty string otherwise. 59 | project_id: A string specifying the cloud project ID. When not specified, 60 | the value can be read from the credential file or otherwise read from the 61 | VM metadata server. If specified neither here nor via the envrionment, a 62 | value error will be raised. 63 | service_account_json_file: An optional string providing the path to the 64 | service account json file. If not provided, application default 65 | credentials are used. 66 | verbose: An optional int specifying the logging level. Logging messages 67 | which are less severe than verbose will be ignored. 0-error, 1-warn, 68 | 2-info, 3-debug. Defaults to error. 69 | disable_cpu_profiling: An optional bool specifying whether or not the CPU 70 | time profiling should be disabled. CPU profiling is only supported for 71 | Python 3.2 or higher. This flag is ignored on unsupported Python versions. 72 | Defaults to False. 73 | disable_wall_profiling: An optional bool specifying whether or not the Wall 74 | time profiling should be disabled. Wall profiling is supported for Python 75 | 2 and Python 3.6 and higher. This flag is ignored on unsupported Python 76 | versions. It defaults to False for the supported versions. The current 77 | wall time profiling avoids dependency on any native code by using Python 78 | signal module. It only profiles the main thread. The start function must 79 | be called from the main thread if wall time profiling is enabled. 80 | Using SIGALRM signal after starting the profiler will cause problems: 81 | registering a handler for SIGALRM will prevent the profiler from 82 | working. SIGALRM will be trigger by the profiler at unpredictable time. 83 | Wall profiling has some other limitations as documented in the 84 | pythonprofiler module. 85 | period_ms: An optional integer specifying the sampling interval in 86 | milliseconds. Applies to both CPU profiling and wall profiling. Defaults 87 | to 10. 88 | discovery_service_url: Optional discovery service URL override. Only useful 89 | to developers of the profiler (to specify API key to use with a testing 90 | API endpoint). 91 | 92 | Raises: 93 | ValueError: If arguments are invalid or if necessary information can't be 94 | determined from the environment and arguments. Or if service name doesn't 95 | match '^[a-z0-9]([-a-z0-9_.]{0,253}[a-z0-9])?$'. Or if called from 96 | a non-main thread when Wall time profiling is enabled. Or if no profiling 97 | mode is enabled. 98 | NotImplementedError: If not run on Linux or Mac. 99 | """ 100 | global _started 101 | if _started: 102 | logger.warning('googlecloudprofiler.start() called again after it was ' 103 | 'previously called. This function should only be called ' 104 | 'once. This call is ignored.') 105 | return 106 | 107 | # Adds a StreamHandler with a default Formatter to the root logger. 108 | # It does nothing if the root logger already has handlers. 109 | logging.basicConfig() 110 | 111 | if not (sys.platform.startswith('linux') or 112 | sys.platform.startswith('darwin')): 113 | raise NotImplementedError('%s OS is not supported.' % (sys.platform)) 114 | 115 | logging_level = [logging.ERROR, logging.WARNING, logging.INFO, 116 | logging.DEBUG][min(verbose, 3)] 117 | logger.setLevel(logging_level) 118 | 119 | if sys.version_info < (3, 2): 120 | logger.warning( 121 | 'Python version %d.%d is not supported. Minimum supported ' 122 | 'Python version is 3.2.', sys.version_info[0], sys.version_info[1]) 123 | 124 | profiler_client = client.Client() 125 | project_id = profiler_client.setup_auth(project_id, service_account_json_file) 126 | profiler_client.config(project_id, service, service_version, 127 | disable_cpu_profiling, disable_wall_profiling, 128 | period_ms, discovery_service_url) 129 | logger.info('Google Cloud Profiler Python agent version: %s', 130 | version.__version__) 131 | profiler_client.start() 132 | 133 | _started = True 134 | -------------------------------------------------------------------------------- /googlecloudprofiler/__version__.py: -------------------------------------------------------------------------------- 1 | # Copyright 2019 Google LLC 2 | # 3 | # Licensed under the Apache License, Version 2.0 (the "License"); 4 | # you may not use this file except in compliance with the License. 5 | # You may obtain a copy of the License at 6 | # 7 | # https://www.apache.org/licenses/LICENSE-2.0 8 | # 9 | # Unless required by applicable law or agreed to in writing, software 10 | # distributed under the License is distributed on an "AS IS" BASIS, 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | # See the License for the specific language governing permissions and 13 | # limitations under the License. 14 | 15 | # pylint: skip-file 16 | """Version of Python Cloud Profiler module.""" 17 | 18 | # setup.py reads the version information from here to set package version 19 | __version__ = '4.1.0' 20 | -------------------------------------------------------------------------------- /googlecloudprofiler/backoff.py: -------------------------------------------------------------------------------- 1 | # Copyright 2018 Google LLC 2 | # 3 | # Licensed under the Apache License, Version 2.0 (the "License"); 4 | # you may not use this file except in compliance with the License. 5 | # You may obtain a copy of the License at 6 | # 7 | # https://www.apache.org/licenses/LICENSE-2.0 8 | # 9 | # Unless required by applicable law or agreed to in writing, software 10 | # distributed under the License is distributed on an "AS IS" BASIS, 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | # See the License for the specific language governing permissions and 13 | # limitations under the License. 14 | """Implements profiler backoff.""" 15 | 16 | import errno 17 | import json 18 | import logging 19 | import random 20 | import googleapiclient 21 | from google.protobuf import duration_pb2 22 | from google.protobuf import json_format 23 | 24 | logger = logging.getLogger(__name__) 25 | _NANOS_PER_SEC = 1000 * 1000 * 1000 26 | 27 | 28 | class Backoff: 29 | """This class calculates the backoff duration for a failed request. 30 | 31 | A backoff duration specified by the server is used if it presents in the 32 | error message. Otherwise exponential backoff is used. The actual duration is a 33 | random value between 0 and an envelope. The envelope starts from the specified 34 | minimum. It is exponentially increased between subsequent failures, up to the 35 | specified maximum. 36 | """ 37 | 38 | def __init__(self, 39 | min_envelope_sec=60.0, 40 | max_envelope_sec=3600.0, 41 | multiplier=1.3): 42 | """Constructs a Backoff object. 43 | 44 | Args: 45 | min_envelope_sec: A float specifying the initial minimum backoff duration 46 | envelope in seconds. 47 | max_envelope_sec: A float specifying the maximum backoff duration envelope 48 | in seconds. 49 | multiplier: A float specifying the factor for exponential increase. 50 | """ 51 | random.seed() 52 | self._min_envelope_sec = min_envelope_sec 53 | self._max_envelope_sec = max_envelope_sec 54 | self._multiplier = multiplier 55 | self._current_envelope_sec = min_envelope_sec 56 | 57 | def next_backoff(self, error=None): 58 | """Calculates the backoff duration for a failed request. 59 | 60 | Args: 61 | error: The exception that caused the failure. 62 | 63 | Returns: 64 | A float representing the desired backoff duration in seconds. 65 | """ 66 | try: 67 | # Add short retry period for "broken pipe" exception. See b/158130635 for 68 | # more details. 69 | if isinstance(error, OSError) and error.errno == errno.EPIPE: 70 | broken_pipe_sec = random.uniform(1, 10) 71 | logger.warning('Agent will back off for %.3f seconds due to %s', 72 | broken_pipe_sec, str(error)) 73 | return broken_pipe_sec 74 | elif isinstance(error, googleapiclient.errors.HttpError): 75 | content = json.loads(error.content.decode('utf-8')) 76 | for detail in content.get('error', {}).get('details', []): 77 | if 'retryDelay' in detail: 78 | delay = duration_pb2.Duration() 79 | json_format.Parse(json.dumps(detail['retryDelay']), delay) 80 | return delay.seconds + float(delay.nanos) / _NANOS_PER_SEC 81 | # It's safe to catch BaseException because this runs in a daemon thread. 82 | except BaseException as e: # pylint: disable=broad-except 83 | logger.warning( 84 | 'Failed to extract server-specified backoff duration ' 85 | '(will use exponential backoff): %s', str(e)) 86 | 87 | duration = random.uniform(0, self._current_envelope_sec) 88 | self._current_envelope_sec = min( 89 | self._max_envelope_sec, self._current_envelope_sec * self._multiplier) 90 | return duration 91 | -------------------------------------------------------------------------------- /googlecloudprofiler/builder.py: -------------------------------------------------------------------------------- 1 | # Copyright 2018 Google LLC 2 | # 3 | # Licensed under the Apache License, Version 2.0 (the "License"); 4 | # you may not use this file except in compliance with the License. 5 | # You may obtain a copy of the License at 6 | # 7 | # https://www.apache.org/licenses/LICENSE-2.0 8 | # 9 | # Unless required by applicable law or agreed to in writing, software 10 | # distributed under the License is distributed on an "AS IS" BASIS, 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | # See the License for the specific language governing permissions and 13 | # limitations under the License. 14 | """Builds the profile proto from call stack traces.""" 15 | 16 | import collections 17 | import gzip 18 | import io 19 | from googlecloudprofiler import profile_pb2 20 | 21 | Func = collections.namedtuple('Func', ['name', 'filename']) 22 | Loc = collections.namedtuple('Loc', ['func_id', 'line_number']) 23 | 24 | 25 | class Builder: 26 | """Builds the profile proto from call stack traces.""" 27 | 28 | def __init__(self): 29 | self._profile = profile_pb2.Profile() 30 | self._function_map = {} 31 | self._location_map = {} 32 | self._string_map = {} 33 | # string_table[0] in the profile proto must be an empty string. 34 | self._string_id('') 35 | 36 | def populate_profile(self, traces, profile_type, period_unit, period, 37 | duration_ns): 38 | """Populates call stack traces into a profile proto. 39 | 40 | Args: 41 | traces: A map mapping a trace to its count. A trace is a sequence of 42 | frames. The leaf frame is at trace[0]. A frame is represented as a tuple 43 | of (function name, filename, line number). 44 | profile_type: A string specifying the profile type, e.g 'CPU' or 'WALL'. 45 | See https://github.com/google/pprof/blob/master/proto/profile.proto for 46 | possible profile types. 47 | period_unit: A string specifying the measurement unit of the sampling 48 | period, e.g 'nanoseconds'. 49 | period: An integer specifying the interval between sampled occurrences. 50 | The measurement unit is specified by the period_unit argument. 51 | duration_ns: An integer specifying the profiling duration in nanoseconds. 52 | """ 53 | self._profile.period_type.type = self._string_id(profile_type) 54 | self._profile.period_type.unit = self._string_id(period_unit) 55 | self._profile.period = period 56 | self._profile.duration_nanos = duration_ns 57 | type1 = self._profile.sample_type.add() 58 | type1.type = self._string_id('sample') 59 | type1.unit = self._string_id('count') 60 | type2 = self._profile.sample_type.add() 61 | type2.type = self._string_id(profile_type) 62 | type2.unit = self._string_id(period_unit) 63 | 64 | for trace, count in traces.items(): 65 | sample = self._profile.sample.add() 66 | sample.value.append(count) 67 | sample.value.append(period * count) 68 | for frame in trace: 69 | # TODO: try to use named tuple for frame if it doesn't over 70 | # complicate the native profiler. 71 | func_id = self._function_id(frame[0], frame[1]) 72 | location_id = self._location_id(func_id, frame[2]) 73 | sample.location_id.append(location_id) 74 | 75 | def emit(self): 76 | """Returns the profile in gzip-compressed profile proto format.""" 77 | profile = self._profile.SerializeToString() 78 | out = io.BytesIO() 79 | with gzip.GzipFile(fileobj=out, mode='wb') as f: 80 | f.write(profile) 81 | return out.getvalue() 82 | 83 | def _function_id(self, name, filename): 84 | """Finds the function ID in the proto, adds the function if not yet exists. 85 | 86 | Args: 87 | name: A string representing the function name. 88 | filename: A string representing the file name. 89 | 90 | Returns: 91 | An integer representing the unique ID of the function in the profile 92 | proto. 93 | """ 94 | name_id = self._string_id(name) 95 | filename_id = self._string_id(filename) 96 | func = Func(name_id, filename_id) 97 | 98 | func_id = self._function_map.get(func) 99 | if func_id is None: 100 | # Function ID in profile proto must not be zero. 101 | func_id = len(self._function_map) + 1 102 | self._function_map[func] = func_id 103 | function = self._profile.function.add() 104 | function.name = name_id 105 | function.filename = filename_id 106 | function.id = func_id 107 | return func_id 108 | 109 | def _location_id(self, func_id, line_number): 110 | """Finds the location ID in the proto, adds the location if not yet exists. 111 | 112 | Args: 113 | func_id: An integer representing the ID of the corresponding function in 114 | the profile proto. 115 | line_number: An integer representing the line number in the source code. 116 | 117 | Returns: 118 | An integer representing the unique ID of the location in the profile 119 | proto. 120 | """ 121 | loc = Loc(func_id=func_id, line_number=line_number) 122 | 123 | location_id = self._location_map.get(loc) 124 | if location_id is None: 125 | # Location ID in profile proto must not be zero. 126 | location_id = len(self._location_map) + 1 127 | self._location_map[loc] = location_id 128 | location = self._profile.location.add() 129 | location.id = location_id 130 | line = location.line.add() 131 | line.line = line_number 132 | line.function_id = func_id 133 | return location_id 134 | 135 | def _string_id(self, value): 136 | """Finds the string ID in the proto, adds the string if not yet exists.""" 137 | string_id = self._string_map.get(value) 138 | if string_id is None: 139 | string_id = len(self._string_map) 140 | self._string_map[value] = string_id 141 | self._profile.string_table.append(value) 142 | return string_id 143 | -------------------------------------------------------------------------------- /googlecloudprofiler/client.py: -------------------------------------------------------------------------------- 1 | # Copyright 2018 Google LLC 2 | # 3 | # Licensed under the Apache License, Version 2.0 (the "License"); 4 | # you may not use this file except in compliance with the License. 5 | # You may obtain a copy of the License at 6 | # 7 | # https://www.apache.org/licenses/LICENSE-2.0 8 | # 9 | # Unless required by applicable law or agreed to in writing, software 10 | # distributed under the License is distributed on an "AS IS" BASIS, 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | # See the License for the specific language governing permissions and 13 | # limitations under the License. 14 | """Communicates with the profiler backend over HTTP.""" 15 | 16 | import base64 17 | import inspect 18 | import json 19 | import logging 20 | import os 21 | import re 22 | import sys 23 | import threading 24 | import time 25 | import traceback 26 | 27 | import google.auth 28 | from google.oauth2 import service_account 29 | import google_auth_httplib2 30 | import googleapiclient 31 | import googleapiclient.discovery 32 | import googleapiclient.errors 33 | from googlecloudprofiler import __version__ as version 34 | from googlecloudprofiler import backoff 35 | # pylint: disable=g-import-not-at-top 36 | if sys.platform.startswith('linux'): 37 | from googlecloudprofiler import cpu_profiler 38 | else: 39 | # CPU profiling is only supported on Linux. 40 | cpu_profiler = None 41 | from googlecloudprofiler import pythonprofiler 42 | import httplib2 43 | import requests 44 | from google.protobuf import duration_pb2 45 | from google.protobuf import json_format 46 | 47 | # This module sometimes catches the general BaseException. This is safe because 48 | # it runs in a daemon thread. Signal is always handled by the main thread, so we 49 | # are not blocking user interruptions such as Ctrl+C. We need to catch the 50 | # general exception sometimes because we can't predict what exception the HTTP 51 | # client can throw. 52 | 53 | # Auth scope to use for the profiler API calls. 54 | _SCOPE = ['https://www.googleapis.com/auth/monitoring.write'] 55 | 56 | _GCE_METADATA_URL = 'http://metadata/computeMetadata/v1/' 57 | _GCE_METADATA_HEADERS = {'Metadata-Flavor': 'Google'} 58 | 59 | _SERVICE_VERSION_LABEL = 'version' 60 | _INSTANCE_LABEL = 'instance' 61 | _ZONE_LABEL = 'zone' 62 | _LANGUAGE_LABEL = 'language' 63 | 64 | _PROFILER_SERVICE_TIMEOUT_SEC = 60 * 60 65 | 66 | _NANOS_PER_SEC = 1000 * 1000 * 1000 67 | 68 | logger = logging.getLogger(__name__) 69 | 70 | 71 | def retrieve_gce_metadata(metadata_key): 72 | """Retrieves the metadata for the given key from the GCE metadata server. 73 | 74 | Args: 75 | metadata_key: A string specifying the metadata key, e.g 76 | 'project/project-id'. See 77 | https://cloud.google.com/compute/docs/storing-retrieving-metadata for the 78 | list of keys. 79 | 80 | Returns: 81 | A string representing the metadata value, or None if not found. 82 | """ 83 | url = _GCE_METADATA_URL + metadata_key 84 | try: 85 | response = requests.get(url, headers=_GCE_METADATA_HEADERS) 86 | if response.status_code == requests.codes.ok: 87 | return response.text 88 | except BaseException as e: 89 | # Ignore any exceptions. 90 | logger.warning('Failed to fetch %s from GCE metadata server: %s', 91 | metadata_key, str(e)) 92 | return None 93 | 94 | 95 | class Client: 96 | """Communicates with the profiler backend over HTTP.""" 97 | 98 | def __init__(self): 99 | self._backoff = backoff.Backoff() 100 | self._filter_log() 101 | self._started = False 102 | self._profiler_service = None 103 | 104 | def setup_auth(self, project_id=None, service_account_json_file=None): 105 | """Sets up authentication with Google APIs. 106 | 107 | This will use the credentials from service_account_json_file if provided, 108 | falling back to application default credentials. See 109 | https://cloud.google.com/docs/authentication/production. 110 | 111 | Args: 112 | project_id: A string specifying the GCP project ID (e.g. my-project). If 113 | not provided, will attempt to retrieve it from the credentials. 114 | service_account_json_file: A string specifying the path to a service 115 | account json file. If not provided, will default to application default 116 | credentials. 117 | 118 | Returns: 119 | A string representing the project ID. 120 | """ 121 | if service_account_json_file: 122 | self._credentials = ( 123 | service_account.Credentials.from_service_account_file( 124 | service_account_json_file, scopes=_SCOPE)) 125 | if not project_id: 126 | with open(service_account_json_file) as f: 127 | project_id = json.load(f).get('project_id') 128 | else: 129 | self._credentials, credentials_project_id = google.auth.default( 130 | scopes=_SCOPE) 131 | project_id = project_id or credentials_project_id 132 | return project_id 133 | 134 | def config(self, project_id, service, service_version, disable_cpu_profiling, 135 | disable_wall_profiling, period_ms, discovery_service_url): 136 | """Sets up the client config. 137 | 138 | Args: 139 | project_id: A string specifying the cloud project ID. When not specified, 140 | the value can be read from the credential file or otherwise read from 141 | the VM metadata server. If specified neither here nor via the 142 | envrionment, a value error will be raised. 143 | service: A string specifying the name of the service under which the 144 | profiled data will be recorded and exposed at the profiler UI for the 145 | project. If specified neither here nor via the envrironment variable 146 | GAE_SERVICE or the environment variable K_SERVICE, a value error will be 147 | raised. See docs in __init__.py for more details. 148 | service_version: A string specifying the version of the service. See docs 149 | in __init__.py for more details. 150 | disable_cpu_profiling: A bool specifying whether or not the CPU time 151 | profiling should be disabled. See docs in __init__.py for more details. 152 | disable_wall_profiling: A bool specifying whether or not the WALL time 153 | profiling should be disabled. See docs in __init__.py for more details. 154 | period_ms: An integer specifying the sampling interval in milliseconds. 155 | discovery_service_url: A URL that points to the location of the discovery 156 | service. 157 | 158 | Raises: 159 | ValueError: If the project ID or service can't be determined from the 160 | environment and arguments. Or if service name doesn't match 161 | '^[a-z0-9]([-a-z0-9_.]{0,253}[a-z0-9])?$'. Or if no profiling mode is 162 | enabled. 163 | """ 164 | self._profilers = {} 165 | self._config_cpu_profiling(disable_cpu_profiling, period_ms) 166 | self._config_wall_profiling(disable_wall_profiling, period_ms) 167 | if not self._profilers: 168 | raise ValueError('No profiling mode is enabled.') 169 | 170 | project_id = project_id or retrieve_gce_metadata('project/project-id') 171 | if not project_id: 172 | raise ValueError( 173 | 'Unable to determine the project ID from the environment. ' 174 | 'project ID mush be provided if running outside of GCP.') 175 | 176 | service = service or os.environ.get('GAE_SERVICE') or os.environ.get( 177 | 'K_SERVICE') 178 | if not service: 179 | raise ValueError('Service name must be provided via configuration or ' 180 | 'GAE_SERVICE environment variable.') 181 | service_re = re.compile('^[a-z0-9]([-a-z0-9_.]{0,253}[a-z0-9])?$') 182 | if not service_re.match(service): 183 | raise ValueError('Service name "%s" does not match regular expression ' 184 | '"%s"' % (service, service_re.pattern)) 185 | deployment_labels = {_LANGUAGE_LABEL: 'python'} 186 | service_version = service_version or os.environ.get( 187 | 'GAE_VERSION') or os.environ.get('K_REVISION') 188 | if service_version: 189 | deployment_labels[_SERVICE_VERSION_LABEL] = service_version 190 | zone = retrieve_gce_metadata('instance/zone') 191 | if zone: 192 | deployment_labels[_ZONE_LABEL] = zone.split('/')[-1] 193 | 194 | self._deployment = { 195 | 'projectId': project_id, 196 | 'target': service, 197 | 'labels': deployment_labels, 198 | } 199 | 200 | self._profile_labels = {} 201 | instance = retrieve_gce_metadata('instance/name') 202 | if instance: 203 | self._profile_labels[_INSTANCE_LABEL] = instance 204 | 205 | self._discovery_service_url = googleapiclient.discovery.DISCOVERY_URI 206 | if discovery_service_url: 207 | self._discovery_service_url = discovery_service_url 208 | 209 | def start(self): 210 | """Starts collecting profiles. 211 | 212 | Starts an endless daemon thread that polls the profiler server, and collects 213 | and uploads profiles as requested. 214 | 215 | Raises: 216 | ValueError: If called from a non-main thread when Wall time profiling 217 | is enabled. 218 | """ 219 | if self._started: 220 | logger.warning('Profiler already started, will not start again') 221 | return 222 | 223 | if 'WALL' in self._profilers: 224 | self._profilers['WALL'].register_handler() 225 | self._polling_thread = threading.Thread(target=self._poll_profiler_service) 226 | self._polling_thread.name = 'Profiler API polling thread' 227 | self._polling_thread.daemon = True 228 | self._polling_thread.start() 229 | 230 | def _config_cpu_profiling(self, disable_cpu_profiling, period_ms): 231 | """Adds CPU profiler if CPU profiling is supported and not disabled.""" 232 | cpu_profiling_supported = cpu_profiler is not None 233 | if not cpu_profiling_supported: 234 | logger.info('CPU profiling is not supported on the current Operating ' 235 | 'System. Linux is the only supported Operating System.') 236 | elif disable_cpu_profiling: 237 | logger.info('CPU profiling is disabled by disable_cpu_profiling') 238 | else: 239 | self._profilers['CPU'] = cpu_profiler.CPUProfiler(period_ms) 240 | 241 | def _config_wall_profiling(self, disable_wall_profiling, period_ms): 242 | """Adds wall profiler if wall profiling is supported and not disabled.""" 243 | if disable_wall_profiling: 244 | logger.info('Wall profiling is disabled by disable_wall_profiling') 245 | else: 246 | self._profilers['WALL'] = pythonprofiler.WallProfiler(period_ms) 247 | 248 | def _build_service(self): 249 | """Builds a discovery client for talking to the Profiler.""" 250 | http = httplib2.Http(timeout=_PROFILER_SERVICE_TIMEOUT_SEC) 251 | http = google_auth_httplib2.AuthorizedHttp(self._credentials, http) 252 | profiler_api = googleapiclient.discovery.build( 253 | 'cloudprofiler', 254 | 'v2', 255 | http=http, 256 | cache_discovery=False, 257 | requestBuilder=ProfilerHttpRequest, 258 | discoveryServiceUrl=self._discovery_service_url) 259 | return profiler_api.projects().profiles() 260 | 261 | def _create_profile(self): 262 | """Calls the profiler server for instructions on the next profile to create. 263 | 264 | The request hangs until the profiler server thinks it's the desired time 265 | to profile. In some cases, the server may also return an error containing 266 | a desired backoff duration. 267 | 268 | Returns: 269 | A Profile object containing necessary information, such as type and 270 | duration, to collect profile data. 271 | """ 272 | profile_types = list(self._profilers.keys()) 273 | request = { 274 | 'profileType': profile_types, 275 | 'deployment': self._deployment, 276 | } 277 | parent = 'projects/' + self._deployment['projectId'] 278 | return self._profiler_service.create(parent=parent, body=request).execute() 279 | 280 | def _collect_and_upload_profile(self, profile): 281 | """Collects a profile and uploads to the profiler server.""" 282 | try: 283 | profile_type = profile['profileType'] 284 | if profile_type not in self._profilers: 285 | logger.warning('Unexpected profile type: %s', profile_type) 286 | return 287 | 288 | duration = duration_pb2.Duration() 289 | profile_duration = profile['duration'] 290 | json_format.Parse(json.dumps(profile_duration), duration) 291 | duration_ns = duration.seconds * _NANOS_PER_SEC + duration.nanos 292 | 293 | profile_bytes = self._profilers[profile_type].profile(duration_ns) 294 | profile['profileBytes'] = base64.b64encode(profile_bytes).decode('UTF-8') 295 | logger.debug('Starting to upload profile') 296 | self._profiler_service.patch( 297 | name=profile['name'], body=profile).execute(num_retries=3) 298 | except BaseException: # pylint: disable=broad-except 299 | logger.warning( 300 | 'Failed to collect and upload profile whose profile type is %s: %s', 301 | profile_type, traceback.format_exc()) 302 | 303 | def _poll_profiler_service(self): 304 | """Polls the profiler server stoplessly.""" 305 | logger.debug('Profiler has started') 306 | build_service_backoff = backoff.Backoff() 307 | while self._profiler_service is None: 308 | try: 309 | self._profiler_service = self._build_service() 310 | except BaseException as e: # pylint: disable=broad-except 311 | # Exponential backoff. 312 | backoff_duration = build_service_backoff.next_backoff() 313 | logger.error( 314 | 'Failed to build the Discovery client for profiler ' 315 | '(will retry after %.3fs): %s', backoff_duration, str(e)) 316 | time.sleep(backoff_duration) 317 | 318 | while True: 319 | profile = None 320 | while not profile: 321 | try: 322 | logger.debug('Starting to create profile') 323 | profile = self._create_profile() 324 | self._backoff = backoff.Backoff() 325 | logger.debug('Successfully created a %s profile', 326 | profile['profileType']) 327 | except BaseException as e: 328 | # Uses the server specified backoff duration if it is present in the 329 | # error message, otherwise uses exponential backoff. 330 | backoff_duration = self._backoff.next_backoff(e) 331 | logger.debug('Failed to create profile (will retry after %.3fs): %s', 332 | backoff_duration, str(e)) 333 | time.sleep(backoff_duration) 334 | 335 | self._collect_and_upload_profile(profile) 336 | 337 | def _filter_log(self): 338 | """Disables logging in the discovery API to avoid excessive logging.""" 339 | 340 | class _ChildLogFilter(logging.Filter): 341 | """Filter to eliminate info-level logging when called from this module.""" 342 | 343 | def __init__(self, filter_levels=None): 344 | super().__init__() 345 | self._filter_levels = filter_levels or set(logging.INFO) 346 | # Get name without extension to avoid .py vs .pyc issues 347 | self._my_filename = os.path.splitext( 348 | inspect.getmodule(_ChildLogFilter).__file__)[0] 349 | 350 | def filter(self, record): 351 | if record.levelno not in self._filter_levels: 352 | return True 353 | callerframes = inspect.getouterframes(inspect.currentframe()) 354 | for f in callerframes: 355 | if os.path.splitext(f[1])[0] == self._my_filename: 356 | return False 357 | return True 358 | 359 | googleapiclient.discovery.logger.addFilter(_ChildLogFilter({logging.INFO})) 360 | 361 | 362 | class ProfilerHttpRequest(googleapiclient.http.HttpRequest): 363 | """Attaches headers specific to the profiling agent. 364 | 365 | The x-goog-api-format-version header is needed for the newer error format. 366 | This format sets the retry info in a separate field in the error response. 367 | This makes it easier (or even possible) to retrieve the retry delay. 368 | 369 | The user-agent and x-goog-api-format-version headers are added 370 | (if not present) or updated (if already present) to note the version 371 | of the profiling agent. 372 | """ 373 | 374 | def __init__(self, *args, **kwargs): 375 | headers = kwargs.setdefault('headers', {}) 376 | headers['x-goog-api-format-version'] = '2' 377 | 378 | # user-agent and x-goog-api-client should be a space-separated list of 379 | # libraries and their versions. 380 | for (h, val) in [('user-agent', 381 | 'gcloud-python-profiler/' + version.__version__), 382 | ('x-goog-api-client', 'gccl/' + version.__version__)]: 383 | if h in headers: 384 | headers[h] += ' ' 385 | else: 386 | headers[h] = '' 387 | headers[h] += val 388 | 389 | super().__init__(*args, **kwargs) 390 | -------------------------------------------------------------------------------- /googlecloudprofiler/cpu_profiler.py: -------------------------------------------------------------------------------- 1 | # Copyright 2018 Google LLC 2 | # 3 | # Licensed under the Apache License, Version 2.0 (the "License"); 4 | # you may not use this file except in compliance with the License. 5 | # You may obtain a copy of the License at 6 | # 7 | # https://www.apache.org/licenses/LICENSE-2.0 8 | # 9 | # Unless required by applicable law or agreed to in writing, software 10 | # distributed under the License is distributed on an "AS IS" BASIS, 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | # See the License for the specific language governing permissions and 13 | # limitations under the License. 14 | """CPU time profiler.""" 15 | 16 | import logging 17 | from googlecloudprofiler import _profiler 18 | from googlecloudprofiler import builder 19 | 20 | logger = logging.getLogger(__name__) 21 | 22 | 23 | class CPUProfiler: 24 | """CPU time profiler. 25 | 26 | The profiler collects CPU time usage data and builds the data as 27 | a gzip-compressed profile proto. 28 | """ 29 | 30 | def __init__(self, period_ms=10): 31 | """Constructs the CPU time profiler. 32 | 33 | Args: 34 | period_ms: An optional integer specifying the sampling interval in 35 | milliseconds. Defaults to 10. 36 | """ 37 | self._profile_type = 'CPU' 38 | self._period_ms = period_ms 39 | 40 | def profile(self, duration_ns): 41 | """Profiles the CPU time usage for the given duration. 42 | 43 | Args: 44 | duration_ns: An integer specifying the duration to profile in nanoseconds. 45 | 46 | Returns: 47 | A bytes object containing gzip-compressed profile proto. 48 | """ 49 | traces = self._profile(duration_ns) 50 | return self._build_profile(duration_ns, traces) 51 | 52 | def _profile(self, duration_ns): 53 | return _profiler.profile_cpu(duration_ns, self._period_ms) 54 | 55 | def _build_profile(self, duration_ns, traces): 56 | profile_builder = builder.Builder() 57 | profile_builder.populate_profile(traces, self._profile_type, 'nanoseconds', 58 | self._period_ms * 1000 * 1000, duration_ns) 59 | return profile_builder.emit() 60 | -------------------------------------------------------------------------------- /googlecloudprofiler/profile_pb2.py: -------------------------------------------------------------------------------- 1 | # Copyright 2022 Google LLC 2 | # 3 | # Licensed under the Apache License, Version 2.0 (the "License"); 4 | # you may not use this file except in compliance with the License. 5 | # You may obtain a copy of the License at 6 | # 7 | # https://www.apache.org/licenses/LICENSE-2.0 8 | # 9 | # Unless required by applicable law or agreed to in writing, software 10 | # distributed under the License is distributed on an "AS IS" BASIS, 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | # See the License for the specific language governing permissions and 13 | # limitations under the License. 14 | # -*- coding: utf-8 -*- 15 | # Generated by the protocol buffer compiler. DO NOT EDIT! 16 | # source: profile.proto 17 | # pylint:skip-file 18 | """Generated protocol buffer code.""" 19 | from google.protobuf.internal import builder as _builder 20 | from google.protobuf import descriptor as _descriptor 21 | from google.protobuf import descriptor_pool as _descriptor_pool 22 | from google.protobuf import symbol_database as _symbol_database 23 | # @@protoc_insertion_point(imports) 24 | 25 | _sym_db = _symbol_database.Default() 26 | 27 | 28 | 29 | 30 | DESCRIPTOR = _descriptor_pool.Default().AddSerializedFile( 31 | b'\n\rprofile.proto\x12\x12perftools.profiles\"\xd5\x03\n\x07Profile\x12\x32\n\x0bsample_type\x18\x01 \x03(\x0b\x32\x1d.perftools.profiles.ValueType\x12*\n\x06sample\x18\x02 \x03(\x0b\x32\x1a.perftools.profiles.Sample\x12,\n\x07mapping\x18\x03 \x03(\x0b\x32\x1b.perftools.profiles.Mapping\x12.\n\x08location\x18\x04 \x03(\x0b\x32\x1c.perftools.profiles.Location\x12.\n\x08\x66unction\x18\x05 \x03(\x0b\x32\x1c.perftools.profiles.Function\x12\x14\n\x0cstring_table\x18\x06 \x03(\t\x12\x13\n\x0b\x64rop_frames\x18\x07 \x01(\x03\x12\x13\n\x0bkeep_frames\x18\x08 \x01(\x03\x12\x12\n\ntime_nanos\x18\t \x01(\x03\x12\x16\n\x0e\x64uration_nanos\x18\n \x01(\x03\x12\x32\n\x0bperiod_type\x18\x0b \x01(\x0b\x32\x1d.perftools.profiles.ValueType\x12\x0e\n\x06period\x18\x0c \x01(\x03\x12\x0f\n\x07\x63omment\x18\r \x03(\x03\x12\x1b\n\x13\x64\x65\x66\x61ult_sample_type\x18\x0e \x01(\x03\"\'\n\tValueType\x12\x0c\n\x04type\x18\x01 \x01(\x03\x12\x0c\n\x04unit\x18\x02 \x01(\x03\"V\n\x06Sample\x12\x13\n\x0blocation_id\x18\x01 \x03(\x04\x12\r\n\x05value\x18\x02 \x03(\x03\x12(\n\x05label\x18\x03 \x03(\x0b\x32\x19.perftools.profiles.Label\"@\n\x05Label\x12\x0b\n\x03key\x18\x01 \x01(\x03\x12\x0b\n\x03str\x18\x02 \x01(\x03\x12\x0b\n\x03num\x18\x03 \x01(\x03\x12\x10\n\x08num_unit\x18\x04 \x01(\x03\"\xdd\x01\n\x07Mapping\x12\n\n\x02id\x18\x01 \x01(\x04\x12\x14\n\x0cmemory_start\x18\x02 \x01(\x04\x12\x14\n\x0cmemory_limit\x18\x03 \x01(\x04\x12\x13\n\x0b\x66ile_offset\x18\x04 \x01(\x04\x12\x10\n\x08\x66ilename\x18\x05 \x01(\x03\x12\x10\n\x08\x62uild_id\x18\x06 \x01(\x03\x12\x15\n\rhas_functions\x18\x07 \x01(\x08\x12\x15\n\rhas_filenames\x18\x08 \x01(\x08\x12\x18\n\x10has_line_numbers\x18\t \x01(\x08\x12\x19\n\x11has_inline_frames\x18\n \x01(\x08\"v\n\x08Location\x12\n\n\x02id\x18\x01 \x01(\x04\x12\x12\n\nmapping_id\x18\x02 \x01(\x04\x12\x0f\n\x07\x61\x64\x64ress\x18\x03 \x01(\x04\x12&\n\x04line\x18\x04 \x03(\x0b\x32\x18.perftools.profiles.Line\x12\x11\n\tis_folded\x18\x05 \x01(\x08\")\n\x04Line\x12\x13\n\x0b\x66unction_id\x18\x01 \x01(\x04\x12\x0c\n\x04line\x18\x02 \x01(\x03\"_\n\x08\x46unction\x12\n\n\x02id\x18\x01 \x01(\x04\x12\x0c\n\x04name\x18\x02 \x01(\x03\x12\x13\n\x0bsystem_name\x18\x03 \x01(\x03\x12\x10\n\x08\x66ilename\x18\x04 \x01(\x03\x12\x12\n\nstart_line\x18\x05 \x01(\x03\x42-\n\x1d\x63om.google.perftools.profilesB\x0cProfileProtob\x06proto3' 32 | ) 33 | 34 | _builder.BuildMessageAndEnumDescriptors(DESCRIPTOR, globals()) 35 | _builder.BuildTopDescriptorsAndMessages(DESCRIPTOR, 'profile_pb2', globals()) 36 | if _descriptor._USE_C_DESCRIPTORS == False: 37 | 38 | DESCRIPTOR._options = None 39 | DESCRIPTOR._serialized_options = b'\n\035com.google.perftools.profilesB\014ProfileProto' 40 | _PROFILE._serialized_start = 38 41 | _PROFILE._serialized_end = 507 42 | _VALUETYPE._serialized_start = 509 43 | _VALUETYPE._serialized_end = 548 44 | _SAMPLE._serialized_start = 550 45 | _SAMPLE._serialized_end = 636 46 | _LABEL._serialized_start = 638 47 | _LABEL._serialized_end = 702 48 | _MAPPING._serialized_start = 705 49 | _MAPPING._serialized_end = 926 50 | _LOCATION._serialized_start = 928 51 | _LOCATION._serialized_end = 1046 52 | _LINE._serialized_start = 1048 53 | _LINE._serialized_end = 1089 54 | _FUNCTION._serialized_start = 1091 55 | _FUNCTION._serialized_end = 1186 56 | # @@protoc_insertion_point(module_scope) 57 | -------------------------------------------------------------------------------- /googlecloudprofiler/pythonprofiler.py: -------------------------------------------------------------------------------- 1 | # Copyright 2018 Google LLC 2 | # 3 | # Licensed under the Apache License, Version 2.0 (the "License"); 4 | # you may not use this file except in compliance with the License. 5 | # You may obtain a copy of the License at 6 | # 7 | # https://www.apache.org/licenses/LICENSE-2.0 8 | # 9 | # Unless required by applicable law or agreed to in writing, software 10 | # distributed under the License is distributed on an "AS IS" BASIS, 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | # See the License for the specific language governing permissions and 13 | # limitations under the License. 14 | """Python2-compatible implementation for profilers.""" 15 | 16 | import atexit 17 | import collections 18 | import logging 19 | import signal 20 | import threading 21 | import time 22 | import timeit 23 | from googlecloudprofiler import builder 24 | 25 | # Maximum stack frames to record. 26 | _MAX_STACK_DEPTH = 128 27 | _NANOS_PER_SEC = 1000 * 1000 * 1000 28 | 29 | logger = logging.getLogger(__name__) 30 | 31 | 32 | class WallProfiler: 33 | """Python2-compatible implementation for Wall time profiler. 34 | 35 | This Wall profiler avoids dependency on any native code by using 36 | Python signal module. It only profiles the main thread. It has the following 37 | limitations: 38 | 39 | 1. A system call interrupted by a signal fails with EINTR. Python doesn't 40 | handle EINTR properly until Python 3.5. Python 3.4 handles it in some 41 | modules but not all, see https://www.python.org/dev/peps/pep-0475/. In 42 | earlier Python, when EINTR occurs, some system calls fail with Interrupted 43 | Error, such as socket.recv, and some fail silently, such as sleep (Python 44 | doc does mention that sleep can be terminated earlier than requested: 45 | https://docs.python.org/2/library/time.html#time.sleep). Python assumes 46 | that the application code takes the responsibility to handle EINTR properly 47 | (usually means retry in a loop). This profiler triggers signal so it can 48 | cause EINTR. The risk is reduced by requiring the operation system to 49 | restart the interrupted system call automatically. But not all system calls 50 | can be restarted by the OS. See "Interruption of system calls and library 51 | functions by signal handlers" in 52 | http://man7.org/linux/man-pages/man7/signal.7.html. Using this profiler 53 | with Python earlier than 3.5 requires that the application code handles 54 | EINTR properly. 55 | 2. The Python-side signal handling, that this profiler uses, is capable of 56 | executing the signal handler on the main thread only. It achieves that by 57 | quickly acquiring and releasing GIL in a loop until the main thread gets to 58 | execute (see slide 23 of http://www.dabeaz.com/python/GIL.pdf). This comes 59 | at expense, especially on programs with large number of threads (which are 60 | not that common in Python which is good news). Also this means that the 61 | main thread must be in a state where it can get something executed - e.g. 62 | it won't if it waits on a thread join. 63 | 3. The Python signal module in Python versions older than 3.6 doesn't handle 64 | signals properly. Briefly speaking, when a signal arrives, a global flag 65 | is_tripped is used to track whether PyErr_CheckSignals is already added to 66 | pending calls. PyErr_CheckSignals will clear this flag when it's called. In 67 | some race conditions, the flag is set to 1 but PyErr_CheckSignals is not 68 | added to the pending calls. This causes the program to no longer handle any 69 | signals. For example, Ctrl + C will not kill the process. The profiler 70 | triggers signal with high frequency, thus makes this problem more likely to 71 | happen. Based on experimentation, the problem is more likely to occur in 72 | Python 3, so the Wall profiler is not supported for Python versions 73 | (inclusive) 3 to 3.5. For Python 2, the problem theoretically can happen, 74 | but is much less likely to manifest. Users can use it at their own risk. 75 | """ 76 | 77 | def __init__(self, period_ms): 78 | """Constructs the Wall time profiler. 79 | 80 | Args: 81 | period_ms: An integer specifying the sampling interval in milliseconds. 82 | """ 83 | self._profile_type = 'wall' 84 | self._period_sec = float(period_ms) / 1000 85 | self._traces = collections.defaultdict(int) 86 | self._in_handler = False 87 | self._started = False 88 | self._last_sample_time = None 89 | self._trace_count = 0 90 | self._sample_time_lock = threading.RLock() 91 | 92 | def register_handler(self): 93 | """Registers the handler to the SIGALRM signal. 94 | 95 | This method must be called from the main thread. Attempting to call it from 96 | other threads will cause a ValueError exception. 97 | """ 98 | signal.signal(signal.SIGALRM, self._handler) 99 | # Requires that the system restarts system calls interrupted by SIGALRM. 100 | # Not all calls can be restarted, see 101 | # http://man7.org/linux/man-pages/man7/signal.7.html. 102 | signal.siginterrupt(signal.SIGALRM, False) 103 | 104 | # Stop sending SIGALRM before the program exits. If SIGALRM is received 105 | # during the program exit, sometimes the program exits with non-zero code 106 | # 142. See b/133360821. 107 | atexit.register(signal.setitimer, signal.ITIMER_REAL, 0) 108 | 109 | def profile(self, duration_ns): 110 | """Profiles for the given duration. 111 | 112 | This function can be called from a non-main thread. It assumes 113 | register_handler has been called. 114 | 115 | Args: 116 | duration_ns: An integer specifying the duration to profile in nanoseconds. 117 | 118 | Returns: 119 | A bytes object containing gzip-compressed profile proto. 120 | """ 121 | self._reset() 122 | 123 | profile_duration = float(duration_ns) / _NANOS_PER_SEC 124 | target_time = timeit.default_timer() + profile_duration 125 | 126 | self._start_profiling() 127 | # In Python 2, sleep can be interrupted by signal. Retries sleep until the 128 | # target time is reached. 129 | while profile_duration > 0: 130 | time.sleep(profile_duration) 131 | self._sample_time_lock.acquire() 132 | 133 | # Signal timer must be disabled before allocating memory. A fork call that 134 | # takes longer than the signal interval will enter an endless loop of 135 | # retrying interrupted clone system call: 136 | # http://lists.debian.org/debian-glibc/2010/03/msg00161.html. If that 137 | # happens, trying to allocation memory may hang waiting for memory lock. 138 | # Failing to disable the signal before allocating memory may cause a 139 | # deadlock. 140 | signal.setitimer(signal.ITIMER_REAL, 0) 141 | profile_duration = target_time - timeit.default_timer() 142 | signal.setitimer(signal.ITIMER_REAL, self._period_sec, self._period_sec) 143 | self._last_sample_time = None 144 | self._sample_time_lock.release() 145 | 146 | self._stop_profiling() 147 | 148 | return self._serialize_and_clear_traces(duration_ns) 149 | 150 | def _record_trace(self, frame): 151 | """Records the call stack trace of the given frame. 152 | 153 | Args: 154 | frame: A Frame object representing the leaf frame of the stack. 155 | 156 | Returns: 157 | A tuple of frames. The leaf frame is at position 0. A frame is a 158 | (function name, filename, line number) tuple. 159 | """ 160 | depth = 0 161 | trace = [] 162 | while frame is not None and depth < _MAX_STACK_DEPTH: 163 | frame_tuple = (frame.f_code.co_name, frame.f_code.co_filename, 164 | frame.f_lineno) 165 | trace.append(frame_tuple) 166 | frame = frame.f_back 167 | depth += 1 168 | return tuple(trace) 169 | 170 | def _handler(self, unused_signum, frame): 171 | """Records the current call stack trace when signal received. 172 | 173 | In Python, signal can only occur between the atomic instructions of the 174 | Python interpreter. Since creating a string is "atomic" in Python sense, 175 | it's safe to copy strings in the signal handler. Also, signals are only 176 | handled by main thread. Simply using a bool flag to prevent reentry is 177 | fine. 178 | 179 | Args: 180 | frame: A Frame object representing the current stack frame. 181 | """ 182 | 183 | # _started flags is used to ignore late signals that may be delivered after 184 | # the profiler has been stopped. 185 | if not self._started or self._in_handler: 186 | return 187 | 188 | self._in_handler = True 189 | trace = self._record_trace(frame) 190 | 191 | # Signal handler is only called when the execution returns to Python level 192 | # and when the main thread aquires the GIL. It's possible that multiple 193 | # signals occurred before the handler is called, for example, when Python 194 | # code calls into a long running C code such as gzip. The good news is that 195 | # we know that when the missed signal happened, the main thread Python level 196 | # stack is the same as the current one: if the main thread got a chance to 197 | # update the Python level stack, it already handled the signal. It's 198 | # appropriate to attribute the missed signals to the current stack. 199 | 200 | # Python signal handler is only called on main thread. This lock is a 201 | # reentrant lock which can be acquired again by the same thread without 202 | # blocking. We also prevented reentry of the handler function. So it's fine 203 | # to use this lock in the signal handler. 204 | self._sample_time_lock.acquire() 205 | now = timeit.default_timer() 206 | signal_tick_count = 1 207 | if self._last_sample_time is None: 208 | self._last_sample_time = now 209 | else: 210 | signal_tick_count = int((now - self._last_sample_time) / self._period_sec) 211 | signal_tick_count = max(1, signal_tick_count) 212 | self._last_sample_time += self._period_sec * signal_tick_count 213 | self._sample_time_lock.release() 214 | 215 | self._traces[trace] += signal_tick_count 216 | self._trace_count += signal_tick_count 217 | self._in_handler = False 218 | 219 | def _start_profiling(self): 220 | self._started = True 221 | signal.setitimer(signal.ITIMER_REAL, self._period_sec, self._period_sec) 222 | 223 | def _stop_profiling(self): 224 | """Stops timer and waits for the last handler to finish.""" 225 | signal.setitimer(signal.ITIMER_REAL, 0) 226 | self._started = False 227 | 228 | # Waits for the last signal handler to finish. 229 | count = 0 230 | while self._in_handler: 231 | if count % 1000 == 0: 232 | logger.info('Wait for the last signal handler to finish') 233 | count += 1 234 | # This also releases the GIL to allow the main thread to finish handling. 235 | time.sleep(0.01) 236 | 237 | def _serialize_and_clear_traces(self, duration_ns): 238 | period_ns = int(self._period_sec * _NANOS_PER_SEC) 239 | unknown_trace_count = int(duration_ns / period_ns) - self._trace_count 240 | if unknown_trace_count > 0: 241 | self._traces[(('unknown', 'unknown', 0),)] = unknown_trace_count 242 | profile_builder = builder.Builder() 243 | profile_builder.populate_profile(self._traces, self._profile_type, 244 | 'nanoseconds', period_ns, duration_ns) 245 | self._reset() 246 | return profile_builder.emit() 247 | 248 | def _reset(self): 249 | self._traces = collections.defaultdict(int) 250 | self._last_sample_time = None 251 | self._trace_count = 0 252 | -------------------------------------------------------------------------------- /googlecloudprofiler/src/_profiler.cc: -------------------------------------------------------------------------------- 1 | // Copyright 2018 Google LLC 2 | // 3 | // Licensed under the Apache License, Version 2.0 (the "License"); 4 | // you may not use this file except in compliance with the License. 5 | // You may obtain a copy of the License at 6 | // 7 | // https://www.apache.org/licenses/LICENSE-2.0 8 | // 9 | // Unless required by applicable law or agreed to in writing, software 10 | // distributed under the License is distributed on an "AS IS" BASIS, 11 | // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | // See the License for the specific language governing permissions and 13 | // limitations under the License. 14 | 15 | #include 16 | 17 | #include "clock.h" 18 | #include "profiler.h" 19 | 20 | namespace { 21 | PyObject* ProfileCPU(PyObject* self, PyObject* args) { 22 | uint64_t duration_nanos = 0; 23 | uint64_t period_msec = 0; 24 | if (!PyArg_ParseTuple(args, "LL", &duration_nanos, &period_msec)) { 25 | return nullptr; 26 | } 27 | 28 | CPUProfiler p(duration_nanos, period_msec * kNanosPerMilli); 29 | return p.Collect(); 30 | } 31 | 32 | PyMethodDef ProfilerMethods[] = { 33 | {"profile_cpu", ProfileCPU, METH_VARARGS, "A function for CPU profiling."}, 34 | {nullptr, nullptr, 0, nullptr} /* Sentinel */ 35 | }; 36 | 37 | struct PyModuleDef moduledef = { 38 | PyModuleDef_HEAD_INIT, "_profiler", /* name of module */ 39 | "Google Cloud Profiler C++ extension module", /* module documentation */ 40 | -1, ProfilerMethods}; 41 | } // namespace 42 | 43 | PyMODINIT_FUNC PyInit__profiler(void) { return PyModule_Create(&moduledef); } 44 | -------------------------------------------------------------------------------- /googlecloudprofiler/src/clock.cc: -------------------------------------------------------------------------------- 1 | // Copyright 2018 Google LLC 2 | // 3 | // Licensed under the Apache License, Version 2.0 (the "License"); 4 | // you may not use this file except in compliance with the License. 5 | // You may obtain a copy of the License at 6 | // 7 | // https://www.apache.org/licenses/LICENSE-2.0 8 | // 9 | // Unless required by applicable law or agreed to in writing, software 10 | // distributed under the License is distributed on an "AS IS" BASIS, 11 | // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | // See the License for the specific language governing permissions and 13 | // limitations under the License. 14 | 15 | #include "clock.h" 16 | 17 | namespace { 18 | Clock DefaultClockInstance; 19 | } // namespace 20 | 21 | Clock *DefaultClock() { return &DefaultClockInstance; } 22 | 23 | struct timespec TimeAdd(const struct timespec t1, const struct timespec t2) { 24 | struct timespec t = {t1.tv_sec + t2.tv_sec, t1.tv_nsec + t2.tv_nsec}; 25 | if (t.tv_nsec > kNanosPerSecond) { 26 | t.tv_sec += t.tv_nsec / kNanosPerSecond; 27 | t.tv_nsec = t.tv_nsec % kNanosPerSecond; 28 | } 29 | return t; 30 | } 31 | 32 | bool TimeLessThan(const struct timespec &t1, const struct timespec &t2) { 33 | return (t1.tv_sec < t2.tv_sec) || 34 | (t1.tv_sec == t2.tv_sec && t1.tv_nsec < t2.tv_nsec); 35 | } 36 | 37 | struct timespec NanosToTimeSpec(int64_t nanos) { 38 | time_t seconds = nanos / kNanosPerSecond; 39 | int32_t nano_seconds = nanos % kNanosPerSecond; 40 | return timespec{seconds, nano_seconds}; 41 | } 42 | -------------------------------------------------------------------------------- /googlecloudprofiler/src/clock.h: -------------------------------------------------------------------------------- 1 | // Copyright 2018 Google LLC 2 | // 3 | // Licensed under the Apache License, Version 2.0 (the "License"); 4 | // you may not use this file except in compliance with the License. 5 | // You may obtain a copy of the License at 6 | // 7 | // https://www.apache.org/licenses/LICENSE-2.0 8 | // 9 | // Unless required by applicable law or agreed to in writing, software 10 | // distributed under the License is distributed on an "AS IS" BASIS, 11 | // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | // See the License for the specific language governing permissions and 13 | // limitations under the License. 14 | 15 | #ifndef GOOGLECLOUDPROFILER_SRC_CLOCK_H_ 16 | #define GOOGLECLOUDPROFILER_SRC_CLOCK_H_ 17 | 18 | #include 19 | #include 20 | 21 | static const int64_t kNanosPerSecond = 1000 * 1000 * 1000; 22 | static const int64_t kMicrosPerSecond = 1000 * 1000; 23 | static const int64_t kNanosPerMilli = 1000 * 1000; 24 | 25 | // Clock interface that can be mocked for tests. The default implementation 26 | // delegates to the system and so is thread-safe. 27 | class Clock { 28 | public: 29 | virtual ~Clock() {} 30 | 31 | // Returns the current time. 32 | virtual struct timespec Now() { 33 | struct timespec now; 34 | clock_gettime(CLOCK_MONOTONIC, &now); 35 | return now; 36 | } 37 | 38 | // Blocks the current thread until the specified point in time. 39 | virtual void SleepUntil(struct timespec ts) { 40 | while (clock_nanosleep(CLOCK_MONOTONIC, TIMER_ABSTIME, &ts, nullptr) > 0) { 41 | } 42 | } 43 | 44 | // Blocks the current thread for the specified duration. 45 | virtual void SleepFor(struct timespec ts) { 46 | while (clock_nanosleep(CLOCK_MONOTONIC, 0, &ts, &ts) > 0) { 47 | } 48 | } 49 | }; 50 | 51 | struct timespec TimeAdd(const struct timespec t1, const struct timespec t2); 52 | struct timespec NanosToTimeSpec(int64_t nanos); 53 | bool TimeLessThan(const struct timespec &t1, const struct timespec &t2); 54 | 55 | // Returns a singleton Clock instance which uses the system implementation. 56 | Clock *DefaultClock(); 57 | 58 | #endif // GOOGLECLOUDPROFILER_SRC_CLOCK_H_ 59 | -------------------------------------------------------------------------------- /googlecloudprofiler/src/log.cc: -------------------------------------------------------------------------------- 1 | // Copyright 2018 Google LLC 2 | // 3 | // Licensed under the Apache License, Version 2.0 (the "License"); 4 | // you may not use this file except in compliance with the License. 5 | // You may obtain a copy of the License at 6 | // 7 | // https://www.apache.org/licenses/LICENSE-2.0 8 | // 9 | // Unless required by applicable law or agreed to in writing, software 10 | // distributed under the License is distributed on an "AS IS" BASIS, 11 | // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | // See the License for the specific language governing permissions and 13 | // limitations under the License. 14 | 15 | #include "log.h" 16 | 17 | #include 18 | 19 | void Log(const char *level, const char *fmt, ...) { 20 | // Ensures the current thread is ready to call the Python C API. GIL is 21 | // garanteed to be held. 22 | PyGILState_STATE gil_state = PyGILState_Ensure(); 23 | static PyObject *logging = nullptr; 24 | if (logging == nullptr) { 25 | // GIL is held as PyGILState_Ensure is called above, no need to worry about 26 | // thread safety. 27 | logging = PyImport_ImportModuleNoBlock("logging"); 28 | } 29 | if (logging == nullptr) { 30 | fputs( 31 | "googlecloudprofiler: failed to import logging module, logging " 32 | "is not enabled.\n", 33 | stderr); 34 | PyGILState_Release(gil_state); 35 | return; 36 | } 37 | char msg[200]; 38 | va_list ap; 39 | va_start(ap, fmt); 40 | vsnprintf(msg, sizeof(msg), fmt, ap); 41 | va_end(ap); 42 | PyObject_CallMethod(logging, const_cast(level), 43 | const_cast("s"), msg); 44 | // Resets the Python state to be the same as it was prior to the 45 | // corresponding PyGILState_Ensure() call. 46 | PyGILState_Release(gil_state); 47 | } 48 | 49 | void LogError(const char *fmt, ...) { 50 | va_list ap; 51 | va_start(ap, fmt); 52 | Log("error", fmt, ap); 53 | va_end(ap); 54 | } 55 | 56 | void LogWarning(const char *fmt, ...) { 57 | va_list ap; 58 | va_start(ap, fmt); 59 | Log("warning", fmt, ap); 60 | va_end(ap); 61 | } 62 | 63 | void LogInfo(const char *fmt, ...) { 64 | va_list ap; 65 | va_start(ap, fmt); 66 | Log("info", fmt, ap); 67 | va_end(ap); 68 | } 69 | 70 | void LogDebug(const char *fmt, ...) { 71 | va_list ap; 72 | va_start(ap, fmt); 73 | Log("debug", fmt, ap); 74 | va_end(ap); 75 | } 76 | -------------------------------------------------------------------------------- /googlecloudprofiler/src/log.h: -------------------------------------------------------------------------------- 1 | // Copyright 2018 Google LLC 2 | // 3 | // Licensed under the Apache License, Version 2.0 (the "License"); 4 | // you may not use this file except in compliance with the License. 5 | // You may obtain a copy of the License at 6 | // 7 | // https://www.apache.org/licenses/LICENSE-2.0 8 | // 9 | // Unless required by applicable law or agreed to in writing, software 10 | // distributed under the License is distributed on an "AS IS" BASIS, 11 | // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | // See the License for the specific language governing permissions and 13 | // limitations under the License. 14 | 15 | #ifndef GOOGLECLOUDPROFILER_SRC_LOG_H_ 16 | #define GOOGLECLOUDPROFILER_SRC_LOG_H_ 17 | 18 | // Logs the error message using Python logging.error. It accepts arguments 19 | // like printf: format specifiers in the given fmt are replaced by the 20 | // corresponding additional arguments. 21 | void LogError(const char *fmt, ...); 22 | 23 | // Logs the warning message using Python logging.warning. It accepts arguments 24 | // like printf: format specifiers in the given fmt are replaced by the 25 | // corresponding additional arguments. 26 | void LogWarning(const char *fmt, ...); 27 | 28 | // Logs the info message using Python logging.info. It accepts arguments 29 | // like printf: format specifiers in the given fmt are replaced by the 30 | // corresponding additional arguments. 31 | void LogInfo(const char *fmt, ...); 32 | 33 | // Logs the debug message using Python logging.debug. It accepts arguments 34 | // like printf: format specifiers in the given fmt are replaced by the 35 | // corresponding additional arguments. 36 | void LogDebug(const char *fmt, ...); 37 | 38 | #endif // GOOGLECLOUDPROFILER_SRC_LOG_H_ 39 | -------------------------------------------------------------------------------- /googlecloudprofiler/src/populate_frames.cc: -------------------------------------------------------------------------------- 1 | #include "populate_frames.h" 2 | 3 | #include 4 | 5 | #include "stacktraces.h" 6 | 7 | // 0x030B0000 is 3.11. 8 | #define PY_311 0x030B0000 9 | #if PY_VERSION_HEX >= PY_311 10 | 11 | /** 12 | * The PyFrameObject structure members have been removed from the public C API 13 | * in 3.11: 14 | https://docs.python.org/3/whatsnew/3.11.html#pyframeobject-3-11-hiding. 15 | * 16 | * Instead, getters are provided which participate in reference counting; since 17 | * this code runs as part of the SIGPROF handler, it cannot modify Python 18 | * objects (including their refcounts) and the getters can't be used. Instead, 19 | * we expose the internal _PyInterpreterFrame and use that directly. 20 | * 21 | */ 22 | 23 | #define Py_BUILD_CORE 24 | #include "internal/pycore_frame.h" 25 | #undef Py_BUILD_CORE 26 | 27 | // Modified from 28 | // https://github.com/python/cpython/blob/v3.11.4/Python/pystate.c#L1278-L1285 29 | static inline _PyInterpreterFrame *unsafe_PyThreadState_GetInterpreterFrame( 30 | PyThreadState *tstate) { 31 | assert(tstate != NULL); 32 | _PyInterpreterFrame *f = tstate->cframe->current_frame; 33 | while (f && _PyFrame_IsIncomplete(f)) { 34 | f = f->previous; 35 | } 36 | if (f == NULL) { 37 | return NULL; 38 | } 39 | return f; 40 | } 41 | 42 | // Modified from 43 | // https://github.com/python/cpython/blob/v3.11.4/Objects/frameobject.c#L1310-L1315 44 | // with refcounting removed 45 | static inline PyCodeObject *unsafe_PyInterpreterFrame_GetCode( 46 | _PyInterpreterFrame *frame) { 47 | assert(frame != NULL); 48 | assert(!_PyFrame_IsIncomplete(frame)); 49 | PyCodeObject *code = frame->f_code; 50 | assert(code != NULL); 51 | return code; 52 | } 53 | 54 | // Modified from 55 | // https://github.com/python/cpython/blob/v3.11.4/Objects/frameobject.c#L1326-L1329 56 | // with refcounting removed 57 | static inline _PyInterpreterFrame *unsafe_PyInterpreterFrame_GetBack( 58 | _PyInterpreterFrame *frame) { 59 | assert(frame != NULL); 60 | assert(!_PyFrame_IsIncomplete(frame)); 61 | _PyInterpreterFrame *prev = frame->previous; 62 | while (prev && _PyFrame_IsIncomplete(prev)) { 63 | prev = prev->previous; 64 | } 65 | return prev; 66 | } 67 | 68 | // Copied from 69 | // https://github.com/python/cpython/blob/v3.11.4/Python/frame.c#L165-L170 as 70 | // this function is not available in libpython 71 | int _PyInterpreterFrame_GetLine(_PyInterpreterFrame *frame) { 72 | int addr = _PyInterpreterFrame_LASTI(frame) * sizeof(_Py_CODEUNIT); 73 | return PyCode_Addr2Line(frame->f_code, addr); 74 | } 75 | 76 | int PopulateFrames(CallFrame *frames, PyThreadState *ts) { 77 | if (ts == nullptr) { 78 | frames[0].lineno = kNoPyState; 79 | frames[0].py_code = nullptr; 80 | return 1; 81 | } 82 | 83 | // We are running in the context of the thread interrupted by the signal 84 | // so the frame object for the current thread is stable. 85 | // Unfortunately, we can't use PyFrameObjects because they are initialized 86 | // lazily and will not have the info we need directly. 87 | _PyInterpreterFrame *frame = unsafe_PyThreadState_GetInterpreterFrame(ts); 88 | int num_frames = 0; 89 | while (frame != nullptr && num_frames < kMaxFramesToCapture) { 90 | frames[num_frames].lineno = _PyInterpreterFrame_GetLine(frame); 91 | frames[num_frames].py_code = unsafe_PyInterpreterFrame_GetCode(frame); 92 | num_frames++; 93 | frame = unsafe_PyInterpreterFrame_GetBack(frame); 94 | } 95 | return num_frames; 96 | } 97 | 98 | #else 99 | // python versions before 3.11 100 | 101 | int PopulateFrames(CallFrame *frames, PyThreadState *ts) { 102 | if (ts == nullptr) { 103 | frames[0].lineno = kNoPyState; 104 | frames[0].py_code = nullptr; 105 | return 1; 106 | } 107 | // We are running in the context of the thread interrupted by the signal 108 | // so the frame object for the current thread is stable. 109 | PyFrameObject *frame = ts->frame; 110 | int num_frames = 0; 111 | while (frame != nullptr && num_frames < kMaxFramesToCapture) { 112 | frames[num_frames].lineno = frame->f_lineno; 113 | frames[num_frames].py_code = frame->f_code; 114 | num_frames++; 115 | frame = frame->f_back; 116 | } 117 | return num_frames; 118 | } 119 | 120 | #endif // PY_VERSION_HEX >= PY_311 121 | -------------------------------------------------------------------------------- /googlecloudprofiler/src/populate_frames.h: -------------------------------------------------------------------------------- 1 | #ifndef THIRD_PARTY_PY_GOOGLECLOUDPROFILER_SRC_POPULATE_FRAMES_H_ 2 | #define THIRD_PARTY_PY_GOOGLECLOUDPROFILER_SRC_POPULATE_FRAMES_H_ 3 | 4 | #include 5 | 6 | #include "stacktraces.h" 7 | 8 | /** 9 | * Populates the CallFrame array with at-most kMaxFramesToCapture python frames 10 | * from the provided PyThreadState. Returns the number of frames populated. 11 | */ 12 | int PopulateFrames(CallFrame* frames, PyThreadState* ts); 13 | 14 | #endif // THIRD_PARTY_PY_GOOGLECLOUDPROFILER_SRC_POPULATE_FRAMES_H_ 15 | -------------------------------------------------------------------------------- /googlecloudprofiler/src/profiler.cc: -------------------------------------------------------------------------------- 1 | // Copyright 2018 Google LLC 2 | // 3 | // Licensed under the Apache License, Version 2.0 (the "License"); 4 | // you may not use this file except in compliance with the License. 5 | // You may obtain a copy of the License at 6 | // 7 | // https://www.apache.org/licenses/LICENSE-2.0 8 | // 9 | // Unless required by applicable law or agreed to in writing, software 10 | // distributed under the License is distributed on an "AS IS" BASIS, 11 | // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | // See the License for the specific language governing permissions and 13 | // limitations under the License. 14 | 15 | #include "profiler.h" 16 | 17 | #include 18 | #include 19 | #include 20 | #include 21 | 22 | #include 23 | #include 24 | #include 25 | 26 | #include "clock.h" 27 | #include "log.h" 28 | #include "populate_frames.h" 29 | 30 | AsyncSafeTraceMultiset *Profiler::fixed_traces_ = nullptr; 31 | std::atomic Profiler::unknown_stack_count_; 32 | GetThreadStateFunc get_thread_state_func = PyGILState_GetThisThreadState; 33 | bool CPUProfiler::fork_handlers_registered_; 34 | 35 | namespace { 36 | 37 | struct PyObjectDecReffer { 38 | void operator()(PyObject *py_object) const { 39 | // Ensures the current thread is ready to call the Python C API. 40 | PyGILState_STATE gil_state = PyGILState_Ensure(); 41 | Py_XDECREF(py_object); 42 | PyGILState_Release(gil_state); 43 | } 44 | }; 45 | 46 | typedef std::unique_ptr PyObjectRef; 47 | 48 | // Helper class to store and reset errno when in a signal handler. 49 | class ErrnoRaii { 50 | public: 51 | ErrnoRaii() { stored_errno_ = errno; } 52 | // Not copyable or assignable. 53 | ErrnoRaii(const ErrnoRaii &) = delete; 54 | ErrnoRaii &operator=(const ErrnoRaii &) = delete; 55 | 56 | ~ErrnoRaii() { errno = stored_errno_; } 57 | 58 | private: 59 | int stored_errno_; 60 | }; 61 | 62 | } // namespace 63 | 64 | destructor CodeDeallocHook::old_code_dealloc_ = nullptr; 65 | std::unordered_map 66 | *CodeDeallocHook::deallocated_code_ = nullptr; 67 | 68 | void CodeDeallocHook::CodeDealloc(PyObject *py_object) { 69 | FuncLoc func_loc; 70 | PyCodeObject *code_object = reinterpret_cast(py_object); 71 | GetFuncLoc(code_object, &func_loc); 72 | deallocated_code_->insert(std::make_pair(code_object, func_loc)); 73 | 74 | old_code_dealloc_(py_object); 75 | } 76 | 77 | void CodeDeallocHook::Reset() { 78 | if (deallocated_code_ == nullptr) { 79 | deallocated_code_ = new std::unordered_map; 80 | } else { 81 | deallocated_code_->clear(); 82 | } 83 | } 84 | 85 | bool CodeDeallocHook::Find(PyCodeObject *pointer, FuncLoc *func_loc) { 86 | auto recorded_code = deallocated_code_->find(pointer); 87 | if (recorded_code == deallocated_code_->end()) { 88 | return false; 89 | } 90 | *func_loc = recorded_code->second; 91 | return true; 92 | } 93 | 94 | // This method schedules the SIGPROF timer to go off every specified interval. 95 | bool SignalHandler::SetSigprofInterval(int64_t period_usec) { 96 | static struct itimerval timer; 97 | timer.it_interval.tv_sec = period_usec / kMicrosPerSecond; 98 | timer.it_interval.tv_usec = period_usec % kMicrosPerSecond; 99 | timer.it_value = timer.it_interval; 100 | if (setitimer(ITIMER_PROF, &timer, NULL) == -1) { 101 | LogError("Failed to set ITIMER_PROF: %s", strerror(errno)); 102 | return false; 103 | } 104 | return true; 105 | } 106 | 107 | struct sigaction SignalHandler::SetAction(void (*action)(int, siginfo_t *, 108 | void *)) { 109 | struct sigaction sa; 110 | sa.sa_handler = nullptr; 111 | sa.sa_sigaction = action; 112 | sa.sa_flags = SA_RESTART | SA_SIGINFO; 113 | 114 | sigemptyset(&sa.sa_mask); 115 | 116 | struct sigaction old_handler; 117 | if (sigaction(SIGPROF, &sa, &old_handler) != 0) { 118 | LogError("Failed to set SIGPROF handler: %s", strerror(errno)); 119 | return old_handler; 120 | } 121 | 122 | return old_handler; 123 | } 124 | 125 | const char *CallTraceErrorToName(CallTraceErrors err) { 126 | switch (err) { 127 | case kNoPyState: 128 | return "[Unknown - No Python thread state]"; 129 | default: 130 | return "[Unknown]"; 131 | } 132 | } 133 | 134 | void Profiler::Handle(int signum, siginfo_t *info, void *context) { 135 | // Gets around -Wunused-parameter. 136 | (void)signum; 137 | (void)info; 138 | (void)context; 139 | 140 | ErrnoRaii err_storage; // stores and resets errno 141 | 142 | CallTrace trace; 143 | CallFrame frames[kMaxFramesToCapture]; 144 | trace.frames = frames; 145 | trace.num_frames = 0; 146 | 147 | // PyGILState_GetThisThreadState uses pthread_getspecific which is not 148 | // guaranteed to be async-signal-safe per POSIX. Some issues can be 149 | // found at https://sourceware.org/glibc/wiki/TLSandSignals. 150 | // TODO: check if the limitations are practical here and if 151 | // there are ways to avoid the problems. 152 | PyThreadState *ts = get_thread_state_func(); 153 | 154 | trace.num_frames = PopulateFrames(frames, ts); 155 | if (!fixed_traces_->Add(&trace)) { 156 | unknown_stack_count_++; 157 | return; 158 | } 159 | } 160 | 161 | void GetFuncLoc(PyCodeObject *code_object, FuncLoc *func_loc) { 162 | // Note that PyUnicode_AsUTF8 caches the char array in the unicodeobject 163 | // and the memory is released when the unicodeobject is deallocated. 164 | const char *name = PyUnicode_AsUTF8(code_object->co_name); 165 | const char *filename = PyUnicode_AsUTF8(code_object->co_filename); 166 | func_loc->name = name != nullptr ? name : "unknown"; 167 | func_loc->filename = filename != nullptr ? filename : "unknown"; 168 | } 169 | 170 | // Should be called when GIL is held if PyCode_Type.tp_dealloc is modified, 171 | // otherwise PyCode_Type.tp_dealloc may be updating 172 | // CodeDeallocHook.deallocated_code_ in another thread. 173 | void Profiler::Reset() { 174 | if (fixed_traces_ == nullptr) { 175 | fixed_traces_ = new AsyncSafeTraceMultiset(); 176 | } else { 177 | fixed_traces_->Reset(); 178 | } 179 | CodeDeallocHook::Reset(); 180 | unknown_stack_count_ = 0; 181 | handler_.SetAction(&Profiler::Handle); 182 | } 183 | 184 | // Must be called when GIL is held. 185 | PyObject *Profiler::PythonTraces() { 186 | // Asserts that GIL is held in debug mode. 187 | assert(PyGILState_Check()); 188 | if (unknown_stack_count_ > 0) { 189 | CallFrame fakeFrame = {kUnknown, nullptr}; 190 | aggregated_traces_.Add(1, &fakeFrame, unknown_stack_count_); 191 | } 192 | 193 | PyObjectRef py_traces(PyDict_New()); 194 | if (py_traces == nullptr) { 195 | return nullptr; 196 | } 197 | for (const auto &trace : aggregated_traces_) { 198 | PyObjectRef py_frames(PyTuple_New(trace.first.size())); 199 | if (py_frames == nullptr) { 200 | return nullptr; 201 | } 202 | 203 | for (size_t i = 0; i < trace.first.size(); i++) { 204 | const auto &frame = trace.first[i]; 205 | FuncLoc func_loc; 206 | PyCodeObject *pointer = frame.py_code; 207 | if (pointer == nullptr) { 208 | func_loc = { 209 | CallTraceErrorToName(static_cast(frame.lineno)), 210 | ""}; 211 | } else { 212 | // All PyCodeObjects deallocated during profiling should be recorded 213 | // by CodeDeallocHook. As we are holding GIL, no deallocation can happen 214 | // elsewhere now. It's safe to assume that a PyCodeObject pointer not 215 | // recorded by CodeDeallocHook points to a live object. 216 | // TODO: If multiple code objects are allocated at the same 217 | // address, the func_loc stored by CodeDeallocHook may not belong to the 218 | // sampled frame. At least we should mark the func_loc as invalid if we 219 | // see an address is reused, probably by hooking PyCode_Type.tp_alloc. 220 | if (!CodeDeallocHook::Find(pointer, &func_loc)) { 221 | GetFuncLoc(pointer, &func_loc); 222 | } 223 | } 224 | PyObject *py_frame = 225 | Py_BuildValue("(ssi)", func_loc.name.c_str(), 226 | func_loc.filename.c_str(), frame.lineno); 227 | if (py_frame == nullptr) { 228 | return nullptr; 229 | } 230 | // PyTuple_SET_ITEM is like PyTuple_SetItem(), but does no error checking. 231 | // Error checking is unnecessary here as we are filling precreated brand 232 | // new tuple. Note that PyTuple_SET_ITEM does NOT increase the reference 233 | // count for the inserted item. We are no longer responsible for 234 | // decreasing the reference count of py_frame. It'll be decreased when 235 | // py_frames is deallocated. 236 | PyTuple_SET_ITEM(py_frames.get(), i, py_frame); 237 | } 238 | uint64_t count = trace.second; 239 | PyObject *py_count = PyDict_GetItem(py_traces.get(), py_frames.get()); 240 | if (py_count != nullptr) { 241 | uint64_t previous_count = PyLong_AsUnsignedLong(py_count); 242 | if (PyErr_Occurred()) { 243 | return nullptr; 244 | } 245 | count += previous_count; 246 | } 247 | PyObjectRef trace_count(PyLong_FromUnsignedLongLong(count)); 248 | // PyDict_SetItem increases the reference count for both key and item. We 249 | // are responsible for decreasing the reference count for py_frames and 250 | // trace_count. 251 | if (PyDict_SetItem(py_traces.get(), py_frames.get(), trace_count.get()) < 252 | 0) { 253 | return nullptr; 254 | } 255 | } 256 | 257 | return py_traces.release(); 258 | } 259 | 260 | bool AlmostThere(const struct timespec &finish, const struct timespec &lap) { 261 | // Determine if there is time for another lap before reaching the 262 | // finish line. Have a margin of multiple laps to ensure we do not 263 | // overrun the finish line. 264 | int64_t margin_laps = 2; 265 | 266 | struct timespec now = DefaultClock()->Now(); 267 | struct timespec laps = {lap.tv_sec * margin_laps, lap.tv_nsec * margin_laps}; 268 | 269 | return TimeLessThan(finish, TimeAdd(now, laps)); 270 | } 271 | 272 | PyObject *CPUProfiler::Collect() { 273 | Reset(); 274 | // Hooks to PyCode_Type.tp_dealloc so that a PyCodeObject is recorded before 275 | // being deallocated. The hook is cancelled when dealloc_hook goes out of 276 | // scope. 277 | CodeDeallocHook dealloc_hook; 278 | 279 | if (!Start()) { 280 | return nullptr; 281 | } 282 | // Releases GIL so that the user threads can execute. 283 | Py_BEGIN_ALLOW_THREADS; 284 | 285 | Clock *clock = DefaultClock(); 286 | // Flush the async table every 100 ms 287 | struct timespec flush_interval = {0, 100 * 1000 * 1000}; // 100 millisec 288 | struct timespec finish_line = 289 | TimeAdd(clock->Now(), NanosToTimeSpec(duration_nanos_)); 290 | 291 | // Sleep until finish_line, but wakeup periodically to flush the 292 | // internal tables. 293 | while (!AlmostThere(finish_line, flush_interval)) { 294 | clock->SleepFor(flush_interval); 295 | Flush(); 296 | } 297 | clock->SleepUntil(finish_line); 298 | Stop(); 299 | // Delay to allow last signals to be processed. 300 | clock->SleepUntil(TimeAdd(finish_line, flush_interval)); 301 | Flush(); 302 | // Reacquire the GIL. 303 | Py_END_ALLOW_THREADS; 304 | 305 | PyObject *traces = PythonTraces(); 306 | return traces; 307 | } 308 | 309 | bool CPUProfiler::Start() { 310 | int period_usec = period_nanos_ / 1000; 311 | return handler_.SetSigprofInterval(period_usec); 312 | } 313 | 314 | void CPUProfiler::Stop() { 315 | handler_.SetSigprofInterval(0); 316 | // Breaks encapsulation, but whatever. 317 | signal(SIGPROF, SIG_IGN); 318 | } 319 | 320 | // Blocks the SIGPROF signal for the calling thread. 321 | void BlockSigprof() { 322 | sigset_t signals; 323 | sigemptyset(&signals); 324 | sigaddset(&signals, SIGPROF); 325 | pthread_sigmask(SIG_BLOCK, &signals, nullptr); 326 | } 327 | 328 | // Unblocks the SIGPROF signal for the calling thread. 329 | void UnblockSigprof() { 330 | sigset_t signals; 331 | sigemptyset(&signals); 332 | sigaddset(&signals, SIGPROF); 333 | pthread_sigmask(SIG_UNBLOCK, &signals, nullptr); 334 | } 335 | -------------------------------------------------------------------------------- /googlecloudprofiler/src/profiler.h: -------------------------------------------------------------------------------- 1 | // Copyright 2018 Google LLC 2 | // 3 | // Licensed under the Apache License, Version 2.0 (the "License"); 4 | // you may not use this file except in compliance with the License. 5 | // You may obtain a copy of the License at 6 | // 7 | // https://www.apache.org/licenses/LICENSE-2.0 8 | // 9 | // Unless required by applicable law or agreed to in writing, software 10 | // distributed under the License is distributed on an "AS IS" BASIS, 11 | // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | // See the License for the specific language governing permissions and 13 | // limitations under the License. 14 | 15 | #ifndef GOOGLECLOUDPROFILER_SRC_PROFILER_H_ 16 | #define GOOGLECLOUDPROFILER_SRC_PROFILER_H_ 17 | 18 | #include 19 | #include 20 | 21 | #include 22 | #include 23 | #include 24 | #include 25 | 26 | #include "stacktraces.h" 27 | 28 | struct FuncLoc { 29 | std::string name; 30 | std::string filename; 31 | }; 32 | 33 | void GetFuncLoc(PyCodeObject *code_object, FuncLoc *func_loc); 34 | 35 | // Blocks the SIGPROF signal for the calling thread. 36 | void BlockSigprof(); 37 | 38 | // Unblocks the SIGPROF signal for the calling thread. 39 | void UnblockSigprof(); 40 | 41 | class SignalHandler { 42 | public: 43 | SignalHandler() {} 44 | 45 | // Not copyable or assignable. 46 | SignalHandler(const SignalHandler &) = delete; 47 | SignalHandler &operator=(const SignalHandler &) = delete; 48 | 49 | struct sigaction SetAction(void (*sigaction)(int, siginfo_t *, void *)); 50 | 51 | bool SetSigprofInterval(int64_t period_usec); 52 | }; 53 | 54 | class CodeDeallocHook { 55 | public: 56 | // The constructor must be called when GIL is held. 57 | CodeDeallocHook() { 58 | Reset(); 59 | old_code_dealloc_ = PyCode_Type.tp_dealloc; 60 | PyCode_Type.tp_dealloc = &CodeDealloc; 61 | } 62 | // Not copyable or assignable. 63 | CodeDeallocHook(const CodeDeallocHook &) = delete; 64 | CodeDeallocHook &operator=(const CodeDeallocHook &) = delete; 65 | 66 | // The destructor must be called when GIL is held. 67 | ~CodeDeallocHook() { PyCode_Type.tp_dealloc = old_code_dealloc_; } 68 | 69 | // A wrapper function on PyCode_Type.tp_dealloc that records the code object 70 | // to deallocated_code_ before the actual deallocation. 71 | static void CodeDealloc(PyObject *py_object); 72 | 73 | // The first call to Reset() allocates deallocated_code_. Subsequent calls 74 | // clear deallocated_code_. When PyCode_Type.tp_dealloc points to CodeDealloc, 75 | // this function must be called when GIL is held, otherwise another thread 76 | // may be updating allocated_code_ during PyCodeObject deallocation. 77 | static void Reset(); 78 | 79 | // If the given pointer exists in deallocated_code_ as key, assign the value 80 | // to func_loc and return true, otherwise return false. When 81 | // PyCode_Type.tp_dealloc points to CodeDealloc, this function must be called 82 | // when GIL is held, otherwise another thread may be updating 83 | // allocated_code_ during PyCodeObject deallocation. 84 | static bool Find(PyCodeObject *pointer, FuncLoc *func_loc); 85 | 86 | private: 87 | // When PyCode_Type.tp_dealloc points to CodeDealloc, a code object is 88 | // recorded in this map before being deallocated. The map maps a code object 89 | // pointer to function information of interest. 90 | static std::unordered_map *deallocated_code_; 91 | 92 | static destructor old_code_dealloc_; 93 | }; 94 | 95 | typedef PyThreadState *(*GetThreadStateFunc)(); 96 | 97 | // get_thread_state_func defaults to PyGILState_GetThisThreadState. It's 98 | // declared here so that PyGILState_GetThisThreadState can be stubbed in tests. 99 | extern GetThreadStateFunc get_thread_state_func; 100 | 101 | class Profiler { 102 | public: 103 | Profiler(int64_t duration_nanos, int64_t period_nanos) 104 | : duration_nanos_(duration_nanos), period_nanos_(period_nanos) { 105 | Reset(); 106 | } 107 | // Not copyable or assignable. 108 | Profiler(const Profiler &) = delete; 109 | Profiler &operator=(const Profiler &) = delete; 110 | 111 | virtual ~Profiler() {} 112 | 113 | // Collects performance data. 114 | // Implicitly does a Reset() before starting collection. 115 | virtual PyObject *Collect() = 0; 116 | 117 | // Returns the traces as a Python dictionary object, which maps a trace to its 118 | // count. 119 | PyObject *PythonTraces(); 120 | 121 | // Signal handler, which records the current stack trace. 122 | static void Handle(int signum, siginfo_t *info, void *context); 123 | 124 | // Resets internal state to support data collection. 125 | void Reset(); 126 | 127 | // Migrates data from fixed internal table into growable data structure. 128 | // Returns number of entries extracted. 129 | int Flush() { return HarvestSamples(fixed_traces_, &aggregated_traces_); } 130 | 131 | protected: 132 | SignalHandler handler_; 133 | int64_t duration_nanos_; 134 | int64_t period_nanos_; 135 | 136 | private: 137 | // Points to a fixed multiset of traces used during collection. This 138 | // is allocated on the first call to Reset(). Will be reused by 139 | // subsequent allocations. Cannot be deallocated as it could be in 140 | // use by other threads, triggered from a signal handler. 141 | static AsyncSafeTraceMultiset *fixed_traces_; 142 | 143 | // Aggregated profile data, populated using data extracted from 144 | // fixed_traces. 145 | TraceMultiset aggregated_traces_; 146 | 147 | static std::atomic unknown_stack_count_; 148 | }; 149 | 150 | // CPUProfiler collects cpu profiles by setting up a CPU timer and 151 | // collecting a sample each time it is triggered (via SIGPROF). 152 | class CPUProfiler : public Profiler { 153 | public: 154 | CPUProfiler(int64_t duration_nanos, int64_t period_nanos) 155 | : Profiler(duration_nanos, period_nanos) { 156 | // When a fork runs longer than the signal interval, it gets interrupted by 157 | // the signal and then retry. This will never end until the profiler 158 | // thread stops sending the signal. In unlucky cases, the profiler 159 | // thread gets blocked on acquiring the memory lock, which is held by 160 | // fork. The process may thus hang unpredictably long. 161 | // The fix is to block the signal for the calling thread before fork and 162 | // reenable it after fork. The caveat is that forks will not be sampled. 163 | if (!fork_handlers_registered_) { 164 | pthread_atfork(&BlockSigprof, &UnblockSigprof, &UnblockSigprof); 165 | // Updating fork_handlers_registered_ here is not thread safe. It's 166 | // fine because the profiler is only allowed to start once, which means 167 | // that CPUProfiler is only created by a single thread. 168 | fork_handlers_registered_ = true; 169 | } 170 | } 171 | // Not copyable or assignable. 172 | CPUProfiler(const CPUProfiler &) = delete; 173 | CPUProfiler &operator=(const CPUProfiler &) = delete; 174 | 175 | // Collects profiling data. 176 | PyObject *Collect() override; 177 | 178 | private: 179 | // Initiates data collection at a fixed interval. 180 | bool Start(); 181 | 182 | // Stops data collection. 183 | void Stop(); 184 | 185 | static bool fork_handlers_registered_; 186 | }; 187 | 188 | #endif // GOOGLECLOUDPROFILER_SRC_PROFILER_H_ 189 | -------------------------------------------------------------------------------- /googlecloudprofiler/src/stacktraces.cc: -------------------------------------------------------------------------------- 1 | // Copyright 2018 Google LLC 2 | // 3 | // Licensed under the Apache License, Version 2.0 (the "License"); 4 | // you may not use this file except in compliance with the License. 5 | // You may obtain a copy of the License at 6 | // 7 | // https://www.apache.org/licenses/LICENSE-2.0 8 | // 9 | // Unless required by applicable law or agreed to in writing, software 10 | // distributed under the License is distributed on an "AS IS" BASIS, 11 | // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | // See the License for the specific language governing permissions and 13 | // limitations under the License. 14 | 15 | #include "stacktraces.h" 16 | 17 | bool AsyncSafeTraceMultiset::Add(const CallTrace *trace) { 18 | uint64_t hash_val = CalculateHash(trace->num_frames, trace->frames); 19 | for (int64_t i = 0; i < MaxEntries(); i++) { 20 | int64_t idx = (i + hash_val) % MaxEntries(); 21 | auto &entry = traces_[idx]; 22 | int64_t count_zero = 0; 23 | entry.active_updates.fetch_add(1, std::memory_order_acquire); 24 | int64_t count = entry.count.load(std::memory_order_acquire); 25 | switch (count) { 26 | case 0: 27 | if (entry.count.compare_exchange_weak(count_zero, kTraceCountLocked, 28 | std::memory_order_relaxed)) { 29 | // This entry is reserved, there is no danger of interacting 30 | // with Extract, so decrement active_updates early. 31 | entry.active_updates.fetch_sub(1, std::memory_order_release); 32 | 33 | // memcpy is not async safe 34 | CallFrame *fb = entry.frame_buffer; 35 | for (int frame_num = 0; frame_num < trace->num_frames; ++frame_num) { 36 | fb[frame_num].lineno = trace->frames[frame_num].lineno; 37 | fb[frame_num].py_code = trace->frames[frame_num].py_code; 38 | } 39 | entry.trace.frames = fb; 40 | entry.trace.num_frames = trace->num_frames; 41 | entry.count.store(int64_t(1), std::memory_order_release); 42 | return true; 43 | } 44 | break; 45 | case kTraceCountLocked: 46 | // This entry is being updated by another thread. Move on. 47 | // Worst case we may end with multiple entries with the same trace. 48 | break; 49 | default: 50 | if (trace->num_frames == entry.trace.num_frames && 51 | Equal(trace->num_frames, trace->frames, entry.trace.frames)) { 52 | // Bump using a compare-swap instead of fetch_add to ensure 53 | // it hasn't been locked by a thread doing Extract(). 54 | // Reload count in case it was updated while we were 55 | // examining the trace. 56 | count = entry.count.load(std::memory_order_relaxed); 57 | if (count != kTraceCountLocked && 58 | entry.count.compare_exchange_weak(count, count + 1, 59 | std::memory_order_relaxed)) { 60 | entry.active_updates.fetch_sub(1, std::memory_order_release); 61 | return true; 62 | } 63 | } 64 | } 65 | // Did nothing, but we still need storage ordering between this 66 | // store and preceding loads. 67 | entry.active_updates.fetch_sub(1, std::memory_order_release); 68 | } 69 | return false; 70 | } 71 | 72 | int AsyncSafeTraceMultiset::Extract(int location, int max_frames, 73 | CallFrame *frames, int64_t *count) { 74 | if (location < 0 || location >= MaxEntries()) { 75 | return 0; 76 | } 77 | auto &entry = traces_[location]; 78 | int64_t c = entry.count.load(std::memory_order_acquire); 79 | if (c <= 0) { 80 | // Unused or in process of being updated, skip for now. 81 | return 0; 82 | } 83 | int num_frames = entry.trace.num_frames; 84 | if (num_frames > max_frames) { 85 | num_frames = max_frames; 86 | } 87 | 88 | c = entry.count.exchange(kTraceCountLocked, std::memory_order_acquire); 89 | 90 | for (int i = 0; i < num_frames; ++i) { 91 | frames[i].lineno = entry.trace.frames[i].lineno; 92 | frames[i].py_code = entry.trace.frames[i].py_code; 93 | } 94 | 95 | while (entry.active_updates.load(std::memory_order_acquire) != 0) { 96 | // spin 97 | // TODO: Introduce a limit to detect and break 98 | // deadlock 99 | } 100 | 101 | entry.count.store(0, std::memory_order_release); 102 | *count = c; 103 | return num_frames; 104 | } 105 | 106 | void TraceMultiset::Add(int num_frames, CallFrame *frames, int64_t count) { 107 | std::vector trace(frames, frames + num_frames); 108 | 109 | auto entry = traces_.find(trace); 110 | if (entry != traces_.end()) { 111 | entry->second += count; 112 | return; 113 | } 114 | traces_.emplace(std::move(trace), count); 115 | } 116 | 117 | int HarvestSamples(AsyncSafeTraceMultiset *from, TraceMultiset *to) { 118 | int trace_count = 0; 119 | int64_t num_traces = from->MaxEntries(); 120 | for (int64_t i = 0; i < num_traces; i++) { 121 | CallFrame frame[kMaxFramesToCapture]; 122 | int64_t count; 123 | 124 | int num_frames = from->Extract(i, kMaxFramesToCapture, &frame[0], &count); 125 | if (num_frames > 0 && count > 0) { 126 | ++trace_count; 127 | to->Add(num_frames, &frame[0], count); 128 | } 129 | } 130 | return trace_count; 131 | } 132 | 133 | uint64_t CalculateHash(int num_frames, const CallFrame *frame) { 134 | uint64_t h = 0; 135 | for (int i = 0; i < num_frames; i++) { 136 | h += static_cast(frame[i].lineno); 137 | h += h << 10; 138 | h ^= h >> 6; 139 | h += reinterpret_cast(frame[i].py_code); 140 | h += h << 10; 141 | h ^= h >> 6; 142 | } 143 | h += h << 3; 144 | h ^= h >> 11; 145 | return h; 146 | } 147 | 148 | bool Equal(int num_frames, const CallFrame *f1, const CallFrame *f2) { 149 | for (int i = 0; i < num_frames; i++) { 150 | if (f1[i].lineno != f2[i].lineno || f1[i].py_code != f2[i].py_code) { 151 | return false; 152 | } 153 | } 154 | return true; 155 | } 156 | -------------------------------------------------------------------------------- /googlecloudprofiler/src/stacktraces.h: -------------------------------------------------------------------------------- 1 | // Copyright 2018 Google LLC 2 | // 3 | // Licensed under the Apache License, Version 2.0 (the "License"); 4 | // you may not use this file except in compliance with the License. 5 | // You may obtain a copy of the License at 6 | // 7 | // https://www.apache.org/licenses/LICENSE-2.0 8 | // 9 | // Unless required by applicable law or agreed to in writing, software 10 | // distributed under the License is distributed on an "AS IS" BASIS, 11 | // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | // See the License for the specific language governing permissions and 13 | // limitations under the License. 14 | 15 | #ifndef GOOGLECLOUDPROFILER_SRC_STACKTRACES_H_ 16 | #define GOOGLECLOUDPROFILER_SRC_STACKTRACES_H_ 17 | 18 | #include 19 | #include 20 | 21 | #include 22 | #include // NOLINT 23 | #include 24 | #include 25 | 26 | typedef struct { 27 | int lineno; 28 | PyCodeObject *py_code; 29 | } CallFrame; 30 | 31 | typedef struct { 32 | int num_frames; 33 | CallFrame *frames; 34 | } CallTrace; 35 | 36 | enum CallTraceErrors { 37 | kUnknown = 0, 38 | kNoPyState = -1, 39 | }; 40 | 41 | // Maximum number of frames to store from the stack traces sampled. 42 | const int kMaxFramesToCapture = 128; 43 | 44 | uint64_t CalculateHash(int num_frames, const CallFrame *frame); 45 | bool Equal(int num_frames, const CallFrame *f1, const CallFrame *f2); 46 | 47 | // Multiset of stack traces. There is a maximum number of distinct 48 | // traces that can be held, return by MaxEntries(); 49 | // 50 | // The Add() operation is async-safe, but will fail and return false 51 | // if there is no room to store the trace. 52 | // 53 | // The Extract() operation will remove a specific entry, and it can 54 | // run concurrently with multiple Add() operations. Multiple 55 | // invocations of Extract() cannot be executed concurrently. 56 | // 57 | // The synchronization is implemented by using a sentinel count value 58 | // to reserve entries. Add() will reserve the first available entry, 59 | // save the stack frame, and then release the entry for other calls to 60 | // Add() or Extract(). Extract() will reserve the entry, wait until no 61 | // additions are in progress, and then release the entry to be reused 62 | // by a subsequent call to Add(). It is important for Extract() to 63 | // wait until no additions are in progress to avoid releasing the 64 | // entry while another thread is inspecting it. 65 | class AsyncSafeTraceMultiset { 66 | public: 67 | AsyncSafeTraceMultiset() { Reset(); } 68 | // Not copyable or assignable. 69 | AsyncSafeTraceMultiset(const AsyncSafeTraceMultiset &) = delete; 70 | AsyncSafeTraceMultiset &operator=(const AsyncSafeTraceMultiset &) = delete; 71 | 72 | void Reset() { memset(traces_, 0, sizeof(traces_)); } 73 | 74 | // Adds a trace to the set. If it is already present, increments its 75 | // count. This operation is thread safe and async safe. 76 | bool Add(const CallTrace *trace); 77 | 78 | // Extracts a trace from the array. frames must point to at least 79 | // max_frames contiguous frames. It will return the number of frames 80 | // written starting at frames[0], up to max_frames. Returns 0 if 81 | // there is no valid trace at this location. This operation is 82 | // thread safe with respect to Add() but only a single call to 83 | // Extract can be done at a time. 84 | int Extract(int location, int max_frames, CallFrame *frames, int64_t *count); 85 | 86 | int64_t MaxEntries() const { return kMaxStackTraces; } 87 | 88 | private: 89 | struct TraceData { 90 | // trace contains the frame count and a pointer to the frames. The 91 | // frames are stored in frame_buffer. 92 | CallTrace trace; 93 | // frame_buffer is the storage for stack frames. 94 | CallFrame frame_buffer[kMaxFramesToCapture]; 95 | // Number of times a trace has been encountered. 96 | // 0 indicates that the trace is unused, 97 | // <0 values are reserved, used for concurrency control. 98 | std::atomic count; 99 | // Number of active attempts to increase the counter on the trace. 100 | std::atomic active_updates; 101 | }; 102 | 103 | // TODO: Re-evaluate MaxStackTraces, to minimize storage 104 | // consumption while maintaining good performance and avoiding 105 | // overflow. 106 | static const int kMaxStackTraces = 2048; 107 | 108 | // Sentinel to use as trace count while the frames are being updated. 109 | static const int64_t kTraceCountLocked = -1; 110 | 111 | TraceData traces_[kMaxStackTraces]; 112 | }; 113 | 114 | // TraceMultiset implements a growable multi-set of traces. It is not 115 | // thread or async safe. Is it intended to be used to aggregate traces 116 | // collected atomically from AsyncSafeTraceMultiset, which implements 117 | // async and thread safe add/extract methods, but has fixed maximum 118 | // size. 119 | class TraceMultiset { 120 | private: 121 | struct TraceHash { 122 | std::size_t operator()(const std::vector &t) const { 123 | return CalculateHash(t.size(), t.data()); 124 | } 125 | }; 126 | 127 | struct TraceEqual { 128 | bool operator()(const std::vector &t1, 129 | const std::vector &t2) const { 130 | if (t1.size() != t2.size()) { 131 | return false; 132 | } 133 | return Equal(t1.size(), t1.data(), t2.data()); 134 | } 135 | }; 136 | 137 | typedef std::unordered_map, uint64_t, TraceHash, 138 | TraceEqual> 139 | CountMap; 140 | 141 | public: 142 | TraceMultiset() {} 143 | // Not copyable or assignable. 144 | TraceMultiset(const TraceMultiset &) = delete; 145 | TraceMultiset &operator=(const TraceMultiset &) = delete; 146 | 147 | // Add a trace to the array. If it is already in the array, 148 | // increment its count. 149 | void Add(int num_frames, CallFrame *frames, int64_t count); 150 | 151 | typedef CountMap::iterator iterator; 152 | typedef CountMap::const_iterator const_iterator; 153 | 154 | iterator begin() { return traces_.begin(); } 155 | iterator end() { return traces_.end(); } 156 | 157 | const_iterator begin() const { return const_iterator(traces_.begin()); } 158 | const_iterator end() const { return const_iterator(traces_.end()); } 159 | 160 | iterator erase(iterator it) { return traces_.erase(it); } 161 | 162 | void Clear() { traces_.clear(); } 163 | 164 | private: 165 | CountMap traces_; 166 | }; 167 | 168 | // HarvestSamples extracts traces from an asyncsafe trace multiset 169 | // and copies them into a trace multiset. It returns the number of samples 170 | // that were copied. This is thread-safe with respect to other threads adding 171 | // samples into the asyncsafe set. 172 | int HarvestSamples(AsyncSafeTraceMultiset *from, TraceMultiset *to); 173 | 174 | #endif // GOOGLECLOUDPROFILER_SRC_STACKTRACES_H_ 175 | -------------------------------------------------------------------------------- /kokoro/common.cfg: -------------------------------------------------------------------------------- 1 | # Copyright 2019 Google Inc. All Rights Reserved. 2 | # 3 | # Licensed under the Apache License, Version 2.0 (the "License"); 4 | # you may not use this file except in compliance with the License. 5 | # You may obtain a copy of the License at 6 | # 7 | # http://www.apache.org/licenses/LICENSE-2.0 8 | # 9 | # Unless required by applicable law or agreed to in writing, software 10 | # distributed under the License is distributed on an "AS IS" BASIS, 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | # See the License for the specific language governing permissions and 13 | # limitations under the License. 14 | 15 | before_action { 16 | fetch_keystore { 17 | keystore_resource { 18 | keystore_config_id: 72935 19 | keyname: "cloud-profiler-e2e-service-account-key" 20 | } 21 | } 22 | } 23 | -------------------------------------------------------------------------------- /kokoro/continuous.cfg: -------------------------------------------------------------------------------- 1 | # Copyright 2019 Google Inc. All Rights Reserved. 2 | # 3 | # Licensed under the Apache License, Version 2.0 (the "License"); 4 | # you may not use this file except in compliance with the License. 5 | # You may obtain a copy of the License at 6 | # 7 | # http://www.apache.org/licenses/LICENSE-2.0 8 | # 9 | # Unless required by applicable law or agreed to in writing, software 10 | # distributed under the License is distributed on an "AS IS" BASIS, 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | # See the License for the specific language governing permissions and 13 | # limitations under the License. 14 | 15 | # The path to the build script including the github_scm.name directory. 16 | build_file: "cloud-profiler-python/kokoro/integration_test.sh" 17 | -------------------------------------------------------------------------------- /kokoro/integration_test.go: -------------------------------------------------------------------------------- 1 | // Copyright 2019 Google LLC 2 | // 3 | // Licensed under the Apache License, Version 2.0 (the "License"); 4 | // you may not use this file except in compliance with the License. 5 | // You may obtain a copy of the License at 6 | // 7 | // https://www.apache.org/licenses/LICENSE-2.0 8 | // 9 | // Unless required by applicable law or agreed to in writing, software 10 | // distributed under the License is distributed on an "AS IS" BASIS, 11 | // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | // See the License for the specific language governing permissions and 13 | // limitations under the License. 14 | 15 | package e2e 16 | 17 | import ( 18 | "bytes" 19 | "flag" 20 | "fmt" 21 | "os" 22 | "runtime" 23 | "strings" 24 | "testing" 25 | "text/template" 26 | "time" 27 | 28 | "cloud.google.com/go/profiler/proftest" 29 | "golang.org/x/net/context" 30 | "golang.org/x/oauth2/google" 31 | 32 | compute "google.golang.org/api/compute/v1" 33 | ) 34 | 35 | var ( 36 | gcsLocation = flag.String("gcs_location", "", "GCS location for the agent") 37 | runBackoffTest = flag.Bool("run_backoff_test", false, "Enables the backoff integration test. This integration test requires over 45 mins to run, so it is not run by default.") 38 | runID = strings.ToLower(strings.Replace(time.Now().Format("2006-01-02-15-04-05.000000-MST"), ".", "-", -1)) 39 | benchFinishString = "benchmark application(s) complete" 40 | errorString = "failed to set up or run the benchmark" 41 | ) 42 | 43 | const ( 44 | cloudScope = "https://www.googleapis.com/auth/cloud-platform" 45 | storageReadScope = "https://www.googleapis.com/auth/devstorage.read_only" 46 | defaultGetPipURL = "https://bootstrap.pypa.io/get-pip.py" 47 | 48 | gceBenchDuration = 600 * time.Second 49 | gceTestTimeout = 20 * time.Minute 50 | 51 | // For any agents to receive backoff, there must be more than 32 agents in 52 | // the deployment. The initial backoff received will be 33 minutes; each 53 | // subsequent backoff will be one minute longer. Running 45 benchmarks for 54 | // 45 minutes will ensure that several agents receive backoff responses and 55 | // are able to wait for the backoff duration then send another request. 56 | numBackoffBenchmarks = 45 57 | backoffBenchDuration = 45 * time.Minute 58 | backoffTestTimeout = 60 * time.Minute 59 | ) 60 | 61 | const startupTemplate = ` 62 | {{ define "setup"}} 63 | 64 | # Install dependencies. 65 | {{if .InstallPythonVersion}} 66 | # ppa:deadsnakes/ppa contains desired Python versions. 67 | retry add-apt-repository -y ppa:deadsnakes/ppa >/dev/null 68 | {{end}} 69 | # Force IPv4 to prevent long IPv6 timeouts. 70 | # TODO : Validate this solves the issue. Remove if not. 71 | retry apt-get -o Acquire::ForceIPv4=true update >/dev/null 72 | retry apt-get -o Acquire::ForceIPv4=true install -yq git build-essential python3-distutils {{.PythonDev}} {{if .InstallPythonVersion}}{{.InstallPythonVersion}}{{end}} >/dev/ttyS2 73 | # Print current Python version. 74 | {{.PythonCommand}} --version 75 | # Distutils need to be installed separately when explicitly testing various 76 | # Python versions. 77 | {{if .InstallPythonVersion}} 78 | retry apt-get -yq install {{.InstallPythonVersion}}-distutils 79 | {{end}} 80 | 81 | # Install Python dependencies. 82 | retry wget -O /tmp/get-pip.py {{.GetPipURL}} >/dev/null 83 | retry {{.PythonCommand}} /tmp/get-pip.py >/dev/null 84 | retry {{.PythonCommand}} -m pip install --upgrade pyasn1 >/dev/null 85 | 86 | # Setup pipenv 87 | retry {{.PythonCommand}} -m pip install pipenv > /dev/null 88 | mkdir bench && cd bench 89 | retry pipenv install > /dev/null 90 | 91 | 92 | # Fetch agent. 93 | mkdir /tmp/agent 94 | retry gsutil cp gs://{{.GCSLocation}}/* /tmp/agent 95 | 96 | # Install agent. 97 | retry pipenv run {{.PythonCommand}} -m pip install --ignore-installed "$(find /tmp/agent -name "google_cloud_profiler*")" 98 | 99 | # Write bench app. 100 | export BENCH_DIR="$HOME/bench" 101 | 102 | cat << EOF > bench.py 103 | import googlecloudprofiler 104 | import sys 105 | import time 106 | import traceback 107 | 108 | def python_bench(): 109 | for counter in range(1, 5000): 110 | pass 111 | 112 | def repeat_bench(dur_sec): 113 | t_end = time.time() + dur_sec 114 | while time.time() < t_end or dur_sec == 0: 115 | python_bench() 116 | 117 | if __name__ == '__main__': 118 | if not {{.VersionCheck}}: 119 | raise EnvironmentError('Python version %s failed to satisfy "{{.VersionCheck}}".' % str(sys.version_info)) 120 | 121 | try: 122 | googlecloudprofiler.start( 123 | service='{{.Service}}', 124 | service_version='1.0.0', 125 | verbose=3) 126 | except BaseException: 127 | sys.exit('Failed to start the profiler: %s' % traceback.format_exc()) 128 | repeat_bench({{.DurationSec}}) 129 | EOF 130 | 131 | {{- end }} 132 | 133 | {{ define "integration" -}} 134 | {{- template "prologue" . }} 135 | {{- template "setup" . }} 136 | 137 | # Run bench app. 138 | pipenv run {{.PythonCommand}} bench.py 139 | 140 | # Indicate to test that script has finished running. 141 | echo "{{.FinishString}}" 142 | 143 | {{ template "epilogue" . -}} 144 | {{end}} 145 | 146 | {{ define "integration_backoff" -}} 147 | {{- template "prologue" . }} 148 | {{- template "setup" . }} 149 | 150 | # Do not display commands being run to simplify logging output. 151 | set +x 152 | 153 | echo "Starting {{.NumBackoffBenchmarks}} benchmarks." 154 | for (( i = 0; i < {{.NumBackoffBenchmarks}}; i++ )); do 155 | (pipenv run {{.PythonCommand}} bench.py) |& while read line; \ 156 | do echo "benchmark $i: ${line}"; done & 157 | done 158 | echo "Successfully started {{.NumBackoffBenchmarks}} benchmarks." 159 | 160 | wait 161 | 162 | # Continue displaying commands being run. 163 | set -x 164 | 165 | echo "{{.FinishString}}" 166 | 167 | {{ template "epilogue" . -}} 168 | {{ end }} 169 | ` 170 | 171 | type testCase struct { 172 | proftest.InstanceConfig 173 | name string 174 | // Python version to install. Empty string means no installation is needed. 175 | installPythonVersion string 176 | // Python command name, e.g "python" or "python3". 177 | pythonCommand string 178 | // The python-dev package to install, e.g "python-dev" or "python3.5-dev". 179 | pythonDev string 180 | // Used in the bench code to check the Python version, e.g 181 | // "sys.version_info[:2] == (2.7)". 182 | versionCheck string 183 | // URL of the get-pip.py script, defaults to 184 | // the value of https://bootstrap.pypa.io/get-pip.py when not specified. 185 | getPipURL string 186 | // Timeout for the integration test. 187 | timeout time.Duration 188 | // When true, a backoff test should be run. Otherwise, run a standard 189 | // integration test. 190 | backoffTest bool 191 | // Duration for which benchmark application should run. 192 | benchDuration time.Duration 193 | // Maps profile type to function name wanted for that type. Empty function 194 | // name means the type should not be profiled. Only used when backoffTest is 195 | // false. 196 | wantProfiles map[string]string 197 | } 198 | 199 | func (tc *testCase) initializeStartUpScript(template *template.Template) error { 200 | params := struct { 201 | Service string 202 | GCSLocation string 203 | InstallPythonVersion string 204 | PythonCommand string 205 | PythonDev string 206 | VersionCheck string 207 | GetPipURL string 208 | FinishString string 209 | ErrorString string 210 | DurationSec int 211 | NumBackoffBenchmarks int 212 | }{ 213 | Service: tc.name, 214 | GCSLocation: *gcsLocation, 215 | InstallPythonVersion: tc.installPythonVersion, 216 | PythonCommand: tc.pythonCommand, 217 | PythonDev: tc.pythonDev, 218 | VersionCheck: tc.versionCheck, 219 | GetPipURL: tc.getPipURL, 220 | FinishString: benchFinishString, 221 | ErrorString: errorString, 222 | DurationSec: int(tc.benchDuration.Seconds()), 223 | } 224 | 225 | testTemplate := "integration" 226 | if tc.backoffTest { 227 | testTemplate = "integration_backoff" 228 | params.NumBackoffBenchmarks = numBackoffBenchmarks 229 | } 230 | 231 | var buf bytes.Buffer 232 | err := template.Lookup(testTemplate).Execute(&buf, params) 233 | if err != nil { 234 | return fmt.Errorf("failed to render startup script for %s: %v", tc.name, err) 235 | } 236 | tc.StartupScript = buf.String() 237 | return nil 238 | } 239 | 240 | func TestAgentIntegration(t *testing.T) { 241 | projectID := os.Getenv("GCLOUD_TESTS_PYTHON_PROJECT_ID") 242 | if projectID == "" { 243 | t.Fatalf("Getenv(GCLOUD_TESTS_PYTHON_PROJECT_ID) got empty string") 244 | } 245 | 246 | zone := os.Getenv("GCLOUD_TESTS_PYTHON_ZONE") 247 | if zone == "" { 248 | t.Fatalf("Getenv(GCLOUD_TESTS_PYTHON_ZONE) got empty string") 249 | } 250 | 251 | if *gcsLocation == "" { 252 | t.Fatal("gcsLocation flag is not set") 253 | } 254 | 255 | ctx := context.Background() 256 | 257 | client, err := google.DefaultClient(ctx, cloudScope) 258 | if err != nil { 259 | t.Fatalf("failed to get default client: %v", err) 260 | } 261 | 262 | computeService, err := compute.New(client) 263 | if err != nil { 264 | t.Fatalf("failed to initialize compute Service: %v", err) 265 | } 266 | template, err := proftest.BaseStartupTmpl.Parse(startupTemplate) 267 | if err != nil { 268 | t.Fatalf("failed to parse startup script template: %v", err) 269 | } 270 | 271 | gceTr := proftest.GCETestRunner{ 272 | TestRunner: proftest.TestRunner{ 273 | Client: client, 274 | }, 275 | ComputeService: computeService, 276 | } 277 | 278 | testcases := generateTestCases(projectID, zone) 279 | 280 | // Allow test cases to run in parallel. 281 | runtime.GOMAXPROCS(len(testcases)) 282 | 283 | for _, tc := range testcases { 284 | tc := tc // capture range variable 285 | t.Run(tc.name, func(t *testing.T) { 286 | t.Parallel() 287 | if err := tc.initializeStartUpScript(template); err != nil { 288 | t.Fatalf("failed to initialize startup script: %v", err) 289 | } 290 | 291 | gceTr.StartInstance(ctx, &tc.InstanceConfig) 292 | defer func() { 293 | if gceTr.DeleteInstance(ctx, &tc.InstanceConfig); err != nil { 294 | t.Fatalf("failed to delete instance: %v", err) 295 | } 296 | }() 297 | 298 | timeoutCtx, cancel := context.WithTimeout(ctx, tc.timeout) 299 | defer cancel() 300 | output, err := gceTr.PollAndLogSerialPort(timeoutCtx, &tc.InstanceConfig, benchFinishString, errorString, t.Logf) 301 | if err != nil { 302 | t.Fatal(err) 303 | } 304 | 305 | if tc.backoffTest { 306 | if err := proftest.CheckSerialOutputForBackoffs(output, numBackoffBenchmarks, "generic::aborted: action throttled, backoff for", "Starting to create profile", "benchmark"); err != nil { 307 | t.Errorf("failed to check serial output for backoffs: %v", err) 308 | } 309 | return 310 | } 311 | 312 | timeNow := time.Now() 313 | endTime := timeNow.Format(time.RFC3339) 314 | startTime := timeNow.Add(-1 * time.Hour).Format(time.RFC3339) 315 | for pType, function := range tc.wantProfiles { 316 | pr, err := gceTr.TestRunner.QueryProfilesWithZone(tc.ProjectID, tc.name, startTime, endTime, pType, zone) 317 | if function == "" { 318 | if err == nil { 319 | t.Errorf("QueryProfilesWithZone(%s, %s, %s, %s, %s, %s) got profile, want no profile", tc.ProjectID, tc.name, startTime, endTime, pType, zone) 320 | } 321 | continue 322 | } 323 | 324 | if err != nil { 325 | t.Errorf("QueryProfiles(%s, %s, %s, %s, %s) got error: %v", tc.ProjectID, tc.name, startTime, endTime, pType, err) 326 | continue 327 | } 328 | 329 | if err := pr.HasFunction(function); err != nil { 330 | t.Errorf("Function %s not found in profiles of type %s: %v", function, pType, err) 331 | } 332 | } 333 | }) 334 | } 335 | } 336 | 337 | func generateTestCases(projectID, zone string) []testCase { 338 | tcs := []testCase{ 339 | // Test GCE Ubuntu default Python 3, expect Python 3.10. 340 | { 341 | InstanceConfig: proftest.InstanceConfig{ 342 | ProjectID: projectID, 343 | Zone: zone, 344 | Name: fmt.Sprintf("profiler-test-python3-%s", runID), 345 | MachineType: "n1-standard-1", 346 | ImageProject: "ubuntu-os-cloud", 347 | ImageFamily: "ubuntu-2204-lts", 348 | Scopes: []string{storageReadScope}, 349 | }, 350 | name: fmt.Sprintf("profiler-test-python3-%s-gce", runID), 351 | wantProfiles: map[string]string{ 352 | "WALL": "repeat_bench", 353 | "CPU": "repeat_bench", 354 | }, 355 | pythonCommand: "python3", 356 | pythonDev: "python3-dev", 357 | versionCheck: "sys.version_info[:2] == (3, 10)", 358 | getPipURL: defaultGetPipURL, 359 | timeout: gceTestTimeout, 360 | benchDuration: gceBenchDuration, 361 | }, 362 | } 363 | 364 | for _, minorVersion := range []int{7, 8, 9, 10, 11} { 365 | getPipURL := defaultGetPipURL 366 | // TODO: remove special case once 3.7 is dropped 367 | if minorVersion == 7 { 368 | getPipURL = "https://bootstrap.pypa.io/pip/3.7/get-pip.py" 369 | } 370 | 371 | tcs = append(tcs, testCase{ 372 | InstanceConfig: proftest.InstanceConfig{ 373 | ProjectID: projectID, 374 | Zone: zone, 375 | Name: fmt.Sprintf("profiler-test-python3%d-%s", minorVersion, runID), 376 | MachineType: "n1-standard-1", 377 | ImageProject: "ubuntu-os-cloud", 378 | ImageFamily: "ubuntu-2204-lts", 379 | Scopes: []string{storageReadScope}, 380 | }, 381 | name: fmt.Sprintf("profiler-test-python3%d-%s-gce", minorVersion, runID), 382 | wantProfiles: map[string]string{ 383 | "WALL": "repeat_bench", 384 | "CPU": "repeat_bench", 385 | }, 386 | installPythonVersion: fmt.Sprintf("python3.%d", minorVersion), 387 | pythonCommand: fmt.Sprintf("python3.%d", minorVersion), 388 | getPipURL: getPipURL, 389 | pythonDev: fmt.Sprintf("python3.%d-dev", minorVersion), 390 | versionCheck: fmt.Sprintf("sys.version_info[:2] >= (3, %d)", minorVersion), 391 | timeout: gceTestTimeout, 392 | benchDuration: gceBenchDuration, 393 | }) 394 | } 395 | 396 | if *runBackoffTest { 397 | tcs = append(tcs, testCase{ 398 | // Test GCE Ubuntu default Python 3, expect Python 3.10. 399 | InstanceConfig: proftest.InstanceConfig{ 400 | ProjectID: projectID, 401 | Zone: zone, 402 | Name: fmt.Sprintf("profiler-test-python3-backoff-%s", runID), 403 | ImageProject: "ubuntu-os-cloud", 404 | ImageFamily: "ubuntu-2204-lts", 405 | Scopes: []string{storageReadScope}, 406 | 407 | // Running many copies of the benchmark requires more 408 | // memory than is available on an n1-standard-1. Use a 409 | // machine type with more memory for backoff test. 410 | MachineType: "n1-highmem-2", 411 | }, 412 | name: fmt.Sprintf("profiler-test-python3-backoff-%s-gce", runID), 413 | wantProfiles: map[string]string{ 414 | "WALL": "repeat_bench", 415 | "CPU": "repeat_bench", 416 | }, 417 | pythonCommand: "python3", 418 | pythonDev: "python3-dev", 419 | getPipURL: defaultGetPipURL, 420 | versionCheck: "sys.version_info[:2] == (3, 10)", 421 | timeout: backoffTestTimeout, 422 | benchDuration: backoffBenchDuration, 423 | backoffTest: true, 424 | }) 425 | } 426 | 427 | return tcs 428 | } 429 | -------------------------------------------------------------------------------- /kokoro/integration_test.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | # 3 | # Copyright 2019 Google LLC 4 | # 5 | # Licensed under the Apache License, Version 2.0 (the "License"); 6 | # you may not use this file except in compliance with the License. 7 | # You may obtain a copy of the License at 8 | # 9 | # https://www.apache.org/licenses/LICENSE-2.0 10 | # 11 | # Unless required by applicable law or agreed to in writing, software 12 | # distributed under the License is distributed on an "AS IS" BASIS, 13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 14 | # See the License for the specific language governing permissions and 15 | # limitations under the License. 16 | 17 | retry() { 18 | for i in {1..3}; do 19 | [[ $i == 1 ]] || sleep 10 # Backing off after a failed attempt. 20 | "${@}" && return 0 21 | done 22 | return 1 23 | } 24 | 25 | # Fail on any error. 26 | set -eo pipefail 27 | 28 | # Display commands being run. 29 | set -x 30 | 31 | cd $(dirname $0)/.. 32 | 33 | export GCLOUD_TESTS_PYTHON_PROJECT_ID="cloud-profiler-e2e" 34 | 35 | export GCLOUD_TESTS_PYTHON_ZONE="us-west3-b" 36 | 37 | export GOOGLE_APPLICATION_CREDENTIALS="${KOKORO_KEYSTORE_DIR}/72935_cloud-profiler-e2e-service-account-key" 38 | 39 | # Package the agent and upload to GCS. 40 | retry python3 -m pip install --user --upgrade setuptools wheel twine 41 | python3 setup.py sdist 42 | AGENT_PATH=$(find "$PWD/dist" -name "google_cloud_profiler*") 43 | GCS_LOCATION="cprof-e2e-artifacts/python/kokoro/${KOKORO_JOB_TYPE}/${KOKORO_BUILD_NUMBER}" 44 | retry gcloud auth activate-service-account --key-file="${GOOGLE_APPLICATION_CREDENTIALS}" 45 | retry gsutil cp "${AGENT_PATH}" "gs://${GCS_LOCATION}/" 46 | 47 | # Run test. 48 | cd "kokoro" 49 | 50 | # Backoff test should not be run on presubmit. 51 | RUN_BACKOFF_TEST="true" 52 | if [[ "$KOKORO_JOB_TYPE" == "PRESUBMIT_GITHUB" ]]; then 53 | RUN_BACKOFF_TEST="false" 54 | fi 55 | 56 | # Pull in newer version of Go than provided by Kokoro image 57 | go version 58 | GO_VERSION="1.22.4" 59 | retry curl -LO https://go.dev/dl/go${GO_VERSION}.linux-amd64.tar.gz 60 | sudo rm -rf /usr/local/go && tar -C /usr/local -xzf go${GO_VERSION}.linux-amd64.tar.gz 61 | export PATH=$PATH:/usr/local/go/bin 62 | go version 63 | 64 | # Initializing go modules allows our dependencies to install versions of their 65 | # dependencies specified by their go.mod files. This reduces the likelihood of 66 | # dependencies breaking this test. 67 | go mod init e2e 68 | 69 | # Compile test before running to download dependencies. 70 | retry go get cloud.google.com/go/profiler/proftest@main 71 | retry go test -c 72 | ./e2e.test -gcs_location="${GCS_LOCATION}" -run_backoff_test=$RUN_BACKOFF_TEST 73 | 74 | # Exit with success code if no need to release the agent. 75 | if [[ "$KOKORO_JOB_TYPE" != "RELEASE" ]]; then 76 | exit 0 77 | fi 78 | 79 | # Release the agent to PyPI. 80 | PYPI_PASSWORD="$(cat "$KOKORO_KEYSTORE_DIR"/72935_pypi-google-cloud-profiler-team-password)" 81 | cat >~/.pypirc <=1.0.0', 31 | 'google-auth-httplib2', 32 | 'protobuf>=3.20', 33 | 'requests', 34 | ] 35 | 36 | ext_module = [ 37 | Extension( 38 | 'googlecloudprofiler._profiler', 39 | sources=glob.glob('googlecloudprofiler/src/*.cc'), 40 | include_dirs=['googlecloudprofiler/src'], 41 | language='c++', 42 | extra_compile_args=['-std=c++11'], 43 | extra_link_args=[ 44 | '-std=c++11', 45 | '-static-libstdc++', 46 | # While libgcc_s.so.1 is pretty much always installed by default 47 | # for non-Alpine linux, it is not installed by default in Alpine. 48 | # So, to support Alpine, we will always statically link "libgcc" 49 | # package. We could alternatively require users to install the 50 | # "libgcc" package, but the static linkage seems less 51 | # invasive. 52 | '-static-libgcc' 53 | ]) 54 | ] 55 | 56 | if not (sys.platform.startswith('linux') or sys.platform.startswith('darwin')): 57 | print( 58 | sys.platform, 'is not a supported operating system.\n' 59 | 'Profiler Python agent modules will be installed but will not ' 60 | 'be functional. Refer to the documentation for a list of ' 61 | 'supported operating systems.\n') 62 | ext_module = [] 63 | 64 | if sys.platform.startswith('darwin'): 65 | print( 66 | 'Profiler Python agent has limited support for ', sys.platform, '. ' 67 | 'Wall profiler is available with supported Python versions. ' 68 | 'CPU profiler is not available. ' 69 | 'Refer to the documentation for a list of supported operating ' 70 | 'systems and Python versions.\n') 71 | ext_module = [] 72 | 73 | 74 | def get_version(): 75 | """Read the version from __version__.py.""" 76 | 77 | with open('googlecloudprofiler/__version__.py') as fp: 78 | # Do not handle exceptions from open() so setup will fail when it cannot 79 | # open the file 80 | line = fp.read() 81 | version = re.search(r"^__version__ = '([0-9]+\.[0-9]+(\.[0-9]+)?-?.*)'", 82 | line, re.M) 83 | if version: 84 | return version.group(1) 85 | 86 | raise RuntimeError( 87 | 'Cannot determine version from googlecloudprofiler/__init__.py.') 88 | 89 | 90 | setup( 91 | name='google-cloud-profiler', 92 | description='Google Cloud Profiler Python Agent', 93 | long_description=open('README.md').read(), 94 | long_description_content_type='text/markdown', 95 | url='https://github.com/GoogleCloudPlatform/cloud-profiler-python', 96 | author='Google LLC', 97 | version=get_version(), 98 | install_requires=install_requires, 99 | setup_requires=['wheel'], 100 | packages=['googlecloudprofiler'], 101 | ext_modules=ext_module, 102 | license='Apache License, Version 2.0', 103 | keywords='google cloud profiler', 104 | classifiers=[ 105 | 'Development Status :: 5 - Production/Stable', 106 | 'Intended Audience :: Developers', 107 | 'License :: OSI Approved :: Apache Software License', 108 | 'Programming Language :: Python :: 3.7', 109 | 'Programming Language :: Python :: 3.8', 110 | 'Programming Language :: Python :: 3.9', 111 | 'Programming Language :: Python :: 3.10', 112 | 'Programming Language :: Python :: 3.11', 113 | ], 114 | ) 115 | --------------------------------------------------------------------------------