├── .editorconfig ├── .github ├── ISSUE_TEMPLATE │ ├── bug_report.md │ ├── custom.md │ └── feature_request.md ├── pull_request_template.md └── workflows │ ├── greetings.yml │ └── pylint.yml ├── .gitignore ├── .gitmodules ├── CITATION.cff ├── LICENSE ├── README.md ├── data_save └── funcs.json ├── doc ├── api-key.png ├── generate.png ├── how-it-works.png ├── result.gif ├── result.png └── trace.png ├── gpttrace ├── GPTtrace.py ├── __init__.py ├── __main__.py ├── bpftrace.py ├── cmd.py ├── config.py ├── examples.py ├── execute.py ├── prompt.py └── utils │ ├── __init__.py │ └── common.py ├── install.sh ├── pyproject.toml ├── requirements.txt └── tools ├── bashreadline.bt ├── bashreadline_example.txt ├── biolatency-kp.bt ├── biolatency.bt ├── biolatency_example.txt ├── biosnoop.bt ├── biosnoop_example.txt ├── biostacks.bt ├── biostacks_example.txt ├── bitesize.bt ├── bitesize_example.txt ├── capable.bt ├── capable_example.txt ├── cpuwalk.bt ├── cpuwalk_example.txt ├── dcsnoop.bt ├── dcsnoop_example.txt ├── examples.json ├── execsnoop.bt ├── execsnoop_example.txt ├── generate.py ├── gethostlatency.bt ├── gethostlatency_example.txt ├── killsnoop.bt ├── killsnoop_example.txt ├── loads.bt ├── loads_example.txt ├── mdflush.bt ├── mdflush_example.txt ├── naptime.bt ├── naptime_example.txt ├── oomkill.bt ├── oomkill_example.txt ├── opensnoop.bt ├── opensnoop_example.txt ├── output.json ├── pidpersec.bt ├── pidpersec_example.txt ├── runqlat.bt ├── runqlat_example.txt ├── runqlen.bt ├── runqlen_example.txt ├── setuids.bt ├── setuids_example.txt ├── ssllatency.bt ├── ssllatency_example.txt ├── sslsnoop.bt ├── sslsnoop_example.txt ├── statsnoop.bt ├── statsnoop_example.txt ├── swapin.bt ├── swapin_example.txt ├── syncsnoop.bt ├── syncsnoop_example.txt ├── syscount.bt ├── syscount_example.txt ├── tcpaccept.bt ├── tcpaccept_example.txt ├── tcpconnect.bt ├── tcpconnect_example.txt ├── tcpdrop.bt ├── tcpdrop_example.txt ├── tcplife.bt ├── tcplife_example.txt ├── tcpretrans.bt ├── tcpretrans_example.txt ├── tcpsynbl.bt ├── tcpsynbl_example.txt ├── threadsnoop.bt ├── threadsnoop_example.txt ├── undump.bt ├── undump_example.txt ├── vfscount.bt ├── vfscount_example.txt ├── vfsstat.bt ├── vfsstat_example.txt ├── writeback.bt ├── writeback_example.txt ├── xfsdist.bt └── xfsdist_example.txt /.editorconfig: -------------------------------------------------------------------------------- 1 | # EditorConfig is awesome: http://EditorConfig.org 2 | 3 | # top-most EditorConfig file 4 | root = true 5 | 6 | # Unix-style newlines with a newline ending every file 7 | [*] 8 | end_of_line = lf 9 | insert_final_newline = true 10 | trim_trailing_whitespace = true 11 | charset = utf-8 12 | 13 | # 4 space indentation 14 | [*.{py,java,r,R}] 15 | indent_style = space 16 | indent_size = 4 17 | 18 | # 2 space indentation 19 | [*.{js,json,y{a,}ml,html,cwl}] 20 | indent_style = space 21 | indent_size = 2 22 | 23 | [*.{md,Rmd,rst}] 24 | trim_trailing_whitespace = false 25 | indent_style = space 26 | indent_size = 2 27 | 28 | [*.py] 29 | skip = build,.tox,.venv 30 | -------------------------------------------------------------------------------- /.github/ISSUE_TEMPLATE/bug_report.md: -------------------------------------------------------------------------------- 1 | --- 2 | name: Bug report 3 | about: Create a report to help me improve 4 | title: "[BUG]" 5 | labels: bug 6 | assignees: '' 7 | 8 | --- 9 | 10 | **Describe the bug** 11 | A clear and concise description of what the bug is. 12 | 13 | **To Reproduce** 14 | Steps to reproduce the behavior: 15 | 16 | 1. Go to '...' 17 | 2. Click on '....' 18 | 3. Scroll down to '....' 19 | 4. See error 20 | 21 | **Expected behavior** 22 | A clear and concise description of what you expected to happen. 23 | 24 | **Screenshots** 25 | If applicable, add screenshots to help explain your problem. 26 | 27 | **Desktop (please complete the following information):** 28 | 29 | * OS: [e.g. Windows] 30 | * Version [e.g. 10] 31 | 32 | **Additional context** 33 | Add any other context about the problem here. 34 | -------------------------------------------------------------------------------- /.github/ISSUE_TEMPLATE/custom.md: -------------------------------------------------------------------------------- 1 | --- 2 | name: Custom issue template 3 | about: Describe this issue template's purpose here. 4 | title: '' 5 | labels: '' 6 | assignees: '' 7 | 8 | --- 9 | 10 | 11 | -------------------------------------------------------------------------------- /.github/ISSUE_TEMPLATE/feature_request.md: -------------------------------------------------------------------------------- 1 | --- 2 | name: Feature request 3 | about: Suggest an idea for this project 4 | title: "[FEATURE]" 5 | labels: enhancement 6 | assignees: '' 7 | 8 | --- 9 | 10 | **Is your feature request related to a problem? Please describe.** 11 | A clear and concise description of what the problem is. Ex. I'm always frustrated 12 | when [...] 13 | 14 | **Describe the solution you'd like** 15 | A clear and concise description of what you want to happen. 16 | 17 | **Describe alternatives you've considered** 18 | A clear and concise description of any alternative solutions or features you've considered. 19 | 20 | **Provide usage examples** 21 | A few examples of how the feature should be used. Please make sure they are clear 22 | and concise. 23 | 24 | **Additional context** 25 | Add any other context or screenshots about the feature request here. 26 | -------------------------------------------------------------------------------- /.github/pull_request_template.md: -------------------------------------------------------------------------------- 1 | # Pull Request Template 2 | 3 | ## Description 4 | 5 | Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context. List any dependencies that are required for this change. 6 | 7 | Fixes # (issue) 8 | 9 | ## Type of change 10 | 11 | Please delete options that are not relevant. 12 | 13 | - [ ] Bug fix (non-breaking change which fixes an issue) 14 | - [ ] New feature (non-breaking change which adds functionality) 15 | - [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected) 16 | - [ ] This change requires a documentation update 17 | 18 | ## How Has This Been Tested? 19 | 20 | Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration 21 | 22 | - [ ] Test A 23 | - [ ] Test B 24 | 25 | **Test Configuration**: 26 | 27 | - Firmware version: 28 | - Hardware: 29 | - Toolchain: 30 | - SDK: 31 | 32 | ## Checklist 33 | 34 | - [ ] My code follows the style guidelines of this project 35 | - [ ] I have performed a self-review of my own code 36 | - [ ] I have commented my code, particularly in hard-to-understand areas 37 | - [ ] I have made corresponding changes to the documentation 38 | - [ ] My changes generate no new warnings 39 | - [ ] I have added tests that prove my fix is effective or that my feature works 40 | - [ ] New and existing unit tests pass locally with my changes 41 | - [ ] Any dependent changes have been merged and published in downstream modules 42 | - [ ] I have checked my code and corrected any misspellings 43 | -------------------------------------------------------------------------------- /.github/workflows/greetings.yml: -------------------------------------------------------------------------------- 1 | name: Greetings 2 | 3 | on: [pull_request_target, issues] 4 | 5 | jobs: 6 | greeting: 7 | runs-on: ubuntu-latest 8 | permissions: 9 | issues: write 10 | pull-requests: write 11 | steps: 12 | - uses: actions/first-interaction@v1 13 | with: 14 | repo-token: ${{ secrets.GITHUB_TOKEN }} 15 | issue-message: 'Thanks for using eunomia-bpf! We appreciate your help and we’ll take care of this as soon as possible.' 16 | pr-message: 'Thanks for your contribution! You are to making eunomia-bpf even better.' 17 | -------------------------------------------------------------------------------- /.github/workflows/pylint.yml: -------------------------------------------------------------------------------- 1 | name: Pylint 2 | 3 | on: [push] 4 | 5 | jobs: 6 | build: 7 | runs-on: ubuntu-latest 8 | strategy: 9 | matrix: 10 | python-version: ["3.9"] 11 | steps: 12 | - uses: actions/checkout@v3 13 | - name: Set up Python ${{ matrix.python-version }} 14 | uses: actions/setup-python@v3 15 | with: 16 | python-version: ${{ matrix.python-version }} 17 | - name: Install dependencies 18 | run: | 19 | python -m pip install --upgrade pip 20 | pip install pylint 21 | pip install -r requirements.txt 22 | - name: Analysing the code with pylint 23 | run: | 24 | # pylint $(git ls-files '*.py') 25 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | token.txt 2 | conv.txt 3 | 4 | # Byte-compiled / optimized / DLL files 5 | *__pycache__/ 6 | *.py[cod] 7 | *$py.class 8 | 9 | # C extensions 10 | *.so 11 | 12 | # Distribution / packaging 13 | .Python 14 | build/ 15 | develop-eggs/ 16 | dist/ 17 | downloads/ 18 | eggs/ 19 | .eggs/ 20 | lib/ 21 | lib64/ 22 | parts/ 23 | sdist/ 24 | var/ 25 | wheels/ 26 | share/python-wheels/ 27 | *.egg-info/ 28 | .installed.cfg 29 | *.egg 30 | MANIFEST 31 | 32 | # PyInstaller 33 | # Usually these files are written by a python script from a template 34 | # before PyInstaller builds the exe, so as to inject date/other infos into it. 35 | *.manifest 36 | *.spec 37 | 38 | # Installer logs 39 | pip-log.txt 40 | pip-delete-this-directory.txt 41 | 42 | # Unit test / coverage reports 43 | htmlcov/ 44 | .tox/ 45 | .nox/ 46 | .coverage 47 | .coverage.* 48 | .cache 49 | nosetests.xml 50 | coverage.xml 51 | *.cover 52 | *.py,cover 53 | .hypothesis/ 54 | .pytest_cache/ 55 | cover/ 56 | 57 | # Translations 58 | *.mo 59 | *.pot 60 | 61 | # Django stuff: 62 | *.log 63 | local_settings.py 64 | db.sqlite3 65 | db.sqlite3-journal 66 | 67 | # Flask stuff: 68 | instance/ 69 | .webassets-cache 70 | 71 | # Scrapy stuff: 72 | .scrapy 73 | 74 | # Sphinx documentation 75 | docs/_build/ 76 | 77 | # PyBuilder 78 | .pybuilder/ 79 | target/ 80 | 81 | # Jupyter Notebook 82 | .ipynb_checkpoints 83 | 84 | # IPython 85 | profile_default/ 86 | ipython_config.py 87 | 88 | # pyenv 89 | # For a library or package, you might want to ignore these files since the code is 90 | # intended to run in multiple environments; otherwise, check them in: 91 | # .python-version 92 | 93 | # pipenv 94 | # According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control. 95 | # However, in case of collaboration, if having platform-specific dependencies or dependencies 96 | # having no cross-platform support, pipenv may install dependencies that don't work, or not 97 | # install all needed dependencies. 98 | #Pipfile.lock 99 | 100 | # poetry 101 | # Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control. 102 | # This is especially recommended for binary packages to ensure reproducibility, and is more 103 | # commonly ignored for libraries. 104 | # https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control 105 | #poetry.lock 106 | 107 | # pdm 108 | # Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control. 109 | #pdm.lock 110 | # pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it 111 | # in version control. 112 | # https://pdm.fming.dev/#use-with-ide 113 | .pdm.toml 114 | 115 | # PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm 116 | __pypackages__/ 117 | 118 | # Celery stuff 119 | celerybeat-schedule 120 | celerybeat.pid 121 | 122 | # SageMath parsed files 123 | *.sage.py 124 | 125 | # Environments 126 | .env 127 | .venv 128 | env/ 129 | venv/ 130 | ENV/ 131 | env.bak/ 132 | venv.bak/ 133 | 134 | # Spyder project settings 135 | .spyderproject 136 | .spyproject 137 | 138 | # Rope project settings 139 | .ropeproject 140 | 141 | # mkdocs documentation 142 | /site 143 | 144 | # mypy 145 | .mypy_cache/ 146 | .dmypy.json 147 | dmypy.json 148 | 149 | # Pyre type checker 150 | .pyre/ 151 | 152 | # pytype static type analyzer 153 | .pytype/ 154 | 155 | # Cython debug symbols 156 | cython_debug/ 157 | 158 | # PyCharm 159 | # JetBrains specific template is maintained in a separate JetBrains.gitignore that can 160 | # be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore 161 | # and can be added to the global gitignore or merged into this file. For a more nuclear 162 | # option (not recommended) you can uncomment the following to ignore the entire idea folder. 163 | #.idea/ 164 | 165 | 166 | *.tmp 167 | 168 | bpf_tutorial 169 | -------------------------------------------------------------------------------- /.gitmodules: -------------------------------------------------------------------------------- 1 | [submodule "bpf_tutorial"] 2 | path = bpf_tutorial 3 | url = https://github.com/eunomia-bpf/bpf-developer-tutorial 4 | -------------------------------------------------------------------------------- /CITATION.cff: -------------------------------------------------------------------------------- 1 | @inproceedings{10.1145/3672197.3673434, 2 | author = {Zheng, Yusheng and Yang, Yiwei and Chen, Maolin and Quinn, Andrew}, 3 | title = {Kgent: Kernel Extensions Large Language Model Agent}, 4 | year = {2024}, 5 | isbn = {9798400707124}, 6 | publisher = {Association for Computing Machinery}, 7 | address = {New York, NY, USA}, 8 | url = {https://doi.org/10.1145/3672197.3673434}, 9 | doi = {10.1145/3672197.3673434}, 10 | abstract = {The extended Berkeley Packet Filters (eBPF) ecosystem allows for the extension of Linux and Windows kernels, but writing eBPF programs is challenging due to the required knowledge of OS internals and programming limitations enforced by the eBPF verifier. These limitations ensure that only expert kernel developers can extend their kernels, making it difficult for junior sys admins, patch makers, and DevOps personnel to maintain extensions. This paper presents Kgent, an alternative framework that alleviates the difficulty of writing an eBPF program by allowing Kernel Extensions to be written in Natural language. Kgent uses recent advances in large language models (LLMs) to synthesize an eBPF program given a user's English language prompt. To ensure that LLM's output is semantically equivalent to the user's prompt, Kgent employs a combination of LLM-empowered program comprehension, symbolic execution, and a series of feedback loops. Kgent's key novelty is the combination of these techniques. In particular, the system uses symbolic execution in a novel structure that allows it to combine the results of program synthesis and program comprehension and build on the recent success that LLMs have shown for each of these tasks individually.To evaluate Kgent, we develop a new corpus of natural language prompts for eBPF programs. We show that Kgent produces correct eBPF programs on 80\%---which is an improvement of a factor of 2.67 compared to GPT-4 program synthesis baseline. Moreover, we find that Kgent very rarely synthesizes "false positive" eBPF programs--- i.e., eBPF programs that Kgent verifies as correct but manual inspection reveals to be semantically incorrect for the input prompt. The code for Kgent is publicly accessible at https://github.com/eunomia-bpf/KEN.}, 11 | booktitle = {Proceedings of the ACM SIGCOMM 2024 Workshop on EBPF and Kernel Extensions}, 12 | pages = {30–36}, 13 | numpages = {7}, 14 | keywords = {Large Language Model, Symbolic Execution, eBPF}, 15 | location = {Sydney, NSW, Australia}, 16 | series = {eBPF '24} 17 | } 18 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2023 eunomia-bpf org. 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # GPTtrace 🤖 2 | 3 | [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT) 4 | [![Actions Status](https://github.com/eunomia-bpf/GPTtrace/workflows/Pylint/badge.svg)](https://github.com/eunomia-bpf/GPTtrace/actions) 5 | [![DeepSource](https://deepsource.io/gh/eunomia-bpf/eunomia-bpf.svg/?label=active+issues&show_trend=true&token=rcSI3J1-gpwLIgZWtKZC-N6C)](https://deepsource.io/gh/eunomia-bpf/eunomia-bpf/?ref=repository-badge) 6 | [![CodeFactor](https://www.codefactor.io/repository/github/eunomia-bpf/eunomia-bpf/badge)](https://www.codefactor.io/repository/github/eunomia-bpf/eunomia-bpf) 7 | [![DOI](https://zenodo.org/badge/603351016.svg)](https://zenodo.org/badge/latestdoi/603351016) 8 | 9 | An experiment for generating eBPF programs and tracing with GPT and natural language 10 | 11 | Want the online version? please see [GPTtrace-web](https://github.com/eunomia-bpf/GPTtrace-web) for **online demo**! 12 | 13 | ### **Checkout our paper [Kgent: Kernel Extensions Large Language Model Agent](https://dl.acm.org/doi/10.1145/3672197.3673434) in eBPF'24!** 14 | 15 | ## Key Features 💡 16 | 17 | ### Interact and Tracing your Linux with natural language 18 | 19 | example: tracing with Count page faults by process 20 | 21 | Image 22 | 23 | - start tracing with natural language 24 | - let AI explain the result to you 25 | 26 | ### Generate eBPF programs with natural language 27 | 28 | example: Write an eBPF program Print entered bash commands from all running shells, save the bpf program to a file and exit without actual run it. 29 | 30 | Image 31 | 32 | We use examples from [bpftrace tools](tools) to create vector store and search. 33 | 34 | For more detail documents and tutorials about how to write eBPF programs, please refer to: [`bpf-developer-tutorial`](https://github.com/eunomia-bpf/bpf-developer-tutorial) (a libbpf tool tutorial to teach ChatGPT to write eBPF programs) 35 | 36 | ### Choose the right bcc command line tool to complete the tracking task 37 | 38 | Use the right bcc tools to trace the kernel 39 | 40 | ```console 41 | $ python3 gpttrace "Trace allocations and display each individual allocator function call" 42 | Run: sudo memleak-bpfcc --trace 43 | Attaching to kernel allocators, Ctrl+C to quit. 44 | (b'Relay(35)', 402, 6, b'd...1', 20299.252425, b'alloc exited, size = 4096, result = ffff8881009cc000') 45 | (b'Relay(35)', 402, 6, b'd...1', 20299.252425, b'free entered, address = ffff8881009cc000, size = 4096') 46 | (b'Relay(35)', 402, 6, b'd...1', 20299.252426, b'free entered, address = 588a6f, size = 4096') 47 | (b'Relay(35)', 402, 6, b'd...1', 20299.252427, b'alloc entered, size = 4096') 48 | (b'Relay(35)', 402, 6, b'd...1', 20299.252427, b'alloc exited, size = 4096, result = ffff8881009cc000') 49 | (b'Relay(35)', 402, 6, b'd...1', 20299.252428, b'free entered, address = ffff8881009cc000, size = 4096') 50 | (b'sudo', 6938, 10, b'd...1', 20299.252437, b'alloc entered, size = 2048') 51 | (b'sudo', 6938, 10, b'd...1', 20299.252439, b'alloc exited, size = 2048, result = ffff88822e845800') 52 | (b'node', 410, 18, b'd...1', 20299.252455, b'alloc entered, size = 256') 53 | (b'node', 410, 18, b'd...1', 20299.252457, b'alloc exited, size = 256, result = ffff8882e9b66400') 54 | (b'node', 410, 18, b'd...1', 20299.252458, b'alloc entered, size = 2048') 55 | ``` 56 | 57 | ## How it works 58 | 59 | ![GPTtrace/doc/how-it-works.png](doc/how-it-works.png) 60 | 61 | 1. **User Input**: The user provides their operating system information and kernel version. This information is crucial as it helps to tailor the eBPF program to the specific environment of the user. 62 | 2. **Prompt Construction**: The user's input, along with the OS info and kernel version, is used to construct a prompt. This prompt is designed to guide the generation of the eBPF program. 63 | 3. **Vector Database Query**: The constructed prompt is used to query the Vector Database for eBPF program examples. These examples serve as a basis for generating the eBPF program that will be inserted into the kernel. 64 | 4. **Hook Point Identification**: The GPT API is used to identify potential hook points in the eBPF program. These hook points are locations in the code where the eBPF program can be inseted to monitor or modify the behavior of the kernel. 65 | 5. **eBPF Program Generation**: The identified hook points, along with the examples from the Vector Database, are used to generate the eBPF program. This program is designed to be inserted into the kernel to perform the desired tracing tasks. 66 | 6. **Kernel Insertion**: The generated eBPF program is inserted into the kernel. If there are any errors during this process, the tool will retry the steps from querying the Vector Database to kernel insertion a few times. 67 | 7. **Result Explanation**: Once the eBPF program is successfully inserted into the kernel, the AI will explain the result to the user. This includes an explanation of what the eBPF program is doing and how it is interacting with the kernel. 68 | 69 | This process ensures that the eBPF program is tailored to the user's specific environment and needs, and that the user understands how the program works and what it is doing. 70 | 71 | ## Installation 🔧 72 | 73 | ```sh 74 | pip install gpttrace 75 | ``` 76 | 77 | ## Usage and Setup 🛠 78 | 79 | ```console 80 | $ python3 -m gpttrace -h 81 | usage: GPTtrace [-h] [-c CMD_NAME QUERY] [-v] [-k OPENAI_API_KEY] 82 | input_string 83 | 84 | Use ChatGPT to write eBPF programs (bpftrace, etc.) 85 | 86 | positional arguments: 87 | input_string Your question or request for a bpf program 88 | 89 | options: 90 | -h, --help show this help message and exit 91 | -c CMD_NAME QUERY, --cmd CMD_NAME QUERY 92 | Use the bcc tool to complete the trace task 93 | -v, --verbose Show more details 94 | -k OPENAI_API_KEY, --key OPENAI_API_KEY 95 | Openai api key, see 96 | `https://platform.openai.com/docs/quickstart/add- 97 | your-api-key` or passed through `OPENAI_API_KEY` 98 | ``` 99 | 100 | ### First: login to ChatGPT 101 | 102 | - Access https://platform.openai.com/docs/quickstart/add-your-api-key,then create your openai api key as following: 103 | 104 | ![image-20230402163041886](doc/api-key.png) 105 | 106 | - Remember your key, and then set it to the environment variable `OPENAI_API_KEY` or use the `-k` option. 107 | 108 | ### start your tracing! 🚀 109 | 110 | For example: 111 | 112 | ```sh 113 | python3 gpttrace "Count page faults by process" 114 | ``` 115 | 116 | If the eBPF program cannot be loaded into the kernel, The error message will be used to correct ChatGPT, and the result will be printed to the console. 117 | 118 | ## Examples 119 | 120 | - Files opened by process 121 | - Syscall count by program 122 | - Read bytes by process: 123 | - Read size distribution by process: 124 | - Show per-second syscall rates: 125 | - Trace disk size by process 126 | - Count page faults by process 127 | - Count LLC cache misses by process name and PID (uses PMCs): 128 | - Profile user-level stacks at 99 Hertz, for PID 189: 129 | - Files opened, for processes in the root cgroup-v2 130 | 131 | ## Citation 132 | 133 | ```bibtex 134 | @inproceedings{10.1145/3672197.3673434, 135 | author = {Zheng, Yusheng and Yang, Yiwei and Chen, Maolin and Quinn, Andrew}, 136 | title = {Kgent: Kernel Extensions Large Language Model Agent}, 137 | year = {2024}, 138 | isbn = {9798400707124}, 139 | publisher = {Association for Computing Machinery}, 140 | address = {New York, NY, USA}, 141 | url = {https://doi.org/10.1145/3672197.3673434}, 142 | doi = {10.1145/3672197.3673434}, 143 | abstract = {The extended Berkeley Packet Filters (eBPF) ecosystem allows for the extension of Linux and Windows kernels, but writing eBPF programs is challenging due to the required knowledge of OS internals and programming limitations enforced by the eBPF verifier. These limitations ensure that only expert kernel developers can extend their kernels, making it difficult for junior sys admins, patch makers, and DevOps personnel to maintain extensions. This paper presents Kgent, an alternative framework that alleviates the difficulty of writing an eBPF program by allowing Kernel Extensions to be written in Natural language. Kgent uses recent advances in large language models (LLMs) to synthesize an eBPF program given a user's English language prompt. To ensure that LLM's output is semantically equivalent to the user's prompt, Kgent employs a combination of LLM-empowered program comprehension, symbolic execution, and a series of feedback loops. Kgent's key novelty is the combination of these techniques. In particular, the system uses symbolic execution in a novel structure that allows it to combine the results of program synthesis and program comprehension and build on the recent success that LLMs have shown for each of these tasks individually.To evaluate Kgent, we develop a new corpus of natural language prompts for eBPF programs. We show that Kgent produces correct eBPF programs on 80\%---which is an improvement of a factor of 2.67 compared to GPT-4 program synthesis baseline. Moreover, we find that Kgent very rarely synthesizes "false positive" eBPF programs--- i.e., eBPF programs that Kgent verifies as correct but manual inspection reveals to be semantically incorrect for the input prompt. The code for Kgent is publicly accessible at https://github.com/eunomia-bpf/KEN.}, 144 | booktitle = {Proceedings of the ACM SIGCOMM 2024 Workshop on EBPF and Kernel Extensions}, 145 | pages = {30–36}, 146 | numpages = {7}, 147 | keywords = {Large Language Model, Symbolic Execution, eBPF}, 148 | location = {Sydney, NSW, Australia}, 149 | series = {eBPF '24} 150 | } 151 | ``` 152 | 153 | ## LICENSE 154 | 155 | MIT 156 | 157 | ## 🔗 Links 158 | 159 | - detail documents and tutorials about how we train ChatGPT to write eBPF programs: https://github.com/eunomia-bpf/bpf-developer-tutorial (基于 CO-RE (一次编写,到处运行) libbpf 的 eBPF 开发者教程:通过 20 个小工具一步步学习 eBPF(尝试教会 ChatGPT 编写 eBPF 程序) 160 | - bpftrace: https://github.com/iovisor/bpftrace 161 | - ChatGPT: https://chat.openai.com/ 162 | -------------------------------------------------------------------------------- /doc/api-key.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/eunomia-bpf/GPTtrace/96ca69662e4d59e9c63531608c38b9da57b4198c/doc/api-key.png -------------------------------------------------------------------------------- /doc/generate.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/eunomia-bpf/GPTtrace/96ca69662e4d59e9c63531608c38b9da57b4198c/doc/generate.png -------------------------------------------------------------------------------- /doc/how-it-works.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/eunomia-bpf/GPTtrace/96ca69662e4d59e9c63531608c38b9da57b4198c/doc/how-it-works.png -------------------------------------------------------------------------------- /doc/result.gif: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/eunomia-bpf/GPTtrace/96ca69662e4d59e9c63531608c38b9da57b4198c/doc/result.gif -------------------------------------------------------------------------------- /doc/result.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/eunomia-bpf/GPTtrace/96ca69662e4d59e9c63531608c38b9da57b4198c/doc/result.png -------------------------------------------------------------------------------- /doc/trace.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/eunomia-bpf/GPTtrace/96ca69662e4d59e9c63531608c38b9da57b4198c/doc/trace.png -------------------------------------------------------------------------------- /gpttrace/GPTtrace.py: -------------------------------------------------------------------------------- 1 | #! /bin/env python 2 | import argparse 3 | import os 4 | import pathlib 5 | 6 | from gpttrace.cmd import cmd 7 | from gpttrace.execute import execute 8 | from gpttrace.utils.common import pretty_print 9 | 10 | 11 | def main() -> None: 12 | """ 13 | Program entry. 14 | """ 15 | parser = argparse.ArgumentParser( 16 | prog="GPTtrace", 17 | description="Use ChatGPT to write eBPF programs (bpftrace, etc.)", 18 | ) 19 | parser.add_argument( 20 | "-c", "--cmd", 21 | help="Use the bcc tool to complete the trace task", 22 | nargs=2, 23 | metavar=("CMD_NAME", "QUERY")) 24 | parser.add_argument( 25 | "-v", "--verbose", 26 | help="Show more details", 27 | action="store_true") 28 | parser.add_argument( 29 | "-k", "--key", 30 | help="Openai api key, see `https://platform.openai.com/docs/quickstart/add-your-api-key` or passed through `OPENAI_API_KEY`", 31 | metavar="OPENAI_API_KEY") 32 | parser.add_argument('input_string', type=str, help='Your question or request for a bpf program') 33 | args = parser.parse_args() 34 | 35 | if os.getenv('OPENAI_API_KEY', args.key) is None: 36 | print(f"Either provide your access token through `-k` or through environment variable {OPENAI_API_KEY}") 37 | return 38 | if args.cmd is not None: 39 | cmd(args.cmd[0], args.cmd[1], args.verbose) 40 | elif args.input_string is not None: 41 | execute(args.input_string, args.verbose) 42 | else: 43 | parser.print_help() 44 | 45 | if __name__ == "__main__": 46 | main() 47 | -------------------------------------------------------------------------------- /gpttrace/__init__.py: -------------------------------------------------------------------------------- 1 | from gpttrace.GPTtrace import main 2 | __version__ = "0.1.2" 3 | -------------------------------------------------------------------------------- /gpttrace/__main__.py: -------------------------------------------------------------------------------- 1 | # This file allows the construction of executable file. If there is no such file, only the package is generated. 2 | from gpttrace.GPTtrace import main 3 | 4 | main() 5 | -------------------------------------------------------------------------------- /gpttrace/bpftrace.py: -------------------------------------------------------------------------------- 1 | #!/bin/python 2 | import subprocess 3 | import openai 4 | import json 5 | import unittest 6 | import threading 7 | from typing import List, TypedDict 8 | 9 | functions = [ 10 | { 11 | "name": "bpftrace", 12 | "description": "A tool use to run bpftrace eBPF programs", 13 | "parameters": { 14 | "type": "object", 15 | "properties": { 16 | "bufferingMode": { 17 | "type": "string", 18 | "description": "output buffering mode" 19 | }, 20 | "format": { 21 | "type": "string", 22 | "description": "output format" 23 | }, 24 | "outputFile": { 25 | "type": "string", 26 | "description": "redirect bpftrace output to file" 27 | }, 28 | "debugInfo": { 29 | "type": "boolean", 30 | "description": "debug info dry run" 31 | }, 32 | "verboseDebugInfo": { 33 | "type": "boolean", 34 | "description": "verbose debug info dry run" 35 | }, 36 | "program": { 37 | "type": "string", 38 | "description": "program to execute" 39 | }, 40 | "includeDir": { 41 | "type": "array", 42 | "items": { 43 | "type": "string" 44 | }, 45 | "description": "directories to add to the include search path" 46 | }, 47 | "usdtFileActivation": { 48 | "type": "boolean", 49 | "description": "activate usdt semaphores based on file path" 50 | }, 51 | "unsafe": { 52 | "type": "boolean", 53 | "description": "allow unsafe builtin functions" 54 | }, 55 | "quiet": { 56 | "type": "boolean", 57 | "description": "keep messages quiet" 58 | }, 59 | "verbose": { 60 | "type": "boolean", 61 | "description": "verbose messages" 62 | }, 63 | "noWarnings": { 64 | "type": "boolean", 65 | "description": "disable all warning messages" 66 | }, 67 | "timeout": { 68 | "type": "integer", 69 | "description": "seconds to run the command" 70 | }, 71 | "continue": { 72 | "type": "boolean", 73 | "description": "finish conversation and not continue." 74 | } 75 | }, 76 | "required": ["program"] 77 | } 78 | }, 79 | { 80 | "name": "SaveFile", 81 | "description": "Save the eBPF program to file", 82 | "parameters": { 83 | "type": "object", 84 | "properties": { 85 | "filename": { 86 | "type": "string", 87 | "description": "the file name to save to" 88 | }, 89 | "content": { 90 | "type": "string", 91 | "description": "the file content" 92 | } 93 | }, 94 | "required": ["filename", "content"] 95 | } 96 | } 97 | ] 98 | 99 | class CommandResult(TypedDict): 100 | command: str 101 | stdout: str 102 | stderr: str 103 | returncode: int 104 | 105 | def run_command_with_timeout(command: List[str], timeout: int) -> CommandResult: 106 | """ 107 | This function runs a command with a timeout. 108 | """ 109 | print("The bpf program to run is: " + ' '.join(command)) 110 | print("timeout: " + str(timeout)) 111 | user_input = input("Enter 'y' to proceed: ") 112 | if user_input.lower() != 'y': 113 | print("Aborting...") 114 | exit() 115 | # Start the process 116 | with subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True) as process: 117 | timer = threading.Timer(timeout, process.kill) 118 | stdout = "" 119 | stderr = "" 120 | try: 121 | # Set a timer to kill the process if it doesn't finish within the timeout 122 | timer.start() 123 | while process.poll() is None: 124 | # Only try to read output if the process is still running 125 | if process.stdout.readable(): 126 | line = process.stdout.read() 127 | print(line, end='') 128 | stdout += line 129 | # Wait for the process to finish and get the output 130 | last_stdout, last_stderr = process.communicate() 131 | stdout += last_stdout 132 | stderr += last_stderr 133 | except Exception as e: 134 | print("Exception: " + str(e)) 135 | finally: 136 | # Make sure the timer is canceled 137 | timer.cancel() 138 | if process.poll() is None and process.stdout.readable(): 139 | stdout += process.stdout.read() 140 | print(stdout) 141 | if process.poll() is None and process.stderr.readable(): 142 | stderr += process.stderr.read() 143 | print(stderr) 144 | return { 145 | "command": ' '.join(command), 146 | "stdout": stdout, 147 | "stderr": stderr, 148 | "returncode": process.returncode 149 | } 150 | 151 | 152 | def construct_command(operation: dict) -> list: 153 | """ 154 | This function constructs a command from a dictionary of options. 155 | """ 156 | cmd = [] 157 | if "bufferingMode" in operation: 158 | cmd += ["-B", operation["bufferingMode"]] 159 | if "format" in operation: 160 | cmd += ["-f", operation["format"]] 161 | if "outputFile" in operation: 162 | cmd += ["-o", operation["outputFile"]] 163 | if "debugInfo" in operation and operation["debugInfo"]: 164 | cmd += ["-d"] 165 | if "verboseDebugInfo" in operation and operation["verboseDebugInfo"]: 166 | cmd += ["-dd"] 167 | if "program" in operation: 168 | cmd += ["-e", operation["program"]] 169 | if "includeDir" in operation: 170 | for dir in operation["includeDir"]: 171 | cmd += ["-I", dir] 172 | if "usdtFileActivation" in operation and operation["usdtFileActivation"]: 173 | cmd += ["--usdt-file-activation"] 174 | if "unsafe" in operation and operation["unsafe"]: 175 | cmd += ["--unsafe"] 176 | if "quiet" in operation and operation["quiet"]: 177 | cmd += ["-q"] 178 | if "verbose" in operation and operation["verbose"]: 179 | cmd += ["-v"] 180 | if "noWarnings" in operation and operation["noWarnings"]: 181 | cmd += ["--no-warnings"] 182 | # ...add other options similarly... 183 | return cmd 184 | 185 | 186 | def run_bpftrace(prompt: str, verbose: bool = False) -> CommandResult: 187 | """ 188 | This function sends a list of messages and functions to the GPT model 189 | and runs the function call returned by the model. 190 | 191 | Parameters: 192 | messages (list): A list of dictionaries where each dictionary represents a message. 193 | 194 | Returns: 195 | None 196 | """ 197 | # Send the conversation and available functions to GPT 198 | messages = [{"role": "user", "content": prompt}] 199 | response = openai.ChatCompletion.create( 200 | model="gpt-3.5-turbo", 201 | messages=messages, 202 | functions=functions, 203 | function_call="auto", # auto is default, but we'll be explicit 204 | ) 205 | response_message = response["choices"][0]["message"] 206 | if verbose: 207 | print(response_message) 208 | # Check if GPT wanted to call a function 209 | if response_message.get("function_call"): 210 | full_command = ["sudo"] 211 | if response_message["function_call"]["name"] == "bpftrace": 212 | # call bpftrace function 213 | full_command.append(response_message["function_call"]["name"]) 214 | args = json.loads(response_message["function_call"]["arguments"]) 215 | 216 | command = construct_command(args) 217 | full_command.extend(command) 218 | timeout = 300 # default is running for 5 mins 219 | if args.get("timeout"): 220 | timeout = args["timeout"] 221 | # run the bpftrace command 222 | res = run_command_with_timeout(full_command, int(timeout)) 223 | if args.get("continue") and res["stderr"] == "": 224 | # continue conversation 225 | res["stderr"] = "The conversation shall not complete." 226 | return res 227 | elif response_message["function_call"]["name"] == "SaveFile": 228 | # call save to file, need to save the response data to a file 229 | args = json.loads(response_message["function_call"]["arguments"]) 230 | filename = args["filename"] 231 | print("Save to file: " + filename) 232 | print(args["content"]) 233 | with open(filename, 'w') as file: 234 | file.write(args["content"]) 235 | res = { 236 | "command": "SaveFile", 237 | "stdout": args["content"], 238 | "stderr": "", 239 | "returncode": 0 240 | } 241 | return res 242 | else: 243 | # not function call 244 | return { 245 | "command": "response_message", 246 | "stdout": response_message["content"], 247 | "stderr": "", 248 | "returncode": 0 249 | } 250 | 251 | 252 | class TestRunBpftrace(unittest.TestCase): 253 | def test_summary(self): 254 | res = run_bpftrace("tracing with Count page faults by process for 3s") 255 | print(res) 256 | print(res["stderr"]) 257 | 258 | def test_construct_command(self): 259 | operation_json = """ 260 | { 261 | "bufferingMode": "full", 262 | "format": "json", 263 | "outputFile": "output.txt", 264 | "program": "kprobe:do_nanosleep { printf(\\"PID %d sleeping...\\n\\", pid); }", 265 | "includeDir": ["dir1", "dir2"] 266 | } 267 | """ 268 | 269 | operation = json.loads(operation_json) 270 | command = construct_command(operation) 271 | print(command) 272 | 273 | def test_construct_complex_command(self): 274 | operation_json = """ 275 | { 276 | "bufferingMode": "full", 277 | "format": "json", 278 | "outputFile": "output.txt", 279 | "debugInfo": true, 280 | "verboseDebugInfo": true, 281 | "program": "kprobe:do_nanosleep { printf(\\"PID %d sleeping...\\n\\", pid); }", 282 | "includeDir": ["dir1", "dir2"], 283 | "usdtFileActivation": true, 284 | "unsafe": true, 285 | "quiet": true, 286 | "verbose": true, 287 | "noWarnings": true 288 | } 289 | """ 290 | operation = json.loads(operation_json) 291 | command = construct_command(operation) 292 | print(command) 293 | 294 | def test_run_command_with_timeout_short_live(self): 295 | command = ["ls", "-l"] 296 | timeout = 5 297 | result = run_command_with_timeout(command, timeout) 298 | print(result) 299 | self.assert_(result["stdout"] != "") 300 | self.assertEqual(result["command"], "ls -l") 301 | self.assertEqual(result["returncode"], 0) 302 | -------------------------------------------------------------------------------- /gpttrace/cmd.py: -------------------------------------------------------------------------------- 1 | import subprocess 2 | import openai 3 | import json 4 | 5 | from langchain.chains.conversation.memory import ConversationBufferMemory 6 | from langchain.chat_models import ChatOpenAI 7 | from langchain.chains import ConversationChain 8 | from gpttrace.prompt import func_call_prompt 9 | from gpttrace.config import cfg 10 | 11 | def cmd(cmd_name: str, query: str, verbose=False) -> None: 12 | """ 13 | Generate the command based on query and execute the command. 14 | 15 | :param cmd_name: name of command. 16 | :param query: The task that the user wants to accomplish with `cmd`. 17 | :param verbose: Whether to print extra information. 18 | """ 19 | func_call = None 20 | functions = get_predifine_funcs() 21 | for func in functions: 22 | # Gets the predefined function call 23 | if func["name"] == cmd_name: 24 | func_call = func 25 | break 26 | if func_call is None: 27 | # Generate a function call based on a given command 28 | func_call = gen_func_call(cmd_name, verbose) 29 | messages = [{"role": "user", "content": query}] 30 | response = openai.ChatCompletion.create( 31 | model=cfg.get("DEFAULT_MODEL"), 32 | messages=messages, 33 | functions=[func_call], 34 | function_call="auto", 35 | ) 36 | response_message = response["choices"][0]["message"] 37 | if response_message.get("function_call"): 38 | cmd_name = response_message["function_call"]["name"] 39 | args = json.loads(response_message["function_call"]["arguments"]) 40 | func_descript = get_specify_func(functions, cmd_name) 41 | exec_cmd(cmd_name, args, func_descript) 42 | else: 43 | print("LLM does not call any bcc tools.") 44 | 45 | 46 | def exec_cmd(cmd_name: str, args: str, func_descript: json) -> None: 47 | """ 48 | Execute the command 49 | 50 | :param cmd_name: The name of the command 51 | :param args: The parameters required to execute the command. 52 | :param func_desrcript: The function call description information about the command is in JSON format. 53 | """ 54 | full_command = ["sudo"] 55 | full_command.append(cmd_name) 56 | positional_arg = None 57 | for arg, value in args.items(): 58 | if (value is not True) and (value is not False): 59 | if not is_positional_arg(cmd_name, arg): 60 | # Determine parameter type 61 | arg_type = func_descript["parameters"]["properties"][arg]["type"] 62 | if arg_type in ["integer", "float"]: 63 | full_command.extend([f"--{arg}", f'{value}']) 64 | else: 65 | full_command.extend([f"--{arg}", f'"{value}"']) 66 | else: 67 | positional_arg = value 68 | else: 69 | full_command.append(f"--{arg}") 70 | if positional_arg is not None: 71 | full_command.append(positional_arg) 72 | full_command = [str(item) for item in full_command] 73 | print("\u001b[1;32m", "Run: ", " ".join(full_command), "\u001b[0m") 74 | try: 75 | subprocess.run(full_command, text=True, check=True) 76 | except subprocess.CalledProcessError as err: 77 | print("\u001b[1;31m\bFailed to execute command!\u001b[0m") 78 | print(err) 79 | 80 | def get_predifine_funcs() -> str: 81 | """ 82 | Gets the JSON format description information of the predefined function call. 83 | 84 | :return: List of JSON-formatted function calls. 85 | """ 86 | func_path = cfg.get("BCC_FUNC_CALL_PATH") 87 | with open(func_path, 'r', encoding='utf-8') as file: 88 | data = file.read() 89 | functions = json.loads(data) 90 | return functions 91 | 92 | def gen_func_call(cmd_name: str, verbose: bool) -> str: 93 | """ 94 | Generates a funciton call for the given command. 95 | 96 | :param cmd_name: Name of command. 97 | :verbose: Whether to print extra information. 98 | :return: The function call in JSON format corresponding to the command. 99 | """ 100 | model_name = cfg.get("DEFAULT_MODEL") 101 | llm = ChatOpenAI(model=model_name,temperature=0) 102 | agent_chain = ConversationChain(llm=llm, verbose=verbose, 103 | memory=ConversationBufferMemory()) 104 | help_doc = get_command_help(cmd_name) 105 | prompt = func_call_prompt(cmd_name, help_doc) 106 | response = agent_chain.predict(input=prompt) 107 | return response 108 | 109 | def get_command_help(command) -> str: 110 | """ 111 | Gets help documentation for the command. 112 | 113 | :param command: Name of command. 114 | :return: Help documentation for the command. 115 | """ 116 | try: 117 | output = subprocess.check_output([command, '--help'], universal_newlines=True) 118 | return output 119 | except subprocess.CalledProcessError as err: 120 | return f"Error executing help command: {err.output}" 121 | 122 | def get_specify_func(functions, cmd_name) -> json: 123 | """ 124 | Gets the JSON format description of the function call for the specified command. 125 | 126 | :param functions: A predefined list of function calls. 127 | :param cmd_name: A specific command. 128 | :return func: The function call corresponding to cmd. 129 | """ 130 | for func in functions: 131 | if func.get('name') == cmd_name: 132 | return func 133 | return None 134 | 135 | 136 | def is_positional_arg(cmd_name, arg) -> bool: 137 | """ 138 | Determine if the argument passed to the command is a position argument. 139 | 140 | :param cmd_name: name of command. 141 | :param arg: argument passed to command. 142 | :return: Whether arg is a positional parameter of cmd. 143 | """ 144 | 145 | positional_dict = { 146 | "biolatency-bpfcc": ["interval", "count"], 147 | "biotop-bpfcc": ["interval", "count"], 148 | "btrfsdist-bpfcc": ["interval", "count"], 149 | "btrfsslower-bpfcc": ["min_ms"], 150 | "cachestat-bpfcc": ["interval", "count"], 151 | "cachetop-bpfcc": ["interval"], 152 | "cobjnew-bpfcc": ["pid", "interval"], 153 | "cpudist-bpfcc": ["interval", "count"], 154 | "cpuunclaimed-bpfcc": ["interval", "count"], 155 | "dbslower-bpfcc": ["engine"], 156 | "dbstat-bpfcc": ["engine"], 157 | "deadlock-bpfcc": ["pid"], 158 | "ext4dist-bpfcc": ["interval", "count"], 159 | "ext4slower-bpfcc": ["min_ms"], 160 | "fileslower-bpfcc": ["min_ms"], 161 | "filetop-bpfcc": ["interval", "count"], 162 | "funccount-bpfcc": ["pattern"], 163 | "funclatency-bpfcc": ["pattern"], 164 | "funcslower-bpfcc": ["function"], 165 | "hardirqs-bpfcc": ["interval", "outputs"], 166 | "inject-bpfcc": ["base_function", "spec"], 167 | "javacalls-bpfcc": ["pid", "interval"], 168 | "javaflow-bpfcc": ["pid"], 169 | "javagc-bpfcc": ["pid"], 170 | "javaobjnew-bpfcc": ["pid", "interval"], 171 | "javastat-bpfcc": ["interval", "count"], 172 | "javathreads-bpfcc": ["pid"], 173 | "llcstat-bpfcc": ["duration"], 174 | "memleak-bpfcc": ["interval", "count"], 175 | "nfsdist-bpfcc": ["interval", "count"], 176 | "nfsslower-bpfcc": ["min_ms"], 177 | "nodegc-bpfcc": ["pid"], 178 | "nodestat-bpfcc": ["interval", "count"], 179 | "offcputime-bpfcc": ["duration"], 180 | "offwaketime-bpfcc": ["duration"], 181 | "perlcalls-bpfcc": ["pid", "interval"], 182 | "perlflow-bpfcc": ["pid"], 183 | "perlstat-bpfcc": ["interval", "count"], 184 | "phpcalls-bpfcc": ["pid", "interval"], 185 | "phpflow-bpfcc": ["pid"], 186 | "phpstat-bpfcc": ["interval", "count"], 187 | "profile-bpfcc": ["duration"], 188 | "pythoncalls-bpfcc": ["pid", "interval"], 189 | "pythonflow-bpfcc": ["pid"], 190 | "pythongc-bpfcc": ["pid"], 191 | "pythonstat-bpfcc": ["interval", "count"], 192 | "rubycalls-bpfcc": ["pid", "interval"], 193 | "rubyflow-bpfcc": ["pid"], 194 | "rubygc-bpfcc": ["pid"], 195 | "rubyobjnew-bpfcc": ["pid", "interval"], 196 | "rubystat-bpfcc": ["interval", "count"], 197 | "runqlat-bpfcc": ["interval", "count"], 198 | "runqlen-bpfcc": ["interval", "count"], 199 | "runqslower-bpfcc": ["min_us"], 200 | "slabratetop-bpfcc": ["interval", "count"], 201 | "softirqs-bpfcc": ["interval", "count"], 202 | "stackcount-bpfcc": ["pattern"], 203 | "tclcalls-bpfcc": ["pid", "interval"], 204 | "tclflow-bpfcc": ["pid"], 205 | "tclobjnew-bpfcc": ["pid", "interval"], 206 | "tclstat-bpfcc": ["interval", "count"], 207 | "tcpconnlat-bpfcc": ["duration_ms"], 208 | "tcpsubnet-bpfcc": ["subnets"], 209 | "tcptop-bpfcc": ["interval", "count"], 210 | "tplist-bpfcc": ["filter"], 211 | "trace-bpfcc": ["probe"], 212 | "ttysnoop-bpfcc": ["device"], 213 | "ucalls": ["pid", "interval"], 214 | "uflow": ["pid"], 215 | "ugc": ["pid"], 216 | "uobjnew": ["pid", "interval"], 217 | "ustat": ["interval", "count"], 218 | "uthreads": ["pid"], 219 | "wakeuptime-bpfcc": ["duration"], 220 | "xfsdist-bpfcc": ["interval", "count"], 221 | "xfsslower-bpfcc": ["min_ms"], 222 | "zfsdist-bpfcc": ["interval", "count"], 223 | "zfsslower-bpfcc": ["min_ms"], 224 | } 225 | 226 | if positional_dict.get(cmd_name) is not None: 227 | return arg in positional_dict[cmd] 228 | return False 229 | 230 | if __name__ == "__main__": 231 | cmd("print 1 second summaries, 10 times", True) 232 | -------------------------------------------------------------------------------- /gpttrace/config.py: -------------------------------------------------------------------------------- 1 | import os 2 | 3 | from getpass import getpass 4 | from pathlib import Path 5 | from typing import Any 6 | from click import UsageError 7 | 8 | CONFIG_FOLDER = os.path.expanduser("~/.config") 9 | GPT_TRACE_CONFIG_FOLDER = Path(CONFIG_FOLDER) / "gpt_trace" 10 | GPT_TRACE_CONFIG_PATH = GPT_TRACE_CONFIG_FOLDER / ".gpt_trace_rc" 11 | 12 | OPENAI_API_KEY = "OPENAI_API_KEY" 13 | 14 | PROJECT_ROOT_PATH = Path(os.path.dirname(os.path.dirname(os.path.abspath(__file__)))) 15 | DOC_PATH = PROJECT_ROOT_PATH / "bpf_tutorial"/ "src" 16 | DATA_SAVE_PATH = PROJECT_ROOT_PATH / "data_save" 17 | VECTOR_DATABASE_PATH = DATA_SAVE_PATH / "vector_database" 18 | BCC_FUNC_CALL_PATH = DATA_SAVE_PATH / "funcs.json" 19 | MODEL_NAME = "gpt-3.5-turbo-0613" 20 | 21 | DEFAULT_CONFIG = { 22 | "DEFAULT_MODEL": os.getenv("DEFAULT_MODEL", "gpt-3.5-turbo"), 23 | "OPENAI_API_KEY": os.getenv("OPENAI_API_KEY", "OPENAI_API_KEY"), 24 | "DATA_SAVE_PATH": os.getenv("DATA_SAVE_PATH", DATA_SAVE_PATH), 25 | "VECTOR_DATABASE_PATH": os.getenv("VECTOR_DATABASE_PATH", VECTOR_DATABASE_PATH), 26 | "BCC_FUNC_CALL_PATH": os.getenv("BCC_FUNC_CALL_PATH", BCC_FUNC_CALL_PATH), 27 | "DOC_PATH": os.getenv("DOC_PATH", DOC_PATH), 28 | "MODEL_NAME": os.getenv("MODEL_NAME", MODEL_NAME) 29 | } 30 | 31 | class Config(dict): 32 | """ 33 | A dictionary for handling configuration items in this project. 34 | """ 35 | def __init__(self, config_path: Path, **defaults: Any): 36 | """ 37 | Initializes a Config object. 38 | 39 | :param config_path: The path to the configuration file. 40 | :param **defaults: Default values for configuration options. 41 | """ 42 | self.config_path = config_path 43 | 44 | if self._exists: 45 | self._read() 46 | has_new_config = False 47 | for key, value in defaults.items(): 48 | if key not in self: 49 | has_new_config = True 50 | self[key] = value 51 | if has_new_config: 52 | self._write() 53 | else: 54 | config_path.parent.mkdir(parents=True, exist_ok=True) 55 | # Don't write API key to config file if it is in the environment. 56 | if not defaults.get("OPENAI_API_KEY") and not os.getenv("OPENAI_API_KEY"): 57 | __api_key = getpass(prompt="Please enter your OpenAI API key: ") 58 | defaults["OPENAI_API_KEY"] = __api_key 59 | super().__init__(**defaults) 60 | self._write() 61 | 62 | 63 | @property 64 | def _exists(self) -> bool: 65 | """ 66 | Checks if the configuration file exists. 67 | 68 | :return: True if the configuration file exists, False otherwise. 69 | """ 70 | return self.config_path.exists() 71 | 72 | def _write(self) -> None: 73 | """ 74 | Writes the configuration options to the configuration file. 75 | """ 76 | with open(self.config_path, "w", encoding="utf-8") as file: 77 | string_config = "" 78 | for key, value in self.items(): 79 | string_config += f"{key}={value}\n" 80 | file.write(string_config) 81 | 82 | def _read(self) -> None: 83 | """ 84 | Reads the configuration options from the configuration file. 85 | """ 86 | with open(self.config_path, "r", encoding="utf-8") as file: 87 | for line in file: 88 | if not line.startswith("#"): 89 | key, value = line.strip().split("=") 90 | self[key] = value 91 | 92 | def get(self, key: str) -> str: # type: ignore 93 | """ 94 | Retrieves the value associated with the specified key. 95 | 96 | :param key: The key of the configuration option to retrieve. 97 | :return: The value associated with the specified key. 98 | """ 99 | # Prioritize environment variables over config file. 100 | value = os.getenv(key) or super().get(key) 101 | if not value: 102 | raise UsageError(f"Missing config key: {key}") 103 | return value 104 | 105 | cfg = Config(GPT_TRACE_CONFIG_PATH, **DEFAULT_CONFIG) 106 | -------------------------------------------------------------------------------- /gpttrace/examples.py: -------------------------------------------------------------------------------- 1 | import os 2 | from langchain.document_loaders import JSONLoader 3 | from langchain.vectorstores import FAISS 4 | from langchain.embeddings.openai import OpenAIEmbeddings 5 | # from langchain.embeddings.openai import OpenAIEmbeddings 6 | # from langchain.vectorstores import DocArrayInMemorySearch 7 | # from langchain.document_loaders import DirectoryLoader 8 | 9 | # loader = DirectoryLoader('gpttrace/tools/', glob="**/*.bt") 10 | # docs = loader.load() 11 | # embeddings = OpenAIEmbeddings() 12 | 13 | # db = DocArrayInMemorySearch.from_documents(docs, embeddings) 14 | 15 | simple_examples = """ 16 | # list probes containing "sleep" 17 | bpftrace -l '*sleep*' 18 | 19 | # trace processes calling sleep 20 | bpftrace -e 'kprobe:do_nanosleep { printf("PID %d sleeping...\n", pid); }' 21 | 22 | # count syscalls by process name 23 | bpftrace -e 'tracepoint:raw_syscalls:sys_enter { @[comm] = count(); }' 24 | 25 | # Files opened by process 26 | bpftrace -e 'tracepoint:syscalls:sys_enter_open { printf("%s %s\n", comm, str(args->filename)); }' 27 | 28 | # Syscall count by program 29 | bpftrace -e 'tracepoint:raw_syscalls:sys_enter { @[comm] = count(); }' 30 | 31 | # Read bytes by process: 32 | bpftrace -e 'tracepoint:syscalls:sys_exit_read /args->ret/ { @[comm] = sum(args->ret); }' 33 | 34 | # Read size distribution by process: 35 | bpftrace -e 'tracepoint:syscalls:sys_exit_read { @[comm] = hist(args->ret); }' 36 | 37 | # Show per-second syscall rates: 38 | bpftrace -e 'tracepoint:raw_syscalls:sys_enter { @ = count(); } interval:s:1 { print(@); clear(@); }' 39 | 40 | # Trace disk size by process 41 | bpftrace -e 'tracepoint:block:block_rq_issue { printf("%d %s %d\n", pid, comm, args->bytes); }' 42 | 43 | # Count page faults by process 44 | bpftrace -e 'software:faults:1 { @[comm] = count(); }' 45 | 46 | # Count LLC cache misses by process name and PID (uses PMCs): 47 | bpftrace -e 'hardware:cache-misses:1000000 { @[comm, pid] = count(); }' 48 | 49 | # Profile user-level stacks at 99 Hertz, for PID 189: 50 | bpftrace -e 'profile:hz:99 /pid == 189/ { @[ustack] = count(); }' 51 | 52 | # Files opened, for processes in the root cgroup-v2 53 | bpftrace -e 'tracepoint:syscalls:sys_enter_openat /cgroup == cgroupid("/sys/fs/cgroup/unified/mycg")/ { printf("%s\n", str(args->filename)); }' 54 | 55 | """ 56 | 57 | def get_bpftrace_basic_examples(query: str) -> str: 58 | loader = JSONLoader( 59 | file_path='./tools/examples.json', 60 | jq_schema='.data[].content', 61 | json_lines=True 62 | ) 63 | documents = loader.load() 64 | embeddings = OpenAIEmbeddings() 65 | 66 | # Check if the vector database files exist 67 | if not (os.path.exists("./data_save/vector_db.faiss") and os.path.exists("./data_save/vector_db.pkl")): 68 | db = FAISS.from_documents(documents, embeddings) 69 | db.save_local("./data_save", index_name="vector_db") 70 | else: 71 | # Load an existing FAISS vector store 72 | db = FAISS.load_local("./data_save", index_name="vector_db", embeddings=embeddings) 73 | 74 | results = db.search(query, search_type='similarity') 75 | results = [result.page_content for result in results] 76 | return "\n".join(results[:2]) 77 | 78 | def construct_bpftrace_examples(text: str) -> str: 79 | examples = get_bpftrace_basic_examples(text) 80 | # docs = db.similarity_search(text) 81 | # examples += "\n The following is a more complex example: \n" 82 | # examples += docs[0].page_content 83 | return examples 84 | 85 | if __name__ == "__main__": 86 | get_bpftrace_basic_examples("Trace allocations and display each individual allocator function call") 87 | -------------------------------------------------------------------------------- /gpttrace/execute.py: -------------------------------------------------------------------------------- 1 | import os 2 | import json 3 | import shutil 4 | import tempfile 5 | import openai 6 | from litellm import completion 7 | 8 | from gpttrace.utils.common import get_doc_content_for_query, init_conversation 9 | from gpttrace.prompt import construct_running_prompt, construct_prompt_on_error, construct_prompt_for_explain 10 | from gpttrace.bpftrace import run_bpftrace 11 | 12 | 13 | def call_gpt_api(prompt: str) -> str: 14 | """ 15 | This function sends a list of messages to the GPT model 16 | """ 17 | messages = [{"role": "user", "content": prompt}] 18 | response = openai.ChatCompletion.create( 19 | model="gpt-3.5-turbo-0613", 20 | messages=messages, 21 | ) 22 | return response["choices"][0]["message"]["content"] 23 | 24 | def call_litellm(prompt: str) -> str: 25 | """ 26 | This function sends a list of messages to the selected litellm model 27 | OpenAI, Azure, Cohere, Anthropic, Replicate models supported 28 | """ 29 | messages = [{"role": "user", "content": prompt}] 30 | # see supported models here: 31 | # https://litellm.readthedocs.io/en/latest/supported/ 32 | response = completion( 33 | model="claude-instant-1", 34 | messages=messages, 35 | ) 36 | return response["choices"][0]["message"]["content"] 37 | 38 | def execute(user_input: str, verbose: bool = False, retry: int = 5, previous_prompt: str = None, output: str = None) -> None: 39 | """ 40 | Convert the user request into a BPF command and execute it. 41 | 42 | :param user_input: The user's request. 43 | :param need_train: Whether to use the vector database. 44 | :param verbose: Whether to print extra information. 45 | """ 46 | if retry == 0: 47 | print("Retry times exceeded...") 48 | # agent_chain, index = init_conversation(need_train, verbose) 49 | # print("Sending query to ChatGPT: " + user_input) 50 | if previous_prompt is None: 51 | prompt = construct_running_prompt(user_input) 52 | else: 53 | prompt = construct_prompt_on_error( 54 | previous_prompt, user_input, output) 55 | if verbose is True: 56 | print("Prompt: " + prompt) 57 | res = run_bpftrace(prompt, verbose) 58 | if res["stderr"] != '': 59 | print("output: " + json.dumps(res)) 60 | print("retry time " + str(retry) + "...") 61 | # retry 62 | res = execute(user_input, verbose, retry - 1, prompt, json.dumps(res)) 63 | else: 64 | # success 65 | print("AI explanation:") 66 | prompt = construct_prompt_for_explain(user_input, res["stdout"]) 67 | if verbose is True: 68 | print("Prompt: " + prompt) 69 | explain = call_gpt_api(prompt) 70 | print(explain) 71 | 72 | -------------------------------------------------------------------------------- /gpttrace/prompt.py: -------------------------------------------------------------------------------- 1 | import os 2 | from gpttrace.examples import construct_bpftrace_examples 3 | 4 | 5 | def construct_prompt_on_error(previous_prompt: str, text: str, output: str) -> str: 6 | """ 7 | Construct prompts when an error occurs. 8 | 9 | :param text: User request. 10 | :return: Prompt. 11 | """ 12 | examples = construct_bpftrace_examples(text) 13 | return f""" 14 | {previous_prompt} 15 | 16 | The previous command failed to execute or not finished. 17 | Maybe you can try list the attach points and choose one to attach, 18 | if you have not done so before. 19 | The origin command and output is as follows: 20 | 21 | {output} 22 | """ 23 | 24 | def construct_prompt_for_explain(text: str, output: str) -> str: 25 | # fix the token limi 26 | if len(output) > 2048: 27 | output = output[:4096] 28 | return f""" 29 | please explain the output of the previous bpftrace result: 30 | 31 | {output} 32 | 33 | The original user request is: 34 | 35 | {text} 36 | """ 37 | 38 | def construct_running_prompt(text: str) -> str: 39 | """ 40 | Construct prompts that translate user requests into bpf commands. 41 | 42 | :param text: User request. 43 | :return: Prompt. 44 | """ 45 | examples = construct_bpftrace_examples(text) 46 | return f""" 47 | As a supportive assistant to a Linux system administrator, 48 | your role involves leveraging bpftrace to generate eBPF code that aids 49 | in problem-solving, as well as responding to queries. 50 | Note that you may not always need to call the bpftrace tool function. 51 | Here are some pertinent examples that align with the user's requests: 52 | 53 | {examples} 54 | 55 | Now, you have received the following request from a user: {text} 56 | Please utilize your capabilities to the fullest extent to accomplish this task. 57 | """ 58 | 59 | 60 | def func_call_prompt(cmd: str, help_doc: str) -> str: 61 | """ 62 | Construct a prompt that generates a function call. 63 | 64 | :param cmd: Name of command. 65 | :param help_doc: Help documentation for the command. 66 | :return: Return prompt. 67 | """ 68 | example = """ 69 | ```json 70 | { 71 | "name": "get_current_weather", 72 | "description": "Get the current weather", 73 | "parameters": { 74 | "type": "object", 75 | "properties": { 76 | "location": { 77 | "type": "string", 78 | "description": "The city and state, e.g. San Francisco, CA", 79 | }, 80 | "format": { 81 | "type": "string", 82 | "enum": ["celsius", "fahrenheit"], 83 | "description": "The temperature unit to use. Infer this from the users location.", 84 | }, 85 | }, 86 | "required": ["location", "format"], 87 | }, 88 | } 89 | ```""" 90 | prompts = f""" 91 | Please generate a JSON representation of the command `{cmd}` as per the provided help documentation: 92 | 93 | {help_doc} 94 | 95 | Your JSON should strictly adhere to the following guidelines: 96 | 97 | - Do not include extra fields such as examples. 98 | - Ensure the command description accurately matches the help documentation. 99 | - Parameter names should not start with a '-' or contain a ','. 100 | - Your format should align with the provided example: {example} 101 | - Assign the most appropriate data type to each parameter. Possible types include "string", "boolean", "integer", and "float". 102 | 103 | IMPORTANT: Provide the JSON representation directly, without any additional explanation or detail. If any information is missing from the help documentation, use your best judgment to provide a logical solution. You are not permitted to request additional information. Do not concern yourself with potential errors or confusion that may arise; your sole responsibility is to generate the JSON code. 104 | """ 105 | return prompts 106 | -------------------------------------------------------------------------------- /gpttrace/utils/__init__.py: -------------------------------------------------------------------------------- 1 | import os 2 | import sys 3 | 4 | project_root = os.path.dirname(os.path.dirname(os.path.abspath(__file__))) 5 | sys.path.append(project_root) 6 | -------------------------------------------------------------------------------- /gpttrace/utils/common.py: -------------------------------------------------------------------------------- 1 | import os 2 | import pygments 3 | from prompt_toolkit import print_formatted_text 4 | from prompt_toolkit.formatted_text import PygmentsTokens 5 | from pygments_markdown_lexer import MarkdownLexer 6 | from typing import Any 7 | from langchain import ConversationChain, OpenAI 8 | from langchain.chat_models import ChatOpenAI 9 | from langchain.chains.conversation.memory import ConversationBufferMemory 10 | from llama_index import LLMPredictor, ServiceContext, StorageContext, VectorStoreIndex, SimpleDirectoryReader, load_index_from_storage 11 | 12 | from gpttrace.config import cfg 13 | 14 | def get_doc_content_for_query(index: VectorStoreIndex, query: str) -> str: 15 | """ 16 | Find the content from the document that is closest to the user's request 17 | 18 | :param index: Vector database 19 | :param query: User's request 20 | :return: The content that is most relevant to the user's request. 21 | """ 22 | query_engine = index.as_query_engine() 23 | response = query_engine.query(query) 24 | related_contents = response.source_nodes 25 | if related_contents is not None: 26 | contents = "\nThere are some related information about this query:\n" 27 | for i, content in enumerate(related_contents): 28 | info = f"Info {i}: {content.node.get_text()}\n" 29 | contents += info 30 | return contents 31 | else: 32 | return None 33 | 34 | def pretty_print(input_info: str, lexer: Any = MarkdownLexer, *args: Any, **kwargs: Any) -> None: 35 | """ 36 | This function takes an input string and a lexer (default is MarkdownLexer), 37 | lexes the input using the provided lexer, and then pretty prints the lexed tokens. 38 | 39 | :param input: The string to be lexed and pretty printed. 40 | :param lexer: The lexer to use for lexing the input. Defaults to MarkdownLexer. 41 | :param args: Additional arguments to be passed to the print_formatted_text function. 42 | :param kwargs: Additional keyword arguments to be passed to the print_formatted_text function. 43 | """ 44 | tokens = list(pygments.lex(input_info, lexer=lexer())) 45 | print_formatted_text(PygmentsTokens(tokens), *args, **kwargs) 46 | 47 | def init_conversation(need_train: bool, verbose: bool) -> list[ConversationChain, VectorStoreIndex]: 48 | """ 49 | Initialize the conversation and vector database. 50 | 51 | :param need_train: Whether you need to use a vector database. 52 | :verbose: Whether to print extra information. 53 | :return: Containing two elements: The ConversationChain object is a conversation between a human and an AI. The VectorStoreIndex object is vector database. 54 | """ 55 | model_name = cfg.get("DEFAULT_MODEL") 56 | llm = ChatOpenAI(model_name=model_name, temperature=0) 57 | agent_chain = ConversationChain(llm=llm, verbose=verbose, 58 | memory=ConversationBufferMemory()) 59 | if need_train: 60 | vector_path = cfg.get("VECTOR_DATABASE_PATH") 61 | if not os.path.exists(vector_path): 62 | print(f"{vector_path} not found. Training...") 63 | md_files = [] 64 | # Get all markdown files in the tutorial 65 | for root, _, files in os.walk(cfg.get("DOC_PATH")): 66 | for file in files: 67 | if file.endswith('.md'): 68 | md_files.append(os.path.join(root, file)) 69 | print(f":: {cfg.get('DOC_PATH')}, {md_files}") 70 | documents = SimpleDirectoryReader(input_files=md_files).load_data() 71 | llm_predictor = LLMPredictor(llm=OpenAI( 72 | temperature=0, model_name="text-davinci-003")) 73 | service_context = ServiceContext.from_defaults( 74 | llm_predictor=llm_predictor) 75 | index = VectorStoreIndex.from_documents( 76 | documents, service_context=service_context) 77 | index.storage_context.persist(vector_path) 78 | print( 79 | f"Training completed, {vector_path} has been saved.") 80 | else: 81 | print(f"Loading the {vector_path}...") 82 | storage_context = StorageContext.from_defaults( 83 | persist_dir=vector_path) 84 | index = load_index_from_storage(storage_context) 85 | else: 86 | index = None 87 | return agent_chain, index 88 | -------------------------------------------------------------------------------- /install.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | sudo apt install bpftrace 4 | sudo apt-get install python3-pip 5 | pip install -r requirements.txt 6 | -------------------------------------------------------------------------------- /pyproject.toml: -------------------------------------------------------------------------------- 1 | [build-system] 2 | requires = ["hatchling"] 3 | build-backend = "hatchling.build" 4 | 5 | [project] 6 | name = "gpttrace" 7 | description = "Generate eBPF programs and tracing with ChatGPT and natural language." 8 | keywords = ["shell", "gpt", "openai", "cli", "productivity"] 9 | readme = "README.md" 10 | license = "MIT" 11 | requires-python = ">=3.6" 12 | authors = [{ name = "eunomia-bpf", email = "2629757717@qq.com" }] 13 | dynamic = ["version"] 14 | classifiers = [ 15 | "Operating System :: OS Independent", 16 | "Topic :: Software Development", 17 | "License :: OSI Approved :: MIT License", 18 | "Intended Audience :: Information Technology", 19 | "Intended Audience :: System Administrators", 20 | "Intended Audience :: Developers", 21 | "Programming Language :: Python :: 3 :: Only", 22 | "Programming Language :: Python :: 3.6", 23 | "Programming Language :: Python :: 3.7", 24 | "Programming Language :: Python :: 3.8", 25 | "Programming Language :: Python :: 3.9", 26 | "Programming Language :: Python :: 3.10", 27 | "Programming Language :: Python :: 3.11", 28 | ] 29 | dependencies = [ 30 | "langchain>=0.0.227", 31 | "llama_index>=0.7.3", 32 | "marko>=2.0.0", 33 | "openai>=0.27.8", 34 | "prompt_toolkit>=3.0.38", 35 | "Pygments>=2.15.1", 36 | "pygments_markdown_lexer>=0.1.0.dev39", 37 | "click>=8.1.4" 38 | ] 39 | 40 | # Program entry function of gpttrace. 41 | [project.scripts] 42 | gpttrace = "gpttrace:main" 43 | 44 | [project.urls] 45 | homepage = "https://github.com/eunomia-bpf/GPTtrace" 46 | repository = "https://github.com/eunomia-bpf/GPTtrace" 47 | documentation = "https://github.com/eunomia-bpf/GPTtrace/blob/main/README.md" 48 | 49 | [tool.hatch.version] 50 | path = "gpttrace/__init__.py" 51 | 52 | # Generate the .whl file, which is installed when the user use `pip install package` 53 | [tool.hatch.build.targets.wheel] 54 | only-include = [ 55 | "gpttrace", 56 | "data_save", 57 | ] 58 | # This will be uploaded to pypi. 59 | [tool.hatch.build.targets.sdist] 60 | only-include = [ 61 | "gpttrace", 62 | "doc", 63 | "data_save", 64 | "README.md", 65 | "LICENSE", 66 | "pyproject.toml", 67 | ] 68 | 69 | [tool.isort] 70 | profile = "black" 71 | skip = "__init__.py" 72 | 73 | [tool.mypy] 74 | strict = true 75 | 76 | [tool.ruff] 77 | select = [ 78 | "E", # pycodestyle errors. 79 | "W", # pycodestyle warnings. 80 | "F", # pyflakes. 81 | "C", # flake8-comprehensions. 82 | "B", # flake8-bugbear. 83 | ] 84 | ignore = [ 85 | "E501", # line too long, handled by black. 86 | "C901", # too complex. 87 | "B008", # do not perform function calls in argument defaults. 88 | ] 89 | 90 | [tool.codespell] 91 | skip = '.git,venv' 92 | # ignore-words-list = '' 93 | -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | langchain==0.0.227 2 | llama_index==0.7.3 3 | marko==2.0.0 4 | openai==0.27.8 5 | litellm==0.1.226 6 | prompt_toolkit==3.0.38 7 | Pygments==2.15.1 8 | pygments_markdown_lexer==0.1.0.dev39 9 | click==8.1.4 10 | -------------------------------------------------------------------------------- /tools/bashreadline.bt: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bpftrace 2 | /* 3 | * bashreadline Print entered bash commands from all running shells. 4 | * For Linux, uses bpftrace and eBPF. 5 | * 6 | * This works by tracing the readline() function using a uretprobe (uprobes). 7 | * 8 | * USAGE: bashreadline.bt 9 | * 10 | * This is a bpftrace version of the bcc tool of the same name. 11 | * 12 | * Copyright 2018 Netflix, Inc. 13 | * Licensed under the Apache License, Version 2.0 (the "License") 14 | * 15 | * 06-Sep-2018 Brendan Gregg Created this. 16 | */ 17 | 18 | BEGIN 19 | { 20 | printf("Tracing bash commands... Hit Ctrl-C to end.\n"); 21 | printf("%-9s %-6s %s\n", "TIME", "PID", "COMMAND"); 22 | } 23 | 24 | uretprobe:/bin/bash:readline 25 | { 26 | time("%H:%M:%S "); 27 | printf("%-6d %s\n", pid, str(retval)); 28 | } 29 | -------------------------------------------------------------------------------- /tools/bashreadline_example.txt: -------------------------------------------------------------------------------- 1 | Demonstrations of bashreadline, the Linux bpftrace/eBPF version. 2 | 3 | 4 | This prints bash commands from all running bash shells on the system. For 5 | example: 6 | 7 | # ./bashreadline.bt 8 | Attaching 2 probes... 9 | Tracing bash commands... Hit Ctrl-C to end. 10 | TIME PID COMMAND 11 | 06:40:06 5526 df -h 12 | 06:40:09 5526 ls -l 13 | 06:40:18 5526 echo hello bpftrace 14 | 06:40:42 5526 echooo this is a failed command, but we can see it anyway 15 | ^C 16 | 17 | The entered command may fail. This is just showing what command lines were 18 | entered interactively for bash to process. 19 | 20 | It works by tracing the return of the readline() function using uprobes 21 | (specifically a uretprobe). 22 | 23 | 24 | There is another version of this tool in bcc: https://github.com/iovisor/bcc 25 | -------------------------------------------------------------------------------- /tools/biolatency-kp.bt: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bpftrace 2 | /* 3 | * biolatency.bt Block I/O latency as a histogram. 4 | * For Linux, uses bpftrace, eBPF. 5 | * 6 | * This is a bpftrace version of the bcc tool of the same name. 7 | * 8 | * Copyright 2018 Netflix, Inc. 9 | * Licensed under the Apache License, Version 2.0 (the "License") 10 | * 11 | * 13-Sep-2018 Brendan Gregg Created this. 12 | */ 13 | 14 | BEGIN 15 | { 16 | printf("Tracing block device I/O... Hit Ctrl-C to end.\n"); 17 | } 18 | 19 | kprobe:blk_account_io_start, 20 | kprobe:__blk_account_io_start 21 | { 22 | @start[arg0] = nsecs; 23 | } 24 | 25 | kprobe:blk_account_io_done, 26 | kprobe:__blk_account_io_done 27 | /@start[arg0]/ 28 | { 29 | @usecs = hist((nsecs - @start[arg0]) / 1000); 30 | delete(@start[arg0]); 31 | } 32 | 33 | END 34 | { 35 | clear(@start); 36 | } 37 | -------------------------------------------------------------------------------- /tools/biolatency.bt: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bpftrace 2 | /* 3 | * biolatency.bt Block I/O latency as a histogram. 4 | * For Linux, uses bpftrace, eBPF. 5 | * 6 | * This is a bpftrace version of the bcc tool of the same name. 7 | * 8 | * Copyright 2018 Netflix, Inc. 9 | * Licensed under the Apache License, Version 2.0 (the "License") 10 | * 11 | * 13-Sep-2018 Brendan Gregg Created this. 12 | */ 13 | 14 | BEGIN 15 | { 16 | printf("Tracing block device I/O... Hit Ctrl-C to end.\n"); 17 | } 18 | 19 | tracepoint:block:block_bio_queue 20 | { 21 | @start[args.sector] = nsecs; 22 | } 23 | 24 | tracepoint:block:block_rq_complete, 25 | tracepoint:block:block_bio_complete 26 | /@start[args.sector]/ 27 | { 28 | @usecs = hist((nsecs - @start[args.sector]) / 1000); 29 | delete(@start[args.sector]); 30 | } 31 | 32 | END 33 | { 34 | clear(@start); 35 | } 36 | -------------------------------------------------------------------------------- /tools/biolatency_example.txt: -------------------------------------------------------------------------------- 1 | Demonstrations of biolatency, the Linux BPF/bpftrace version. 2 | 3 | 4 | This traces block I/O, and shows latency as a power-of-2 histogram. For example: 5 | 6 | # ./biolatency-kp.bt 7 | Attaching 3 probes... 8 | Tracing block device I/O... Hit Ctrl-C to end. 9 | ^C 10 | 11 | @usecs: 12 | [256, 512) 2 | | 13 | [512, 1K) 10 |@ | 14 | [1K, 2K) 426 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| 15 | [2K, 4K) 230 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@ | 16 | [4K, 8K) 9 |@ | 17 | [8K, 16K) 128 |@@@@@@@@@@@@@@@ | 18 | [16K, 32K) 68 |@@@@@@@@ | 19 | [32K, 64K) 0 | | 20 | [64K, 128K) 0 | | 21 | [128K, 256K) 10 |@ | 22 | 23 | While tracing, this shows that 426 block I/O had a latency of between 1K and 2K 24 | usecs (1024 and 2048 microseconds), which is between 1 and 2 milliseconds. 25 | There are also two modes visible, one between 1 and 2 milliseconds, and another 26 | between 8 and 16 milliseconds: this sounds like cache hits and cache misses. 27 | There were also 10 I/O with latency 128 to 256 ms: outliers. Other tools and 28 | instrumentation, like biosnoop.bt, can shed more light on those outliers. 29 | 30 | 31 | There is another version of this tool in bcc: https://github.com/iovisor/bcc 32 | The bcc version provides options to customize the output. 33 | 34 | "biolatency.bt" is an updated version of "biolatency-kp.bt" and does basically 35 | the same thing utilizing the tracepoints instead of kprobes. 36 | -------------------------------------------------------------------------------- /tools/biosnoop.bt: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bpftrace 2 | /* 3 | * biosnoop.bt Block I/O tracing tool, showing per I/O latency. 4 | * For Linux, uses bpftrace, eBPF. 5 | * 6 | * TODO: switch to block tracepoints. Add offset and size columns. 7 | * 8 | * This is a bpftrace version of the bcc tool of the same name. 9 | * 10 | * 15-Nov-2017 Brendan Gregg Created this. 11 | */ 12 | 13 | #ifndef BPFTRACE_HAVE_BTF 14 | #include 15 | #include 16 | #endif 17 | 18 | BEGIN 19 | { 20 | printf("%-12s %-7s %-16s %-6s %7s\n", "TIME(ms)", "DISK", "COMM", "PID", "LAT(ms)"); 21 | } 22 | 23 | kprobe:blk_account_io_start, 24 | kprobe:__blk_account_io_start 25 | { 26 | @start[arg0] = nsecs; 27 | @iopid[arg0] = pid; 28 | @iocomm[arg0] = comm; 29 | @disk[arg0] = ((struct request *)arg0)->q->disk->disk_name; 30 | } 31 | 32 | kprobe:blk_account_io_done, 33 | kprobe:__blk_account_io_done 34 | /@start[arg0] != 0 && @iopid[arg0] != 0 && @iocomm[arg0] != ""/ 35 | 36 | { 37 | $now = nsecs; 38 | printf("%-12u %-7s %-16s %-6d %7d\n", 39 | elapsed / 1e6, @disk[arg0], @iocomm[arg0], @iopid[arg0], 40 | ($now - @start[arg0]) / 1e6); 41 | 42 | delete(@start[arg0]); 43 | delete(@iopid[arg0]); 44 | delete(@iocomm[arg0]); 45 | delete(@disk[arg0]); 46 | } 47 | 48 | END 49 | { 50 | clear(@start); 51 | clear(@iopid); 52 | clear(@iocomm); 53 | clear(@disk); 54 | } 55 | -------------------------------------------------------------------------------- /tools/biosnoop_example.txt: -------------------------------------------------------------------------------- 1 | Demonstrations of biosnoop, the Linux BPF/bpftrace version. 2 | 3 | 4 | This traces block I/O, and shows the issuing process (at least, the process 5 | that was on-CPU at the time of queue insert) and the latency of the I/O: 6 | 7 | # ./biosnoop.bt 8 | Attaching 4 probes... 9 | TIME(ms) DISK COMM PID LAT(ms) 10 | 611 nvme0n1 bash 4179 10 11 | 611 nvme0n1 cksum 4179 0 12 | 627 nvme0n1 cksum 4179 15 13 | 641 nvme0n1 cksum 4179 13 14 | 644 nvme0n1 cksum 4179 3 15 | 658 nvme0n1 cksum 4179 13 16 | 673 nvme0n1 cksum 4179 14 17 | 686 nvme0n1 cksum 4179 13 18 | 701 nvme0n1 cksum 4179 14 19 | 710 nvme0n1 cksum 4179 8 20 | 717 nvme0n1 cksum 4179 6 21 | 728 nvme0n1 cksum 4179 10 22 | 735 nvme0n1 cksum 4179 6 23 | 751 nvme0n1 cksum 4179 10 24 | 758 nvme0n1 cksum 4179 17 25 | 783 nvme0n1 cksum 4179 12 26 | 796 nvme0n1 cksum 4179 25 27 | 802 nvme0n1 cksum 4179 32 28 | [...] 29 | 30 | This output shows the cksum process was issuing block I/O, which were 31 | completing with around 12 milliseconds of latency. Each block I/O event is 32 | printed out, with a completion time as the first column, measured from 33 | program start. 34 | 35 | 36 | An example of some background flushing: 37 | 38 | # ./biosnoop.bt 39 | Attaching 4 probes... 40 | TIME(ms) DISK COMM PID LAT(ms) 41 | 2966 nvme0n1 jbd2/nvme0n1-8 615 0 42 | 2967 nvme0n1 jbd2/nvme0n1-8 615 0 43 | [...] 44 | 45 | 46 | There is another version of this tool in bcc: https://github.com/iovisor/bcc 47 | The bcc version provides more fields. 48 | -------------------------------------------------------------------------------- /tools/biostacks.bt: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bpftrace 2 | /* 3 | * biostacks - Show disk I/O latency with initialization stacks. 4 | * 5 | * See BPF Performance Tools, Chapter 9, for an explanation of this tool. 6 | * 7 | * Copyright (c) 2019 Brendan Gregg. 8 | * Licensed under the Apache License, Version 2.0 (the "License"). 9 | * This was originally created for the BPF Performance Tools book 10 | * published by Addison Wesley. ISBN-13: 9780136554820 11 | * When copying or porting, include this comment. 12 | * 13 | * 19-Mar-2019 Brendan Gregg Created this. 14 | */ 15 | 16 | BEGIN 17 | { 18 | printf("Tracing block I/O with init stacks. Hit Ctrl-C to end.\n"); 19 | } 20 | 21 | kprobe:blk_account_io_start, 22 | kprobe:__blk_account_io_start 23 | { 24 | @reqstack[arg0] = kstack; 25 | @reqts[arg0] = nsecs; 26 | } 27 | 28 | kprobe:blk_start_request, 29 | kprobe:blk_mq_start_request 30 | /@reqts[arg0]/ 31 | { 32 | @usecs[@reqstack[arg0]] = hist(nsecs - @reqts[arg0]); 33 | delete(@reqstack[arg0]); 34 | delete(@reqts[arg0]); 35 | } 36 | 37 | END 38 | { 39 | clear(@reqstack); clear(@reqts); 40 | } 41 | -------------------------------------------------------------------------------- /tools/biostacks_example.txt: -------------------------------------------------------------------------------- 1 | Demonstrations of biostacks, the Linux BCC/eBPF version. 2 | 3 | 4 | This tool shows block I/O latency as a histogram, with the kernel stack trace 5 | that initiated the I/O. This can help explain disk I/O that is not directly 6 | requested by applications (eg, metadata reads on writes, resilvering, etc). 7 | For example: 8 | 9 | # ./biostacks.bt 10 | Attaching 5 probes... 11 | Tracing block I/O with init stacks. Hit Ctrl-C to end. 12 | ^C 13 | 14 | @usecs[ 15 | blk_account_io_start+1 16 | blk_mq_make_request+1102 17 | generic_make_request+292 18 | submit_bio+115 19 | _xfs_buf_ioapply+798 20 | xfs_buf_submit+101 21 | xlog_bdstrat+43 22 | xlog_sync+705 23 | xlog_state_release_iclog+108 24 | _xfs_log_force+542 25 | xfs_log_force+44 26 | xfsaild+428 27 | kthread+289 28 | ret_from_fork+53 29 | ]: 30 | [64K, 128K) 1 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| 31 | 32 | [...] 33 | 34 | @usecs[ 35 | blk_account_io_start+1 36 | blk_mq_make_request+707 37 | generic_make_request+292 38 | submit_bio+115 39 | xfs_add_to_ioend+455 40 | xfs_do_writepage+758 41 | write_cache_pages+524 42 | xfs_vm_writepages+190 43 | do_writepages+75 44 | __writeback_single_inode+69 45 | writeback_sb_inodes+481 46 | __writeback_inodes_wb+103 47 | wb_writeback+625 48 | wb_workfn+384 49 | process_one_work+478 50 | worker_thread+50 51 | kthread+289 52 | ret_from_fork+53 53 | ]: 54 | [8K, 16K) 560 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| 55 | [16K, 32K) 218 |@@@@@@@@@@@@@@@@@@@@ | 56 | [32K, 64K) 26 |@@ | 57 | [64K, 128K) 2 | | 58 | [128K, 256K) 53 |@@@@ | 59 | [256K, 512K) 60 |@@@@@ | 60 | 61 | This output shows the most frequent stack was XFS writeback, with latencies 62 | between 8 and 512 microseconds. The other stack included here shows an XFS 63 | log sync. 64 | -------------------------------------------------------------------------------- /tools/bitesize.bt: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bpftrace 2 | /* 3 | * bitesize Show disk I/O size as a histogram. 4 | * For Linux, uses bpftrace and eBPF. 5 | * 6 | * USAGE: bitesize.bt 7 | * 8 | * This is a bpftrace version of the bcc tool of the same name. 9 | * 10 | * Copyright 2018 Netflix, Inc. 11 | * Licensed under the Apache License, Version 2.0 (the "License") 12 | * 13 | * 07-Sep-2018 Brendan Gregg Created this. 14 | */ 15 | 16 | BEGIN 17 | { 18 | printf("Tracing block device I/O... Hit Ctrl-C to end.\n"); 19 | } 20 | 21 | tracepoint:block:block_rq_issue 22 | { 23 | @[args.comm] = hist(args.bytes); 24 | } 25 | 26 | END 27 | { 28 | printf("\nI/O size (bytes) histograms by process name:"); 29 | } 30 | -------------------------------------------------------------------------------- /tools/bitesize_example.txt: -------------------------------------------------------------------------------- 1 | Demonstrations of bitesize, the Linux bpftrace/eBPF version. 2 | 3 | 4 | This traces disk I/O via the block I/O interface, and prints a summary of I/O 5 | sizes as histograms for each process name. For example: 6 | 7 | # ./bitesize.bt 8 | Attaching 3 probes... 9 | Tracing block device I/O... Hit Ctrl-C to end. 10 | ^C 11 | I/O size (bytes) histograms by process name: 12 | 13 | @[cleanup]: 14 | [4K, 8K) 2 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| 15 | 16 | @[postdrop]: 17 | [4K, 8K) 2 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| 18 | 19 | @[jps]: 20 | [4K, 8K) 1 |@@@@@@@@@@@@@@@@@@@@@@@@@@ | 21 | [8K, 16K) 0 | | 22 | [16K, 32K) 0 | | 23 | [32K, 64K) 2 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| 24 | 25 | @[kworker/2:1H]: 26 | [0] 3 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| 27 | [1] 0 | | 28 | [2, 4) 0 | | 29 | [4, 8) 0 | | 30 | [8, 16) 0 | | 31 | [16, 32) 0 | | 32 | [32, 64) 0 | | 33 | [64, 128) 0 | | 34 | [128, 256) 0 | | 35 | [256, 512) 0 | | 36 | [512, 1K) 0 | | 37 | [1K, 2K) 0 | | 38 | [2K, 4K) 0 | | 39 | [4K, 8K) 0 | | 40 | [8K, 16K) 0 | | 41 | [16K, 32K) 0 | | 42 | [32K, 64K) 0 | | 43 | [64K, 128K) 1 |@@@@@@@@@@@@@@@@@ | 44 | 45 | @[jbd2/nvme0n1-8]: 46 | [4K, 8K) 3 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| 47 | [8K, 16K) 0 | | 48 | [16K, 32K) 0 | | 49 | [32K, 64K) 2 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ | 50 | [64K, 128K) 1 |@@@@@@@@@@@@@@@@@ | 51 | 52 | @[dd]: 53 | [16K, 32K) 921 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| 54 | 55 | The most active process while tracing was "dd", which issues 921 I/O between 56 | 16 Kbytes and 32 Kbytes in size. 57 | 58 | 59 | There is another version of this tool in bcc: https://github.com/iovisor/bcc 60 | -------------------------------------------------------------------------------- /tools/capable.bt: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bpftrace 2 | /* 3 | * capable Trace security capability checks (cap_capable()). 4 | * For Linux, uses bpftrace and eBPF. 5 | * 6 | * USAGE: capable.bt 7 | * 8 | * This is a bpftrace version of the bcc tool of the same name. 9 | * 10 | * Copyright 2018 Netflix, Inc. 11 | * Licensed under the Apache License, Version 2.0 (the "License") 12 | * 13 | * 08-Sep-2018 Brendan Gregg Created this. 14 | */ 15 | 16 | BEGIN 17 | { 18 | printf("Tracing cap_capable syscalls... Hit Ctrl-C to end.\n"); 19 | printf("%-9s %-6s %-6s %-16s %-4s %-20s AUDIT\n", "TIME", "UID", "PID", 20 | "COMM", "CAP", "NAME"); 21 | @cap[0] = "CAP_CHOWN"; 22 | @cap[1] = "CAP_DAC_OVERRIDE"; 23 | @cap[2] = "CAP_DAC_READ_SEARCH"; 24 | @cap[3] = "CAP_FOWNER"; 25 | @cap[4] = "CAP_FSETID"; 26 | @cap[5] = "CAP_KILL"; 27 | @cap[6] = "CAP_SETGID"; 28 | @cap[7] = "CAP_SETUID"; 29 | @cap[8] = "CAP_SETPCAP"; 30 | @cap[9] = "CAP_LINUX_IMMUTABLE"; 31 | @cap[10] = "CAP_NET_BIND_SERVICE"; 32 | @cap[11] = "CAP_NET_BROADCAST"; 33 | @cap[12] = "CAP_NET_ADMIN"; 34 | @cap[13] = "CAP_NET_RAW"; 35 | @cap[14] = "CAP_IPC_LOCK"; 36 | @cap[15] = "CAP_IPC_OWNER"; 37 | @cap[16] = "CAP_SYS_MODULE"; 38 | @cap[17] = "CAP_SYS_RAWIO"; 39 | @cap[18] = "CAP_SYS_CHROOT"; 40 | @cap[19] = "CAP_SYS_PTRACE"; 41 | @cap[20] = "CAP_SYS_PACCT"; 42 | @cap[21] = "CAP_SYS_ADMIN"; 43 | @cap[22] = "CAP_SYS_BOOT"; 44 | @cap[23] = "CAP_SYS_NICE"; 45 | @cap[24] = "CAP_SYS_RESOURCE"; 46 | @cap[25] = "CAP_SYS_TIME"; 47 | @cap[26] = "CAP_SYS_TTY_CONFIG"; 48 | @cap[27] = "CAP_MKNOD"; 49 | @cap[28] = "CAP_LEASE"; 50 | @cap[29] = "CAP_AUDIT_WRITE"; 51 | @cap[30] = "CAP_AUDIT_CONTROL"; 52 | @cap[31] = "CAP_SETFCAP"; 53 | @cap[32] = "CAP_MAC_OVERRIDE"; 54 | @cap[33] = "CAP_MAC_ADMIN"; 55 | @cap[34] = "CAP_SYSLOG"; 56 | @cap[35] = "CAP_WAKE_ALARM"; 57 | @cap[36] = "CAP_BLOCK_SUSPEND"; 58 | @cap[37] = "CAP_AUDIT_READ"; 59 | @cap[38] = "CAP_PERFMON"; 60 | @cap[39] = "CAP_BPF"; 61 | @cap[40] = "CAP_CHECKPOINT_RESTORE"; 62 | } 63 | 64 | kprobe:cap_capable 65 | { 66 | $cap = arg2; 67 | $audit = arg3; 68 | time("%H:%M:%S "); 69 | printf("%-6d %-6d %-16s %-4d %-20s %d\n", uid, pid, comm, $cap, 70 | @cap[$cap], $audit); 71 | } 72 | 73 | END 74 | { 75 | clear(@cap); 76 | } 77 | -------------------------------------------------------------------------------- /tools/capable_example.txt: -------------------------------------------------------------------------------- 1 | Demonstrations of capable, the Linux bpftrace/eBPF version. 2 | 3 | 4 | capable traces calls to the kernel cap_capable() function, which does security 5 | capability checks, and prints details for each call. For example: 6 | 7 | # ./capable.bt 8 | TIME UID PID COMM CAP NAME AUDIT 9 | 22:11:23 114 2676 snmpd 12 CAP_NET_ADMIN 1 10 | 22:11:23 0 6990 run 24 CAP_SYS_RESOURCE 1 11 | 22:11:23 0 7003 chmod 3 CAP_FOWNER 1 12 | 22:11:23 0 7003 chmod 4 CAP_FSETID 1 13 | 22:11:23 0 7005 chmod 4 CAP_FSETID 1 14 | 22:11:23 0 7005 chmod 4 CAP_FSETID 1 15 | 22:11:23 0 7006 chown 4 CAP_FSETID 1 16 | 22:11:23 0 7006 chown 4 CAP_FSETID 1 17 | 22:11:23 0 6990 setuidgid 6 CAP_SETGID 1 18 | 22:11:23 0 6990 setuidgid 6 CAP_SETGID 1 19 | 22:11:23 0 6990 setuidgid 7 CAP_SETUID 1 20 | 22:11:24 0 7013 run 24 CAP_SYS_RESOURCE 1 21 | 22:11:24 0 7026 chmod 3 CAP_FOWNER 1 22 | 22:11:24 0 7026 chmod 4 CAP_FSETID 1 23 | 22:11:24 0 7028 chmod 4 CAP_FSETID 1 24 | 22:11:24 0 7028 chmod 4 CAP_FSETID 1 25 | 22:11:24 0 7029 chown 4 CAP_FSETID 1 26 | 22:11:24 0 7029 chown 4 CAP_FSETID 1 27 | 22:11:24 0 7013 setuidgid 6 CAP_SETGID 1 28 | 22:11:24 0 7013 setuidgid 6 CAP_SETGID 1 29 | 22:11:24 0 7013 setuidgid 7 CAP_SETUID 1 30 | 22:11:25 0 7036 run 24 CAP_SYS_RESOURCE 1 31 | 22:11:25 0 7049 chmod 3 CAP_FOWNER 1 32 | 22:11:25 0 7049 chmod 4 CAP_FSETID 1 33 | 22:11:25 0 7051 chmod 4 CAP_FSETID 1 34 | 22:11:25 0 7051 chmod 4 CAP_FSETID 1 35 | [...] 36 | 37 | This can be useful for general debugging, and also security enforcement: 38 | determining a whitelist of capabilities an application needs. 39 | 40 | The output above includes various capability checks: snmpd checking 41 | CAP_NET_ADMIN, run checking CAP_SYS_RESOURCES, then some short-lived processes 42 | checking CAP_FOWNER, CAP_FSETID, etc. 43 | 44 | To see what each of these capabilities does, check the capabilities(7) man 45 | page and the kernel source. 46 | 47 | 48 | There is another version of this tool in bcc: https://github.com/iovisor/bcc 49 | The bcc version provides options to customize the output. 50 | -------------------------------------------------------------------------------- /tools/cpuwalk.bt: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bpftrace 2 | /* 3 | * cpuwalk Sample which CPUs are executing processes. 4 | * For Linux, uses bpftrace and eBPF. 5 | * 6 | * USAGE: cpuwalk.bt 7 | * 8 | * This is a bpftrace version of the DTraceToolkit tool of the same name. 9 | * 10 | * Copyright 2018 Netflix, Inc. 11 | * Licensed under the Apache License, Version 2.0 (the "License") 12 | * 13 | * 08-Sep-2018 Brendan Gregg Created this. 14 | */ 15 | 16 | BEGIN 17 | { 18 | printf("Sampling CPU at 99hz... Hit Ctrl-C to end.\n"); 19 | } 20 | 21 | profile:hz:99 22 | /pid/ 23 | { 24 | @cpu = lhist(cpu, 0, 1000, 1); 25 | } 26 | -------------------------------------------------------------------------------- /tools/cpuwalk_example.txt: -------------------------------------------------------------------------------- 1 | Demonstrations of cpuwalk, the Linux bpftrace/eBPF version. 2 | 3 | 4 | cpuwalk samples which CPUs processes are running on, and prints a summary 5 | histogram. For example, here is a Linux kernel build on a 36-CPU server: 6 | 7 | # ./cpuwalk.bt 8 | Attaching 2 probes... 9 | Sampling CPU at 99hz... Hit Ctrl-C to end. 10 | ^C 11 | 12 | @cpu: 13 | [0, 1) 130 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ | 14 | [1, 2) 137 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ | 15 | [2, 3) 99 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ | 16 | [3, 4) 99 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ | 17 | [4, 5) 82 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ | 18 | [5, 6) 34 |@@@@@@@@@@@@ | 19 | [6, 7) 67 |@@@@@@@@@@@@@@@@@@@@@@@@ | 20 | [7, 8) 41 |@@@@@@@@@@@@@@@ | 21 | [8, 9) 97 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ | 22 | [9, 10) 140 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| 23 | [10, 11) 105 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ | 24 | [11, 12) 77 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@ | 25 | [12, 13) 39 |@@@@@@@@@@@@@@ | 26 | [13, 14) 58 |@@@@@@@@@@@@@@@@@@@@@ | 27 | [14, 15) 64 |@@@@@@@@@@@@@@@@@@@@@@@ | 28 | [15, 16) 57 |@@@@@@@@@@@@@@@@@@@@@ | 29 | [16, 17) 99 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ | 30 | [17, 18) 56 |@@@@@@@@@@@@@@@@@@@@ | 31 | [18, 19) 44 |@@@@@@@@@@@@@@@@ | 32 | [19, 20) 80 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ | 33 | [20, 21) 64 |@@@@@@@@@@@@@@@@@@@@@@@ | 34 | [21, 22) 59 |@@@@@@@@@@@@@@@@@@@@@ | 35 | [22, 23) 88 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ | 36 | [23, 24) 84 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ | 37 | [24, 25) 29 |@@@@@@@@@@ | 38 | [25, 26) 48 |@@@@@@@@@@@@@@@@@ | 39 | [26, 27) 62 |@@@@@@@@@@@@@@@@@@@@@@@ | 40 | [27, 28) 66 |@@@@@@@@@@@@@@@@@@@@@@@@ | 41 | [28, 29) 57 |@@@@@@@@@@@@@@@@@@@@@ | 42 | [29, 30) 59 |@@@@@@@@@@@@@@@@@@@@@ | 43 | [30, 31) 56 |@@@@@@@@@@@@@@@@@@@@ | 44 | [31, 32) 23 |@@@@@@@@ | 45 | [32, 33) 90 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ | 46 | [33, 34) 62 |@@@@@@@@@@@@@@@@@@@@@@@ | 47 | [34, 35) 39 |@@@@@@@@@@@@@@ | 48 | [35, 36) 68 |@@@@@@@@@@@@@@@@@@@@@@@@@ | 49 | 50 | This shows that all 36 CPUs were active, with some busier than others. 51 | 52 | 53 | Compare that output to the following workload from an application: 54 | 55 | # ./cpuwalk.bt 56 | Attaching 2 probes... 57 | Sampling CPU at 99hz... Hit Ctrl-C to end. 58 | ^C 59 | 60 | @cpu: 61 | [6, 7) 243 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| 62 | [7, 8) 0 | | 63 | [8, 9) 0 | | 64 | [9, 10) 0 | | 65 | [10, 11) 0 | | 66 | [11, 12) 0 | | 67 | [12, 13) 0 | | 68 | [13, 14) 0 | | 69 | [14, 15) 0 | | 70 | [15, 16) 0 | | 71 | [16, 17) 0 | | 72 | [17, 18) 0 | | 73 | [18, 19) 0 | | 74 | [19, 20) 0 | | 75 | [20, 21) 1 | | 76 | 77 | In this case, only a single CPU (6) is really active doing work. Only a single 78 | sample was taken of another CPU (20) running a process. If the workload was 79 | supposed to be making use of multiple CPUs, it isn't, and that can be 80 | investigated (application's configuration, number of threads, CPU binding, etc). 81 | -------------------------------------------------------------------------------- /tools/dcsnoop.bt: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bpftrace 2 | /* 3 | * dcsnoop Trace directory entry cache (dcache) lookups. 4 | * For Linux, uses bpftrace and eBPF. 5 | * 6 | * This uses kernel dynamic tracing of kernel functions, lookup_fast() and 7 | * d_lookup(), which will need to be modified to match kernel changes. See 8 | * code comments. 9 | * 10 | * USAGE: dcsnoop.bt 11 | * 12 | * Copyright 2018 Netflix, Inc. 13 | * Licensed under the Apache License, Version 2.0 (the "License") 14 | * 15 | * 08-Sep-2018 Brendan Gregg Created this. 16 | */ 17 | 18 | #ifndef BPFTRACE_HAVE_BTF 19 | #include 20 | #include 21 | 22 | // from fs/namei.c: 23 | struct nameidata { 24 | struct path path; 25 | struct qstr last; 26 | // [...] 27 | }; 28 | #endif 29 | 30 | BEGIN 31 | { 32 | printf("Tracing dcache lookups... Hit Ctrl-C to end.\n"); 33 | printf("%-8s %-6s %-16s %1s %s\n", "TIME", "PID", "COMM", "T", "FILE"); 34 | } 35 | 36 | // comment out this block to avoid showing hits: 37 | kprobe:lookup_fast, 38 | kprobe:lookup_fast.constprop.* 39 | { 40 | $nd = (struct nameidata *)arg0; 41 | printf("%-8d %-6d %-16s R %s\n", elapsed / 1e6, pid, comm, 42 | str($nd->last.name)); 43 | } 44 | 45 | kprobe:d_lookup 46 | { 47 | $name = (struct qstr *)arg1; 48 | @fname[tid] = $name->name; 49 | } 50 | 51 | kretprobe:d_lookup 52 | /@fname[tid]/ 53 | { 54 | printf("%-8d %-6d %-16s M %s\n", elapsed / 1e6, pid, comm, 55 | str(@fname[tid])); 56 | delete(@fname[tid]); 57 | } 58 | -------------------------------------------------------------------------------- /tools/dcsnoop_example.txt: -------------------------------------------------------------------------------- 1 | Demonstrations of dcsnoop, the Linux bpftrace/eBPF version. 2 | 3 | 4 | dcsnoop traces directory entry cache (dcache) lookups, and can be used for 5 | further investigation beyond dcstat(8). The output is likely verbose, as 6 | dcache lookups are likely frequent. For example: 7 | 8 | # ./dcsnoop.bt 9 | Attaching 4 probes... 10 | Tracing dcache lookups... Hit Ctrl-C to end. 11 | TIME PID COMM T FILE 12 | 427 1518 irqbalance R proc/interrupts 13 | 427 1518 irqbalance R interrupts 14 | 427 1518 irqbalance R proc/stat 15 | 427 1518 irqbalance R stat 16 | 483 2440 snmp-pass R proc/cpuinfo 17 | 483 2440 snmp-pass R cpuinfo 18 | 486 2440 snmp-pass R proc/stat 19 | 486 2440 snmp-pass R stat 20 | 834 1744 snmpd R proc/net/dev 21 | 834 1744 snmpd R net/dev 22 | 834 1744 snmpd R self/net 23 | 834 1744 snmpd R 1744 24 | 834 1744 snmpd R net 25 | 834 1744 snmpd R dev 26 | 834 1744 snmpd R proc/net/if_inet6 27 | 834 1744 snmpd R net/if_inet6 28 | 834 1744 snmpd R self/net 29 | 834 1744 snmpd R 1744 30 | 834 1744 snmpd R net 31 | 834 1744 snmpd R if_inet6 32 | 835 1744 snmpd R sys/class/net/docker0/device/vendor 33 | 835 1744 snmpd R class/net/docker0/device/vendor 34 | 835 1744 snmpd R net/docker0/device/vendor 35 | 835 1744 snmpd R docker0/device/vendor 36 | 835 1744 snmpd R devices/virtual/net/docker0 37 | 835 1744 snmpd R virtual/net/docker0 38 | 835 1744 snmpd R net/docker0 39 | 835 1744 snmpd R docker0 40 | 835 1744 snmpd R device/vendor 41 | 835 1744 snmpd R proc/sys/net/ipv4/neigh/docker0/retrans_time_ms 42 | 835 1744 snmpd R sys/net/ipv4/neigh/docker0/retrans_time_ms 43 | 835 1744 snmpd R net/ipv4/neigh/docker0/retrans_time_ms 44 | 835 1744 snmpd R ipv4/neigh/docker0/retrans_time_ms 45 | 835 1744 snmpd R neigh/docker0/retrans_time_ms 46 | 835 1744 snmpd R docker0/retrans_time_ms 47 | 835 1744 snmpd R retrans_time_ms 48 | 835 1744 snmpd R proc/sys/net/ipv6/neigh/docker0/retrans_time_ms 49 | 835 1744 snmpd R sys/net/ipv6/neigh/docker0/retrans_time_ms 50 | 835 1744 snmpd R net/ipv6/neigh/docker0/retrans_time_ms 51 | 835 1744 snmpd R ipv6/neigh/docker0/retrans_time_ms 52 | 835 1744 snmpd R neigh/docker0/retrans_time_ms 53 | 835 1744 snmpd R docker0/retrans_time_ms 54 | 835 1744 snmpd R retrans_time_ms 55 | 835 1744 snmpd R proc/sys/net/ipv6/conf/docker0/forwarding 56 | 835 1744 snmpd R sys/net/ipv6/conf/docker0/forwarding 57 | 835 1744 snmpd R net/ipv6/conf/docker0/forwarding 58 | 835 1744 snmpd R ipv6/conf/docker0/forwarding 59 | 835 1744 snmpd R conf/docker0/forwarding 60 | [...] 61 | 5154 934 cksum R usr/bin/basename 62 | 5154 934 cksum R bin/basename 63 | 5154 934 cksum R basename 64 | 5154 934 cksum R usr/bin/bashbug 65 | 5154 934 cksum R bin/bashbug 66 | 5154 934 cksum R bashbug 67 | 5154 934 cksum M bashbug 68 | 5155 934 cksum R usr/bin/batch 69 | 5155 934 cksum R bin/batch 70 | 5155 934 cksum R batch 71 | 5155 934 cksum M batch 72 | 5155 934 cksum R usr/bin/bc 73 | 5155 934 cksum R bin/bc 74 | 5155 934 cksum R bc 75 | 5155 934 cksum M bc 76 | 5169 934 cksum R usr/bin/bdftopcf 77 | 5169 934 cksum R bin/bdftopcf 78 | 5169 934 cksum R bdftopcf 79 | 5169 934 cksum M bdftopcf 80 | 5173 934 cksum R usr/bin/bdftruncate 81 | 5173 934 cksum R bin/bdftruncate 82 | 5173 934 cksum R bdftruncate 83 | 5173 934 cksum M bdftruncate 84 | 85 | The way the dcache is currently implemented, each component of a path is 86 | checked in turn. The first line, showing "proc/interrupts" from irqbalance, 87 | will be a lookup for "proc" in a directory (that isn't shown here). If it 88 | finds "proc", it will then lookup "interrupts" inside net. 89 | 90 | The script is easily modifiable to only show misses, reducing the volume of 91 | the output. Or use the bcc version of this tool, which only shows misses by 92 | default: https://github.com/iovisor/bcc 93 | -------------------------------------------------------------------------------- /tools/execsnoop.bt: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bpftrace 2 | /* 3 | * execsnoop.bt Trace new processes via exec() syscalls. 4 | * For Linux, uses bpftrace and eBPF. 5 | * 6 | * This traces when processes call exec(). It is handy for identifying new 7 | * processes created via the usual fork()->exec() sequence. Note that the 8 | * return value is not currently traced, so the exec() may have failed. 9 | * 10 | * TODO: switch to tracepoints args. Support more args. Include retval. 11 | * 12 | * This is a bpftrace version of the bcc tool of the same name. 13 | * 14 | * 15-Nov-2017 Brendan Gregg Created this. 15 | * 11-Sep-2018 " " Switched to use join(). 16 | */ 17 | 18 | BEGIN 19 | { 20 | printf("%-10s %-5s %s\n", "TIME(ms)", "PID", "ARGS"); 21 | } 22 | 23 | tracepoint:syscalls:sys_enter_exec* 24 | { 25 | printf("%-10u %-5d ", elapsed / 1e6, pid); 26 | join(args.argv); 27 | } 28 | -------------------------------------------------------------------------------- /tools/execsnoop_example.txt: -------------------------------------------------------------------------------- 1 | Demonstrations of execsnoop, the Linux BPF/bpftrace version. 2 | 3 | 4 | Tracing all new process execution (via exec()): 5 | 6 | # ./execsnoop.bt 7 | Attaching 3 probes... 8 | TIME(ms) PID ARGS 9 | 2460 3466 ls --color=auto -lh execsnoop.bt execsnoop.bt.0 execsnoop.bt.1 10 | 3996 3467 man ls 11 | 4005 3473 preconv -e UTF-8 12 | 4005 3473 preconv -e UTF-8 13 | 4005 3473 preconv -e UTF-8 14 | 4005 3473 preconv -e UTF-8 15 | 4005 3473 preconv -e UTF-8 16 | 4005 3474 tbl 17 | 4005 3474 tbl 18 | 4005 3474 tbl 19 | 4005 3474 tbl 20 | 4005 3474 tbl 21 | 4005 3476 nroff -mandoc -rLL=193n -rLT=193n -Tutf8 22 | 4005 3476 nroff -mandoc -rLL=193n -rLT=193n -Tutf8 23 | 4005 3476 nroff -mandoc -rLL=193n -rLT=193n -Tutf8 24 | 4005 3476 nroff -mandoc -rLL=193n -rLT=193n -Tutf8 25 | 4005 3476 nroff -mandoc -rLL=193n -rLT=193n -Tutf8 26 | 4006 3479 pager -rLL=193n 27 | 4006 3479 pager -rLL=193n 28 | 4006 3479 pager -rLL=193n 29 | 4006 3479 pager -rLL=193n 30 | 4006 3479 pager -rLL=193n 31 | 4007 3481 locale charmap 32 | 4008 3482 groff -mtty-char -Tutf8 -mandoc -rLL=193n -rLT=193n 33 | 4009 3483 troff -mtty-char -mandoc -rLL=193n -rLT=193n -Tutf8 34 | 35 | The output begins by showing an "ls" command, and then the process execution 36 | to serve "man ls". The same exec arguments appear multiple times: in this case 37 | they are failing as the $PATH variable is walked, until one finally succeeds. 38 | 39 | This tool can be used to discover unwanted short-lived processes that may be 40 | causing performance issues such as latency perturbations. 41 | 42 | 43 | There is another version of this tool in bcc: https://github.com/iovisor/bcc 44 | The bcc version provides more fields and command line options. 45 | -------------------------------------------------------------------------------- /tools/generate.py: -------------------------------------------------------------------------------- 1 | import re 2 | import openai 3 | import os 4 | import json 5 | import glob 6 | import time 7 | openai.api_key = os.getenv("OPENAI_API_KEY") 8 | 9 | 10 | def get_bpf_summary(bpf_code): 11 | response = openai.ChatCompletion.create( 12 | model="gpt-3.5-turbo", 13 | messages=[ 14 | {"role": "system", "content": "You are a helpful assistant."}, 15 | {"role": "user", "content": 16 | f""" 17 | Given this BPF code: 18 | 19 | {bpf_code} 20 | 21 | Provide a concise and clear summary in one or two sentences. 22 | Frame the explanation as a user's request to a developer to write a BPF code. 23 | Avoid beginning with phrases like 'This BPF code is...' Instead, 24 | start with action-oriented phrases such as 'Write a BPF code that...' 25 | """ 26 | 27 | 28 | } 29 | ], 30 | max_tokens=1500 31 | ) 32 | 33 | output = response.choices[0].message['content'] 34 | return output 35 | 36 | 37 | def get_all_bpf_in_dir(directory): 38 | all_files = glob.glob(directory + '/*.bt') 39 | all_bpf = [] 40 | for file in all_files: 41 | print("opening file: " + file) 42 | # sleep for a short time to avoid hitting the rate limit 43 | # Sleep for 5 seconds 44 | time.sleep(3) 45 | with open(file, 'r') as f: 46 | bpf_code = f.read() 47 | summary = get_bpf_summary(bpf_code) 48 | all_bpf.append({ 49 | "request": summary, 50 | "bpf": bpf_code 51 | }) 52 | print(bpf_code) 53 | print(summary) 54 | 55 | return all_bpf 56 | 57 | def write_example_to_json(): 58 | directory = './' 59 | all_bpf = get_all_bpf_in_dir(directory) 60 | with open('output.json', 'w') as outfile: 61 | json.dump(all_bpf, outfile) 62 | 63 | def remove_multiline_comments(lines): 64 | """ 65 | Remove multiline comments from a list of lines. 66 | 67 | This function takes a list of lines as input and removes multiline comments 68 | that start with '/*' and end with '*/'. The function returns the cleaned 69 | content without the multiline comments. 70 | """ 71 | 72 | inside_comment = False 73 | cleaned_lines = [] 74 | 75 | for line in lines: 76 | if not inside_comment: 77 | start_index = line.find('/*') 78 | end_index = line.find('*/', start_index + 2) 79 | 80 | if start_index != -1 and end_index != -1: 81 | inside_comment = False 82 | cleaned_line = line[:start_index] + line[end_index + 2:] 83 | cleaned_lines.append(cleaned_line) 84 | elif start_index != -1: 85 | inside_comment = True 86 | cleaned_line = line[:start_index] 87 | cleaned_lines.append(cleaned_line) 88 | else: 89 | cleaned_lines.append(line) 90 | else: 91 | end_index = line.find('*/') 92 | if end_index != -1: 93 | inside_comment = False 94 | cleaned_line = line[end_index + 2:] 95 | cleaned_lines.append(cleaned_line) 96 | 97 | cleaned_content = ''.join(cleaned_lines) 98 | return cleaned_content 99 | 100 | def reformat(): 101 | rearranged_data = {"data": []} 102 | with open("./tools/output.json", mode='r', encoding='utf-8') as file: 103 | contents = json.load(file) 104 | for content in contents: 105 | cleaned_content = remove_multiline_comments(re.split(r'(\n)', content['bpf'])) 106 | example = f"example: {content['request']}\n\n```\n{cleaned_content}\n```\n" 107 | info = {"content": example} 108 | rearranged_data["data"].append(info) 109 | 110 | with open("./tools/examples.json", 'w', encoding='utf-8') as file: 111 | json.dump(rearranged_data, file) 112 | 113 | if __name__ == "__main__": 114 | reformat() 115 | -------------------------------------------------------------------------------- /tools/gethostlatency.bt: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bpftrace 2 | /* 3 | * gethostlatency Trace getaddrinfo/gethostbyname[2] calls. 4 | * For Linux, uses bpftrace and eBPF. 5 | * 6 | * This can be useful for identifying DNS latency, by identifying which 7 | * remote host name lookups were slow, and by how much. 8 | * 9 | * This uses dynamic tracing of user-level functions and registers, and may 10 | # need modifications to match your software and processor architecture. 11 | * 12 | * USAGE: gethostlatency.bt 13 | * 14 | * This is a bpftrace version of the bcc tool of the same name. 15 | * 16 | * Copyright 2018 Netflix, Inc. 17 | * Licensed under the Apache License, Version 2.0 (the "License") 18 | * 19 | * 08-Sep-2018 Brendan Gregg Created this. 20 | */ 21 | 22 | BEGIN 23 | { 24 | printf("Tracing getaddr/gethost calls... Hit Ctrl-C to end.\n"); 25 | printf("%-9s %-6s %-16s %6s %s\n", "TIME", "PID", "COMM", "LATms", 26 | "HOST"); 27 | } 28 | 29 | uprobe:libc:getaddrinfo, 30 | uprobe:libc:gethostbyname, 31 | uprobe:libc:gethostbyname2 32 | { 33 | @start[tid] = nsecs; 34 | @name[tid] = arg0; 35 | } 36 | 37 | uretprobe:libc:getaddrinfo, 38 | uretprobe:libc:gethostbyname, 39 | uretprobe:libc:gethostbyname2 40 | /@start[tid]/ 41 | { 42 | $latms = (nsecs - @start[tid]) / 1e6; 43 | time("%H:%M:%S "); 44 | printf("%-6d %-16s %6d %s\n", pid, comm, $latms, str(@name[tid])); 45 | delete(@start[tid]); 46 | delete(@name[tid]); 47 | } 48 | -------------------------------------------------------------------------------- /tools/gethostlatency_example.txt: -------------------------------------------------------------------------------- 1 | Demonstrations of gethostlatency, the Linux bpftrace/eBPF version. 2 | 3 | 4 | This traces host name lookup calls (getaddrinfo(), gethostbyname(), and 5 | gethostbyname2()), and shows the PID and command performing the lookup, the 6 | latency (duration) of the call in milliseconds, and the host string: 7 | 8 | # ./gethostlatency.bt 9 | Attaching 7 probes... 10 | Tracing getaddr/gethost calls... Hit Ctrl-C to end. 11 | TIME PID COMM LATms HOST 12 | 02:52:05 19105 curl 81 www.netflix.com 13 | 02:52:12 19111 curl 17 www.netflix.com 14 | 02:52:19 19116 curl 9 www.facebook.com 15 | 02:52:23 19118 curl 3 www.facebook.com 16 | 17 | In this example, the first call to lookup "www.netflix.com" took 81 ms, and 18 | the second took 17 ms (sounds like some caching). 19 | 20 | 21 | There is another version of this tool in bcc: https://github.com/iovisor/bcc 22 | The bcc version provides options to customize the output. 23 | -------------------------------------------------------------------------------- /tools/killsnoop.bt: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bpftrace 2 | /* 3 | * killsnoop Trace signals issued by the kill() syscall. 4 | * For Linux, uses bpftrace and eBPF. 5 | * 6 | * USAGE: killsnoop.bt 7 | * 8 | * Also a basic example of bpftrace. 9 | * 10 | * This is a bpftrace version of the bcc tool of the same name. 11 | * 12 | * Copyright 2018 Netflix, Inc. 13 | * Licensed under the Apache License, Version 2.0 (the "License") 14 | * 15 | * 07-Sep-2018 Brendan Gregg Created this. 16 | */ 17 | 18 | BEGIN 19 | { 20 | printf("Tracing kill() signals... Hit Ctrl-C to end.\n"); 21 | printf("%-9s %-6s %-16s %-4s %-6s %s\n", "TIME", "PID", "COMM", "SIG", 22 | "TPID", "RESULT"); 23 | } 24 | 25 | tracepoint:syscalls:sys_enter_kill 26 | { 27 | @tpid[tid] = args.pid; 28 | @tsig[tid] = args.sig; 29 | } 30 | 31 | tracepoint:syscalls:sys_exit_kill 32 | /@tpid[tid]/ 33 | { 34 | time("%H:%M:%S "); 35 | printf("%-6d %-16s %-4d %-6d %d\n", pid, comm, @tsig[tid], @tpid[tid], 36 | args.ret); 37 | delete(@tpid[tid]); 38 | delete(@tsig[tid]); 39 | } 40 | -------------------------------------------------------------------------------- /tools/killsnoop_example.txt: -------------------------------------------------------------------------------- 1 | Demonstrations of killsnoop, the Linux bpftrace/eBPF version. 2 | 3 | 4 | 5 | This traces signals sent via the kill() syscall. For example: 6 | 7 | # ./killsnoop.bt 8 | Attaching 3 probes... 9 | Tracing kill() signals... Hit Ctrl-C to end. 10 | TIME PID COMM SIG TPID RESULT 11 | 00:09:37 22485 bash 2 23856 0 12 | 00:09:40 22485 bash 2 23856 -3 13 | 00:09:31 22485 bash 15 23814 -3 14 | 15 | The first line showed a SIGINT (2) sent from PID 22485 (a bash shell) to 16 | PID 23856. The result, 0, means success. The next line shows the same signal 17 | sent, which resulted in -3, a failure (likely because the target process 18 | no longer existed). 19 | 20 | 21 | There is another version of this tool in bcc: https://github.com/iovisor/bcc 22 | The bcc version provides command line options to customize the output. 23 | -------------------------------------------------------------------------------- /tools/loads.bt: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bpftrace 2 | /* 3 | * loads Prints load averages. 4 | * For Linux, uses bpftrace and eBPF. 5 | * 6 | * These are the same load averages printed by "uptime", but to three decimal 7 | * places instead of two (not that it really matters). This is really a 8 | * demonstration of fetching and processing a kernel structure from bpftrace. 9 | * 10 | * USAGE: loads.bt 11 | * 12 | * This is a bpftrace version of a DTraceToolkit tool. 13 | * 14 | * Copyright 2018 Netflix, Inc. 15 | * Licensed under the Apache License, Version 2.0 (the "License") 16 | * 17 | * 10-Sep-2018 Brendan Gregg Created this. 18 | */ 19 | 20 | BEGIN 21 | { 22 | printf("Reading load averages... Hit Ctrl-C to end.\n"); 23 | } 24 | 25 | interval:s:1 26 | { 27 | /* 28 | * See fs/proc/loadavg.c and include/linux/sched/loadavg.h for the 29 | * following calculations. 30 | */ 31 | $avenrun = kaddr("avenrun"); 32 | $load1 = *$avenrun; 33 | $load5 = *($avenrun + 8); 34 | $load15 = *($avenrun + 16); 35 | time("%H:%M:%S "); 36 | printf("load averages: %d.%03d %d.%03d %d.%03d\n", 37 | ($load1 >> 11), (($load1 & ((1 << 11) - 1)) * 1000) >> 11, 38 | ($load5 >> 11), (($load5 & ((1 << 11) - 1)) * 1000) >> 11, 39 | ($load15 >> 11), (($load15 & ((1 << 11) - 1)) * 1000) >> 11 40 | ); 41 | } 42 | -------------------------------------------------------------------------------- /tools/loads_example.txt: -------------------------------------------------------------------------------- 1 | Demonstrations of loads, the Linux bpftrace/eBPF version. 2 | 3 | 4 | This is a simple tool that prints the system load averages, to three decimal 5 | places each (not that it really matters), as a demonstration of fetching 6 | kernel structures from bpftrace: 7 | 8 | # ./loads.bt 9 | Attaching 2 probes... 10 | Reading load averages... Hit Ctrl-C to end. 11 | 21:29:17 load averages: 2.091 2.048 1.947 12 | 21:29:18 load averages: 2.091 2.048 1.947 13 | 21:29:19 load averages: 2.091 2.048 1.947 14 | 21:29:20 load averages: 2.091 2.048 1.947 15 | 21:29:21 load averages: 2.164 2.064 1.953 16 | 21:29:22 load averages: 2.164 2.064 1.953 17 | 21:29:23 load averages: 2.164 2.064 1.953 18 | ^C 19 | 20 | These are the same load averages printed by uptime: 21 | 22 | # uptime 23 | 21:29:24 up 2 days, 18:57, 3 users, load average: 2.16, 2.06, 1.95 24 | 25 | 26 | For more on load averages, see my post: 27 | http://www.brendangregg.com/blog/2017-08-08/linux-load-averages.html 28 | -------------------------------------------------------------------------------- /tools/mdflush.bt: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bpftrace 2 | /* 3 | * mdflush Trace md flush events. 4 | * For Linux, uses bpftrace and eBPF. 5 | * 6 | * USAGE: mdflush.bt 7 | * 8 | * This is a bpftrace version of the bcc tool of the same name. 9 | * 10 | * For Linux 5.12+ (see tools/old for script for lower versions). 11 | * 12 | * Copyright 2018 Netflix, Inc. 13 | * Licensed under the Apache License, Version 2.0 (the "License") 14 | * 15 | * 08-Sep-2018 Brendan Gregg Created this. 16 | */ 17 | 18 | #ifndef BPFTRACE_HAVE_BTF 19 | #include 20 | #include 21 | #endif 22 | 23 | BEGIN 24 | { 25 | printf("Tracing md flush events... Hit Ctrl-C to end.\n"); 26 | printf("%-8s %-6s %-16s %s\n", "TIME", "PID", "COMM", "DEVICE"); 27 | } 28 | 29 | kprobe:md_flush_request 30 | { 31 | time("%H:%M:%S "); 32 | printf("%-6d %-16s %s\n", pid, comm, 33 | ((struct bio *)arg1)->bi_bdev->bd_disk->disk_name); 34 | } 35 | -------------------------------------------------------------------------------- /tools/mdflush_example.txt: -------------------------------------------------------------------------------- 1 | Demonstrations of mdflush, the Linux bpftrace/eBPF version. 2 | 3 | 4 | The mdflush tool traces flushes at the md driver level, and prints details 5 | including the time of the flush: 6 | 7 | # ./mdflush.bt 8 | Tracing md flush requests... Hit Ctrl-C to end. 9 | TIME PID COMM DEVICE 10 | 03:13:49 16770 sync md0 11 | 03:14:08 16864 sync md0 12 | 03:14:49 496 kworker/1:0H md0 13 | 03:14:49 488 xfsaild/md0 md0 14 | 03:14:54 488 xfsaild/md0 md0 15 | 03:15:00 488 xfsaild/md0 md0 16 | 03:15:02 85 kswapd0 md0 17 | 03:15:02 488 xfsaild/md0 md0 18 | 03:15:05 488 xfsaild/md0 md0 19 | 03:15:08 488 xfsaild/md0 md0 20 | 03:15:10 488 xfsaild/md0 md0 21 | 03:15:11 488 xfsaild/md0 md0 22 | 03:15:11 488 xfsaild/md0 md0 23 | 03:15:11 488 xfsaild/md0 md0 24 | 03:15:11 488 xfsaild/md0 md0 25 | 03:15:11 488 xfsaild/md0 md0 26 | 03:15:12 488 xfsaild/md0 md0 27 | 03:15:13 488 xfsaild/md0 md0 28 | 03:15:15 488 xfsaild/md0 md0 29 | 03:15:19 496 kworker/1:0H md0 30 | 03:15:49 496 kworker/1:0H md0 31 | 03:15:55 18840 sync md0 32 | 03:16:49 496 kworker/1:0H md0 33 | 03:17:19 496 kworker/1:0H md0 34 | 03:20:19 496 kworker/1:0H md0 35 | 03:21:19 496 kworker/1:0H md0 36 | 03:21:49 496 kworker/1:0H md0 37 | 03:25:19 496 kworker/1:0H md0 38 | [...] 39 | 40 | This can be useful for correlation with latency outliers or spikes in disk 41 | latency, as measured using another tool (eg, system monitoring). If spikes in 42 | disk latency often coincide with md flush events, then it would make flushing 43 | a target for tuning. 44 | 45 | Note that the flush events are likely to originate from higher in the I/O 46 | stack, such as from file systems. This traces md processing them, and the 47 | timestamp corresponds with when md began to issue the flush to disks. 48 | 49 | There is another version of this tool in bcc: https://github.com/iovisor/bcc 50 | -------------------------------------------------------------------------------- /tools/naptime.bt: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bpftrace 2 | /* 3 | * naptime - Show voluntary sleep calls. 4 | * 5 | * See BPF Performance Tools, Chapter 13, for an explanation of this tool. 6 | * 7 | * Copyright (c) 2019 Brendan Gregg. 8 | * Licensed under the Apache License, Version 2.0 (the "License"). 9 | * This was originally created for the BPF Performance Tools book 10 | * published by Addison Wesley. ISBN-13: 9780136554820 11 | * When copying or porting, include this comment. 12 | * 13 | * 16-Feb-2019 Brendan Gregg Created this. 14 | */ 15 | 16 | #ifndef BPFTRACE_HAVE_BTF 17 | #include 18 | #include 19 | #endif 20 | 21 | BEGIN 22 | { 23 | printf("Tracing sleeps. Hit Ctrl-C to end.\n"); 24 | printf("%-8s %-6s %-16s %-6s %-16s %s\n", "TIME", "PPID", "PCOMM", 25 | "PID", "COMM", "SECONDS"); 26 | } 27 | 28 | tracepoint:syscalls:sys_enter_nanosleep 29 | /args.rqtp->tv_sec + args.rqtp->tv_nsec/ 30 | { 31 | $task = (struct task_struct *)curtask; 32 | time("%H:%M:%S "); 33 | printf("%-6d %-16s %-6d %-16s %d.%03d\n", $task->real_parent->pid, 34 | $task->real_parent->comm, pid, comm, 35 | args.rqtp->tv_sec, (uint64)args.rqtp->tv_nsec / 1e6); 36 | } 37 | -------------------------------------------------------------------------------- /tools/naptime_example.txt: -------------------------------------------------------------------------------- 1 | Demonstrations of naptime, the Linux bpftrace/eBPF version. 2 | 3 | 4 | Tracing application sleeps via the nanosleep(2) syscall: 5 | 6 | # ./naptime.bt 7 | Attaching 2 probes... 8 | Tracing sleeps. Hit Ctrl-C to end. 9 | TIME PCOMM PPID COMM PID SECONDS 10 | 15:50:00 1 systemd 1319 mysqld 1.000 11 | 15:50:01 4388 bash 25250 sleep 5.000 12 | 15:50:01 1 systemd 1319 mysqld 1.000 13 | 15:50:01 1 systemd 1180 cron 60.000 14 | 15:50:01 1 systemd 1180 cron 60.000 15 | 15:50:02 1 systemd 1319 mysqld 1.000 16 | [...] 17 | 18 | The output shows mysqld performing a one second sleep every second (likely 19 | a daemon thread), a sleep(1) command sleeping for five seconds and called 20 | by bash, and cron threads sleeping for 60 seconds. 21 | -------------------------------------------------------------------------------- /tools/oomkill.bt: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bpftrace 2 | /* 3 | * oomkill Trace OOM killer. 4 | * For Linux, uses bpftrace and eBPF. 5 | * 6 | * This traces the kernel out-of-memory killer, and prints basic details, 7 | * including the system load averages. This can provide more context on the 8 | * system state at the time of OOM: was it getting busier or steady, based 9 | * on the load averages? This tool may also be useful to customize for 10 | * investigations; for example, by adding other task_struct details at the 11 | * time of the OOM, or other commands in the system() call. 12 | * 13 | * This currently works by using kernel dynamic tracing of oom_kill_process(). 14 | * 15 | * USAGE: oomkill.bt 16 | * 17 | * Copyright 2018 Netflix, Inc. 18 | * Licensed under the Apache License, Version 2.0 (the "License") 19 | * 20 | * 07-Sep-2018 Brendan Gregg Created this. 21 | */ 22 | 23 | #ifndef BPFTRACE_HAVE_BTF 24 | #include 25 | #endif 26 | 27 | BEGIN 28 | { 29 | printf("Tracing oom_kill_process()... Hit Ctrl-C to end.\n"); 30 | } 31 | 32 | kprobe:oom_kill_process 33 | { 34 | $oc = (struct oom_control *)arg0; 35 | time("%H:%M:%S "); 36 | printf("Triggered by PID %d (\"%s\"), ", pid, comm); 37 | printf("OOM kill of PID %d (\"%s\"), %d pages, loadavg: ", 38 | $oc->chosen->pid, $oc->chosen->comm, $oc->totalpages); 39 | cat("/proc/loadavg"); 40 | } 41 | -------------------------------------------------------------------------------- /tools/oomkill_example.txt: -------------------------------------------------------------------------------- 1 | Demonstrations of oomkill, the Linux bpftrace/eBPF version. 2 | 3 | 4 | oomkill is a simple program that traces the Linux out-of-memory (OOM) killer, 5 | and shows basic details on one line per OOM kill: 6 | 7 | # ./oomkill.bt 8 | Tracing oom_kill_process()... Ctrl-C to end. 9 | 21:03:39 Triggered by PID 3297 ("ntpd"), OOM kill of PID 22516 ("perl"), 3850642 pages, loadavg: 0.99 0.39 0.30 3/282 22724 10 | 21:03:48 Triggered by PID 22517 ("perl"), OOM kill of PID 22517 ("perl"), 3850642 pages, loadavg: 0.99 0.41 0.30 2/282 22932 11 | 12 | The first line shows that PID 22516, with process name "perl", was OOM killed 13 | when it reached 3850642 pages (usually 4 Kbytes per page). This OOM kill 14 | happened to be triggered by PID 3297, process name "ntpd", doing some memory 15 | allocation. 16 | 17 | The system log (dmesg) shows pages of details and system context about an OOM 18 | kill. What it currently lacks, however, is context on how the system had been 19 | changing over time. I've seen OOM kills where I wanted to know if the system 20 | was at steady state at the time, or if there had been a recent increase in 21 | workload that triggered the OOM event. oomkill provides some context: at the 22 | end of the line is the load average information from /proc/loadavg. For both 23 | of the oomkills here, we can see that the system was getting busier at the 24 | time (a higher 1 minute "average" of 0.99, compared to the 15 minute "average" 25 | of 0.30). 26 | 27 | oomkill can also be the basis of other tools and customizations. For example, 28 | you can edit it to include other task_struct details from the target PID at 29 | the time of the OOM kill, or to run other commands from the shell. 30 | 31 | There is another version of this tool in bcc: https://github.com/iovisor/bcc 32 | -------------------------------------------------------------------------------- /tools/opensnoop.bt: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bpftrace 2 | /* 3 | * opensnoop Trace open() syscalls. 4 | * For Linux, uses bpftrace and eBPF. 5 | * 6 | * Also a basic example of bpftrace. 7 | * 8 | * USAGE: opensnoop.bt 9 | * 10 | * This is a bpftrace version of the bcc tool of the same name. 11 | * 12 | * Copyright 2018 Netflix, Inc. 13 | * Licensed under the Apache License, Version 2.0 (the "License") 14 | * 15 | * 08-Sep-2018 Brendan Gregg Created this. 16 | */ 17 | 18 | BEGIN 19 | { 20 | printf("Tracing open syscalls... Hit Ctrl-C to end.\n"); 21 | printf("%-6s %-16s %4s %3s %s\n", "PID", "COMM", "FD", "ERR", "PATH"); 22 | } 23 | 24 | tracepoint:syscalls:sys_enter_open, 25 | tracepoint:syscalls:sys_enter_openat 26 | { 27 | @filename[tid] = args.filename; 28 | } 29 | 30 | tracepoint:syscalls:sys_exit_open, 31 | tracepoint:syscalls:sys_exit_openat 32 | /@filename[tid]/ 33 | { 34 | $ret = args.ret; 35 | $fd = $ret >= 0 ? $ret : -1; 36 | $errno = $ret >= 0 ? 0 : - $ret; 37 | 38 | printf("%-6d %-16s %4d %3d %s\n", pid, comm, $fd, $errno, 39 | str(@filename[tid])); 40 | delete(@filename[tid]); 41 | } 42 | 43 | END 44 | { 45 | clear(@filename); 46 | } 47 | -------------------------------------------------------------------------------- /tools/opensnoop_example.txt: -------------------------------------------------------------------------------- 1 | Demonstrations of opensnoop, the Linux bpftrace/eBPF version. 2 | 3 | 4 | opensnoop traces the open() syscall system-wide, and prints various details. 5 | Example output: 6 | 7 | # ./opensnoop.bt 8 | Attaching 3 probes... 9 | Tracing open syscalls... Hit Ctrl-C to end. 10 | PID COMM FD ERR PATH 11 | 2440 snmp-pass 4 0 /proc/cpuinfo 12 | 2440 snmp-pass 4 0 /proc/stat 13 | 25706 ls 3 0 /etc/ld.so.cache 14 | 25706 ls 3 0 /lib/x86_64-linux-gnu/libselinux.so.1 15 | 25706 ls 3 0 /lib/x86_64-linux-gnu/libc.so.6 16 | 25706 ls 3 0 /lib/x86_64-linux-gnu/libpcre.so.3 17 | 25706 ls 3 0 /lib/x86_64-linux-gnu/libdl.so.2 18 | 25706 ls 3 0 /lib/x86_64-linux-gnu/libpthread.so.0 19 | 25706 ls 3 0 /proc/filesystems 20 | 25706 ls 3 0 /usr/lib/locale/locale-archive 21 | 25706 ls 3 0 . 22 | 1744 snmpd 8 0 /proc/net/dev 23 | 1744 snmpd 21 0 /proc/net/if_inet6 24 | 1744 snmpd 21 0 /sys/class/net/eth0/device/vendor 25 | 1744 snmpd 21 0 /sys/class/net/eth0/device/device 26 | 1744 snmpd 21 0 /proc/sys/net/ipv4/neigh/eth0/retrans_time_ms 27 | 1744 snmpd 21 0 /proc/sys/net/ipv6/neigh/eth0/retrans_time_ms 28 | 1744 snmpd 21 0 /proc/sys/net/ipv6/conf/eth0/forwarding 29 | 1744 snmpd 21 0 /proc/sys/net/ipv6/neigh/eth0/base_reachable_time_ms 30 | 1744 snmpd -1 2 /sys/class/net/lo/device/vendor 31 | 1744 snmpd 21 0 /proc/sys/net/ipv4/neigh/lo/retrans_time_ms 32 | 1744 snmpd 21 0 /proc/sys/net/ipv6/neigh/lo/retrans_time_ms 33 | 1744 snmpd 21 0 /proc/sys/net/ipv6/conf/lo/forwarding 34 | 1744 snmpd 21 0 /proc/sys/net/ipv6/neigh/lo/base_reachable_time_ms 35 | 2440 snmp-pass 4 0 /proc/cpuinfo 36 | 2440 snmp-pass 4 0 /proc/stat 37 | 22884 pickup 12 0 maildrop 38 | 2440 snmp-pass 4 0 /proc/cpuinfo 39 | 2440 snmp-pass 4 0 /proc/stat 40 | 41 | While tracing, at "ls" command was launched: the libraries it uses can be seen 42 | as they were opened. Also, the snmpd process opened various /proc and /sys 43 | files (reading metrics). 44 | was starting up: a new process). 45 | 46 | opensnoop can be useful for discovering configuration and log files, if used 47 | during application startup. 48 | 49 | 50 | There is another version of this tool in bcc: https://github.com/iovisor/bcc 51 | The bcc version provides command line options to customize the output. 52 | -------------------------------------------------------------------------------- /tools/pidpersec.bt: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bpftrace 2 | /* 3 | * pidpersec Count new processes (via fork). 4 | * For Linux, uses bpftrace and eBPF. 5 | * 6 | * Written as a basic example of counting on an event. 7 | * 8 | * USAGE: pidpersec.bt 9 | * 10 | * This is a bpftrace version of the bcc tool of the same name. 11 | * 12 | * Copyright 2018 Netflix, Inc. 13 | * Licensed under the Apache License, Version 2.0 (the "License") 14 | * 15 | * 06-Sep-2018 Brendan Gregg Created this. 16 | */ 17 | 18 | BEGIN 19 | { 20 | printf("Tracing new processes... Hit Ctrl-C to end.\n"); 21 | 22 | } 23 | 24 | tracepoint:sched:sched_process_fork 25 | { 26 | @ = count(); 27 | } 28 | 29 | interval:s:1 30 | { 31 | time("%H:%M:%S PIDs/sec: "); 32 | print(@); 33 | clear(@); 34 | } 35 | 36 | END 37 | { 38 | clear(@); 39 | } 40 | -------------------------------------------------------------------------------- /tools/pidpersec_example.txt: -------------------------------------------------------------------------------- 1 | Demonstrations of pidpersec, the Linux bpftrace/eBPF version. 2 | 3 | 4 | Tracing new processes: 5 | 6 | # ./pidpersec.bt 7 | Attaching 4 probes... 8 | Tracing new processes... Hit Ctrl-C to end. 9 | 22:29:50 PIDs/sec: @: 121 10 | 22:29:51 PIDs/sec: @: 120 11 | 22:29:52 PIDs/sec: @: 122 12 | 22:29:53 PIDs/sec: @: 124 13 | 22:29:54 PIDs/sec: @: 123 14 | 22:29:55 PIDs/sec: @: 121 15 | 22:29:56 PIDs/sec: @: 121 16 | 22:29:57 PIDs/sec: @: 121 17 | 22:29:58 PIDs/sec: @: 49 18 | 22:29:59 PIDs/sec: 19 | 22:30:00 PIDs/sec: 20 | 22:30:01 PIDs/sec: 21 | 22:30:02 PIDs/sec: 22 | ^C 23 | 24 | The output begins by showing a rate of new processes over 120 per second. 25 | That then ends at time 22:29:59, and for the next few seconds there are zero 26 | new processes per second. 27 | 28 | 29 | The following example shows a Linux build launched at 6:33:40, on a 36 CPU 30 | server, with make -j36: 31 | 32 | # ./pidpersec.bt 33 | Attaching 4 probes... 34 | Tracing new processes... Hit Ctrl-C to end. 35 | 06:33:38 PIDs/sec: 36 | 06:33:39 PIDs/sec: 37 | 06:33:40 PIDs/sec: @: 2314 38 | 06:33:41 PIDs/sec: @: 2517 39 | 06:33:42 PIDs/sec: @: 1345 40 | 06:33:43 PIDs/sec: @: 1752 41 | 06:33:44 PIDs/sec: @: 1744 42 | 06:33:45 PIDs/sec: @: 1549 43 | 06:33:46 PIDs/sec: @: 1643 44 | 06:33:47 PIDs/sec: @: 1487 45 | 06:33:48 PIDs/sec: @: 1534 46 | 06:33:49 PIDs/sec: @: 1279 47 | 06:33:50 PIDs/sec: @: 1392 48 | 06:33:51 PIDs/sec: @: 1556 49 | 06:33:52 PIDs/sec: @: 1580 50 | 06:33:53 PIDs/sec: @: 1944 51 | 52 | A Linux kernel build involves launched many thousands of short-lived processes, 53 | which can be seen in the above output: a rate of over 1,000 processes per 54 | second. 55 | 56 | 57 | There is another version of this tool in bcc: https://github.com/iovisor/bcc 58 | -------------------------------------------------------------------------------- /tools/runqlat.bt: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bpftrace 2 | /* 3 | * runqlat.bt CPU scheduler run queue latency as a histogram. 4 | * For Linux, uses bpftrace, eBPF. 5 | * 6 | * This is a bpftrace version of the bcc tool of the same name. 7 | * 8 | * Copyright 2018 Netflix, Inc. 9 | * Licensed under the Apache License, Version 2.0 (the "License") 10 | * 11 | * 17-Sep-2018 Brendan Gregg Created this. 12 | */ 13 | 14 | #include 15 | 16 | BEGIN 17 | { 18 | printf("Tracing CPU scheduler... Hit Ctrl-C to end.\n"); 19 | } 20 | 21 | tracepoint:sched:sched_wakeup, 22 | tracepoint:sched:sched_wakeup_new 23 | { 24 | @qtime[args.pid] = nsecs; 25 | } 26 | 27 | tracepoint:sched:sched_switch 28 | { 29 | if (args.prev_state == TASK_RUNNING) { 30 | @qtime[args.prev_pid] = nsecs; 31 | } 32 | 33 | $ns = @qtime[args.next_pid]; 34 | if ($ns) { 35 | @usecs = hist((nsecs - $ns) / 1000); 36 | } 37 | delete(@qtime[args.next_pid]); 38 | } 39 | 40 | END 41 | { 42 | clear(@qtime); 43 | } 44 | -------------------------------------------------------------------------------- /tools/runqlat_example.txt: -------------------------------------------------------------------------------- 1 | Demonstrations of runqlat, the Linux BPF/bpftrace version. 2 | 3 | 4 | This traces time spent waiting in the CPU scheduler for a turn on-CPU. This 5 | metric is often called run queue latency, or scheduler latency. This tool shows 6 | this latency as a power-of-2 histogram in nanoseconds. For example: 7 | 8 | # ./runqlat.bt 9 | Attaching 5 probes... 10 | Tracing CPU scheduler... Hit Ctrl-C to end. 11 | ^C 12 | 13 | 14 | 15 | @usecs: 16 | [0] 1 | | 17 | [1] 11 |@@ | 18 | [2, 4) 16 |@@@ | 19 | [4, 8) 43 |@@@@@@@@@@ | 20 | [8, 16) 134 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ | 21 | [16, 32) 220 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| 22 | [32, 64) 117 |@@@@@@@@@@@@@@@@@@@@@@@@@@@ | 23 | [64, 128) 84 |@@@@@@@@@@@@@@@@@@@ | 24 | [128, 256) 10 |@@ | 25 | [256, 512) 2 | | 26 | [512, 1K) 5 |@ | 27 | [1K, 2K) 5 |@ | 28 | [2K, 4K) 5 |@ | 29 | [4K, 8K) 4 | | 30 | [8K, 16K) 1 | | 31 | [16K, 32K) 2 | | 32 | [32K, 64K) 0 | | 33 | [64K, 128K) 1 | | 34 | [128K, 256K) 0 | | 35 | [256K, 512K) 0 | | 36 | [512K, 1M) 1 | | 37 | 38 | This is an idle system where most of the time we are waiting for less than 39 | 128 microseconds, shown by the mode above. As an example of reading the output, 40 | the above histogram shows 220 scheduling events with a run queue latency of 41 | between 16 and 32 microseconds. 42 | 43 | The output also shows an outlier taking between 0.5 and 1 seconds: ??? XXX 44 | likely work was scheduled behind another higher priority task, and had to wait 45 | briefly. The kernel decides whether it is worth migrating such work to an 46 | idle CPU, or leaving it wait its turn on its current CPU run queue where 47 | the CPU caches should be hotter. 48 | 49 | 50 | I'll now add a single-threaded CPU bound workload to this system, and bind 51 | it on one CPU: 52 | 53 | # ./runqlat.bt 54 | Attaching 5 probes... 55 | Tracing CPU scheduler... Hit Ctrl-C to end. 56 | ^C 57 | 58 | 59 | 60 | @usecs: 61 | [1] 6 |@@@ | 62 | [2, 4) 26 |@@@@@@@@@@@@@ | 63 | [4, 8) 97 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| 64 | [8, 16) 72 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ | 65 | [16, 32) 17 |@@@@@@@@@ | 66 | [32, 64) 19 |@@@@@@@@@@ | 67 | [64, 128) 20 |@@@@@@@@@@ | 68 | [128, 256) 3 |@ | 69 | [256, 512) 0 | | 70 | [512, 1K) 0 | | 71 | [1K, 2K) 1 | | 72 | [2K, 4K) 1 | | 73 | [4K, 8K) 4 |@@ | 74 | [8K, 16K) 3 |@ | 75 | [16K, 32K) 0 | | 76 | [32K, 64K) 0 | | 77 | [64K, 128K) 0 | | 78 | [128K, 256K) 1 | | 79 | [256K, 512K) 0 | | 80 | [512K, 1M) 0 | | 81 | [1M, 2M) 1 | | 82 | 83 | That didn't make much difference. 84 | 85 | 86 | Now I'll add a second single-threaded CPU workload, and bind it to the same 87 | CPU, causing contention: 88 | 89 | # ./runqlat.bt 90 | Attaching 5 probes... 91 | Tracing CPU scheduler... Hit Ctrl-C to end. 92 | ^C 93 | 94 | 95 | 96 | @usecs: 97 | [0] 1 | | 98 | [1] 8 |@@@ | 99 | [2, 4) 28 |@@@@@@@@@@@@ | 100 | [4, 8) 95 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ | 101 | [8, 16) 120 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| 102 | [16, 32) 22 |@@@@@@@@@ | 103 | [32, 64) 10 |@@@@ | 104 | [64, 128) 7 |@@@ | 105 | [128, 256) 3 |@ | 106 | [256, 512) 1 | | 107 | [512, 1K) 0 | | 108 | [1K, 2K) 0 | | 109 | [2K, 4K) 2 | | 110 | [4K, 8K) 4 |@ | 111 | [8K, 16K) 107 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ | 112 | [16K, 32K) 0 | | 113 | [32K, 64K) 0 | | 114 | [64K, 128K) 0 | | 115 | [128K, 256K) 0 | | 116 | [256K, 512K) 1 | | 117 | 118 | There's now a second mode between 8 and 16 milliseconds, as each thread must 119 | wait its turn on the one CPU. 120 | 121 | 122 | Now I'll run 10 CPU-bound threads on one CPU: 123 | 124 | # ./runqlat.bt 125 | Attaching 5 probes... 126 | Tracing CPU scheduler... Hit Ctrl-C to end. 127 | ^C 128 | 129 | 130 | 131 | @usecs: 132 | [0] 2 | | 133 | [1] 10 |@ | 134 | [2, 4) 38 |@@@@ | 135 | [4, 8) 63 |@@@@@@ | 136 | [8, 16) 106 |@@@@@@@@@@@ | 137 | [16, 32) 28 |@@@ | 138 | [32, 64) 13 |@ | 139 | [64, 128) 15 |@ | 140 | [128, 256) 2 | | 141 | [256, 512) 2 | | 142 | [512, 1K) 1 | | 143 | [1K, 2K) 1 | | 144 | [2K, 4K) 2 | | 145 | [4K, 8K) 4 | | 146 | [8K, 16K) 3 | | 147 | [16K, 32K) 0 | | 148 | [32K, 64K) 478 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| 149 | [64K, 128K) 1 | | 150 | [128K, 256K) 0 | | 151 | [256K, 512K) 0 | | 152 | [512K, 1M) 0 | | 153 | [1M, 2M) 1 | | 154 | 155 | This shows that most of the time threads need to wait their turn, with the 156 | largest mode between 32 and 64 milliseconds. 157 | 158 | 159 | There is another version of this tool in bcc: https://github.com/iovisor/bcc 160 | The bcc version provides options to customize the output. 161 | -------------------------------------------------------------------------------- /tools/runqlen.bt: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bpftrace 2 | /* 3 | * runqlen.bt CPU scheduler run queue length as a histogram. 4 | * For Linux, uses bpftrace, eBPF. 5 | * 6 | * This is a bpftrace version of the bcc tool of the same name. 7 | * 8 | * Copyright 2018 Netflix, Inc. 9 | * Licensed under the Apache License, Version 2.0 (the "License") 10 | * 11 | * 07-Oct-2018 Brendan Gregg Created this. 12 | */ 13 | 14 | #ifndef BPFTRACE_HAVE_BTF 15 | #include 16 | 17 | // Until BTF is available, we'll need to declare some of this struct manually, 18 | // since it isn't available to be #included. This will need maintenance to match 19 | // your kernel version. It is from kernel/sched/sched.h: 20 | struct cfs_rq { 21 | struct load_weight load; 22 | unsigned long runnable_weight; 23 | unsigned int nr_running; 24 | unsigned int h_nr_running; 25 | }; 26 | #endif 27 | 28 | BEGIN 29 | { 30 | printf("Sampling run queue length at 99 Hertz... Hit Ctrl-C to end.\n"); 31 | } 32 | 33 | profile:hz:99 34 | { 35 | $task = (struct task_struct *)curtask; 36 | $my_q = (struct cfs_rq *)$task->se.cfs_rq; 37 | $len = $my_q->nr_running; 38 | $len = $len > 0 ? $len - 1 : 0; // subtract currently running task 39 | @runqlen = lhist($len, 0, 100, 1); 40 | } 41 | -------------------------------------------------------------------------------- /tools/runqlen_example.txt: -------------------------------------------------------------------------------- 1 | Demonstrations of runqlen, the Linux BPF/bpftrace version. 2 | 3 | 4 | This tool samples the length of the CPU scheduler run queues, showing these 5 | sampled lengths as a histogram. This can be used to characterize demand for 6 | CPU resources. For example: 7 | 8 | # ./runqlen.bt 9 | Attaching 2 probes... 10 | Sampling run queue length at 99 Hertz... Hit Ctrl-C to end. 11 | ^C 12 | 13 | @runqlen: 14 | [0, 1) 1967 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| 15 | [1, 2) 0 | | 16 | [2, 3) 0 | | 17 | [3, 4) 306 |@@@@@@@@ | 18 | 19 | This output shows that the run queue length was usually zero, except for some 20 | samples where it was 3. This was caused by binding 4 CPU bound threads to a 21 | single CPUs. 22 | 23 | 24 | There is another version of this tool in bcc: https://github.com/iovisor/bcc 25 | The bcc version provides options to customize the output. 26 | -------------------------------------------------------------------------------- /tools/setuids.bt: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bpftrace 2 | /* 3 | * setuids - Trace the setuid syscalls: privilege escalation. 4 | * 5 | * See BPF Performance Tools, Chapter 11, for an explanation of this tool. 6 | * 7 | * Copyright (c) 2019 Brendan Gregg. 8 | * Licensed under the Apache License, Version 2.0 (the "License"). 9 | * This was originally created for the BPF Performance Tools book 10 | * published by Addison Wesley. ISBN-13: 9780136554820 11 | * When copying or porting, include this comment. 12 | * 13 | * 26-Feb-2019 Brendan Gregg Created this. 14 | */ 15 | 16 | BEGIN 17 | { 18 | printf("Tracing setuid(2) family syscalls. Hit Ctrl-C to end.\n"); 19 | printf("%-8s %-6s %-16s %-6s %-9s %s\n", "TIME", 20 | "PID", "COMM", "UID", "SYSCALL", "ARGS (RET)"); 21 | } 22 | 23 | tracepoint:syscalls:sys_enter_setuid, 24 | tracepoint:syscalls:sys_enter_setfsuid 25 | { 26 | @uid[tid] = uid; 27 | @setuid[tid] = args.uid; 28 | @seen[tid] = 1; 29 | } 30 | 31 | tracepoint:syscalls:sys_enter_setresuid 32 | { 33 | @uid[tid] = uid; 34 | @ruid[tid] = args.ruid; 35 | @euid[tid] = args.euid; 36 | @suid[tid] = args.suid; 37 | @seen[tid] = 1; 38 | } 39 | 40 | tracepoint:syscalls:sys_exit_setuid 41 | /@seen[tid]/ 42 | { 43 | time("%H:%M:%S "); 44 | printf("%-6d %-16s %-6d setuid uid=%d (%d)\n", pid, comm, 45 | @uid[tid], @setuid[tid], args.ret); 46 | delete(@seen[tid]); delete(@uid[tid]); delete(@setuid[tid]); 47 | } 48 | 49 | tracepoint:syscalls:sys_exit_setfsuid 50 | /@seen[tid]/ 51 | { 52 | time("%H:%M:%S "); 53 | printf("%-6d %-16s %-6d setfsuid uid=%d (prevuid=%d)\n", pid, comm, 54 | @uid[tid], @setuid[tid], args.ret); 55 | delete(@seen[tid]); delete(@uid[tid]); delete(@setuid[tid]); 56 | } 57 | 58 | tracepoint:syscalls:sys_exit_setresuid 59 | /@seen[tid]/ 60 | { 61 | time("%H:%M:%S "); 62 | printf("%-6d %-16s %-6d setresuid ", pid, comm, @uid[tid]); 63 | printf("ruid=%d euid=%d suid=%d (%d)\n", @ruid[tid], @euid[tid], 64 | @suid[tid], args.ret); 65 | delete(@seen[tid]); delete(@uid[tid]); delete(@ruid[tid]); 66 | delete(@euid[tid]); delete(@suid[tid]); 67 | } 68 | -------------------------------------------------------------------------------- /tools/setuids_example.txt: -------------------------------------------------------------------------------- 1 | Demonstrations of setuids, the Linux bpftrace/eBPF version. 2 | 3 | 4 | This tool traces privilege escalation via setuid syscalls (setuid(2), 5 | setfsuid(2), retresuid(2)). For example, here are the setuid calls during an 6 | ssh login: 7 | 8 | # ./setuids.bt 9 | Attaching 7 probes... 10 | Tracing setuid(2) family syscalls. Hit Ctrl-C to end. 11 | TIME PID COMM UID SYSCALL ARGS (RET) 12 | 14:28:22 21785 ssh 1000 setresuid ruid=-1 euid=1000 suid=-1 (0) 13 | 14:28:22 21787 sshd 0 setresuid ruid=122 euid=122 suid=122 (0) 14 | 14:28:22 21787 sshd 122 setuid uid=0 (-1) 15 | 14:28:22 21787 sshd 122 setresuid ruid=-1 euid=0 suid=-1 (-1) 16 | 14:28:24 21786 sshd 0 setresuid ruid=-1 euid=1000 suid=-1 (0) 17 | 14:28:24 21786 sshd 0 setresuid ruid=-1 euid=0 suid=-1 (0) 18 | 14:28:24 21786 sshd 0 setresuid ruid=-1 euid=1000 suid=-1 (0) 19 | 14:28:24 21786 sshd 0 setresuid ruid=-1 euid=0 suid=-1 (0) 20 | 14:28:24 21786 sshd 0 setfsuid uid=1000 (prevuid=0) 21 | 14:28:24 21786 sshd 0 setfsuid uid=1000 (prevuid=1000) 22 | 14:28:24 21786 sshd 0 setfsuid uid=0 (prevuid=1000) 23 | 14:28:24 21786 sshd 0 setfsuid uid=0 (prevuid=0) 24 | 14:28:24 21786 sshd 0 setfsuid uid=1000 (prevuid=0) 25 | 14:28:24 21786 sshd 0 setfsuid uid=1000 (prevuid=1000) 26 | 14:28:24 21786 sshd 0 setfsuid uid=0 (prevuid=1000) 27 | 14:28:24 21786 sshd 0 setfsuid uid=0 (prevuid=0) 28 | 14:28:24 21786 sshd 0 setfsuid uid=1000 (prevuid=0) 29 | 14:28:24 21786 sshd 0 setfsuid uid=1000 (prevuid=1000) 30 | 14:28:24 21786 sshd 0 setfsuid uid=0 (prevuid=1000) 31 | 14:28:24 21786 sshd 0 setfsuid uid=0 (prevuid=0) 32 | 14:28:24 21851 sshd 0 setresuid ruid=1000 euid=1000 suid=1000 (0) 33 | 14:28:24 21851 sshd 1000 setuid uid=0 (-1) 34 | 14:28:24 21851 sshd 1000 setresuid ruid=-1 euid=0 suid=-1 (-1) 35 | 36 | Why does sshd make so many calls? I don't know! Nevertheless, this shows what 37 | this tool can do: it shows the caller details (PID, COMM, and UID), the syscall 38 | (SYSCALL), and the syscall arguments (ARGS) and return value (RET). You can 39 | modify this tool to print user stack traces for each call, which will show the 40 | code path in sshd (provided it is compiled with frame pointers). 41 | -------------------------------------------------------------------------------- /tools/ssllatency.bt: -------------------------------------------------------------------------------- 1 | #!/usr/bin/bpftrace 2 | /* 3 | * ssllatency Trace SSL/TLS handshake for OpenSSL. 4 | * For Linux, uses bpftrace and eBPF. 5 | * 6 | * ssllatency shows handshake latency stats and distribution. 7 | * 8 | * Copyright (c) 2021 Tao Xu. 9 | * Licensed under the Apache License, Version 2.0 (the "License") 10 | * 11 | * 17-Dec-2021 Tao Xu created this. 12 | */ 13 | 14 | BEGIN 15 | { 16 | printf("Tracing SSL/TLS handshake... Hit Ctrl-C to end.\n"); 17 | } 18 | 19 | uprobe:libssl:SSL_read, 20 | uprobe:libssl:SSL_write, 21 | uprobe:libssl:SSL_do_handshake 22 | { 23 | @start_ssl[tid] = nsecs; 24 | @func_ssl[tid] = func; // store for uretprobe 25 | } 26 | 27 | uretprobe:libssl:SSL_read, 28 | uretprobe:libssl:SSL_write, 29 | uretprobe:libssl:SSL_do_handshake 30 | /@start_ssl[tid] != 0/ 31 | { 32 | $lat_us = (nsecs - @start_ssl[tid]) / 1000; 33 | if ((int8)retval >= 1) { 34 | @hist[@func_ssl[tid]] = lhist($lat_us, 0, 1000, 200); 35 | @stat[@func_ssl[tid]] = stats($lat_us); 36 | } else { 37 | @histF[@func_ssl[tid]] = lhist($lat_us, 0, 1000, 200); 38 | @statF[@func_ssl[tid]] = stats($lat_us); 39 | } 40 | delete(@start_ssl[tid]); delete(@func_ssl[tid]); 41 | } 42 | 43 | // need debug symbol for ossl local functions 44 | uprobe:libcrypto:rsa_ossl_public_encrypt, 45 | uprobe:libcrypto:rsa_ossl_public_decrypt, 46 | uprobe:libcrypto:rsa_ossl_private_encrypt, 47 | uprobe:libcrypto:rsa_ossl_private_decrypt, 48 | uprobe:libcrypto:RSA_sign, 49 | uprobe:libcrypto:RSA_verify, 50 | uprobe:libcrypto:ossl_ecdsa_sign, 51 | uprobe:libcrypto:ossl_ecdsa_verify, 52 | uprobe:libcrypto:ecdh_simple_compute_key 53 | { 54 | @start_crypto[tid] = nsecs; 55 | @func_crypto[tid] = func; // store for uretprobe 56 | } 57 | 58 | uretprobe:libcrypto:rsa_ossl_public_encrypt, 59 | uretprobe:libcrypto:rsa_ossl_public_decrypt, 60 | uretprobe:libcrypto:rsa_ossl_private_encrypt, 61 | uretprobe:libcrypto:rsa_ossl_private_decrypt, 62 | uretprobe:libcrypto:RSA_sign, 63 | uretprobe:libcrypto:RSA_verify, 64 | uretprobe:libcrypto:ossl_ecdsa_sign, 65 | uretprobe:libcrypto:ossl_ecdsa_verify, 66 | uretprobe:libcrypto:ecdh_simple_compute_key 67 | /@start_crypto[tid] != 0/ 68 | { 69 | $lat_us = (nsecs - @start_crypto[tid]) / 1000; 70 | @hist[@func_crypto[tid]] = lhist($lat_us, 0, 1000, 200); 71 | @stat[@func_crypto[tid]] = stats($lat_us); 72 | delete(@start_crypto[tid]); delete(@func_crypto[tid]); 73 | } 74 | 75 | END 76 | { 77 | printf("\nLatency distribution in microsecond:"); 78 | } 79 | -------------------------------------------------------------------------------- /tools/ssllatency_example.txt: -------------------------------------------------------------------------------- 1 | Demonstrations of ssllatency, the Linux bpftrace/eBPF version. 2 | 3 | ssllatency traces OpenSSL handshake functions. It's the statistical summary 4 | version of sslsnoop. This is useful for performance analysis with different 5 | crypto algorithms or async SSL acceleration by CPU or offload device. 6 | For example: 7 | 8 | # wrk -t 1 -c 10 -d 1s -H 'Connection: close' https://localhost:443/0kb.bin 9 | Running 1s test @ https://localhost:443/0kb.bin 10 | 1 threads and 10 connections 11 | Thread Stats Avg Stdev Max +/- Stdev 12 | Latency 839.94us 323.68us 5.98ms 98.50% 13 | Req/Sec 1.28k 9.05 1.29k 54.55% 14 | 1400 requests in 1.10s, 414.26KB read 15 | Non-2xx or 3xx responses: 1400 16 | Requests/sec: 1272.97 17 | Transfer/sec: 376.67KB 18 | 19 | # ./ssllatency.bt 20 | Attaching 26 probes... 21 | Tracing SSL/TLS handshake... Hit Ctrl-C to end. 22 | ^C 23 | Latency distribution in microsecond: 24 | 25 | @hist[SSL_write]: 26 | [0, 200) 1401 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| 27 | @hist[SSL_read]: 28 | [0, 200) 1401 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| 29 | @hist[SSL_do_handshake]: 30 | [600, 800) 1359 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| 31 | [800, 1000) 12 | | 32 | [1000, ...) 32 |@ | 33 | @hist[rsa_ossl_private_decrypt]: 34 | [600, 800) 1359 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| 35 | [800, 1000) 44 |@ | 36 | @histF[SSL_do_handshake]: 37 | [0, 200) 1410 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| 38 | @histF[SSL_read]: 39 | [0, 200) 2804 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| 40 | 41 | @stat[SSL_read]: count 1401, average 2, total 2834 42 | @stat[SSL_write]: count 1401, average 5, total 7062 43 | @stat[rsa_ossl_private_decrypt]: count 1403, average 643, total 902780 44 | @stat[SSL_do_handshake]: count 1403, average 706, total 991605 45 | 46 | @statF[SSL_read]: count 2804, average 1, total 3951 47 | @statF[SSL_do_handshake]: count 1410, average 29, total 41964 48 | 49 | This output shows latency distribution for wrk benchmark which saturated 50 | one CPU core used by nginx server. wrk issued 1400 requests in 1.10s, and 51 | req/s is 1272. Server side RSA function is counted 1403 times averaging 52 | 643us latency. And there's same amount(1410/1403) of failed/successful 53 | SSL_do_handshake calls for the round trip. This is the default behavior. 54 | 55 | # wrk -t 1 -c 10 -d 1s -H 'Connection: close' https://localhost:443/0kb.bin 56 | Running 1s test @ https://localhost:443/0kb.bin 57 | 1 threads and 10 connections 58 | Thread Stats Avg Stdev Max +/- Stdev 59 | Latency 448.67us 148.67us 1.28ms 82.00% 60 | Req/Sec 2.95k 43.03 2.99k 80.00% 61 | 2933 requests in 1.00s, 867.87KB read 62 | Non-2xx or 3xx responses: 2933 63 | Requests/sec: 2930.53 64 | Transfer/sec: 867.14KB 65 | 66 | # ./ssllatency.bt 67 | Attaching 26 probes... 68 | Tracing SSL/TLS handshake... Hit Ctrl-C to end. 69 | ^C 70 | Latency distribution in microsecond: 71 | 72 | @hist[SSL_write]: 73 | [0, 200) 2933 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| 74 | @hist[SSL_read]: 75 | [0, 200) 2933 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| 76 | @hist[SSL_do_handshake]: 77 | [0, 200) 2941 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| 78 | @histF[SSL_read]: 79 | [0, 200) 5873 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| 80 | @histF[SSL_do_handshake]: 81 | [0, 200) 5884 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| 82 | 83 | @stat[SSL_read]: count 2933, average 4, total 12088 84 | @stat[SSL_write]: count 2933, average 7, total 20683 85 | @stat[SSL_do_handshake]: count 2941, average 51, total 151149 86 | 87 | @statF[SSL_read]: count 5873, average 2, total 13942 88 | @statF[SSL_do_handshake]: count 5884, average 19, total 113061 89 | 90 | This is the hardware accelerated result by using async SSL and CPU crypto 91 | SIMD. req/s is more than doubled under same wkr workload. Peak throughput 92 | can be more than 3x if adding more wrk connections. Keep using same 93 | workload for comparison. 94 | 95 | libcrypto_mb.so is used instead of libcrypto.so, to batch process multiple 96 | async requets simultaneously by SIMD. As a result, wrk issued 2933 requests 97 | in 1s. Failed SSL_do_handshake calls has doubled(5884), and successful 98 | calls(2941) returned quickly(51us). This is expected from async routines. 99 | 100 | The above effect is based on the huge bottleneck of RSA which is very CPU 101 | intensive. If change to ECDSA, the overhead would be much less, and overall 102 | improvement is less obvious. 103 | -------------------------------------------------------------------------------- /tools/sslsnoop.bt: -------------------------------------------------------------------------------- 1 | #!/usr/bin/bpftrace 2 | /* 3 | * sslsnoop Trace SSL/TLS handshake for OpenSSL. 4 | * For Linux, uses bpftrace and eBPF. 5 | * 6 | * sslsnoop shows handshake latency and retval. This is useful for SSL/TLS 7 | * performance analysis. 8 | * 9 | * Copyright (c) 2021 Tao Xu. 10 | * Licensed under the Apache License, Version 2.0 (the "License") 11 | * 12 | * 15-Dec-2021 Tao Xu created this. 13 | */ 14 | 15 | BEGIN 16 | { 17 | printf("Tracing SSL/TLS handshake... Hit Ctrl-C to end.\n"); 18 | printf("%-10s %-8s %-8s %7s %5s %s\n", "TIME(us)", "TID", 19 | "COMM", "LAT(us)", "RET", "FUNC"); 20 | } 21 | 22 | uprobe:libssl:SSL_read, 23 | uprobe:libssl:SSL_write, 24 | uprobe:libssl:SSL_do_handshake 25 | { 26 | @start_ssl[tid] = nsecs; 27 | @func_ssl[tid] = func; // store for uretprobe 28 | } 29 | 30 | uretprobe:libssl:SSL_read, 31 | uretprobe:libssl:SSL_write, 32 | uretprobe:libssl:SSL_do_handshake 33 | /@start_ssl[tid] != 0/ 34 | { 35 | printf("%-10u %-8d %-8s %7u %5d %s\n", elapsed/1000, tid, comm, 36 | (nsecs - @start_ssl[tid])/1000, retval, @func_ssl[tid]); 37 | delete(@start_ssl[tid]); delete(@func_ssl[tid]); 38 | } 39 | 40 | // need debug symbol for ossl local functions 41 | uprobe:libcrypto:rsa_ossl_public_encrypt, 42 | uprobe:libcrypto:rsa_ossl_public_decrypt, 43 | uprobe:libcrypto:rsa_ossl_private_encrypt, 44 | uprobe:libcrypto:rsa_ossl_private_decrypt, 45 | uprobe:libcrypto:RSA_sign, 46 | uprobe:libcrypto:RSA_verify, 47 | uprobe:libcrypto:ossl_ecdsa_sign, 48 | uprobe:libcrypto:ossl_ecdsa_verify, 49 | uprobe:libcrypto:ossl_ecdh_compute_key 50 | { 51 | @start_crypto[tid] = nsecs; 52 | @func_crypto[tid] = func; // store for uretprobe 53 | } 54 | 55 | uretprobe:libcrypto:rsa_ossl_public_encrypt, 56 | uretprobe:libcrypto:rsa_ossl_public_decrypt, 57 | uretprobe:libcrypto:rsa_ossl_private_encrypt, 58 | uretprobe:libcrypto:rsa_ossl_private_decrypt, 59 | uretprobe:libcrypto:RSA_sign, 60 | uretprobe:libcrypto:RSA_verify, 61 | uretprobe:libcrypto:ossl_ecdsa_sign, 62 | uretprobe:libcrypto:ossl_ecdsa_verify, 63 | uretprobe:libcrypto:ossl_ecdh_compute_key 64 | /@start_crypto[tid] != 0/ 65 | { 66 | printf("%-10u %-8d %-8s %7u %5d %s\n", elapsed/1000, tid, comm, 67 | (nsecs - @start_crypto[tid])/1000, retval, @func_crypto[tid]); 68 | delete(@start_crypto[tid]); delete(@func_crypto[tid]); 69 | } 70 | -------------------------------------------------------------------------------- /tools/sslsnoop_example.txt: -------------------------------------------------------------------------------- 1 | Demonstrations of sslsnoop, the Linux bpftrace/eBPF version. 2 | 3 | sslsnoop shows OpenSSL handshake function latency and return value. This 4 | can be used to analyzea SSL/TLS performance. For example: 5 | 6 | # ./sslsnoop.bt 7 | Attaching 25 probes... 8 | Tracing SSL/TLS handshake... Hit Ctrl-C to end. 9 | TIME(us) TID COMM LAT(us) RET FUNC 10 | 1623016 2834695 openssl 71 1 ossl_ecdh_compute_key 11 | 1623319 2834695 openssl 32 51 rsa_ossl_public_decrypt 12 | 1623418 2834695 openssl 31 51 rsa_ossl_public_decrypt 13 | 1623547 2834695 openssl 27 256 rsa_ossl_public_decrypt 14 | 1623612 2834695 openssl 361150 0 SSL_write 15 | 1804646 2834695 openssl 92 -1 SSL_read 16 | 1804730 2834695 openssl 76 -1 SSL_read 17 | ^C 18 | 19 | Above shows the output of 'openssl s_client -connect example.com:443'. 20 | The first SSL_write call returned after 361ms that is the TLS handshake 21 | time. Local ECDH and RSA crypto calculation is fast at client side. Most 22 | time is spent at server side including crypto and network latency. 23 | 24 | # ./sslsnoop.bt 25 | Attaching 25 probes... 26 | Tracing SSL/TLS handshake... Hit Ctrl-C to end. 27 | TIME(us) TID COMM LAT(us) RET FUNC 28 | 1133960 2826460 nginx 81 -1 SSL_do_handshake 29 | 1134910 2826460 nginx 631 256 rsa_ossl_private_decrypt 30 | 1134977 2826460 nginx 709 1 SSL_do_handshake 31 | 1134984 2826460 nginx 3 -1 SSL_read 32 | 1134209 2834970 openssl 37 256 rsa_ossl_public_encrypt 33 | 1134994 2834970 openssl 1244 0 SSL_write 34 | ^C 35 | 36 | Change example.com to localhost to exclude network latency. Output above 37 | shows 1.2ms overall handshake time, and RSA calculation took 0.7ms at nginx 38 | server. As event print is asynchronous, timestamp is not guaranteed to show 39 | in order. 40 | 41 | The bcc tool sslsniff shows similar event latency, and additional plaintext 42 | and ciphertext in SSL_read/wirte: https://github.com/iovisor/bcc 43 | -------------------------------------------------------------------------------- /tools/statsnoop.bt: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bpftrace 2 | /* 3 | * statsnoop Trace stat() syscalls. 4 | * For Linux, uses bpftrace and eBPF. 5 | * 6 | * This traces the tracepoints for statfs(), statx(), newstat(), and 7 | * newlstat(). These aren't the only the stat syscalls: if you are missing 8 | * activity, you may need to add more variants. 9 | * 10 | * Also a basic example of bpftrace. 11 | * 12 | * USAGE: statsnoop.bt 13 | * 14 | * This is a bpftrace version of the bcc tool of the same name. 15 | * 16 | * Copyright 2018 Netflix, Inc. 17 | * Licensed under the Apache License, Version 2.0 (the "License") 18 | * 19 | * 08-Sep-2018 Brendan Gregg Created this. 20 | */ 21 | 22 | BEGIN 23 | { 24 | printf("Tracing stat syscalls... Hit Ctrl-C to end.\n"); 25 | printf("%-6s %-16s %3s %s\n", "PID", "COMM", "ERR", "PATH"); 26 | } 27 | 28 | tracepoint:syscalls:sys_enter_statfs 29 | { 30 | @filename[tid] = args.pathname; 31 | } 32 | 33 | tracepoint:syscalls:sys_enter_statx, 34 | tracepoint:syscalls:sys_enter_newstat, 35 | tracepoint:syscalls:sys_enter_newlstat 36 | { 37 | @filename[tid] = args.filename; 38 | } 39 | 40 | tracepoint:syscalls:sys_exit_statfs, 41 | tracepoint:syscalls:sys_exit_statx, 42 | tracepoint:syscalls:sys_exit_newstat, 43 | tracepoint:syscalls:sys_exit_newlstat 44 | /@filename[tid]/ 45 | { 46 | $ret = args.ret; 47 | $errno = $ret >= 0 ? 0 : - $ret; 48 | 49 | printf("%-6d %-16s %3d %s\n", pid, comm, $errno, 50 | str(@filename[tid])); 51 | delete(@filename[tid]); 52 | } 53 | 54 | END 55 | { 56 | clear(@filename); 57 | } 58 | -------------------------------------------------------------------------------- /tools/statsnoop_example.txt: -------------------------------------------------------------------------------- 1 | Demonstrations of statsnoop, the Linux bpftrace/eBPF version. 2 | 3 | 4 | statsnoop traces different stat() syscalls system-wide, and prints details. 5 | Example output: 6 | 7 | # ./statsnoop.bt 8 | Attaching 9 probes... 9 | Tracing stat syscalls... Hit Ctrl-C to end. 10 | PID COMM ERR PATH 11 | 27835 bash 0 . 12 | 27835 bash 2 /usr/local/sbin/iconfig 13 | 27835 bash 2 /usr/local/bin/iconfig 14 | 27835 bash 2 /usr/sbin/iconfig 15 | 27835 bash 2 /usr/bin/iconfig 16 | 27835 bash 2 /sbin/iconfig 17 | 27835 bash 2 /bin/iconfig 18 | 27835 bash 2 /usr/games/iconfig 19 | 27835 bash 2 /usr/local/games/iconfig 20 | 27835 bash 2 /snap/bin/iconfig 21 | 27835 bash 2 /apps/python/bin/iconfig 22 | 30573 command-not-fou 2 /usr/bin/Modules/Setup 23 | 30573 command-not-fou 2 /usr/bin/lib/python3.5/os.py 24 | 30573 command-not-fou 2 /usr/bin/lib/python3.5/os.pyc 25 | 30573 command-not-fou 0 /usr/lib/python3.5/os.py 26 | 30573 command-not-fou 2 /usr/bin/pybuilddir.txt 27 | 30573 command-not-fou 2 /usr/bin/lib/python3.5/lib-dynload 28 | 30573 command-not-fou 0 /usr/lib/python3.5/lib-dynload 29 | 30573 command-not-fou 2 /usr/lib/python35.zip 30 | 30573 command-not-fou 0 /usr/lib 31 | 30573 command-not-fou 2 /usr/lib/python35.zip 32 | 30573 command-not-fou 0 /usr/lib/python3.5/ 33 | 30573 command-not-fou 0 /usr/lib/python3.5/ 34 | 30573 command-not-fou 0 /usr/lib/python3.5/ 35 | 30573 command-not-fou 2 /usr/lib/python3.5/encodings/__init__.cpython-35m-x86_64-linux- 36 | 30573 command-not-fou 2 /usr/lib/python3.5/encodings/__init__.abi3.so 37 | 30573 command-not-fou 2 /usr/lib/python3.5/encodings/__init__.so 38 | 30573 command-not-fou 0 /usr/lib/python3.5/encodings/__init__.py 39 | 30573 command-not-fou 0 /usr/lib/python3.5/encodings/__init__.py 40 | 41 | This output has caught me mistyping a command in another shell, "iconfig" 42 | instead of "ifconfig". The first several lines show the bash shell searching 43 | the $PATH (why is games in my $PATH??), and failing to find it (ERR == 2 is 44 | file not found). Then, a "command-not-found" program executes (the name is 45 | truncated to 16 characters in the COMM field, including the NULL), which 46 | begins the process of searching for and suggesting a package. ie, this: 47 | 48 | # iconfig 49 | The program 'iconfig' is currently not installed. You can install it by typing: 50 | apt install ipmiutil 51 | 52 | statsnoop can be used for general debugging, to see what file information has 53 | been requested, and whether those files exist. It can be used as a companion 54 | to opensnoop, which shows what files were actually opened. 55 | 56 | 57 | There is another version of this tool in bcc: https://github.com/iovisor/bcc 58 | The bcc version provides options to customize the output. 59 | -------------------------------------------------------------------------------- /tools/swapin.bt: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bpftrace 2 | /* 3 | * swapin - Show swapins by process. 4 | * 5 | * See BPF Performance Tools, Chapter 7, for an explanation of this tool. 6 | * 7 | * Copyright (c) 2019 Brendan Gregg. 8 | * Licensed under the Apache License, Version 2.0 (the "License"). 9 | * This was originally created for the BPF Performance Tools book 10 | * published by Addison Wesley. ISBN-13: 9780136554820 11 | * When copying or porting, include this comment. 12 | * 13 | * 26-Jan-2019 Brendan Gregg Created this. 14 | */ 15 | 16 | kprobe:swap_readpage 17 | { 18 | @[comm, pid] = count(); 19 | } 20 | 21 | interval:s:1 22 | { 23 | time(); 24 | print(@); 25 | clear(@); 26 | } 27 | -------------------------------------------------------------------------------- /tools/swapin_example.txt: -------------------------------------------------------------------------------- 1 | Demonstrations of swapin, the Linux BCC/eBPF version. 2 | 3 | 4 | This tool counts swapins by process, to show which process is affected by 5 | swapping. For example: 6 | 7 | # ./swapin.bt 8 | Attaching 2 probes... 9 | 13:36:59 10 | 11 | 13:37:00 12 | @[chrome, 4536]: 10809 13 | @[gnome-shell, 2239]: 12410 14 | 15 | 13:37:01 16 | @[chrome, 4536]: 3826 17 | 18 | 13:37:02 19 | @[cron, 1180]: 23 20 | @[gnome-shell, 2239]: 2462 21 | 22 | 13:37:03 23 | @[gnome-shell, 1444]: 4 24 | @[gnome-shell, 2239]: 3420 25 | 26 | 13:37:04 27 | 28 | 13:37:05 29 | [...] 30 | 31 | While tracing, this showed that PID 2239 (gnome-shell) and PID 4536 (chrome) 32 | suffered over ten thousand swapins. 33 | -------------------------------------------------------------------------------- /tools/syncsnoop.bt: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bpftrace 2 | /* 3 | * syncsnoop Trace sync() variety of syscalls. 4 | * For Linux, uses bpftrace and eBPF. 5 | * 6 | * Also a basic example of bpftrace. 7 | * 8 | * USAGE: syncsnoop.bt 9 | * 10 | * This is a bpftrace version of the bcc tool of the same name. 11 | * 12 | * Copyright 2018 Netflix, Inc. 13 | * Licensed under the Apache License, Version 2.0 (the "License") 14 | * 15 | * 06-Sep-2018 Brendan Gregg Created this. 16 | */ 17 | 18 | BEGIN 19 | { 20 | printf("Tracing sync syscalls... Hit Ctrl-C to end.\n"); 21 | printf("%-9s %-6s %-16s %s\n", "TIME", "PID", "COMM", "EVENT"); 22 | } 23 | 24 | tracepoint:syscalls:sys_enter_sync, 25 | tracepoint:syscalls:sys_enter_syncfs, 26 | tracepoint:syscalls:sys_enter_fsync, 27 | tracepoint:syscalls:sys_enter_fdatasync, 28 | tracepoint:syscalls:sys_enter_sync_file_range*, 29 | tracepoint:syscalls:sys_enter_msync 30 | { 31 | time("%H:%M:%S "); 32 | printf("%-6d %-16s %s\n", pid, comm, probe); 33 | } 34 | -------------------------------------------------------------------------------- /tools/syncsnoop_example.txt: -------------------------------------------------------------------------------- 1 | Demonstrations of syncsnoop, the Linux bpftrace/eBPF version. 2 | 3 | 4 | Tracing file system sync events: 5 | 6 | # ./syncsnoop.bt 7 | Attaching 7 probes... 8 | Tracing sync syscalls... Hit Ctrl-C to end. 9 | TIME PID COMM EVENT 10 | 02:02:17 27933 sync tracepoint:syscalls:sys_enter_sync 11 | 02:03:43 27936 sync tracepoint:syscalls:sys_enter_sync 12 | 13 | The output shows calls to the sync() syscall (traced via its tracepoint), 14 | along with various details. 15 | 16 | 17 | There is another version of this tool in bcc: https://github.com/iovisor/bcc 18 | -------------------------------------------------------------------------------- /tools/syscount.bt: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bpftrace 2 | /* 3 | * syscount.bt Count system calls. 4 | * For Linux, uses bpftrace, eBPF. 5 | * 6 | * This is a bpftrace version of the bcc tool of the same name. 7 | * The bcc versions translates syscall IDs to their names, and this version 8 | * currently does not. Syscall IDs can be listed by "ausyscall --dump". 9 | * 10 | * Copyright 2018 Netflix, Inc. 11 | * Licensed under the Apache License, Version 2.0 (the "License") 12 | * 13 | * 13-Sep-2018 Brendan Gregg Created this. 14 | */ 15 | 16 | BEGIN 17 | { 18 | printf("Counting syscalls... Hit Ctrl-C to end.\n"); 19 | // ausyscall --dump | awk 'NR > 1 { printf("\t@sysname[%d] = \"%s\";\n", $1, $2); }' 20 | } 21 | 22 | tracepoint:raw_syscalls:sys_enter 23 | { 24 | @syscall[args.id] = count(); 25 | @process[comm] = count(); 26 | } 27 | 28 | END 29 | { 30 | printf("\nTop 10 syscalls IDs:\n"); 31 | print(@syscall, 10); 32 | clear(@syscall); 33 | 34 | printf("\nTop 10 processes:\n"); 35 | print(@process, 10); 36 | clear(@process); 37 | } 38 | -------------------------------------------------------------------------------- /tools/syscount_example.txt: -------------------------------------------------------------------------------- 1 | Demonstrations of syscount, the Linux bpftrace/eBPF version. 2 | 3 | 4 | syscount counts system calls, and prints summaries of the top ten syscall IDs, 5 | and the top ten process names making syscalls. For example: 6 | 7 | # ./syscount.bt 8 | Attaching 3 probes... 9 | Counting syscalls... Hit Ctrl-C to end. 10 | ^C 11 | Top 10 syscalls IDs: 12 | @syscall[6]: 36862 13 | @syscall[21]: 42189 14 | @syscall[13]: 44532 15 | @syscall[12]: 58456 16 | @syscall[9]: 82113 17 | @syscall[8]: 95575 18 | @syscall[5]: 147658 19 | @syscall[3]: 163269 20 | @syscall[2]: 270801 21 | @syscall[4]: 326333 22 | 23 | Top 10 processes: 24 | @process[rm]: 14360 25 | @process[tail]: 16011 26 | @process[objtool]: 20767 27 | @process[fixdep]: 28489 28 | @process[as]: 48982 29 | @process[gcc]: 90652 30 | @process[command-not-fou]: 172874 31 | @process[sh]: 270515 32 | @process[cc1]: 482888 33 | @process[make]: 1404065 34 | 35 | The above output was traced during a Linux kernel build, and the process name 36 | with the most syscalls was "make" with 1,404,065 syscalls while tracing. The 37 | highest syscall ID was 4, which is stat(). 38 | 39 | 40 | There is another version of this tool in bcc: https://github.com/iovisor/bcc 41 | The bcc version provides different command line options, and translates the 42 | syscall IDs to their syscall names. 43 | -------------------------------------------------------------------------------- /tools/tcpaccept.bt: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bpftrace 2 | /* 3 | * tcpaccept.bt Trace TCP accept()s 4 | * For Linux, uses bpftrace and eBPF. 5 | * 6 | * USAGE: tcpaccept.bt 7 | * 8 | * This is a bpftrace version of the bcc tool of the same name. 9 | * 10 | * This uses dynamic tracing of the kernel inet_csk_accept() socket function 11 | * (from tcp_prot.accept), and will need to be modified to match kernel changes. 12 | 13 | * Copyright (c) 2018 Dale Hamel. 14 | * Licensed under the Apache License, Version 2.0 (the "License") 15 | 16 | * 23-Nov-2018 Dale Hamel created this. 17 | */ 18 | 19 | #ifndef BPFTRACE_HAVE_BTF 20 | #include 21 | #include 22 | #else 23 | #include 24 | #endif 25 | 26 | BEGIN 27 | { 28 | printf("Tracing TCP accepts. Hit Ctrl-C to end.\n"); 29 | printf("%-8s %-6s %-14s ", "TIME", "PID", "COMM"); 30 | printf("%-39s %-5s %-39s %-5s %s\n", "RADDR", "RPORT", "LADDR", 31 | "LPORT", "BL"); 32 | } 33 | 34 | kretprobe:inet_csk_accept 35 | { 36 | $sk = (struct sock *)retval; 37 | $inet_family = $sk->__sk_common.skc_family; 38 | 39 | if ($inet_family == AF_INET || $inet_family == AF_INET6) { 40 | // initialize variable type: 41 | $daddr = ntop(0); 42 | $saddr = ntop(0); 43 | if ($inet_family == AF_INET) { 44 | $daddr = ntop($sk->__sk_common.skc_daddr); 45 | $saddr = ntop($sk->__sk_common.skc_rcv_saddr); 46 | } else { 47 | $daddr = ntop( 48 | $sk->__sk_common.skc_v6_daddr.in6_u.u6_addr8); 49 | $saddr = ntop( 50 | $sk->__sk_common.skc_v6_rcv_saddr.in6_u.u6_addr8); 51 | } 52 | $lport = $sk->__sk_common.skc_num; 53 | $dport = $sk->__sk_common.skc_dport; 54 | $qlen = $sk->sk_ack_backlog; 55 | $qmax = $sk->sk_max_ack_backlog; 56 | 57 | // Destination port is big endian, it must be flipped 58 | $dport = bswap($dport); 59 | 60 | time("%H:%M:%S "); 61 | printf("%-6d %-14s ", pid, comm); 62 | printf("%-39s %-5d %-39s %-5d ", $daddr, $dport, $saddr, 63 | $lport); 64 | printf("%d/%d\n", $qlen, $qmax); 65 | } 66 | } 67 | -------------------------------------------------------------------------------- /tools/tcpaccept_example.txt: -------------------------------------------------------------------------------- 1 | Demonstrations of tcpaccept, the Linux bpftrace/eBPF version. 2 | 3 | 4 | This tool traces the kernel function accepting TCP socket connections (eg, a 5 | passive connection via accept(); not connect()). Some example output (IP 6 | addresses changed to protect the innocent): 7 | 8 | # ./tcpaccept.bt 9 | Tracing tcp accepts. Hit Ctrl-C to end. 10 | TIME PID COMM RADDR RPORT LADDR LPORT BL 11 | 00:34:19 3949061 nginx 10.228.22.228 44226 10.229.20.169 8088 0/128 12 | 00:34:19 3951399 ruby 127.0.0.1 52422 127.0.0.1 8000 0/128 13 | 00:34:19 3949062 nginx 10.228.23.128 35408 10.229.20.169 8080 0/128 14 | 15 | 16 | This output shows three connections, an IPv4 connections to PID 3951399, a "ruby" 17 | process listening on port 8000, and one connection to a "nginx" process 18 | listening on port 8080. The remote address and port are also printed, and the accept queue 19 | current size as well as maximum size are shown. 20 | 21 | The overhead of this tool should be negligible, since it is only tracing the 22 | kernel function performing accept. It is not tracing every packet and then 23 | filtering. 24 | 25 | This tool only traces successful TCP accept()s. Connection attempts to closed 26 | ports will not be shown (those can be traced via other functions). 27 | 28 | There is another version of this tool in bcc: https://github.com/iovisor/bcc 29 | 30 | USAGE message: 31 | 32 | # ./tcpaccept.bt 33 | -------------------------------------------------------------------------------- /tools/tcpconnect.bt: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bpftrace 2 | /* 3 | * tcpconnect.bt Trace TCP connect()s. 4 | * For Linux, uses bpftrace and eBPF. 5 | * 6 | * USAGE: tcpconnect.bt 7 | * 8 | * This is a bpftrace version of the bcc tool of the same name. 9 | * It is limited to ipv4 addresses. 10 | * 11 | * All connection attempts are traced, even if they ultimately fail. 12 | * 13 | * This uses dynamic tracing of kernel functions, and will need to be updated 14 | * to match kernel changes. 15 | * 16 | * Copyright (c) 2018 Dale Hamel. 17 | * Licensed under the Apache License, Version 2.0 (the "License") 18 | * 19 | * 23-Nov-2018 Dale Hamel created this. 20 | */ 21 | 22 | #ifndef BPFTRACE_HAVE_BTF 23 | #include 24 | #include 25 | #else 26 | #include 27 | #endif 28 | 29 | BEGIN 30 | { 31 | printf("Tracing tcp connections. Hit Ctrl-C to end.\n"); 32 | printf("%-8s %-8s %-16s ", "TIME", "PID", "COMM"); 33 | printf("%-39s %-6s %-39s %-6s\n", "SADDR", "SPORT", "DADDR", "DPORT"); 34 | } 35 | 36 | kprobe:tcp_connect 37 | { 38 | $sk = ((struct sock *) arg0); 39 | $inet_family = $sk->__sk_common.skc_family; 40 | 41 | if ($inet_family == AF_INET || $inet_family == AF_INET6) { 42 | if ($inet_family == AF_INET) { 43 | $daddr = ntop($sk->__sk_common.skc_daddr); 44 | $saddr = ntop($sk->__sk_common.skc_rcv_saddr); 45 | } else { 46 | $daddr = ntop($sk->__sk_common.skc_v6_daddr.in6_u.u6_addr8); 47 | $saddr = ntop($sk->__sk_common.skc_v6_rcv_saddr.in6_u.u6_addr8); 48 | } 49 | $lport = $sk->__sk_common.skc_num; 50 | $dport = $sk->__sk_common.skc_dport; 51 | 52 | // Destination port is big endian, it must be flipped 53 | $dport = bswap($dport); 54 | 55 | time("%H:%M:%S "); 56 | printf("%-8d %-16s ", pid, comm); 57 | printf("%-39s %-6d %-39s %-6d\n", $saddr, $lport, $daddr, $dport); 58 | } 59 | } 60 | -------------------------------------------------------------------------------- /tools/tcpconnect_example.txt: -------------------------------------------------------------------------------- 1 | Demonstrations of tcpconnect, the Linux bpftrace/eBPF version. 2 | 3 | 4 | This tool traces the kernel function performing active TCP connections 5 | (eg, via a connect() syscall; accept() are passive connections). Some example 6 | output (IP addresses changed to protect the innocent): 7 | 8 | # ./tcpconnect.bt 9 | TIME PID COMM SADDR SPORT DADDR DPORT 10 | 00:36:45 1798396 agent 127.0.0.1 5001 10.229.20.82 56114 11 | 00:36:45 1798396 curl 127.0.0.1 10255 10.229.20.82 56606 12 | 00:36:45 3949059 nginx 127.0.0.1 8000 127.0.0.1 37780 13 | 14 | 15 | This output shows three connections, one from a "agent" process, one from 16 | "curl", and one from "nginx". The output details shows the IP version, source 17 | address, source socket port, destination address, and destination port. This traces attempted 18 | connections: these may have failed. 19 | 20 | The overhead of this tool should be negligible, since it is only tracing the 21 | kernel functions performing connect. It is not tracing every packet and then 22 | filtering. 23 | 24 | USAGE message: 25 | 26 | # ./tcpconnect.bt 27 | -------------------------------------------------------------------------------- /tools/tcpdrop.bt: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bpftrace 2 | /* 3 | * tcpdrop.bt Trace TCP kernel-dropped packets/segments. 4 | * For Linux, uses bpftrace and eBPF. 5 | * 6 | * USAGE: tcpdrop.bt 7 | * 8 | * This is a bpftrace version of the bcc tool of the same name. 9 | * It is limited to ipv4 addresses, and cannot show tcp flags. 10 | * 11 | * This provides information such as packet details, socket state, and kernel 12 | * stack trace for packets/segments that were dropped via kfree_skb. 13 | * 14 | * For Linux 5.17+ (see tools/old for script for lower versions). 15 | * 16 | * Copyright (c) 2018 Dale Hamel. 17 | * Licensed under the Apache License, Version 2.0 (the "License") 18 | * 19 | * 23-Nov-2018 Dale Hamel created this. 20 | * 01-Oct-2022 Rong Tao use tracepoint:skb:kfree_skb 21 | */ 22 | 23 | #ifndef BPFTRACE_HAVE_BTF 24 | #include 25 | #include 26 | #else 27 | #include 28 | #endif 29 | 30 | BEGIN 31 | { 32 | printf("Tracing tcp drops. Hit Ctrl-C to end.\n"); 33 | printf("%-8s %-8s %-16s %-21s %-21s %-8s\n", "TIME", "PID", "COMM", "SADDR:SPORT", "DADDR:DPORT", "STATE"); 34 | 35 | // See https://github.com/torvalds/linux/blob/master/include/net/tcp_states.h 36 | @tcp_states[1] = "ESTABLISHED"; 37 | @tcp_states[2] = "SYN_SENT"; 38 | @tcp_states[3] = "SYN_RECV"; 39 | @tcp_states[4] = "FIN_WAIT1"; 40 | @tcp_states[5] = "FIN_WAIT2"; 41 | @tcp_states[6] = "TIME_WAIT"; 42 | @tcp_states[7] = "CLOSE"; 43 | @tcp_states[8] = "CLOSE_WAIT"; 44 | @tcp_states[9] = "LAST_ACK"; 45 | @tcp_states[10] = "LISTEN"; 46 | @tcp_states[11] = "CLOSING"; 47 | @tcp_states[12] = "NEW_SYN_RECV"; 48 | } 49 | 50 | tracepoint:skb:kfree_skb 51 | { 52 | $reason = args.reason; 53 | $skb = (struct sk_buff *)args.skbaddr; 54 | $sk = ((struct sock *) $skb->sk); 55 | $inet_family = $sk->__sk_common.skc_family; 56 | 57 | if ($reason > SKB_DROP_REASON_NOT_SPECIFIED && 58 | ($inet_family == AF_INET || $inet_family == AF_INET6)) { 59 | if ($inet_family == AF_INET) { 60 | $daddr = ntop($sk->__sk_common.skc_daddr); 61 | $saddr = ntop($sk->__sk_common.skc_rcv_saddr); 62 | } else { 63 | $daddr = ntop($sk->__sk_common.skc_v6_daddr.in6_u.u6_addr8); 64 | $saddr = ntop($sk->__sk_common.skc_v6_rcv_saddr.in6_u.u6_addr8); 65 | } 66 | $lport = $sk->__sk_common.skc_num; 67 | $dport = $sk->__sk_common.skc_dport; 68 | 69 | // Destination port is big endian, it must be flipped 70 | $dport = bswap($dport); 71 | 72 | $state = $sk->__sk_common.skc_state; 73 | $statestr = @tcp_states[$state]; 74 | 75 | time("%H:%M:%S "); 76 | printf("%-8d %-16s ", pid, comm); 77 | printf("%39s:%-6d %39s:%-6d %-10s\n", $saddr, $lport, $daddr, $dport, $statestr); 78 | printf("%s\n", kstack); 79 | } 80 | } 81 | 82 | END 83 | { 84 | clear(@tcp_states); 85 | } 86 | -------------------------------------------------------------------------------- /tools/tcpdrop_example.txt: -------------------------------------------------------------------------------- 1 | Demonstrations of tcpdrop, the Linux bpftrace/eBPF version. 2 | 3 | 4 | tcpdrop prints details of TCP packets or segments that were dropped by the 5 | kernel, including the kernel stack trace that led to the drop: 6 | 7 | # ./tcpdrop.bt 8 | TIME PID COMM SADDR:SPORT DADDR:DPORT STATE 9 | 00:39:21 0 swapper/2 10.231.244.31:3306 10.229.20.82:50552 ESTABLISHE 10 | tcp_drop+0x1 11 | tcp_v4_do_rcv+0x135 12 | tcp_v4_rcv+0x9c7 13 | ip_local_deliver_finish+0x62 14 | ip_local_deliver+0x6f 15 | ip_rcv_finish+0x129 16 | ip_rcv+0x28f 17 | __netif_receive_skb_core+0x432 18 | __netif_receive_skb+0x18 19 | netif_receive_skb_internal+0x37 20 | napi_gro_receive+0xc5 21 | ena_clean_rx_irq+0x3c3 22 | ena_io_poll+0x33f 23 | net_rx_action+0x140 24 | __softirqentry_text_start+0xdf 25 | irq_exit+0xb6 26 | do_IRQ+0x82 27 | ret_from_intr+0x0 28 | native_safe_halt+0x6 29 | default_idle+0x20 30 | arch_cpu_idle+0x15 31 | default_idle_call+0x23 32 | do_idle+0x17f 33 | cpu_startup_entry+0x73 34 | rest_init+0xae 35 | start_kernel+0x4dc 36 | x86_64_start_reservations+0x24 37 | x86_64_start_kernel+0x74 38 | secondary_startup_64+0xa5 39 | [...] 40 | 41 | The last column shows the state of the TCP session. 42 | 43 | This tool is useful for debugging high rates of drops, which can cause the 44 | remote end to do timer-based retransmits, hurting performance. 45 | 46 | USAGE: 47 | 48 | # ./tcpdrop.bt 49 | -------------------------------------------------------------------------------- /tools/tcplife.bt: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bpftrace 2 | /* 3 | * tcplife - Trace TCP session lifespans with connection details. 4 | * 5 | * See BPF Performance Tools, Chapter 10, for an explanation of this tool. 6 | * 7 | * Copyright (c) 2019 Brendan Gregg. 8 | * Licensed under the Apache License, Version 2.0 (the "License"). 9 | * This was originally created for the BPF Performance Tools book 10 | * published by Addison Wesley. ISBN-13: 9780136554820 11 | * When copying or porting, include this comment. 12 | * 13 | * 17-Apr-2019 Brendan Gregg Created this. 14 | */ 15 | 16 | #ifndef BPFTRACE_HAVE_BTF 17 | #include 18 | #include 19 | #include 20 | #include 21 | #else 22 | #include 23 | #endif 24 | 25 | BEGIN 26 | { 27 | printf("%-5s %-10s %-15s %-5s %-15s %-5s ", "PID", "COMM", 28 | "LADDR", "LPORT", "RADDR", "RPORT"); 29 | printf("%5s %5s %s\n", "TX_KB", "RX_KB", "MS"); 30 | } 31 | 32 | kprobe:tcp_set_state 33 | { 34 | $sk = (struct sock *)arg0; 35 | $newstate = arg1; 36 | 37 | /* 38 | * This tool includes PID and comm context. From TCP this is best 39 | * effort, and may be wrong in some situations. It does this: 40 | * - record timestamp on any state < TCP_FIN_WAIT1 41 | * note some state transitions may not be present via this kprobe 42 | * - cache task context on: 43 | * TCP_SYN_SENT: tracing from client 44 | * TCP_LAST_ACK: client-closed from server 45 | * - do output on TCP_CLOSE: 46 | * fetch task context if cached, or use current task 47 | */ 48 | 49 | // record first timestamp seen for this socket 50 | if ($newstate < TCP_FIN_WAIT1 && @birth[$sk] == 0) { 51 | @birth[$sk] = nsecs; 52 | } 53 | 54 | // record PID & comm on SYN_SENT 55 | if ($newstate == TCP_SYN_SENT || $newstate == TCP_LAST_ACK) { 56 | @skpid[$sk] = pid; 57 | @skcomm[$sk] = comm; 58 | } 59 | 60 | // session ended: calculate lifespan and print 61 | if ($newstate == TCP_CLOSE && @birth[$sk]) { 62 | $delta_ms = (nsecs - @birth[$sk]) / 1e6; 63 | $lport = $sk->__sk_common.skc_num; 64 | $dport = $sk->__sk_common.skc_dport; 65 | $dport = bswap($dport); 66 | $tp = (struct tcp_sock *)$sk; 67 | $pid = @skpid[$sk]; 68 | $comm = @skcomm[$sk]; 69 | if ($comm == "") { 70 | // not cached, use current task 71 | $pid = pid; 72 | $comm = comm; 73 | } 74 | 75 | $family = $sk->__sk_common.skc_family; 76 | $saddr = ntop(0); 77 | $daddr = ntop(0); 78 | if ($family == AF_INET) { 79 | $saddr = ntop(AF_INET, $sk->__sk_common.skc_rcv_saddr); 80 | $daddr = ntop(AF_INET, $sk->__sk_common.skc_daddr); 81 | } else { 82 | // AF_INET6 83 | $saddr = ntop(AF_INET6, 84 | $sk->__sk_common.skc_v6_rcv_saddr.in6_u.u6_addr8); 85 | $daddr = ntop(AF_INET6, 86 | $sk->__sk_common.skc_v6_daddr.in6_u.u6_addr8); 87 | } 88 | printf("%-5d %-10.10s %-15s %-5d %-15s %-6d ", $pid, 89 | $comm, $saddr, $lport, $daddr, $dport); 90 | printf("%5d %5d %d\n", $tp->bytes_acked / 1024, 91 | $tp->bytes_received / 1024, $delta_ms); 92 | 93 | delete(@birth[$sk]); 94 | delete(@skpid[$sk]); 95 | delete(@skcomm[$sk]); 96 | } 97 | } 98 | 99 | END 100 | { 101 | clear(@birth); clear(@skpid); clear(@skcomm); 102 | } 103 | -------------------------------------------------------------------------------- /tools/tcplife_example.txt: -------------------------------------------------------------------------------- 1 | Demonstrations of tcplife, the Linux bpftrace/eBPF version. 2 | 3 | 4 | This tool shows the lifespan of TCP sessions, including througphut statistics, 5 | and for efficiency only instruments TCP state changes (rather than all packets). 6 | For example: 7 | 8 | # ./tcplife.bt 9 | PID COMM LADDR LPORT RADDR RPORT TX_KB RX_KB MS 10 | 20976 ssh 127.0.0.1 56766 127.0.0.1 22 6 10584 3059 11 | 20977 sshd 127.0.0.1 22 127.0.0.1 56766 10584 6 3059 12 | 14519 monitord 127.0.0.1 44832 127.0.0.1 44444 0 0 0 13 | 4496 Chrome_IOT 7f00:6:5ea7::a00:0 42846 0:0:bb01:: 443 0 3 12441 14 | 4496 Chrome_IOT 7f00:6:5aa7::a00:0 42842 0:0:bb01:: 443 0 3 12436 15 | 4496 Chrome_IOT 7f00:6:62a7::a00:0 42850 0:0:bb01:: 443 0 3 12436 16 | 4496 Chrome_IOT 7f00:6:5ca7::a00:0 42844 0:0:bb01:: 443 0 3 12442 17 | 4496 Chrome_IOT 7f00:6:60a7::a00:0 42848 0:0:bb01:: 443 0 3 12436 18 | 4496 Chrome_IOT 10.0.0.65 33342 54.241.2.241 443 0 3 10717 19 | 4496 Chrome_IOT 10.0.0.65 33350 54.241.2.241 443 0 3 10711 20 | 4496 Chrome_IOT 10.0.0.65 33352 54.241.2.241 443 0 3 10712 21 | 14519 monitord 127.0.0.1 44832 127.0.0.1 44444 0 0 0 22 | 23 | The output begins with a localhost ssh connection, so both endpoints can be 24 | seen: the ssh process (PID 20976) which received 10584 Kbytes, and the sshd 25 | process (PID 20977) which transmitted 10584 Kbytes. This session lasted 3059 26 | milliseconds. Other sessions can also be seen, including IPv6 connections. 27 | -------------------------------------------------------------------------------- /tools/tcpretrans.bt: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bpftrace 2 | /* 3 | * tcpretrans.bt Trace or count TCP retransmits 4 | * For Linux, uses bpftrace and eBPF. 5 | * 6 | * USAGE: tcpretrans.bt 7 | * 8 | * This is a bpftrace version of the bcc tool of the same name. 9 | * It is limited to ipv4 addresses, and doesn't support tracking TLPs. 10 | * 11 | * This uses dynamic tracing of kernel functions, and will need to be updated 12 | * to match kernel changes. 13 | * 14 | * Copyright (c) 2018 Dale Hamel. 15 | * Licensed under the Apache License, Version 2.0 (the "License") 16 | * 17 | * 23-Nov-2018 Dale Hamel created this. 18 | */ 19 | 20 | #ifndef BPFTRACE_HAVE_BTF 21 | #include 22 | #include 23 | #else 24 | #include 25 | #endif 26 | 27 | BEGIN 28 | { 29 | printf("Tracing tcp retransmits. Hit Ctrl-C to end.\n"); 30 | printf("%-8s %-8s %20s %21s %6s\n", "TIME", "PID", "LADDR:LPORT", 31 | "RADDR:RPORT", "STATE"); 32 | 33 | // See include/net/tcp_states.h: 34 | @tcp_states[1] = "ESTABLISHED"; 35 | @tcp_states[2] = "SYN_SENT"; 36 | @tcp_states[3] = "SYN_RECV"; 37 | @tcp_states[4] = "FIN_WAIT1"; 38 | @tcp_states[5] = "FIN_WAIT2"; 39 | @tcp_states[6] = "TIME_WAIT"; 40 | @tcp_states[7] = "CLOSE"; 41 | @tcp_states[8] = "CLOSE_WAIT"; 42 | @tcp_states[9] = "LAST_ACK"; 43 | @tcp_states[10] = "LISTEN"; 44 | @tcp_states[11] = "CLOSING"; 45 | @tcp_states[12] = "NEW_SYN_RECV"; 46 | } 47 | 48 | kprobe:tcp_retransmit_skb 49 | { 50 | $sk = (struct sock *)arg0; 51 | $inet_family = $sk->__sk_common.skc_family; 52 | 53 | if ($inet_family == AF_INET || $inet_family == AF_INET6) { 54 | // initialize variable type: 55 | $daddr = ntop(0); 56 | $saddr = ntop(0); 57 | if ($inet_family == AF_INET) { 58 | $daddr = ntop($sk->__sk_common.skc_daddr); 59 | $saddr = ntop($sk->__sk_common.skc_rcv_saddr); 60 | } else { 61 | $daddr = ntop( 62 | $sk->__sk_common.skc_v6_daddr.in6_u.u6_addr8); 63 | $saddr = ntop( 64 | $sk->__sk_common.skc_v6_rcv_saddr.in6_u.u6_addr8); 65 | } 66 | $lport = $sk->__sk_common.skc_num; 67 | $dport = $sk->__sk_common.skc_dport; 68 | 69 | // Destination port is big endian, it must be flipped 70 | $dport = bswap($dport); 71 | 72 | $state = $sk->__sk_common.skc_state; 73 | $statestr = @tcp_states[$state]; 74 | 75 | time("%H:%M:%S "); 76 | printf("%-8d %14s:%-6d %14s:%-6d %6s\n", pid, $saddr, $lport, 77 | $daddr, $dport, $statestr); 78 | } 79 | } 80 | 81 | END 82 | { 83 | clear(@tcp_states); 84 | } 85 | -------------------------------------------------------------------------------- /tools/tcpretrans_example.txt: -------------------------------------------------------------------------------- 1 | Demonstrations of tcpretrans, the Linux bpftrace/eBPF version. 2 | 3 | 4 | This tool traces the kernel TCP retransmit function to show details of these 5 | retransmits. For example: 6 | 7 | # ./tcpretrans.bt 8 | TIME PID LADDR:LPORT RADDR:RPORT STATE 9 | 01:55:05 0 10.153.223.157:22 69.53.245.40:34619 ESTABLISHED 10 | 01:55:05 0 10.153.223.157:22 69.53.245.40:34619 ESTABLISHED 11 | 01:55:17 0 10.153.223.157:22 69.53.245.40:22957 ESTABLISHED 12 | [...] 13 | 14 | This output shows three TCP retransmits, the first two were for an IPv4 15 | connection from 10.153.223.157 port 22 to 69.53.245.40 port 34619. The TCP 16 | state was "ESTABLISHED" at the time of the retransmit. The on-CPU PID at the 17 | time of the retransmit is printed, in this case 0 (the kernel, which will 18 | be the case most of the time). 19 | 20 | Retransmits are usually a sign of poor network health, and this tool is 21 | useful for their investigation. Unlike using tcpdump, this tool has very 22 | low overhead, as it only traces the retransmit function. It also prints 23 | additional kernel details: the state of the TCP session at the time of the 24 | retransmit. 25 | 26 | USAGE message: 27 | 28 | # ./tcpretrans.bt 29 | -------------------------------------------------------------------------------- /tools/tcpsynbl.bt: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bpftrace 2 | /* 3 | * tcpsynbl - Show TCP SYN backlog as a histogram. 4 | * 5 | * See BPF Performance Tools, Chapter 10, for an explanation of this tool. 6 | * 7 | * Copyright (c) 2019 Brendan Gregg. 8 | * Licensed under the Apache License, Version 2.0 (the "License"). 9 | * This was originally created for the BPF Performance Tools book 10 | * published by Addison Wesley. ISBN-13: 9780136554820 11 | * When copying or porting, include this comment. 12 | * 13 | * 19-Apr-2019 Brendan Gregg Created this. 14 | */ 15 | 16 | #ifndef BPFTRACE_HAVE_BTF 17 | #include 18 | #endif 19 | 20 | BEGIN 21 | { 22 | printf("Tracing SYN backlog size. Ctrl-C to end.\n"); 23 | } 24 | 25 | kprobe:tcp_v4_syn_recv_sock, 26 | kprobe:tcp_v6_syn_recv_sock 27 | { 28 | $sock = (struct sock *)arg0; 29 | @backlog[$sock->sk_max_ack_backlog & 0xffffffff] = 30 | hist($sock->sk_ack_backlog); 31 | if ($sock->sk_ack_backlog > $sock->sk_max_ack_backlog) { 32 | time("%H:%M:%S dropping a SYN.\n"); 33 | } 34 | } 35 | 36 | END 37 | { 38 | printf("\n@backlog[backlog limit]: histogram of backlog size\n"); 39 | } 40 | -------------------------------------------------------------------------------- /tools/tcpsynbl_example.txt: -------------------------------------------------------------------------------- 1 | Demonstrations of tcpsynbl, the Linux bpftrace/eBPF version. 2 | 3 | 4 | This tool shows the TCP SYN backlog size during SYN arrival as a histogram. 5 | This lets you see how close your applications are to hitting the backlog limit 6 | and dropping SYNs (causing performance issues with SYN retransmits). For 7 | example: 8 | 9 | # ./tcpsynbl.bt 10 | Attaching 4 probes... 11 | Tracing SYN backlog size. Ctrl-C to end. 12 | ^C 13 | @backlog[backlog limit]: histogram of backlog size 14 | 15 | 16 | @backlog[500]: 17 | [0] 2266 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| 18 | [1] 3 | | 19 | [2, 4) 1 | | 20 | 21 | This output shows that for the backlog limit of 500, there were 2266 SYN 22 | arrivals where the backlog was zero, three where the backlog was one, and 23 | one where the backlog was either two or three. This indicates that we are 24 | nowhere near this limit. 25 | -------------------------------------------------------------------------------- /tools/threadsnoop.bt: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bpftrace 2 | /* 3 | * threadsnoop - List new thread creation. 4 | * 5 | * See BPF Performance Tools, Chapter 13, for an explanation of this tool. 6 | * 7 | * Copyright (c) 2019 Brendan Gregg. 8 | * Licensed under the Apache License, Version 2.0 (the "License"). 9 | * This was originally created for the BPF Performance Tools book 10 | * published by Addison Wesley. ISBN-13: 9780136554820 11 | * When copying or porting, include this comment. 12 | * 13 | * 15-Feb-2019 Brendan Gregg Created this. 14 | */ 15 | 16 | BEGIN 17 | { 18 | printf("%-10s %-6s %-16s %s\n", "TIME(ms)", "PID", "COMM", "FUNC"); 19 | } 20 | 21 | uprobe:libpthread:pthread_create, 22 | uprobe:libc:pthread_create 23 | { 24 | printf("%-10u %-6d %-16s %s\n", elapsed / 1e6, pid, comm, 25 | usym(arg2)); 26 | } 27 | -------------------------------------------------------------------------------- /tools/threadsnoop_example.txt: -------------------------------------------------------------------------------- 1 | Demonstrations of threadsnoop, the Linux bpftrace/eBPF version. 2 | 3 | 4 | Tracing new threads via phtread_create(): 5 | 6 | # ./threadsnoop.bt 7 | Attaching 2 probes... 8 | TIME(ms) PID COMM FUNC 9 | 1938 28549 dockerd threadentry 10 | 1939 28549 dockerd threadentry 11 | 1939 28549 dockerd threadentry 12 | 1940 28549 dockerd threadentry 13 | 1949 28549 dockerd threadentry 14 | 1958 28549 dockerd threadentry 15 | 1939 28549 dockerd threadentry 16 | 1950 28549 dockerd threadentry 17 | 2013 28579 docker-containe 0x562f30f2e710 18 | 2036 28549 dockerd threadentry 19 | 2083 28579 docker-containe 0x562f30f2e710 20 | 2116 629 systemd-journal 0x7fb7114955c0 21 | 2116 629 systemd-journal 0x7fb7114955c0 22 | [...] 23 | 24 | The output shows a dockerd process creating several threads with the start 25 | routine threadentry(), and docker-containe (truncated) and systemd-journal 26 | also starting threads: in their cases, the function had no symbol information 27 | available, so their addresses are printed in hex. 28 | -------------------------------------------------------------------------------- /tools/undump.bt: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bpftrace 2 | /* 3 | * undump Trace unix domain socket package receive. 4 | * For Linux, uses bpftrace and eBPF. 5 | * 6 | * Also a basic example of bpftrace. 7 | * 8 | * This is a bpftrace version of the bcc examples/tracing of the same name. 9 | * 10 | * USAGE: undump.bt 11 | * 12 | * Copyright 2022 CESTC, Inc. 13 | * Licensed under the Apache License, Version 2.0 (the "License") 14 | * 15 | * 22-May-2022 Rong Tao Created this. 16 | */ 17 | #ifndef BPFTRACE_HAVE_BTF 18 | #include 19 | #endif 20 | 21 | BEGIN 22 | { 23 | printf("Dump UNIX socket packages RX. Ctrl-C to end\n"); 24 | printf("%-8s %-16s %-8s %-8s %-s\n", "TIME", "COMM", "PID", "SIZE", "DATA"); 25 | } 26 | 27 | kprobe:unix_stream_read_actor 28 | { 29 | $skb = (struct sk_buff *)arg0; 30 | time("%H:%M:%S "); 31 | printf("%-16s %-8d %-8d %r\n", comm, pid, $skb->len, buf($skb->data, $skb->len)); 32 | } 33 | 34 | END 35 | { 36 | } 37 | -------------------------------------------------------------------------------- /tools/undump_example.txt: -------------------------------------------------------------------------------- 1 | Demonstrations of undump.bt, the Linux eBPF/bpftrace version. 2 | 3 | This example trace the kernel function performing receive AP_UNIX socket 4 | packet. Some example output: 5 | 6 | Terminal 1, UNIX Socket Server: 7 | 8 | ``` 9 | $ nc -lU /var/tmp/unixsocket 10 | # receive from Client 11 | Hello, world 12 | 123abc 13 | ``` 14 | 15 | Terminal 2, UNIX socket Client: 16 | 17 | ``` 18 | $ nc -U /var/tmp/unixsocket 19 | # Input some lines 20 | Hello, world 21 | 123abc 22 | ``` 23 | 24 | Terminal 3, receive tracing: 25 | 26 | ``` 27 | $ sudo ./undump.bt 28 | Attaching 3 probes... 29 | Dump UNIX socket packages RX. Ctrl-C to end 30 | TIME COMM PID SIZE DATA 31 | 20:40:11 nc 139071 13 Hello, world\x0a 32 | 20:40:14 nc 139071 7 123abc\x0a 33 | ^C 34 | ``` 35 | 36 | -------------------------------------------------------------------------------- /tools/vfscount.bt: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bpftrace 2 | /* 3 | * vfscount Count VFS calls ("vfs_*"). 4 | * For Linux, uses bpftrace and eBPF. 5 | * 6 | * Written as a basic example of counting kernel functions. 7 | * 8 | * USAGE: vfscount.bt 9 | * 10 | * This is a bpftrace version of the bcc tool of the same name. 11 | * 12 | * Copyright 2018 Netflix, Inc. 13 | * Licensed under the Apache License, Version 2.0 (the "License") 14 | * 15 | * 06-Sep-2018 Brendan Gregg Created this. 16 | */ 17 | 18 | BEGIN 19 | { 20 | printf("Tracing VFS calls... Hit Ctrl-C to end.\n"); 21 | 22 | } 23 | 24 | kprobe:vfs_* 25 | { 26 | @[func] = count(); 27 | } 28 | -------------------------------------------------------------------------------- /tools/vfscount_example.txt: -------------------------------------------------------------------------------- 1 | Demonstrations of vfscount, the Linux bpftrace/eBPF version. 2 | 3 | 4 | Tracing all VFS calls: 5 | 6 | # ./vfscount.bt 7 | Attaching 54 probes... 8 | cannot attach kprobe, Invalid argument 9 | Warning: could not attach probe kprobe:vfs_dedupe_get_page.isra.21, skipping. 10 | Tracing VFS calls... Hit Ctrl-C to end. 11 | ^C 12 | 13 | @[vfs_fsync_range]: 4 14 | @[vfs_readlink]: 14 15 | @[vfs_statfs]: 56 16 | @[vfs_lock_file]: 60 17 | @[vfs_write]: 276 18 | @[vfs_statx]: 328 19 | @[vfs_statx_fd]: 394 20 | @[vfs_open]: 541 21 | @[vfs_getattr]: 595 22 | @[vfs_getattr_nosec]: 597 23 | @[vfs_read]: 1113 24 | 25 | While tracing, the vfs_read() call was the most frequent, occurring 1,113 times. 26 | 27 | VFS is the Virtual File System: a kernel abstraction for file systems and other 28 | resources that expose a file system interface. Much of VFS maps directly to the 29 | syscall interface. Tracing VFS calls gives you a high level breakdown of the 30 | kernel workload, and starting points for further investigation. 31 | 32 | Note that a warning was printed: "Warning: could not attach probe 33 | kprobe:vfs_dedupe_get_page.isra.21": these are not currently instrumentable by 34 | bpftrace/kprobes, so a warning is printed to let you know that they will be 35 | missed. 36 | 37 | 38 | There is another version of this tool in bcc: https://github.com/iovisor/bcc 39 | -------------------------------------------------------------------------------- /tools/vfsstat.bt: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bpftrace 2 | /* 3 | * vfsstat Count some VFS calls, with per-second summaries. 4 | * For Linux, uses bpftrace and eBPF. 5 | * 6 | * Written as a basic example of counting multiple events and printing a 7 | * per-second summary. 8 | * 9 | * USAGE: vfsstat.bt 10 | * 11 | * This is a bpftrace version of the bcc tool of the same name. 12 | * 13 | * Copyright 2018 Netflix, Inc. 14 | * Licensed under the Apache License, Version 2.0 (the "License") 15 | * 16 | * 06-Sep-2018 Brendan Gregg Created this. 17 | */ 18 | 19 | BEGIN 20 | { 21 | printf("Tracing key VFS calls... Hit Ctrl-C to end.\n"); 22 | 23 | } 24 | 25 | kprobe:vfs_read*, 26 | kprobe:vfs_write*, 27 | kprobe:vfs_fsync, 28 | kprobe:vfs_open, 29 | kprobe:vfs_create 30 | { 31 | @[func] = count(); 32 | } 33 | 34 | interval:s:1 35 | { 36 | time(); 37 | print(@); 38 | clear(@); 39 | } 40 | 41 | END 42 | { 43 | clear(@); 44 | } 45 | -------------------------------------------------------------------------------- /tools/vfsstat_example.txt: -------------------------------------------------------------------------------- 1 | Demonstrations of vfsstat, the Linux bpftrace/eBPF version. 2 | 3 | 4 | This traces some common VFS calls (see the script for the list) and prints 5 | per-second summaries. 6 | 7 | # ./vfsstat.bt 8 | Attaching 8 probes... 9 | Tracing key VFS calls... Hit Ctrl-C to end. 10 | 21:30:38 11 | @[vfs_write]: 1274 12 | @[vfs_open]: 8675 13 | @[vfs_read]: 11515 14 | 15 | 21:30:39 16 | @[vfs_write]: 1155 17 | @[vfs_open]: 8077 18 | @[vfs_read]: 10398 19 | 20 | 21:30:40 21 | @[vfs_write]: 1222 22 | @[vfs_open]: 8554 23 | @[vfs_read]: 11011 24 | 25 | 21:30:41 26 | @[vfs_write]: 1230 27 | @[vfs_open]: 8605 28 | @[vfs_read]: 11077 29 | 30 | 21:30:42 31 | @[vfs_write]: 1229 32 | @[vfs_open]: 8591 33 | @[vfs_read]: 11061 34 | 35 | ^C 36 | 37 | Each second, a timestamp is printed ("HH:MM:SS") followed by common VFS 38 | functions and the number of calls for that second. While tracing, the vfs_read() 39 | kernel function was most frequent, occurring over 10,000 times per second. 40 | 41 | 42 | There is another version of this tool in bcc: https://github.com/iovisor/bcc 43 | The bcc version provides command line options. 44 | -------------------------------------------------------------------------------- /tools/writeback.bt: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bpftrace 2 | /* 3 | * writeback Trace file system writeback events with details. 4 | * For Linux, uses bpftrace and eBPF. 5 | * 6 | * This traces when file system dirtied pages are flushed to disk by kernel 7 | * writeback, and prints details including when the event occurred, and the 8 | * duration of the event. This can be useful for correlating these times with 9 | * other performance problems, and if there is a match, it would be a clue 10 | * that the problem may be caused by writeback. How quickly the kernel does 11 | * writeback can be tuned: see the kernel docs, eg, 12 | * vm.dirty_writeback_centisecs. 13 | * 14 | * USAGE: writeback.bt 15 | * 16 | * Copyright 2018 Netflix, Inc. 17 | * Licensed under the Apache License, Version 2.0 (the "License") 18 | * 19 | * 14-Sep-2018 Brendan Gregg Created this. 20 | */ 21 | 22 | BEGIN 23 | { 24 | printf("Tracing writeback... Hit Ctrl-C to end.\n"); 25 | printf("%-9s %-8s %-8s %-16s %s\n", "TIME", "DEVICE", "PAGES", 26 | "REASON", "ms"); 27 | 28 | // see /sys/kernel/debug/tracing/events/writeback/writeback_start/format 29 | @reason[0] = "background"; 30 | @reason[1] = "vmscan"; 31 | @reason[2] = "sync"; 32 | @reason[3] = "periodic"; 33 | @reason[4] = "laptop_timer"; 34 | @reason[5] = "free_more_memory"; 35 | @reason[6] = "fs_free_space"; 36 | @reason[7] = "forker_thread"; 37 | } 38 | 39 | tracepoint:writeback:writeback_start 40 | { 41 | @start[args.sb_dev] = nsecs; 42 | } 43 | 44 | tracepoint:writeback:writeback_written 45 | { 46 | $sb_dev = args.sb_dev; 47 | $s = @start[$sb_dev]; 48 | delete(@start[$sb_dev]); 49 | $lat = $s ? (nsecs - $s) / 1000 : 0; 50 | 51 | time("%H:%M:%S "); 52 | printf("%-8s %-8d %-16s %d.%03d\n", args.name, 53 | args.nr_pages & 0xffff, // TODO: explain these bitmasks 54 | @reason[args.reason & 0xffffffff], 55 | $lat / 1000, $lat % 1000); 56 | } 57 | 58 | END 59 | { 60 | clear(@reason); 61 | clear(@start); 62 | } 63 | -------------------------------------------------------------------------------- /tools/writeback_example.txt: -------------------------------------------------------------------------------- 1 | Demonstrations of writeback, the Linux bpftrace/eBPF version. 2 | 3 | 4 | This tool traces when the kernel writeback procedure is writing dirtied pages 5 | to disk, and shows details such as the time, device numbers, reason for the 6 | write back, and the duration. For example: 7 | 8 | # ./writeback.bt 9 | Attaching 4 probes... 10 | Tracing writeback... Hit Ctrl-C to end. 11 | TIME DEVICE PAGES REASON ms 12 | 23:28:47 259:1 15791 periodic 0.005 13 | 23:28:48 259:0 15792 periodic 0.004 14 | 23:28:52 259:1 15784 periodic 0.003 15 | 23:28:53 259:0 18682 periodic 0.003 16 | 23:28:55 259:0 41970 background 326.663 17 | 23:28:56 259:0 18418 background 332.689 18 | 23:28:56 259:0 60402 background 362.446 19 | 23:28:57 259:1 18230 periodic 0.005 20 | 23:28:57 259:1 65492 background 3.343 21 | 23:28:57 259:1 65492 background 0.002 22 | 23:28:58 259:0 36850 background 0.000 23 | 23:28:58 259:0 13298 background 597.198 24 | 23:28:58 259:0 55282 background 322.050 25 | 23:28:59 259:0 31730 background 336.031 26 | 23:28:59 259:0 8178 background 357.119 27 | 23:29:01 259:0 50162 background 1803.146 28 | 23:29:02 259:0 27634 background 1311.876 29 | 23:29:03 259:0 6130 background 331.599 30 | 23:29:03 259:0 50162 background 293.968 31 | 23:29:03 259:0 28658 background 284.946 32 | 23:29:03 259:0 7154 background 286.572 33 | [...] 34 | 35 | By looking a the timestamps and latency, it can be seen that the system was 36 | not spending much time in writeback until 23:28:55, when "background" 37 | writeback began, taking over 300 milliseconds per flush. 38 | 39 | If timestamps of heavy writeback coincide with times when applications suffered 40 | performance issues, that would be a clue that they are correlated and there 41 | is contention for the disk devices. There are various ways to tune this: 42 | eg, vm.dirty_writeback_centisecs. 43 | -------------------------------------------------------------------------------- /tools/xfsdist.bt: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bpftrace 2 | /* 3 | * xfsdist Summarize XFS operation latency. 4 | * For Linux, uses bpftrace and eBPF. 5 | * 6 | * This traces four common file system calls: read, write, open, and fsync. 7 | * It can be customized to trace more if desired. 8 | * 9 | * USAGE: xfsdist.bt 10 | * 11 | * This is a bpftrace version of the bcc tool of the same name. 12 | * 13 | * Copyright 2018 Netflix, Inc. 14 | * Licensed under the Apache License, Version 2.0 (the "License") 15 | * 16 | * 08-Sep-2018 Brendan Gregg Created this. 17 | */ 18 | 19 | BEGIN 20 | { 21 | printf("Tracing XFS operation latency... Hit Ctrl-C to end.\n"); 22 | } 23 | 24 | kprobe:xfs_file_read_iter, 25 | kprobe:xfs_file_write_iter, 26 | kprobe:xfs_file_open, 27 | kprobe:xfs_file_fsync 28 | { 29 | @start[tid] = nsecs; 30 | @name[tid] = func; 31 | } 32 | 33 | kretprobe:xfs_file_read_iter, 34 | kretprobe:xfs_file_write_iter, 35 | kretprobe:xfs_file_open, 36 | kretprobe:xfs_file_fsync 37 | /@start[tid]/ 38 | { 39 | @us[@name[tid]] = hist((nsecs - @start[tid]) / 1000); 40 | delete(@start[tid]); 41 | delete(@name[tid]); 42 | } 43 | 44 | END 45 | { 46 | clear(@start); 47 | clear(@name); 48 | } 49 | -------------------------------------------------------------------------------- /tools/xfsdist_example.txt: -------------------------------------------------------------------------------- 1 | Demonstrations of xfsdist, the Linux bpftrace/eBPF version. 2 | 3 | 4 | xfsdist traces XFS reads, writes, opens, and fsyncs, and summarizes their 5 | latency as a power-of-2 histogram. For example: 6 | 7 | # xfsdist.bt 8 | Attaching 9 probes... 9 | Tracing XFS operation latency... Hit Ctrl-C to end. 10 | ^C 11 | 12 | @us[xfs_file_write_iter]: 13 | [8, 16) 1 |@@@@@@@@@@@@@@@@@@@@@@@@@@ | 14 | [16, 32) 2 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| 15 | 16 | @us[xfs_file_read_iter]: 17 | [1] 724 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| 18 | [2, 4) 137 |@@@@@@@@@ | 19 | [4, 8) 143 |@@@@@@@@@@ | 20 | [8, 16) 37 |@@ | 21 | [16, 32) 11 | | 22 | [32, 64) 22 |@ | 23 | [64, 128) 7 | | 24 | [128, 256) 0 | | 25 | [256, 512) 485 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ | 26 | [512, 1K) 149 |@@@@@@@@@@ | 27 | [1K, 2K) 98 |@@@@@@@ | 28 | [2K, 4K) 85 |@@@@@@ | 29 | [4K, 8K) 27 |@ | 30 | [8K, 16K) 29 |@@ | 31 | [16K, 32K) 25 |@ | 32 | [32K, 64K) 1 | | 33 | [64K, 128K) 0 | | 34 | [128K, 256K) 6 | | 35 | 36 | @us[xfs_file_open]: 37 | [1] 1819 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| 38 | [2, 4) 272 |@@@@@@@ | 39 | [4, 8) 0 | | 40 | [8, 16) 9 | | 41 | [16, 32) 7 | | 42 | 43 | This output shows a bi-modal distribution for read latency, with a faster 44 | mode of 724 reads that took between 0 and 1 microseconds, and a slower 45 | mode of over 485 reads that took between 256 and 512 microseconds. It's 46 | likely that the faster mode was a hit from the in-memory file system cache, 47 | and the slower mode is a read from a storage device (disk). 48 | 49 | This "latency" is measured from when the operation was issued from the VFS 50 | interface to the file system, to when it completed. This spans everything: 51 | block device I/O (disk I/O), file system CPU cycles, file system locks, run 52 | queue latency, etc. This is a better measure of the latency suffered by 53 | applications reading from the file system than measuring this down at the 54 | block device interface. 55 | 56 | Note that this only traces the common file system operations previously 57 | listed: other file system operations (eg, inode operations including 58 | getattr()) are not traced. 59 | 60 | 61 | There is another version of this tool in bcc: https://github.com/iovisor/bcc 62 | The bcc version provides command line options to customize the output. 63 | --------------------------------------------------------------------------------