├── .git-blame-ignore-revs ├── .github ├── ISSUE_TEMPLATE │ ├── bug-report.yml │ ├── config.yml │ ├── feature-request.yml │ └── question.yml └── workflows │ ├── ci.yml │ ├── macos.yml │ ├── nightly-ci.yml │ └── windows.yml ├── .gitignore ├── .pre-commit-config.yaml ├── .readthedocs.yml ├── LICENSE ├── README.md ├── ailment ├── __init__.py ├── block.py ├── block_walker.py ├── constant.py ├── converter_common.py ├── converter_pcode.py ├── converter_vex.py ├── expression.py ├── manager.py ├── py.typed ├── statement.py ├── tagged_object.py └── utils.py ├── docs ├── Makefile ├── api.rst ├── conf.py ├── index.rst ├── make.bat └── readme.rst ├── pyproject.toml └── tests ├── test_expression.py └── test_irsb.py /.git-blame-ignore-revs: -------------------------------------------------------------------------------- 1 | # Black + pre-commit 2 | a5ba0e3f88b7d258a849dbb2647aad46122bd815 # Black 3 | 35c4bfb0def7d63d9ded4d9f5314f728a0fe5d40 # Trailing whitespace, pyupgrade, prefer builtin constructors 4 | -------------------------------------------------------------------------------- /.github/ISSUE_TEMPLATE/bug-report.yml: -------------------------------------------------------------------------------- 1 | name: Report a bug 2 | description: Report a bug in ailment 3 | labels: [bug,needs-triage] 4 | body: 5 | - type: markdown 6 | attributes: 7 | value: | 8 | Thank you for taking the time to submit this bug report! 9 | 10 | Before submitting this bug report, please check the following, which may resolve your issue: 11 | * Have you checked that you are running the latest versions of angr and its components? angr is rapidly-evolving! 12 | * Have you [searched existing issues](https://github.com/angr/ailment/issues?q=is%3Aopen+is%3Aissue+label%3Abug) to see if this bug has been reported before? 13 | * Have you checked the [documentation](https://docs.angr.io/)? 14 | * Have you checked the [FAQ](https://docs.angr.io/introductory-errata/faq)? 15 | 16 | **Important:** If this bug is a security vulnerability, please submit it privately. See our [security policy](https://github.com/angr/angr/blob/master/SECURITY.md) for more details. 17 | 18 | Please note: The angr suite is maintained by a small team. While we cannot guarantee any timeliness for fixes and enhancements, we will do our best. For more real-time help with angr, from us and the community, join our [Slack](https://angr.io/invite/). 19 | 20 | - type: textarea 21 | attributes: 22 | label: Description 23 | description: Brief description of the bug, with any relevant log messages. 24 | validations: 25 | required: true 26 | 27 | - type: textarea 28 | attributes: 29 | label: Steps to reproduce the bug 30 | description: | 31 | If appropriate, include both a **script to reproduce the bug**, and if possible **attach the binary used**. 32 | 33 | **Tip:** You can attach files to the issue by first clicking on the textarea to select it, then dragging & dropping the file onto the textarea. 34 | - type: textarea 35 | attributes: 36 | label: Environment 37 | description: Many common issues are caused by problems with the local Python environment. Before submitting, double-check that your versions of all modules in the angr suite (angr, cle, pyvex, ...) are up to date and include the output of `python -m angr.misc.bug_report` here. 38 | 39 | - type: textarea 40 | attributes: 41 | label: Additional context 42 | description: Any additional context about the problem. 43 | -------------------------------------------------------------------------------- /.github/ISSUE_TEMPLATE/config.yml: -------------------------------------------------------------------------------- 1 | blank_issues_enabled: false 2 | contact_links: 3 | - name: Join our Slack community 4 | url: https://angr.io/invite/ 5 | about: For questions and help with angr, you are invited to join the angr Slack community 6 | -------------------------------------------------------------------------------- /.github/ISSUE_TEMPLATE/feature-request.yml: -------------------------------------------------------------------------------- 1 | name: Request a feature 2 | description: Request a new feature for ailment 3 | labels: [enhancement,needs-triage] 4 | body: 5 | - type: markdown 6 | attributes: 7 | value: | 8 | Thank you for taking the time to submit this feature request! 9 | 10 | Before submitting this feature request, please check the following: 11 | * Have you checked that you are running the latest versions of angr and its components? angr is rapidly-evolving! 12 | * Have you checked the [documentation](https://docs.angr.io/) to see if this feature exists already? 13 | * Have you [searched existing issues](https://github.com/angr/ailment/issues?q=is%3Aissue+label%3Aenhancement+) to see if this feature has been requested before? 14 | 15 | Please note: The angr suite is maintained by a small team. While we cannot guarantee any timeliness for fixes and enhancements, we will do our best. For more real-time help with angr, from us and the community, join our [Slack](https://angr.io/invite/). 16 | 17 | - type: textarea 18 | attributes: 19 | label: Description 20 | description: | 21 | Brief description of the desired feature. If the feature is intended to solve some problem, please clearly describe the problem, including any relevant binaries, etc. 22 | 23 | **Tip:** You can attach files to the issue by first clicking on the textarea to select it, then dragging & dropping the file onto the textarea. 24 | validations: 25 | required: true 26 | 27 | - type: textarea 28 | attributes: 29 | label: Alternatives 30 | description: Possible alternative solutions or features that you have considered. 31 | 32 | - type: textarea 33 | attributes: 34 | label: Additional context 35 | description: Any other context or screenshots about the feature request. 36 | -------------------------------------------------------------------------------- /.github/ISSUE_TEMPLATE/question.yml: -------------------------------------------------------------------------------- 1 | name: Ask a question 2 | description: Ask a question about ailment 3 | labels: [question,needs-triage] 4 | body: 5 | - type: markdown 6 | attributes: 7 | value: | 8 | If you have a question about ailment, that is not a bug report or a feature request, you can ask it here. For more real-time help with ailment, from us and the community, join our [Slack](https://angr.io/invite/). 9 | 10 | Before submitting this question, please check the following, which may answer your question: 11 | * Have you checked the [documentation](https://docs.angr.io/)? 12 | * Have you checked the [FAQ](https://docs.angr.io/introductory-errata/faq)? 13 | * Have you checked our library of [examples](https://github.com/angr/angr-doc/tree/master/examples)? 14 | * Have you [searched existing issues](https://github.com/angr/ailment/issues?q=is%3Aissue+label%3Aquestion) to see if this question has been answered before? 15 | * Have you checked that you are running the latest versions of angr and its components. angr is rapidly-evolving! 16 | 17 | Please note: The angr suite is maintained by a small team. While we cannot guarantee any timeliness for fixes and enhancements, we will do our best. 18 | 19 | - type: textarea 20 | attributes: 21 | label: Question 22 | description: 23 | validations: 24 | required: true 25 | -------------------------------------------------------------------------------- /.github/workflows/ci.yml: -------------------------------------------------------------------------------- 1 | name: CI 2 | 3 | on: 4 | push: 5 | branches: 6 | - master 7 | pull_request: 8 | workflow_dispatch: 9 | 10 | jobs: 11 | ci: 12 | uses: angr/ci-settings/.github/workflows/angr-ci.yml@master 13 | windows: 14 | uses: ./.github/workflows/windows.yml 15 | macos: 16 | uses: ./.github/workflows/macos.yml 17 | 18 | -------------------------------------------------------------------------------- /.github/workflows/macos.yml: -------------------------------------------------------------------------------- 1 | name: Test on macOS 2 | 3 | on: 4 | workflow_dispatch: 5 | workflow_call: 6 | 7 | jobs: 8 | macos: 9 | name: Test macOS 10 | runs-on: macos-15 11 | steps: 12 | - uses: actions/checkout@v3 13 | - uses: actions/setup-python@v4 14 | with: 15 | python-version: "3.10" 16 | - run: python -m venv $HOME/venv 17 | name: Create venv 18 | shell: bash 19 | - run: | 20 | source $HOME/venv/bin/activate 21 | pip install .[testing] 22 | name: Install 23 | - run: | 24 | source $HOME/venv/bin/activate 25 | pytest -n auto 26 | name: Run pytest 27 | -------------------------------------------------------------------------------- /.github/workflows/nightly-ci.yml: -------------------------------------------------------------------------------- 1 | name: Nightly CI 2 | 3 | on: 4 | schedule: 5 | - cron: "0 0 * * *" 6 | workflow_dispatch: 7 | 8 | jobs: 9 | ci: 10 | uses: angr/ci-settings/.github/workflows/angr-ci.yml@master 11 | with: 12 | nightly: true 13 | secrets: inherit 14 | -------------------------------------------------------------------------------- /.github/workflows/windows.yml: -------------------------------------------------------------------------------- 1 | name: Test on Windows 2 | 3 | on: 4 | workflow_dispatch: 5 | workflow_call: 6 | 7 | jobs: 8 | windows: 9 | name: Test Windows 10 | runs-on: windows-2022 11 | steps: 12 | - uses: actions/checkout@v3 13 | - uses: actions/setup-python@v4 14 | with: 15 | python-version: "3.10" 16 | - run: python -m venv $HOME/venv 17 | name: Create venv 18 | shell: bash 19 | - run: | 20 | call %USERPROFILE%\venv\Scripts\activate 21 | pip install .[testing] 22 | name: Install 23 | shell: cmd 24 | - run: | 25 | call %USERPROFILE%\venv\Scripts\activate 26 | pytest -n auto 27 | name: Run pytest 28 | shell: cmd 29 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | .idea 2 | *.pyc 3 | *.egg-info 4 | dist 5 | build 6 | docs/_build 7 | -------------------------------------------------------------------------------- /.pre-commit-config.yaml: -------------------------------------------------------------------------------- 1 | repos: 2 | 3 | # 4 | # Fail fast 5 | # 6 | 7 | - repo: https://github.com/abravalheri/validate-pyproject 8 | rev: v0.24.1 9 | hooks: 10 | - id: validate-pyproject 11 | fail_fast: true 12 | 13 | - repo: https://github.com/pre-commit/pre-commit-hooks 14 | rev: v5.0.0 15 | hooks: 16 | # General 17 | - id: check-merge-conflict 18 | fail_fast: true 19 | - id: check-case-conflict 20 | fail_fast: true 21 | - id: destroyed-symlinks 22 | fail_fast: true 23 | - id: check-symlinks 24 | fail_fast: true 25 | - id: check-added-large-files 26 | fail_fast: true 27 | # Syntax 28 | - id: check-toml 29 | fail_fast: true 30 | - id: check-json 31 | fail_fast: true 32 | - id: check-yaml 33 | fail_fast: true 34 | 35 | - repo: https://github.com/pre-commit/pre-commit-hooks 36 | rev: v5.0.0 37 | hooks: 38 | - id: check-ast 39 | fail_fast: true 40 | 41 | # 42 | # Modifiers 43 | # 44 | 45 | - repo: https://github.com/pre-commit/pre-commit-hooks 46 | rev: v5.0.0 47 | hooks: 48 | - id: mixed-line-ending 49 | - id: trailing-whitespace 50 | 51 | - repo: https://github.com/dannysepler/rm_unneeded_f_str 52 | rev: v0.2.0 53 | hooks: 54 | - id: rm-unneeded-f-str 55 | 56 | - repo: https://github.com/asottile/pyupgrade 57 | rev: v3.20.0 58 | hooks: 59 | - id: pyupgrade 60 | args: [--py310-plus] 61 | 62 | - repo: https://github.com/astral-sh/ruff-pre-commit 63 | rev: v0.11.11 64 | hooks: 65 | - id: ruff 66 | args: [--fix, --exit-non-zero-on-fix] 67 | 68 | # Last modifier: Coding Standard 69 | - repo: https://github.com/psf/black 70 | rev: 25.1.0 71 | hooks: 72 | - id: black 73 | 74 | # 75 | # Static Checks 76 | # 77 | 78 | - repo: https://github.com/pre-commit/pygrep-hooks 79 | rev: v1.10.0 80 | hooks: 81 | # Python 82 | - id: python-use-type-annotations 83 | - id: python-no-log-warn 84 | # Documentation 85 | - id: rst-backticks 86 | - id: rst-directive-colons 87 | - id: rst-inline-touching-normal 88 | 89 | - repo: https://github.com/pre-commit/pre-commit-hooks 90 | rev: v5.0.0 91 | hooks: 92 | - id: debug-statements 93 | - id: check-builtin-literals 94 | - id: check-docstring-first 95 | -------------------------------------------------------------------------------- /.readthedocs.yml: -------------------------------------------------------------------------------- 1 | # Read the Docs configuration file 2 | # See https://docs.readthedocs.io/en/stable/config-file/v2.html for details 3 | 4 | version: 2 5 | sphinx: 6 | configuration: docs/conf.py 7 | build: 8 | os: ubuntu-22.04 9 | tools: 10 | python: "3.10" 11 | 12 | python: 13 | install: 14 | - method: pip 15 | path: . 16 | extra_requirements: 17 | - docs 18 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | Copyright (c) 2015, The Regents of the University of California 2 | All rights reserved. 3 | 4 | Redistribution and use in source and binary forms, with or without 5 | modification, are permitted provided that the following conditions are met: 6 | 7 | * Redistributions of source code must retain the above copyright notice, this 8 | list of conditions and the following disclaimer. 9 | 10 | * Redistributions in binary form must reproduce the above copyright notice, 11 | this list of conditions and the following disclaimer in the documentation 12 | and/or other materials provided with the distribution. 13 | 14 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" 15 | AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 16 | IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 17 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE 18 | FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 19 | DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR 20 | SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER 21 | CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 22 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 23 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 24 | 25 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | **Archival:** ailment has been [merged](https://github.com/angr/angr/pull/5501) into angr. 2 | 3 | --- 4 | 5 | # AILment 6 | [![Latest Release](https://img.shields.io/pypi/v/ailment.svg)](https://pypi.python.org/pypi/ailment/) 7 | [![Python Version](https://img.shields.io/pypi/pyversions/ailment)](https://pypi.python.org/pypi/ailment/) 8 | [![PyPI Statistics](https://img.shields.io/pypi/dm/ailment.svg)](https://pypistats.org/packages/ailment) 9 | [![License](https://img.shields.io/github/license/angr/ailment.svg)](https://github.com/angr/ailment/blob/master/LICENSE) 10 | 11 | AIL is the angr intermediate language. 12 | 13 | ## Project Links 14 | Project repository: https://github.com/angr/ailment 15 | 16 | Documentation: https://api.angr.io/projects/ailment/en/latest/ 17 | -------------------------------------------------------------------------------- /ailment/__init__.py: -------------------------------------------------------------------------------- 1 | __version__ = "9.2.159.dev0" 2 | 3 | import logging 4 | 5 | from .block import Block 6 | from . import statement 7 | from . import expression 8 | from .statement import Assignment, Statement 9 | from .expression import Expression, Const, Tmp, Register, UnaryOp, BinaryOp 10 | from .converter_common import Converter 11 | from .manager import Manager 12 | from .block_walker import AILBlockWalker, AILBlockWalkerBase 13 | 14 | log = logging.getLogger(__name__) 15 | 16 | # REALLY BAD 17 | Expr = expression 18 | Stmt = statement 19 | 20 | available_converters: set[str] = set() 21 | 22 | try: 23 | from .converter_vex import VEXIRSBConverter 24 | import pyvex 25 | 26 | available_converters.add("vex") 27 | except ImportError as e: 28 | log.debug("Could not import VEXIRSBConverter") 29 | log.debug(e) 30 | VEXIRSBConverter = None 31 | 32 | try: 33 | from .converter_pcode import PCodeIRSBConverter 34 | from angr.engines import pcode 35 | 36 | available_converters.add("pcode") 37 | except ImportError as e: 38 | log.debug("Could not import PCodeIRSBConverter") 39 | log.debug(e) 40 | PCodeIRSBConverter = None 41 | 42 | 43 | class IRSBConverter(Converter): 44 | @staticmethod 45 | def convert(irsb, manager): # pylint:disable=arguments-differ 46 | """ 47 | Convert the given IRSB to an AIL block 48 | 49 | :param irsb: The IRSB to convert 50 | :param manager: The manager to use 51 | :return: Returns the converted block 52 | """ 53 | 54 | if "pcode" in available_converters and isinstance(irsb, pcode.lifter.IRSB): 55 | return PCodeIRSBConverter.convert(irsb, manager) 56 | elif "vex" in available_converters and isinstance(irsb, pyvex.IRSB): 57 | return VEXIRSBConverter.convert(irsb, manager) 58 | else: 59 | raise ValueError("No converter available for %s" % type(irsb)) 60 | 61 | 62 | __all__ = [ 63 | "available_converters", 64 | "Block", 65 | "expression", 66 | "statement", 67 | "Stmt", 68 | "Expr", 69 | "Statement", 70 | "Assignment", 71 | "Expression", 72 | "Const", 73 | "Tmp", 74 | "Register", 75 | "UnaryOp", 76 | "BinaryOp", 77 | "Manager", 78 | "IRSBConverter", 79 | "AILBlockWalkerBase", 80 | "AILBlockWalker", 81 | "PCodeIRSBConverter", 82 | "VEXIRSBConverter", 83 | ] 84 | -------------------------------------------------------------------------------- /ailment/block.py: -------------------------------------------------------------------------------- 1 | from typing import TYPE_CHECKING 2 | 3 | if TYPE_CHECKING: 4 | from .statement import Statement 5 | 6 | 7 | class Block: 8 | """ 9 | Describes an AIL block. 10 | """ 11 | 12 | __slots__ = ( 13 | "addr", 14 | "original_size", 15 | "statements", 16 | "idx", 17 | "_hash", 18 | ) 19 | 20 | def __init__(self, addr: int, original_size, statements=None, idx=None): 21 | self.addr = addr 22 | self.original_size = original_size 23 | self.statements: list["Statement"] = [] if statements is None else statements 24 | self.idx = idx 25 | self._hash = None # cached hash value 26 | 27 | def copy(self, statements=None): 28 | return Block( 29 | addr=self.addr, 30 | original_size=self.original_size, 31 | statements=self.statements[::] if statements is None else statements, 32 | idx=self.idx, 33 | ) 34 | 35 | def __repr__(self): 36 | if self.idx is None: 37 | return "" % (self.addr, len(self.statements)) 38 | else: 39 | return "" % (self.addr, self.idx, len(self.statements)) 40 | 41 | def dbg_repr(self, indent=0): 42 | indent_str = " " * indent 43 | if self.idx is None: 44 | block_str = f"{indent_str}## Block {self.addr:x}\n" 45 | else: 46 | block_str = "%s## Block %x.%d\n" % (indent_str, self.addr, self.idx) 47 | stmts_str = "\n".join( 48 | [ 49 | ("%s%02d | %s | " % (indent_str, i, hex(getattr(stmt, "ins_addr", 0)))) + str(stmt) 50 | for i, stmt in enumerate(self.statements) 51 | ] 52 | ) 53 | block_str += stmts_str + "\n" 54 | return block_str 55 | 56 | def __str__(self): 57 | return self.dbg_repr() 58 | 59 | def __eq__(self, other): 60 | return ( 61 | type(other) is Block 62 | and self.addr == other.addr 63 | and self.statements == other.statements 64 | and self.idx == other.idx 65 | ) 66 | 67 | def likes(self, other): 68 | return ( 69 | type(other) is Block 70 | and len(self.statements) == len(other.statements) 71 | and all(s1.likes(s2) for s1, s2 in zip(self.statements, other.statements)) 72 | ) 73 | 74 | def clear_hash(self): 75 | self._hash = None 76 | 77 | def __hash__(self): 78 | # Changing statements does not change the hash of a block, which allows in-place statement editing 79 | if self._hash is None: 80 | self._hash = hash((Block, self.addr, self.idx)) 81 | return self._hash 82 | -------------------------------------------------------------------------------- /ailment/block_walker.py: -------------------------------------------------------------------------------- 1 | # pylint:disable=unused-argument,no-self-use 2 | # pyright: reportIncompatibleMethodOverride=false 3 | from typing import Any 4 | from collections.abc import Callable 5 | 6 | from . import Block 7 | from .statement import ( 8 | Call, 9 | CAS, 10 | Statement, 11 | ConditionalJump, 12 | Assignment, 13 | Store, 14 | Return, 15 | Jump, 16 | DirtyStatement, 17 | WeakAssignment, 18 | ) 19 | from .expression import ( 20 | Load, 21 | Expression, 22 | BinaryOp, 23 | UnaryOp, 24 | Convert, 25 | ITE, 26 | DirtyExpression, 27 | VEXCCallExpression, 28 | Tmp, 29 | Register, 30 | Const, 31 | Reinterpret, 32 | MultiStatementExpression, 33 | VirtualVariable, 34 | Phi, 35 | ) 36 | 37 | 38 | class AILBlockWalkerBase: 39 | """ 40 | Walks all statements and expressions of an AIL node and do nothing. 41 | """ 42 | 43 | def __init__(self, stmt_handlers=None, expr_handlers=None): 44 | _default_stmt_handlers = { 45 | Assignment: self._handle_Assignment, 46 | WeakAssignment: self._handle_WeakAssignment, 47 | CAS: self._handle_CAS, 48 | Call: self._handle_Call, 49 | Store: self._handle_Store, 50 | ConditionalJump: self._handle_ConditionalJump, 51 | Jump: self._handle_Jump, 52 | Return: self._handle_Return, 53 | DirtyStatement: self._handle_DirtyStatement, 54 | } 55 | 56 | _default_expr_handlers = { 57 | Call: self._handle_CallExpr, 58 | Load: self._handle_Load, 59 | BinaryOp: self._handle_BinaryOp, 60 | UnaryOp: self._handle_UnaryOp, 61 | Convert: self._handle_Convert, 62 | ITE: self._handle_ITE, 63 | DirtyExpression: self._handle_DirtyExpression, 64 | VEXCCallExpression: self._handle_VEXCCallExpression, 65 | Tmp: self._handle_Tmp, 66 | Register: self._handle_Register, 67 | Reinterpret: self._handle_Reinterpret, 68 | Const: self._handle_Const, 69 | MultiStatementExpression: self._handle_MultiStatementExpression, 70 | VirtualVariable: self._handle_VirtualVariable, 71 | Phi: self._handle_Phi, 72 | } 73 | 74 | self.stmt_handlers: dict[type, Callable] = stmt_handlers if stmt_handlers else _default_stmt_handlers 75 | self.expr_handlers: dict[type, Callable] = expr_handlers if expr_handlers else _default_expr_handlers 76 | 77 | def walk(self, block: Block) -> None: 78 | i = 0 79 | while i < len(block.statements): 80 | stmt = block.statements[i] 81 | self._handle_stmt(i, stmt, block) 82 | i += 1 83 | 84 | def walk_statement(self, stmt: Statement, block: Block | None = None): 85 | return self._handle_stmt(0, stmt, block) 86 | 87 | def walk_expression( 88 | self, 89 | expr: Expression, 90 | stmt_idx: int | None = None, 91 | stmt: Statement | None = None, 92 | block: Block | None = None, 93 | ): 94 | return self._handle_expr(0, expr, stmt_idx or 0, stmt, block) 95 | 96 | def _handle_stmt(self, stmt_idx: int, stmt: Statement, block: Block | None) -> Any: 97 | try: 98 | handler = self.stmt_handlers[type(stmt)] 99 | except KeyError: 100 | handler = None 101 | 102 | if handler: 103 | return handler(stmt_idx, stmt, block) 104 | return None 105 | 106 | def _handle_expr( 107 | self, expr_idx: int, expr: Expression, stmt_idx: int, stmt: Statement | None, block: Block | None 108 | ) -> Any: 109 | try: 110 | handler = self.expr_handlers[type(expr)] 111 | except KeyError: 112 | handler = None 113 | 114 | if handler: 115 | return handler(expr_idx, expr, stmt_idx, stmt, block) 116 | return None 117 | 118 | # 119 | # Default handlers 120 | # 121 | 122 | def _handle_Assignment(self, stmt_idx: int, stmt: Assignment, block: Block | None): 123 | self._handle_expr(0, stmt.dst, stmt_idx, stmt, block) 124 | self._handle_expr(1, stmt.src, stmt_idx, stmt, block) 125 | 126 | def _handle_WeakAssignment(self, stmt_idx: int, stmt: WeakAssignment, block: Block | None): 127 | self._handle_expr(0, stmt.dst, stmt_idx, stmt, block) 128 | self._handle_expr(1, stmt.src, stmt_idx, stmt, block) 129 | 130 | def _handle_CAS(self, stmt_idx: int, stmt: CAS, block: Block | None): 131 | self._handle_expr(0, stmt.addr, stmt_idx, stmt, block) 132 | self._handle_expr(1, stmt.data_lo, stmt_idx, stmt, block) 133 | if stmt.data_hi is not None: 134 | self._handle_expr(2, stmt.data_hi, stmt_idx, stmt, block) 135 | self._handle_expr(3, stmt.expd_lo, stmt_idx, stmt, block) 136 | if stmt.expd_hi is not None: 137 | self._handle_expr(4, stmt.expd_hi, stmt_idx, stmt, block) 138 | self._handle_expr(5, stmt.old_lo, stmt_idx, stmt, block) 139 | if stmt.old_hi is not None: 140 | self._handle_expr(6, stmt.old_hi, stmt_idx, stmt, block) 141 | 142 | def _handle_Call(self, stmt_idx: int, stmt: Call, block: Block | None): 143 | if not isinstance(stmt.target, str): 144 | self._handle_expr(-1, stmt.target, stmt_idx, stmt, block) 145 | if stmt.args: 146 | for i, arg in enumerate(stmt.args): 147 | self._handle_expr(i, arg, stmt_idx, stmt, block) 148 | 149 | def _handle_Store(self, stmt_idx: int, stmt: Store, block: Block | None): 150 | self._handle_expr(0, stmt.addr, stmt_idx, stmt, block) 151 | self._handle_expr(1, stmt.data, stmt_idx, stmt, block) 152 | if stmt.guard is not None: 153 | self._handle_expr(2, stmt.guard, stmt_idx, stmt, block) 154 | 155 | def _handle_Jump(self, stmt_idx: int, stmt: Jump, block: Block | None): 156 | self._handle_expr(0, stmt.target, stmt_idx, stmt, block) 157 | 158 | def _handle_ConditionalJump(self, stmt_idx: int, stmt: ConditionalJump, block: Block | None): 159 | self._handle_expr(0, stmt.condition, stmt_idx, stmt, block) 160 | if stmt.true_target is not None: 161 | self._handle_expr(1, stmt.true_target, stmt_idx, stmt, block) 162 | if stmt.false_target is not None: 163 | self._handle_expr(2, stmt.false_target, stmt_idx, stmt, block) 164 | 165 | def _handle_Return(self, stmt_idx: int, stmt: Return, block: Block | None): 166 | if stmt.ret_exprs: 167 | for i, ret_expr in enumerate(stmt.ret_exprs): 168 | self._handle_expr(i, ret_expr, stmt_idx, stmt, block) 169 | 170 | def _handle_DirtyStatement(self, stmt_idx: int, stmt: DirtyStatement, block: Block | None): 171 | self._handle_expr(0, stmt.dirty, stmt_idx, stmt, block) 172 | 173 | def _handle_Load(self, expr_idx: int, expr: Load, stmt_idx: int, stmt: Statement, block: Block | None): 174 | self._handle_expr(0, expr.addr, stmt_idx, stmt, block) 175 | 176 | def _handle_CallExpr(self, expr_idx: int, expr: Call, stmt_idx: int, stmt: Statement, block: Block | None): 177 | if not isinstance(expr.target, str): 178 | self._handle_expr(-1, expr.target, stmt_idx, stmt, block) 179 | if expr.args: 180 | for i, arg in enumerate(expr.args): 181 | self._handle_expr(i, arg, stmt_idx, stmt, block) 182 | 183 | def _handle_BinaryOp(self, expr_idx: int, expr: BinaryOp, stmt_idx: int, stmt: Statement, block: Block | None): 184 | self._handle_expr(0, expr.operands[0], stmt_idx, stmt, block) 185 | self._handle_expr(1, expr.operands[1], stmt_idx, stmt, block) 186 | 187 | def _handle_UnaryOp(self, expr_idx: int, expr: UnaryOp, stmt_idx: int, stmt: Statement, block: Block | None): 188 | self._handle_expr(0, expr.operand, stmt_idx, stmt, block) 189 | 190 | def _handle_Convert(self, expr_idx: int, expr: Convert, stmt_idx: int, stmt: Statement, block: Block | None): 191 | self._handle_expr(expr_idx, expr.operand, stmt_idx, stmt, block) 192 | 193 | def _handle_Reinterpret( 194 | self, expr_idx: int, expr: Reinterpret, stmt_idx: int, stmt: Statement, block: Block | None 195 | ): 196 | self._handle_expr(expr_idx, expr.operand, stmt_idx, stmt, block) 197 | 198 | def _handle_ITE(self, expr_idx: int, expr: ITE, stmt_idx: int, stmt: Statement, block: Block | None): 199 | self._handle_expr(0, expr.cond, stmt_idx, stmt, block) 200 | self._handle_expr(1, expr.iftrue, stmt_idx, stmt, block) 201 | self._handle_expr(2, expr.iffalse, stmt_idx, stmt, block) 202 | 203 | def _handle_Tmp(self, expr_idx: int, expr: Tmp, stmt_idx: int, stmt: Statement, block: Block | None): 204 | pass 205 | 206 | def _handle_Register(self, expr_idx: int, expr: Register, stmt_idx: int, stmt: Statement, block: Block | None): 207 | pass 208 | 209 | def _handle_Const(self, expr_idx: int, expr: Const, stmt_idx: int, stmt: Statement, block: Block | None): 210 | pass 211 | 212 | def _handle_VirtualVariable( 213 | self, expr_idx: int, expr: VirtualVariable, stmt_idx: int, stmt: Statement, block: Block | None 214 | ): 215 | pass 216 | 217 | def _handle_Phi(self, expr_id: int, expr: Phi, stmt_idx: int, stmt: Statement, block: Block | None): 218 | for idx, (_, vvar) in enumerate(expr.src_and_vvars): 219 | if vvar is not None: 220 | self._handle_expr(idx, vvar, stmt_idx, stmt, block) 221 | 222 | def _handle_MultiStatementExpression( 223 | self, expr_idx, expr: MultiStatementExpression, stmt_idx: int, stmt: Statement, block: Block | None 224 | ): 225 | for idx, stmt_ in enumerate(expr.stmts): 226 | self._handle_stmt(idx, stmt_, None) 227 | self._handle_expr(0, expr.expr, stmt_idx, stmt, block) 228 | 229 | def _handle_DirtyExpression( 230 | self, expr_idx: int, expr: DirtyExpression, stmt_idx: int, stmt: Statement, block: Block | None 231 | ): 232 | for idx, operand in enumerate(expr.operands): 233 | self._handle_expr(idx, operand, stmt_idx, stmt, block) 234 | if expr.guard is not None: 235 | self._handle_expr(len(expr.operands) + 1, expr.guard, stmt_idx, stmt, block) 236 | 237 | def _handle_VEXCCallExpression( 238 | self, expr_idx: int, expr: VEXCCallExpression, stmt_idx: int, stmt: Statement, block: Block | None 239 | ): 240 | for idx, operand in enumerate(expr.operands): 241 | self._handle_expr(idx, operand, stmt_idx, stmt, block) 242 | 243 | 244 | class AILBlockWalker(AILBlockWalkerBase): 245 | """ 246 | Walks all statements and expressions of an AIL node, and rebuilds expressions, statements, or blocks if needed. 247 | 248 | If you need a pure walker without rebuilding, use AILBlockWalkerBase instead. 249 | 250 | :ivar update_block: True if the block should be updated in place, False if a new block should be created and 251 | returned as the result of walk(). 252 | :ivar replace_phi_stmt: True if you want _handle_Phi be called and vvars potentially replaced; False otherwise. 253 | Default to False because in the most majority cases you do not want vvars in a Phi 254 | variable be replaced. 255 | """ 256 | 257 | def __init__( 258 | self, stmt_handlers=None, expr_handlers=None, update_block: bool = True, replace_phi_stmt: bool = False 259 | ): 260 | super().__init__(stmt_handlers=stmt_handlers, expr_handlers=expr_handlers) 261 | self._update_block = update_block 262 | self._replace_phi_stmt = replace_phi_stmt 263 | 264 | def walk(self, block: Block) -> Block | None: 265 | """ 266 | Walk the block and rebuild it if necessary. The block will be rebuilt in-place (by updating statements in the 267 | original block when self._update_block is set to True), or a new block will be created and returned. 268 | 269 | :param block: The block to walk. 270 | :return: The new block that is rebuilt, or None if the block is not changed or when self._update_block 271 | is set to True. 272 | """ 273 | 274 | changed = False 275 | new_block: Block | None = None 276 | 277 | i = 0 278 | while i < len(block.statements): 279 | stmt = block.statements[i] 280 | new_stmt = self._handle_stmt(i, stmt, block) 281 | if new_stmt is not None: 282 | changed = True 283 | if not self._update_block: 284 | if new_block is None: 285 | new_block = block.copy(statements=block.statements[:i]) 286 | new_block.statements.append(new_stmt) 287 | else: 288 | if new_block is not None: 289 | new_block.statements.append(stmt) 290 | i += 1 291 | 292 | return new_block if changed else None 293 | 294 | def _handle_stmt(self, stmt_idx: int, stmt: Statement, block: Block | None) -> Any: 295 | try: 296 | handler = self.stmt_handlers[type(stmt)] 297 | except KeyError: 298 | handler = None 299 | 300 | if handler: 301 | return handler(stmt_idx, stmt, block) 302 | return None 303 | 304 | def _handle_expr( 305 | self, expr_idx: int, expr: Expression, stmt_idx: int, stmt: Statement | None, block: Block | None 306 | ) -> Any: 307 | try: 308 | handler = self.expr_handlers[type(expr)] 309 | except KeyError: 310 | handler = None 311 | 312 | if handler: 313 | expr = handler(expr_idx, expr, stmt_idx, stmt, block) 314 | if expr is not None: 315 | r = self._handle_expr(expr_idx, expr, stmt_idx, stmt, block) 316 | return r if r is not None else expr 317 | return None # unchanged 318 | 319 | # 320 | # Default handlers 321 | # 322 | 323 | def _handle_Assignment(self, stmt_idx: int, stmt: Assignment, block: Block | None) -> Assignment | None: 324 | changed = False 325 | 326 | dst = self._handle_expr(0, stmt.dst, stmt_idx, stmt, block) 327 | if dst is not None and dst is not stmt.dst: 328 | changed = True 329 | else: 330 | dst = stmt.dst 331 | 332 | src = self._handle_expr(1, stmt.src, stmt_idx, stmt, block) 333 | if src is not None and src is not stmt.src: 334 | changed = True 335 | else: 336 | src = stmt.src 337 | 338 | if changed: 339 | # update the statement directly in the block 340 | new_stmt = Assignment(stmt.idx, dst, src, **stmt.tags) 341 | if self._update_block and block is not None: 342 | block.statements[stmt_idx] = new_stmt 343 | return new_stmt 344 | return None 345 | 346 | def _handle_WeakAssignment(self, stmt_idx: int, stmt: WeakAssignment, block: Block | None) -> WeakAssignment | None: 347 | changed = False 348 | 349 | dst = self._handle_expr(0, stmt.dst, stmt_idx, stmt, block) 350 | if dst is not None and dst is not stmt.dst: 351 | changed = True 352 | else: 353 | dst = stmt.dst 354 | 355 | src = self._handle_expr(1, stmt.src, stmt_idx, stmt, block) 356 | if src is not None and src is not stmt.src: 357 | changed = True 358 | else: 359 | src = stmt.src 360 | 361 | if changed: 362 | # update the statement directly in the block 363 | new_stmt = WeakAssignment(stmt.idx, dst, src, **stmt.tags) 364 | if self._update_block and block is not None: 365 | block.statements[stmt_idx] = new_stmt 366 | return new_stmt 367 | return None 368 | 369 | def _handle_CAS(self, stmt_idx: int, stmt: CAS, block: Block | None) -> CAS | None: 370 | changed = False 371 | 372 | addr = self._handle_expr(0, stmt.addr, stmt_idx, stmt, block) 373 | if addr is not None and addr is not stmt.addr: 374 | changed = True 375 | else: 376 | addr = stmt.addr 377 | 378 | data_lo = self._handle_expr(1, stmt.data_lo, stmt_idx, stmt, block) 379 | if data_lo is not None and data_lo is not stmt.data_lo: 380 | changed = True 381 | else: 382 | data_lo = stmt.data_lo 383 | 384 | data_hi = None 385 | if stmt.data_hi is not None: 386 | data_hi = self._handle_expr(2, stmt.data_hi, stmt_idx, stmt, block) 387 | if data_hi is not None and data_hi is not stmt.data_hi: 388 | changed = True 389 | else: 390 | data_hi = stmt.data_hi 391 | 392 | expd_lo = self._handle_expr(3, stmt.expd_lo, stmt_idx, stmt, block) 393 | if expd_lo is not None and expd_lo is not stmt.expd_lo: 394 | changed = True 395 | else: 396 | expd_lo = stmt.expd_lo 397 | 398 | expd_hi = None 399 | if stmt.expd_hi is not None: 400 | expd_hi = self._handle_expr(4, stmt.expd_hi, stmt_idx, stmt, block) 401 | if expd_hi is not None and expd_hi is not stmt.expd_hi: 402 | changed = True 403 | else: 404 | expd_hi = stmt.expd_hi 405 | 406 | old_lo = self._handle_expr(5, stmt.old_lo, stmt_idx, stmt, block) 407 | if old_lo is not None and old_lo is not stmt.old_lo: 408 | changed = True 409 | else: 410 | old_lo = stmt.old_lo 411 | 412 | old_hi = None 413 | if stmt.old_hi is not None: 414 | old_hi = self._handle_expr(6, stmt.old_hi, stmt_idx, stmt, block) 415 | if old_hi is not None and old_hi is not stmt.old_hi: 416 | changed = True 417 | else: 418 | old_hi = stmt.old_hi 419 | 420 | if changed: 421 | # update the statement directly in the block 422 | new_stmt = CAS( 423 | stmt.idx, 424 | addr, 425 | data_lo, 426 | data_hi, 427 | expd_lo, 428 | expd_hi, 429 | old_lo, 430 | old_hi, 431 | stmt.endness, 432 | **stmt.tags, 433 | ) 434 | if self._update_block and block is not None: 435 | block.statements[stmt_idx] = new_stmt 436 | return new_stmt 437 | return None 438 | 439 | def _handle_Call(self, stmt_idx: int, stmt: Call, block: Block | None): 440 | changed = False 441 | 442 | if isinstance(stmt.target, str): 443 | new_target = None 444 | else: 445 | new_target = self._handle_expr(-1, stmt.target, stmt_idx, stmt, block) 446 | if new_target is not None and new_target is not stmt.target: 447 | changed = True 448 | 449 | new_args = None 450 | if stmt.args is not None: 451 | new_args = [] 452 | 453 | i = 0 454 | while i < len(stmt.args): 455 | arg = stmt.args[i] 456 | new_arg = self._handle_expr(i, arg, stmt_idx, stmt, block) 457 | if new_arg is not None and new_arg is not arg: 458 | if not changed: 459 | # initialize new_args 460 | new_args = list(stmt.args[:i]) 461 | new_args.append(new_arg) 462 | changed = True 463 | else: 464 | if changed: 465 | new_args.append(arg) 466 | i += 1 467 | 468 | if changed: 469 | new_stmt = Call( 470 | stmt.idx, 471 | new_target if new_target is not None else stmt.target, 472 | calling_convention=stmt.calling_convention, 473 | prototype=stmt.prototype, 474 | args=new_args, 475 | ret_expr=stmt.ret_expr, 476 | **stmt.tags, 477 | ) 478 | if self._update_block and block is not None: 479 | block.statements[stmt_idx] = new_stmt 480 | return new_stmt 481 | return None 482 | 483 | def _handle_Store(self, stmt_idx: int, stmt: Store, block: Block | None): 484 | changed = False 485 | 486 | addr = self._handle_expr(0, stmt.addr, stmt_idx, stmt, block) 487 | if addr is not None and addr is not stmt.addr: 488 | changed = True 489 | else: 490 | addr = stmt.addr 491 | 492 | data = self._handle_expr(1, stmt.data, stmt_idx, stmt, block) 493 | if data is not None and data is not stmt.data: 494 | changed = True 495 | else: 496 | data = stmt.data 497 | 498 | guard = None if stmt.guard is None else self._handle_expr(2, stmt.guard, stmt_idx, stmt, block) 499 | if guard is not None and guard is not stmt.guard: 500 | changed = True 501 | else: 502 | guard = stmt.guard 503 | 504 | if changed: 505 | # update the statement directly in the block 506 | new_stmt = Store( 507 | stmt.idx, 508 | addr, 509 | data, 510 | stmt.size, 511 | stmt.endness, 512 | guard=guard, 513 | variable=stmt.variable, 514 | offset=stmt.offset, 515 | **stmt.tags, 516 | ) 517 | if self._update_block and block is not None: 518 | block.statements[stmt_idx] = new_stmt 519 | return new_stmt 520 | return None 521 | 522 | def _handle_Jump(self, stmt_idx: int, stmt: Jump, block: Block | None): 523 | changed = False 524 | 525 | target = self._handle_expr(0, stmt.target, stmt_idx, stmt, block) 526 | if target is not None and target is not stmt.target: 527 | changed = True 528 | else: 529 | target = stmt.target 530 | 531 | if changed: 532 | new_stmt = Jump( 533 | stmt.idx, 534 | target, 535 | target_idx=stmt.target_idx, 536 | **stmt.tags, 537 | ) 538 | if self._update_block and block is not None: 539 | block.statements[stmt_idx] = new_stmt 540 | return new_stmt 541 | return None 542 | 543 | def _handle_ConditionalJump(self, stmt_idx: int, stmt: ConditionalJump, block: Block | None): 544 | changed = False 545 | 546 | condition = self._handle_expr(0, stmt.condition, stmt_idx, stmt, block) 547 | if condition is not None and condition is not stmt.condition: 548 | changed = True 549 | else: 550 | condition = stmt.condition 551 | 552 | true_target = None 553 | if stmt.true_target is not None: 554 | true_target = self._handle_expr(1, stmt.true_target, stmt_idx, stmt, block) 555 | if true_target is not None and true_target is not stmt.true_target: 556 | changed = True 557 | else: 558 | true_target = stmt.true_target 559 | 560 | false_target = None 561 | if stmt.false_target is not None: 562 | false_target = self._handle_expr(2, stmt.false_target, stmt_idx, stmt, block) 563 | if false_target is not None and false_target is not stmt.false_target: 564 | changed = True 565 | else: 566 | false_target = stmt.false_target 567 | 568 | if changed: 569 | new_stmt = ConditionalJump( 570 | stmt.idx, 571 | condition, 572 | true_target, 573 | false_target, 574 | true_target_idx=stmt.true_target_idx, 575 | false_target_idx=stmt.false_target_idx, 576 | **stmt.tags, 577 | ) 578 | if self._update_block and block is not None: 579 | block.statements[stmt_idx] = new_stmt 580 | return new_stmt 581 | return None 582 | 583 | def _handle_Return(self, stmt_idx: int, stmt: Return, block: Block | None): 584 | if stmt.ret_exprs: 585 | i = 0 586 | changed = False 587 | new_ret_exprs = [None] * len(stmt.ret_exprs) 588 | while i < len(stmt.ret_exprs): 589 | new_ret_expr = self._handle_expr(i, stmt.ret_exprs[i], stmt_idx, stmt, block) 590 | if new_ret_expr is not None: 591 | new_ret_exprs[i] = new_ret_expr 592 | changed = True 593 | else: 594 | new_ret_exprs[i] = stmt.ret_exprs[i] 595 | i += 1 596 | 597 | if changed: 598 | new_stmt = Return(stmt.idx, new_ret_exprs, **stmt.tags) 599 | if self._update_block and block is not None: 600 | block.statements[stmt_idx] = new_stmt 601 | return new_stmt 602 | return None 603 | 604 | def _handle_DirtyStatement(self, stmt_idx: int, stmt: DirtyStatement, block: Block | None): 605 | changed = False 606 | 607 | dirty = self._handle_expr(0, stmt.dirty, stmt_idx, stmt, block) 608 | if dirty is not None and dirty is not stmt.dirty: 609 | changed = True 610 | else: 611 | dirty = stmt.dirty 612 | 613 | if changed: 614 | new_stmt = DirtyStatement(stmt.idx, dirty, **stmt.tags) 615 | if self._update_block and block is not None: 616 | block.statements[stmt_idx] = new_stmt 617 | return new_stmt 618 | return None 619 | 620 | # 621 | # Expression handlers 622 | # 623 | 624 | def _handle_Load(self, expr_idx: int, expr: Load, stmt_idx: int, stmt: Statement, block: Block | None): 625 | addr = self._handle_expr(0, expr.addr, stmt_idx, stmt, block) 626 | 627 | if addr is not None and addr is not expr.addr: 628 | new_expr = expr.copy() 629 | new_expr.addr = addr 630 | return new_expr 631 | return None 632 | 633 | def _handle_CallExpr(self, expr_idx: int, expr: Call, stmt_idx: int, stmt: Statement, block: Block | None): 634 | changed = False 635 | 636 | if isinstance(expr.target, str): 637 | new_target = None 638 | else: 639 | new_target = self._handle_expr(-1, expr.target, stmt_idx, stmt, block) 640 | if new_target is not None and new_target is not expr.target: 641 | changed = True 642 | 643 | new_args = None 644 | if expr.args is not None: 645 | i = 0 646 | new_args = [] 647 | while i < len(expr.args): 648 | arg = expr.args[i] 649 | new_arg = self._handle_expr(i, arg, stmt_idx, stmt, block) 650 | if new_arg is not None and new_arg is not arg: 651 | if not changed: 652 | # initialize new_args 653 | new_args = list(expr.args[:i]) 654 | new_args.append(new_arg) 655 | changed = True 656 | else: 657 | if changed: 658 | new_args.append(arg) 659 | i += 1 660 | 661 | if changed: 662 | expr = expr.copy() 663 | if new_target is not None: 664 | expr.target = new_target 665 | expr.args = new_args 666 | return expr 667 | return None 668 | 669 | def _handle_BinaryOp(self, expr_idx: int, expr: BinaryOp, stmt_idx: int, stmt: Statement, block: Block | None): 670 | changed = False 671 | 672 | operand_0 = self._handle_expr(0, expr.operands[0], stmt_idx, stmt, block) 673 | if operand_0 is not None and operand_0 is not expr.operands[0]: 674 | changed = True 675 | else: 676 | operand_0 = expr.operands[0] 677 | 678 | operand_1 = self._handle_expr(1, expr.operands[1], stmt_idx, stmt, block) 679 | if operand_1 is not None and operand_1 is not expr.operands[1]: 680 | changed = True 681 | else: 682 | operand_1 = expr.operands[1] 683 | 684 | if changed: 685 | new_expr = expr.copy() 686 | new_expr.operands = (operand_0, operand_1) 687 | new_expr.depth = max(operand_0.depth, operand_1.depth) + 1 688 | return new_expr 689 | return None 690 | 691 | def _handle_UnaryOp(self, expr_idx: int, expr: UnaryOp, stmt_idx: int, stmt: Statement, block: Block | None): 692 | new_operand = self._handle_expr(0, expr.operand, stmt_idx, stmt, block) 693 | if new_operand is not None and new_operand is not expr.operand: 694 | new_expr = expr.copy() 695 | new_expr.operand = new_operand 696 | return new_expr 697 | return None 698 | 699 | def _handle_Convert(self, expr_idx: int, expr: Convert, stmt_idx: int, stmt: Statement, block: Block | None): 700 | new_operand = self._handle_expr(expr_idx, expr.operand, stmt_idx, stmt, block) 701 | if new_operand is not None and new_operand is not expr.operand: 702 | return Convert(expr.idx, expr.from_bits, expr.to_bits, expr.is_signed, new_operand, **expr.tags) 703 | return None 704 | 705 | def _handle_Reinterpret( 706 | self, expr_idx: int, expr: Reinterpret, stmt_idx: int, stmt: Statement, block: Block | None 707 | ): 708 | new_operand = self._handle_expr(expr_idx, expr.operand, stmt_idx, stmt, block) 709 | if new_operand is not None and new_operand is not expr.operand: 710 | return Reinterpret( 711 | expr.idx, expr.from_bits, expr.from_type, expr.to_bits, expr.to_type, new_operand, **expr.tags 712 | ) 713 | return None 714 | 715 | def _handle_ITE(self, expr_idx: int, expr: ITE, stmt_idx: int, stmt: Statement, block: Block | None): 716 | changed = False 717 | 718 | cond = self._handle_expr(0, expr.cond, stmt_idx, stmt, block) 719 | if cond is not None and cond is not expr.cond: 720 | changed = True 721 | else: 722 | cond = expr.cond 723 | 724 | iftrue = self._handle_expr(1, expr.iftrue, stmt_idx, stmt, block) 725 | if iftrue is not None and iftrue is not expr.iftrue: 726 | changed = True 727 | else: 728 | iftrue = expr.iftrue 729 | 730 | iffalse = self._handle_expr(2, expr.iffalse, stmt_idx, stmt, block) 731 | if iffalse is not None and iffalse is not expr.iffalse: 732 | changed = True 733 | else: 734 | iffalse = expr.iffalse 735 | 736 | if changed: 737 | new_expr = expr.copy() 738 | new_expr.cond = cond 739 | new_expr.iftrue = iftrue 740 | new_expr.iffalse = iffalse 741 | return new_expr 742 | return None 743 | 744 | def _handle_Phi(self, expr_id: int, expr: Phi, stmt_idx: int, stmt: Statement, block: Block | None) -> Phi | None: 745 | if not self._replace_phi_stmt: 746 | # fallback to the read-only version 747 | return super()._handle_Phi(expr_id, expr, stmt_idx, stmt, block) 748 | 749 | changed = False 750 | 751 | src_and_vvars = [] 752 | for idx, (src, vvar) in enumerate(expr.src_and_vvars): 753 | if vvar is None: 754 | if src_and_vvars is not None: 755 | src_and_vvars.append((src, None)) 756 | continue 757 | new_vvar = self._handle_expr(idx, vvar, stmt_idx, stmt, block) 758 | if new_vvar is not None and new_vvar is not vvar: 759 | changed = True 760 | if src_and_vvars is None: 761 | src_and_vvars = expr.src_and_vvars[:idx] 762 | src_and_vvars.append((src, new_vvar)) 763 | elif src_and_vvars is not None: 764 | src_and_vvars.append((src, vvar)) 765 | 766 | return Phi(expr.idx, expr.bits, src_and_vvars, **expr.tags) if changed else None 767 | 768 | def _handle_DirtyExpression( 769 | self, expr_idx: int, expr: DirtyExpression, stmt_idx: int, stmt: Statement, block: Block | None 770 | ): 771 | changed = False 772 | new_operands = [] 773 | for operand in expr.operands: 774 | new_operand = self._handle_expr(0, operand, stmt_idx, stmt, block) 775 | if new_operand is not None and new_operand is not operand: 776 | changed = True 777 | new_operands.append(new_operand) 778 | else: 779 | new_operands.append(operand) 780 | 781 | new_guard = expr.guard 782 | if expr.guard is not None: 783 | new_guard = self._handle_expr(2, expr.guard, stmt_idx, stmt, block) 784 | if new_guard is not None and new_guard is not expr.guard: 785 | changed = True 786 | 787 | if changed: 788 | return DirtyExpression( 789 | expr.idx, 790 | expr.callee, 791 | new_operands, 792 | guard=new_guard, 793 | mfx=expr.mfx, 794 | maddr=expr.maddr, 795 | msize=expr.msize, 796 | bits=expr.bits, 797 | **expr.tags, 798 | ) 799 | return None 800 | 801 | def _handle_VEXCCallExpression( 802 | self, expr_idx: int, expr: VEXCCallExpression, stmt_idx: int, stmt: Statement, block: Block | None 803 | ): 804 | changed = False 805 | new_operands = [] 806 | for idx, operand in enumerate(expr.operands): 807 | new_operand = self._handle_expr(idx, operand, stmt_idx, stmt, block) 808 | if new_operand is not None and new_operand is not operand: 809 | changed = True 810 | new_operands.append(new_operand) 811 | else: 812 | new_operands.append(operand) 813 | 814 | if changed: 815 | new_expr = expr.copy() 816 | new_expr.operands = tuple(new_operands) 817 | return new_expr 818 | return None 819 | 820 | def _handle_MultiStatementExpression( 821 | self, expr_idx, expr: MultiStatementExpression, stmt_idx: int, stmt: Statement, block: Block | None 822 | ): 823 | changed = False 824 | new_statements = [] 825 | for idx, stmt_ in enumerate(expr.stmts): 826 | new_stmt = self._handle_stmt(idx, stmt_, None) 827 | if new_stmt is not None and new_stmt is not stmt_: 828 | changed = True 829 | new_statements.append(new_stmt) 830 | else: 831 | new_statements.append(stmt_) 832 | 833 | new_expr = self._handle_expr(0, expr.expr, stmt_idx, stmt, block) 834 | if new_expr is not None and new_expr is not expr.expr: 835 | changed = True 836 | else: 837 | new_expr = expr.expr 838 | 839 | if changed: 840 | expr_ = expr.copy() 841 | expr_.expr = new_expr 842 | expr_.stmts = new_statements 843 | return expr_ 844 | return None 845 | -------------------------------------------------------------------------------- /ailment/constant.py: -------------------------------------------------------------------------------- 1 | UNDETERMINED_SIZE = -0xC0DE 2 | -------------------------------------------------------------------------------- /ailment/converter_common.py: -------------------------------------------------------------------------------- 1 | class SkipConversionNotice(Exception): 2 | pass 3 | 4 | 5 | class Converter: 6 | @staticmethod 7 | def convert(thing): 8 | raise NotImplementedError() 9 | -------------------------------------------------------------------------------- /ailment/converter_pcode.py: -------------------------------------------------------------------------------- 1 | import logging 2 | 3 | from angr.utils.constants import DEFAULT_STATEMENT 4 | from angr.engines.pcode.lifter import IRSB 5 | from pypcode import OpCode, Varnode 6 | import pypcode 7 | 8 | from .block import Block 9 | from .statement import Statement, Assignment, Store, Jump, ConditionalJump, Return, Call 10 | from .expression import Expression, DirtyExpression, Const, Register, Tmp, UnaryOp, BinaryOp, Load, Convert 11 | 12 | # FIXME: Convert, ITE 13 | from .manager import Manager 14 | from .converter_common import Converter 15 | 16 | 17 | log = logging.getLogger(name=__name__) 18 | 19 | # FIXME: Not all ops are mapped to AIL expressions! 20 | opcode_to_generic_name = { 21 | # OpCode.MULTIEQUAL : '', 22 | # OpCode.INDIRECT : '', 23 | # OpCode.PIECE : '', 24 | # OpCode.SUBPIECE : '', 25 | OpCode.INT_EQUAL: "CmpEQ", 26 | OpCode.INT_NOTEQUAL: "CmpNE", 27 | OpCode.INT_SLESS: "CmpLTs", 28 | OpCode.INT_SLESSEQUAL: "CmpLEs", 29 | OpCode.INT_LESS: "CmpLT", 30 | OpCode.INT_LESSEQUAL: "CmpLE", 31 | # OpCode.INT_ZEXT : '', 32 | # OpCode.INT_SEXT : '', 33 | OpCode.INT_ADD: "Add", 34 | OpCode.INT_SUB: "Sub", 35 | OpCode.INT_CARRY: "Carry", 36 | OpCode.INT_SCARRY: "SCarry", 37 | OpCode.INT_SBORROW: "SBorrow", 38 | # OpCode.INT_2COMP : '', 39 | OpCode.INT_NEGATE: "Neg", 40 | OpCode.INT_XOR: "Xor", 41 | OpCode.INT_AND: "And", 42 | OpCode.INT_OR: "Or", 43 | OpCode.INT_LEFT: "Shl", 44 | OpCode.INT_RIGHT: "Shr", 45 | OpCode.INT_SRIGHT: "Sar", 46 | OpCode.INT_MULT: "Mul", 47 | OpCode.INT_DIV: "Div", 48 | # OpCode.INT_SDIV : '', 49 | # OpCode.INT_REM : '', 50 | # OpCode.INT_SREM : '', 51 | OpCode.BOOL_NEGATE: "Not", 52 | OpCode.BOOL_XOR: "LogicalXor", 53 | OpCode.BOOL_AND: "LogicalAnd", 54 | OpCode.BOOL_OR: "LogicalOr", 55 | # OpCode.CAST : '', 56 | # OpCode.PTRADD : '', 57 | # OpCode.PTRSUB : '', 58 | # OpCode.FLOAT_EQUAL : '', 59 | # OpCode.FLOAT_NOTEQUAL : '', 60 | # OpCode.FLOAT_LESS : '', 61 | # OpCode.FLOAT_LESSEQUAL : '', 62 | # OpCode.FLOAT_NAN : '', 63 | # OpCode.FLOAT_ADD : '', 64 | OpCode.FLOAT_DIV: "Div", 65 | OpCode.FLOAT_MULT: "Mul", 66 | OpCode.FLOAT_SUB: "Sub", 67 | # OpCode.FLOAT_NEG : '', 68 | # OpCode.FLOAT_ABS : '', 69 | # OpCode.FLOAT_SQRT : '', 70 | # OpCode.FLOAT_INT2FLOAT : '', 71 | # OpCode.FLOAT_FLOAT2FLOAT : '', 72 | # OpCode.FLOAT_TRUNC : '', 73 | # OpCode.FLOAT_CEIL : '', 74 | # OpCode.FLOAT_FLOOR : '', 75 | # OpCode.FLOAT_ROUND : '', 76 | # OpCode.SEGMENTOP : '', 77 | # OpCode.CPOOLREF : '', 78 | # OpCode.NEW : '', 79 | # OpCode.INSERT : '', 80 | # OpCode.EXTRACT : '', 81 | # OpCode.POPCOUNT : '', 82 | } 83 | 84 | 85 | class PCodeIRSBConverter(Converter): 86 | """ 87 | Converts a p-code IRSB to an AIL block 88 | """ 89 | 90 | @staticmethod 91 | def convert(irsb: IRSB, manager: Manager): # pylint:disable=arguments-differ 92 | """ 93 | Convert the given IRSB to an AIL block 94 | 95 | :param irsb: IRSB to convert 96 | :param manager: Manager to use 97 | :return: Converted block 98 | """ 99 | return PCodeIRSBConverter(irsb, manager)._convert() 100 | 101 | def __init__(self, irsb: IRSB, manager: Manager): 102 | self._irsb = irsb 103 | self._manager = manager 104 | self._statements = [] 105 | self._current_op = None 106 | self._next_ins_addr = None 107 | self._current_behavior = None 108 | self._statement_idx = 0 109 | 110 | # Remap all uniques s.t. they are write-once with values starting from 0 111 | self._unique_tracker: dict[int, tuple[int, int]] = {} 112 | self._unique_counter = 0 113 | 114 | self._special_op_handlers = { 115 | OpCode.COPY: self._convert_copy, 116 | OpCode.INT_ZEXT: self._convert_zext, 117 | OpCode.INT_SEXT: self._convert_sext, 118 | OpCode.LOAD: self._convert_load, 119 | OpCode.STORE: self._convert_store, 120 | OpCode.BRANCH: self._convert_branch, 121 | OpCode.CBRANCH: self._convert_cbranch, 122 | OpCode.BRANCHIND: self._convert_branchind, 123 | OpCode.CALL: self._convert_call, 124 | OpCode.CALLIND: self._convert_callind, 125 | OpCode.CALLOTHER: self._convert_callother, 126 | OpCode.RETURN: self._convert_ret, 127 | OpCode.MULTIEQUAL: self._convert_multiequal, 128 | OpCode.INDIRECT: self._convert_indirect, 129 | OpCode.SEGMENTOP: self._convert_segment_op, 130 | OpCode.CPOOLREF: self._convert_cpool_ref, 131 | OpCode.NEW: self._convert_new, 132 | OpCode.FLOAT_INT2FLOAT: self._convert_int2float, 133 | OpCode.FLOAT_FLOAT2FLOAT: self._convert_float2float, 134 | } 135 | 136 | manager.tyenv = None 137 | manager.block_addr = irsb.addr 138 | manager.vex_stmt_idx = DEFAULT_STATEMENT # Reset after loop. Necessary? 139 | 140 | def _convert(self) -> Block: 141 | """ 142 | Convert the given IRSB to an AIL Block 143 | """ 144 | self._statement_idx = 0 145 | 146 | for op in self._irsb._ops: 147 | self._current_op = op 148 | if op.opcode == pypcode.OpCode.IMARK: 149 | self._manager.ins_addr = op.inputs[0].offset 150 | self._next_ins_addr = op.inputs[-1].offset + op.inputs[-1].size 151 | else: 152 | self._current_behavior = self._irsb.behaviors.get_behavior_for_opcode(self._current_op.opcode) 153 | self._convert_current_op() 154 | self._statement_idx += 1 155 | 156 | if "sparc:" in self._irsb.arch.name and self._irsb.arch.bits == 32: 157 | if self._current_op.opcode == OpCode.CALL: 158 | break 159 | 160 | return Block(self._irsb.addr, self._irsb.size, statements=self._statements) 161 | 162 | def _convert_current_op(self) -> None: 163 | """ 164 | Convert the current op to corresponding AIL statement 165 | """ 166 | assert self._current_behavior is not None 167 | 168 | is_special = self._current_behavior.opcode in self._special_op_handlers 169 | 170 | if is_special: 171 | try: 172 | self._special_op_handlers[self._current_behavior.opcode]() 173 | except NotImplementedError as ex: 174 | log.warning("Unsupported opcode: %s", ex) 175 | elif self._current_behavior.is_unary: 176 | self._convert_unary() 177 | else: 178 | self._convert_binary() 179 | 180 | def _convert_unary(self) -> None: 181 | """ 182 | Convert the current unary op to corresponding AIL statement 183 | """ 184 | opcode = self._current_op.opcode 185 | 186 | op = opcode_to_generic_name.get(opcode, None) 187 | in1 = self._get_value(self._current_op.inputs[0]) 188 | if op is None: 189 | log.warning("p-code: Unsupported opcode of type %s", opcode.__name__) 190 | out = DirtyExpression(self._manager.next_atom(), opcode.__name__, [], bits=self._current_op.output.size * 8) 191 | else: 192 | out = UnaryOp(self._manager.next_atom(), op, in1, ins_addr=self._manager.ins_addr) 193 | 194 | stmt = self._set_value(self._current_op.output, out) 195 | self._statements.append(stmt) 196 | 197 | def _convert_binary(self) -> None: 198 | """ 199 | Convert the current binary op to corresponding AIL statement 200 | """ 201 | opcode = self._current_op.opcode 202 | op = opcode_to_generic_name.get(opcode, None) 203 | in1 = self._get_value(self._current_op.inputs[0]) 204 | in2 = self._get_value(self._current_op.inputs[1]) 205 | signed = op in {"CmpLEs", "CmpGTs"} 206 | 207 | if op is None: 208 | log.warning("p-code: Unsupported opcode of type %s.", opcode.__name__) 209 | out = DirtyExpression(self._manager.next_atom(), opcode.__name__, [], bits=self._current_op.output.size * 8) 210 | else: 211 | out = BinaryOp(self._manager.next_atom(), op, [in1, in2], signed, ins_addr=self._manager.ins_addr) 212 | 213 | # Zero-extend 1-bit results 214 | zextend_ops = { 215 | OpCode.INT_EQUAL, 216 | OpCode.INT_NOTEQUAL, 217 | OpCode.INT_SLESS, 218 | OpCode.INT_SLESSEQUAL, 219 | OpCode.INT_LESS, 220 | OpCode.INT_LESSEQUAL, 221 | } 222 | if opcode in zextend_ops: 223 | out = Convert(self._manager.next_atom(), 1, self._current_op.output.size * 8, False, out) 224 | 225 | stmt = self._set_value(self._current_op.output, out) 226 | self._statements.append(stmt) 227 | 228 | def _map_register_name(self, varnode: Varnode) -> int: 229 | """ 230 | Map SLEIGH register offset to ArchInfo register offset based on name. 231 | 232 | :param varnode: The varnode to translate 233 | :return: The register file offset 234 | """ 235 | # FIXME: Will need performance optimization 236 | # FIXME: Should not get trans object this way. Moreover, should have a 237 | # faster mapping method than going through trans 238 | reg_name = varnode.getRegisterName() 239 | try: 240 | reg_offset = self._manager.arch.get_register_offset(reg_name.lower()) 241 | log.debug("Mapped register '%s' to offset %x", reg_name, reg_offset) 242 | except ValueError: 243 | reg_offset = varnode.offset + 0x100000 244 | log.warning("Could not map register '%s' from archinfo. Mapping to %x", reg_name, reg_offset) 245 | return reg_offset 246 | 247 | def _remap_temp(self, offset: int, size: int, is_write: bool) -> int | None: 248 | """ 249 | Remap any unique space addresses such that they are written only once 250 | 251 | :param offset: The unique space address 252 | :param is_write: Whether the access is a write or a read 253 | :return: The remapped temporary register index 254 | """ 255 | if is_write: 256 | self._unique_tracker[offset] = self._unique_counter, size 257 | self._unique_counter += 1 258 | return self._unique_tracker[offset][0] 259 | else: 260 | if offset in self._unique_tracker: 261 | return self._unique_tracker[offset][0] 262 | # this might be a partial access of an existing temporary variable. return None for now 263 | return None 264 | 265 | def _convert_varnode(self, varnode: Varnode, is_write: bool) -> Expression: 266 | """ 267 | Convert a varnode to a corresponding AIL expression 268 | 269 | :param varnode: The varnode to remap 270 | :param is_write: Whether the varnode is being read or written to 271 | :return: The corresponding AIL expression 272 | """ 273 | space_name = varnode.space.name 274 | size = varnode.size * 8 275 | 276 | if space_name == "const": 277 | return Const(self._manager.next_atom(), None, varnode.offset, size) 278 | elif space_name == "register": 279 | offset = self._map_register_name(varnode) 280 | return Register(self._manager.next_atom(), None, offset, size, reg_name=varnode.getRegisterName()) 281 | elif space_name == "unique": 282 | offset = self._remap_temp(varnode.offset, varnode.size, is_write) 283 | if offset is None: 284 | # this might be a partial access of an existing temporary variable 285 | unique_offset = None 286 | for delta in range(-1, -8, -1): 287 | if varnode.offset + delta in self._unique_tracker: 288 | unique_offset = varnode.offset + delta 289 | break 290 | assert unique_offset is not None, "Cannot find the source unique variable" 291 | # TODO: Check size 292 | _, ori_tmp_size = self._unique_tracker[unique_offset] 293 | t = Tmp(self._manager.next_atom(), None, unique_offset, ori_tmp_size * 8) 294 | # FIXME: Asserting BE 295 | right_shift_amount = varnode.offset + varnode.size - (unique_offset + ori_tmp_size) 296 | if right_shift_amount != 0: 297 | t = BinaryOp( 298 | self._manager.next_atom(), 299 | "Shr", 300 | [t, Const(self._manager.next_atom(), None, right_shift_amount * 8, 8)], 301 | False, 302 | ins_addr=self._manager.ins_addr, 303 | ) 304 | return Convert(self._manager.next_atom(), t.bits, size, False, t, ins_addr=self._manager.ins_addr) 305 | 306 | return Tmp(self._manager.next_atom(), None, offset, size) 307 | elif space_name in ["ram", "mem"]: 308 | assert not is_write 309 | addr = Const(self._manager.next_atom(), None, varnode.offset, self._manager.arch.bits) 310 | # Note: Load takes bytes, not bits, for size 311 | return Load( 312 | self._manager.next_atom(), 313 | addr, 314 | varnode.size, 315 | self._manager.arch.memory_endness, 316 | ins_addr=self._manager.ins_addr, 317 | ) 318 | else: 319 | raise NotImplementedError() 320 | 321 | def _set_value(self, varnode: Varnode, value: Expression) -> Statement: 322 | """ 323 | Create the appropriate assignment statement to store to a varnode 324 | 325 | This method stores to the appropriate register, or unique space, 326 | depending on the space indicated by the varnode. 327 | 328 | :param varnode: Varnode to store into 329 | :param value: Value to store 330 | :return: Corresponding AIL statement 331 | """ 332 | space_name = varnode.space.name 333 | 334 | if space_name in ["register", "unique"]: 335 | return Assignment( 336 | self._statement_idx, self._convert_varnode(varnode, True), value, ins_addr=self._manager.ins_addr 337 | ) 338 | elif space_name in ["ram", "mem"]: 339 | addr = Const(self._manager.next_atom(), None, varnode.offset, self._manager.arch.bits) 340 | return Store( 341 | self._statement_idx, 342 | addr, 343 | value, 344 | varnode.size, 345 | self._manager.arch.memory_endness, 346 | ins_addr=self._manager.ins_addr, 347 | ) 348 | else: 349 | raise NotImplementedError() 350 | 351 | def _get_value(self, varnode: Varnode) -> Expression: 352 | """ 353 | Create the appropriate expression to load from a varnode 354 | 355 | This method loads from the appropriate const, register, unique, or RAM 356 | space, depending on the space indicated by the varnode. 357 | 358 | :param varnode: Varnode to load from. 359 | :return: Value loaded 360 | """ 361 | return self._convert_varnode(varnode, False) 362 | 363 | def _convert_copy(self) -> None: 364 | """ 365 | Convert copy operation 366 | """ 367 | out = self._current_op.output 368 | inp = self._get_value(self._current_op.inputs[0]) 369 | stmt = self._set_value(out, inp) 370 | self._statements.append(stmt) 371 | 372 | def _convert_zext(self) -> None: 373 | """ 374 | Convert zext operation 375 | """ 376 | out = self._current_op.output 377 | inp = Convert( 378 | self._manager.next_atom(), 379 | self._current_op.inputs[0].size * 8, 380 | out.size * 8, 381 | False, 382 | self._get_value(self._current_op.inputs[0]), 383 | ) 384 | stmt = self._set_value(out, inp) 385 | self._statements.append(stmt) 386 | 387 | def _convert_sext(self) -> None: 388 | """ 389 | Convert the signed extension operation 390 | """ 391 | out = self._current_op.output 392 | inp = Convert( 393 | self._manager.next_atom(), 394 | self._current_op.inputs[0].size * 8, 395 | out.size * 8, 396 | False, 397 | self._get_value(self._current_op.inputs[0]), 398 | ) 399 | stmt = self._set_value(out, inp) 400 | self._statements.append(stmt) 401 | 402 | def _convert_negate(self) -> None: 403 | """ 404 | Convert bool negate operation 405 | """ 406 | out = self._current_op.output 407 | inp = self._get_value(self._current_op.inputs[0]) 408 | 409 | cval = Const(self._manager.next_atom(), None, 0, self._current_op.inputs[0].size * 8) 410 | 411 | expr = BinaryOp(self._manager.next_atom(), "CmpEQ", [inp, cval], signed=False, ins_addr=self._manager.ins_addr) 412 | 413 | stmt = self._set_value(out, expr) 414 | self._statements.append(stmt) 415 | 416 | def _convert_load(self) -> None: 417 | """ 418 | Convert a p-code load operation 419 | """ 420 | spc = self._current_op.inputs[0].getSpaceFromConst() 421 | out = self._current_op.output 422 | spc_name = spc.name.lower() 423 | assert spc_name in {"ram", "mem", "register"} 424 | if spc_name == "register": 425 | # load from register 426 | res = self._get_value(self._current_op.inputs[1]) 427 | stmt = self._set_value(out, res) 428 | else: 429 | # load from memory 430 | off = self._get_value(self._current_op.inputs[1]) 431 | res = Load( 432 | self._manager.next_atom(), 433 | off, 434 | self._current_op.output.size, 435 | self._manager.arch.memory_endness, 436 | ins_addr=self._manager.ins_addr, 437 | ) 438 | stmt = self._set_value(out, res) 439 | self._statements.append(stmt) 440 | 441 | def _convert_store(self) -> None: 442 | """ 443 | Convert a p-code store operation 444 | """ 445 | spc = self._current_op.inputs[0].getSpaceFromConst() 446 | spc_name = spc.name.lower() 447 | assert spc_name in {"ram", "mem", "register"} 448 | if spc_name == "register": 449 | # store to register 450 | out = self._current_op.inputs[2] 451 | res = self._get_value(self._current_op.inputs[1]) 452 | stmt = self._set_value(out, res) 453 | else: 454 | # store to memory 455 | off = self._get_value(self._current_op.inputs[1]) 456 | data = self._get_value(self._current_op.inputs[2]) 457 | log.debug("Storing %s at offset %s", data, off) 458 | # self.state.memory.store(off, data, endness=self.project.arch.memory_endness) 459 | stmt = Store( 460 | self._statement_idx, 461 | off, 462 | data, 463 | self._current_op.inputs[2].size, 464 | self._manager.arch.memory_endness, 465 | ins_addr=self._manager.ins_addr, 466 | ) 467 | self._statements.append(stmt) 468 | 469 | def _convert_branch(self) -> None: 470 | """ 471 | Convert a p-code branch operation 472 | """ 473 | if self._current_op.inputs[0].space == "const": 474 | raise NotImplementedError("p-code relative branch not supported yet") 475 | dest_addr = self._current_op.inputs[0].offset 476 | 477 | # special handling: if the previous statement is a ConditionalJump with a None destination address, then we 478 | # back-patch the previous statement 479 | dest = Const(self._manager.next_atom(), None, dest_addr, self._manager.arch.bits) 480 | if self._statements: 481 | last_stmt = self._statements[-1] 482 | if isinstance(last_stmt, ConditionalJump) and last_stmt.false_target is None: 483 | last_stmt.false_target = dest 484 | return 485 | 486 | stmt = Jump(self._statement_idx, dest, ins_addr=self._manager.ins_addr) 487 | self._statements.append(stmt) 488 | 489 | def _convert_cbranch(self) -> None: 490 | """ 491 | Convert a p-code conditional branch operation 492 | """ 493 | if self._current_op.inputs[0].space == "const": 494 | raise NotImplementedError("p-code relative branch not supported yet") 495 | dest_addr = self._current_op.inputs[0].offset 496 | cond = self._get_value(self._current_op.inputs[1]) 497 | cval = Const(self._manager.next_atom(), None, 0, cond.bits) 498 | condition = BinaryOp(self._manager.next_atom(), "CmpNE", [cond, cval], signed=False) 499 | dest = Const(self._manager.next_atom(), None, dest_addr, self._manager.arch.bits) 500 | if self._irsb._ops[-1] is self._current_op: 501 | # if the cbranch op is the last op, then we need to generate a fallthru target 502 | fallthru = Const( 503 | self._manager.next_atom(), 504 | None, 505 | self._next_ins_addr, 506 | self._manager.arch.bits, 507 | ) 508 | else: 509 | # there will be a Jump statement that follows the cbranch 510 | fallthru = None 511 | stmt = ConditionalJump(self._statement_idx, condition, dest, fallthru, ins_addr=self._manager.ins_addr) 512 | self._statements.append(stmt) 513 | 514 | def _convert_ret(self) -> None: 515 | """ 516 | Convert a p-code return operation 517 | """ 518 | Const(self._manager.next_atom(), None, self._irsb.next, self._manager.arch.bits) 519 | stmt = Return( 520 | self._statement_idx, 521 | [], 522 | ins_addr=self._manager.ins_addr, 523 | vex_block_addr=self._manager.block_addr, 524 | vex_stmt_idx=DEFAULT_STATEMENT, 525 | ) 526 | self._statements.append(stmt) 527 | 528 | def _convert_branchind(self) -> None: 529 | """ 530 | Convert a p-code indirect branch operation 531 | """ 532 | dest = self._get_value(self._current_op.inputs[0]) 533 | stmt = Jump(self._statement_idx, dest, ins_addr=self._manager.ins_addr) 534 | self._statements.append(stmt) 535 | 536 | def _convert_call(self) -> None: 537 | """ 538 | Convert a p-code call operation 539 | """ 540 | ret_reg_offset = self._manager.arch.ret_offset 541 | if ret_reg_offset is not None: 542 | ret_expr = Register(None, None, ret_reg_offset, self._manager.arch.bits) # ??? 543 | else: 544 | ret_expr = None 545 | dest = Const(self._manager.next_atom(), None, self._irsb.next, self._manager.arch.bits) 546 | stmt = Call( 547 | self._manager.next_atom(), 548 | dest, 549 | ret_expr=ret_expr, 550 | ins_addr=self._manager.ins_addr, 551 | vex_block_addr=self._manager.block_addr, 552 | vex_stmt_idx=DEFAULT_STATEMENT, 553 | ) 554 | self._statements.append(stmt) 555 | 556 | def _convert_callind(self) -> None: 557 | """ 558 | Convert a p-code indirect call operation 559 | """ 560 | ret_reg_offset = self._manager.arch.ret_offset 561 | ret_expr = Register(None, None, ret_reg_offset, self._manager.arch.bits) # ??? 562 | dest = self._get_value(self._current_op.inputs[0]) 563 | stmt = Call( 564 | self._manager.next_atom(), 565 | dest, 566 | ret_expr=ret_expr, 567 | ins_addr=self._manager.ins_addr, 568 | vex_block_addr=self._manager.block_addr, 569 | vex_stmt_idx=DEFAULT_STATEMENT, 570 | ) 571 | self._statements.append(stmt) 572 | 573 | def _convert_int2float(self) -> None: 574 | """ 575 | Convert INT2FLOAT operation. 576 | """ 577 | out = self._current_op.output 578 | inp = Convert( 579 | self._manager.next_atom(), 580 | self._current_op.inputs[0].size * 8, 581 | out.size * 8, 582 | True, 583 | self._get_value(self._current_op.inputs[0]), 584 | from_type=Convert.TYPE_INT, 585 | to_type=Convert.TYPE_FP, 586 | ) 587 | stmt = self._set_value(out, inp) 588 | self._statements.append(stmt) 589 | 590 | def _convert_float2float(self) -> None: 591 | """ 592 | Convert FLOAT2FLOAT operation. 593 | """ 594 | out = self._current_op.output 595 | inp = Convert( 596 | self._manager.next_atom(), 597 | self._current_op.inputs[0].size * 8, 598 | out.size * 8, 599 | True, 600 | self._get_value(self._current_op.inputs[0]), 601 | from_type=Convert.TYPE_FP, 602 | to_type=Convert.TYPE_FP, 603 | ) 604 | stmt = self._set_value(out, inp) 605 | self._statements.append(stmt) 606 | 607 | def _convert_callother(self) -> None: 608 | raise NotImplementedError("CALLOTHER emulation not currently supported") 609 | 610 | def _convert_multiequal(self) -> None: 611 | raise NotImplementedError("MULTIEQUAL appearing in unheritaged code?") 612 | 613 | def _convert_indirect(self) -> None: 614 | raise NotImplementedError("INDIRECT appearing in unheritaged code?") 615 | 616 | def _convert_segment_op(self) -> None: 617 | raise NotImplementedError("SEGMENTOP emulation not currently supported") 618 | 619 | def _convert_cpool_ref(self) -> None: 620 | raise NotImplementedError("Cannot currently emulate cpool operator") 621 | 622 | def _convert_new(self) -> None: 623 | raise NotImplementedError("Cannot currently emulate new operator") 624 | -------------------------------------------------------------------------------- /ailment/converter_vex.py: -------------------------------------------------------------------------------- 1 | # pylint:disable=missing-class-docstring 2 | import logging 3 | 4 | import pyvex 5 | from angr.utils.constants import DEFAULT_STATEMENT 6 | from angr.engines.vex.claripy.irop import vexop_to_simop 7 | from angr.errors import UnsupportedIROpError 8 | 9 | from .block import Block 10 | from .statement import Assignment, CAS, Store, Jump, Call, ConditionalJump, DirtyStatement, Return 11 | from .expression import ( 12 | Const, 13 | Register, 14 | Tmp, 15 | DirtyExpression, 16 | UnaryOp, 17 | Convert, 18 | BinaryOp, 19 | Load, 20 | ITE, 21 | Reinterpret, 22 | VEXCCallExpression, 23 | ) 24 | from .converter_common import SkipConversionNotice, Converter 25 | 26 | 27 | log = logging.getLogger(name=__name__) 28 | 29 | 30 | class VEXExprConverter(Converter): 31 | @staticmethod 32 | def simop_from_vexop(vex_op): 33 | return vexop_to_simop(vex_op) 34 | 35 | @staticmethod 36 | def generic_name_from_vex_op(vex_op): 37 | return vexop_to_simop(vex_op)._generic_name 38 | 39 | @staticmethod 40 | def convert(expr, manager): # pylint:disable=arguments-differ 41 | """ 42 | 43 | :param expr: 44 | :return: 45 | """ 46 | if isinstance(expr, pyvex.const.IRConst): 47 | return VEXExprConverter.const_n(expr, manager) 48 | 49 | func = EXPRESSION_MAPPINGS.get(type(expr)) 50 | if func is not None: 51 | # When something goes wrong, return a DirtyExpression instead of crashing the program 52 | try: 53 | return func(expr, manager) 54 | except UnsupportedIROpError: 55 | log.warning("VEXExprConverter: Unsupported IROp %s.", expr.op) 56 | return DirtyExpression( 57 | manager.next_atom(), f"unsupported_{expr.op}", [], bits=expr.result_size(manager.tyenv) 58 | ) 59 | 60 | log.warning("VEXExprConverter: Unsupported VEX expression of type %s.", type(expr)) 61 | try: 62 | bits = expr.result_size(manager.tyenv) 63 | except ValueError: 64 | # e.g., "ValueError: Type Ity_INVALID does not have size" 65 | bits = 0 66 | return DirtyExpression(manager.next_atom(), f"unsupported_{str(type(expr))}", [], bits=bits) 67 | 68 | @staticmethod 69 | def convert_list(exprs, manager): 70 | converted = [] 71 | for expr in exprs: 72 | converted.append(VEXExprConverter.convert(expr, manager)) 73 | return converted 74 | 75 | @staticmethod 76 | def register(offset, bits, manager): 77 | reg_size = bits // manager.arch.byte_width 78 | reg_name = manager.arch.translate_register_name(offset, reg_size) 79 | return Register( 80 | manager.next_atom(), 81 | None, 82 | offset, 83 | bits, 84 | reg_name=reg_name, 85 | ins_addr=manager.ins_addr, 86 | vex_block_addr=manager.block_addr, 87 | vex_stmt_idx=manager.vex_stmt_idx, 88 | ) 89 | 90 | @staticmethod 91 | def tmp(tmp_idx, bits, manager): 92 | return Tmp( 93 | manager.next_atom(), 94 | None, 95 | tmp_idx, 96 | bits, 97 | ins_addr=manager.ins_addr, 98 | vex_block_addr=manager.block_addr, 99 | vex_stmt_idx=manager.vex_stmt_idx, 100 | ) 101 | 102 | @staticmethod 103 | def RdTmp(expr, manager): 104 | return VEXExprConverter.tmp(expr.tmp, expr.result_size(manager.tyenv), manager) 105 | 106 | @staticmethod 107 | def Get(expr, manager): 108 | return VEXExprConverter.register(expr.offset, expr.result_size(manager.tyenv), manager) 109 | 110 | @staticmethod 111 | def Load(expr, manager): 112 | return Load( 113 | manager.next_atom(), 114 | VEXExprConverter.convert(expr.addr, manager), 115 | expr.result_size(manager.tyenv) // 8, 116 | expr.end, 117 | ins_addr=manager.ins_addr, 118 | vex_block_addr=manager.block_addr, 119 | vex_stmt_idx=manager.vex_stmt_idx, 120 | ) 121 | 122 | @staticmethod 123 | def Unop(expr, manager): 124 | op_name = VEXExprConverter.generic_name_from_vex_op(expr.op) 125 | if op_name == "Reinterp": 126 | simop = vexop_to_simop(expr.op) 127 | return Reinterpret( 128 | manager.next_atom(), 129 | simop._from_size, 130 | simop._from_type, 131 | simop._to_size, 132 | simop._to_type, 133 | VEXExprConverter.convert(expr.args[0], manager), 134 | ins_addr=manager.ins_addr, 135 | vex_block_addr=manager.block_addr, 136 | vex_stmt_idx=manager.vex_stmt_idx, 137 | ) 138 | elif op_name is None: 139 | # is it a conversion? 140 | simop = vexop_to_simop(expr.op) 141 | if simop._conversion: 142 | if simop._from_side == "HI": 143 | # returns the high-half of the argument 144 | inner = VEXExprConverter.convert(expr.args[0], manager) 145 | shifted = BinaryOp( 146 | manager.next_atom(), 147 | "Shr", 148 | [ 149 | inner, 150 | Const( 151 | manager.next_atom(), 152 | None, 153 | simop._to_size, 154 | 8, 155 | ins_addr=manager.ins_addr, 156 | vex_block_addr=manager.block_addr, 157 | vex_stmt_idx=manager.vex_stmt_idx, 158 | ), 159 | ], 160 | False, 161 | ins_addr=manager.ins_addr, 162 | vex_block_addr=manager.block_addr, 163 | vex_stmt_idx=manager.vex_stmt_idx, 164 | ) 165 | return Convert( 166 | manager.next_atom(), 167 | simop._from_size, 168 | simop._to_size, 169 | simop.is_signed, 170 | shifted, 171 | ins_addr=manager.ins_addr, 172 | vex_block_addr=manager.block_addr, 173 | vex_stmt_idx=manager.vex_stmt_idx, 174 | ) 175 | 176 | return Convert( 177 | manager.next_atom(), 178 | simop._from_size, 179 | simop._to_size, 180 | simop.is_signed, 181 | VEXExprConverter.convert(expr.args[0], manager), 182 | ins_addr=manager.ins_addr, 183 | vex_block_addr=manager.block_addr, 184 | vex_stmt_idx=manager.vex_stmt_idx, 185 | ) 186 | raise NotImplementedError("Unsupported operation") 187 | elif op_name == "Not" and expr.op != "Iop_Not1": 188 | # NotN (N != 1) is equivalent to bitwise negation 189 | op_name = "BitwiseNeg" 190 | 191 | return UnaryOp( 192 | manager.next_atom(), 193 | op_name, 194 | VEXExprConverter.convert(expr.args[0], manager), 195 | bits=expr.result_size(manager.tyenv), 196 | ins_addr=manager.ins_addr, 197 | vex_block_addr=manager.block_addr, 198 | vex_stmt_idx=manager.vex_stmt_idx, 199 | ) 200 | 201 | @staticmethod 202 | def Binop(expr, manager): 203 | op = VEXExprConverter.simop_from_vexop(expr.op) 204 | op_name = op._generic_name 205 | operands = VEXExprConverter.convert_list(expr.args, manager) 206 | 207 | if op_name == "Add" and type(operands[1]) is Const and operands[1].sign_bit == 1: 208 | # convert it to a sub 209 | op_name = "Sub" 210 | op1_val, op1_bits = operands[1].value, operands[1].bits 211 | operands[1] = Const(operands[1].idx, None, (1 << op1_bits) - op1_val, op1_bits) 212 | 213 | signed = False 214 | vector_count = None 215 | vector_size = None 216 | if op._vector_count is not None and op._vector_size is not None: 217 | # SIMD conversions 218 | op_name += "V" # vectorized 219 | vector_count = op._vector_count 220 | vector_size = op._vector_size 221 | elif op_name in {"CmpLE", "CmpLT", "CmpGE", "CmpGT", "Div", "DivMod", "Mod", "Mul", "Mull"}: 222 | if op.is_signed: 223 | signed = True 224 | 225 | if op_name == "Cmp" and op._float: 226 | # Rename Cmp to CmpF 227 | op_name = "CmpF" 228 | 229 | if op_name is None and op._conversion: 230 | # conversion 231 | # TODO: Finish this 232 | if op._from_type == "I" and op._to_type == "F": 233 | # integer to floating point 234 | rm = operands[0] 235 | operand = operands[1] 236 | return Convert( 237 | manager.next_atom(), 238 | op._from_size, 239 | op._to_size, 240 | op.is_signed, 241 | operand, 242 | from_type=Convert.TYPE_INT, 243 | to_type=Convert.TYPE_FP, 244 | rounding_mode=rm, 245 | ins_addr=manager.ins_addr, 246 | vex_block_addr=manager.block_addr, 247 | vex_stmt_idx=manager.vex_stmt_idx, 248 | ) 249 | elif op._from_side == "HL": 250 | # Concatenating the two arguments and form a new value 251 | op_name = "Concat" 252 | elif op._from_type == "F" and op._to_type == "F": 253 | # floating point to floating point 254 | rm = operands[0] 255 | operand = operands[1] 256 | return Convert( 257 | manager.next_atom(), 258 | op._from_size, 259 | op._to_size, 260 | op.is_signed, 261 | operand, 262 | from_type=Convert.TYPE_FP, 263 | to_type=Convert.TYPE_FP, 264 | rounding_mode=rm, 265 | ins_addr=manager.ins_addr, 266 | vex_block_addr=manager.block_addr, 267 | vex_stmt_idx=manager.vex_stmt_idx, 268 | ) 269 | elif op._from_type == "F" and op._to_type == "I": 270 | # floating point to integer 271 | # floating point to floating point 272 | rm = operands[0] 273 | operand = operands[1] 274 | return Convert( 275 | manager.next_atom(), 276 | op._from_size, 277 | op._to_size, 278 | op.is_signed, 279 | operand, 280 | from_type=Convert.TYPE_FP, 281 | to_type=Convert.TYPE_INT, 282 | rounding_mode=rm, 283 | ins_addr=manager.ins_addr, 284 | vex_block_addr=manager.block_addr, 285 | vex_stmt_idx=manager.vex_stmt_idx, 286 | ) 287 | 288 | bits = op._output_size_bits 289 | 290 | if op_name == "DivMod": 291 | op1_size = op._from_size if op._from_size is not None else operands[0].bits 292 | op2_size = op._to_size if op._to_size is not None else operands[1].bits 293 | 294 | if op2_size < op1_size: 295 | # e.g., DivModU64to32 296 | operands[1] = Convert( 297 | manager.next_atom(), 298 | op2_size, 299 | op1_size, 300 | op._from_signed != "U", 301 | operands[1], 302 | ins_addr=manager.ins_addr, 303 | vex_block_addr=manager.block_addr, 304 | vex_stmt_idx=manager.vex_stmt_idx, 305 | ) 306 | chunk_bits = bits // 2 307 | 308 | div = BinaryOp( 309 | manager.next_atom(), 310 | "Div", 311 | operands, 312 | signed, 313 | ins_addr=manager.ins_addr, 314 | vex_block_addr=manager.block_addr, 315 | vex_stmt_idx=manager.vex_stmt_idx, 316 | bits=op1_size, 317 | ) 318 | truncated_div = Convert( 319 | manager.next_atom(), 320 | op1_size, 321 | chunk_bits, 322 | signed, 323 | div, 324 | ins_addr=manager.ins_addr, 325 | vex_block_addr=manager.block_addr, 326 | vex_stmt_idx=manager.vex_stmt_idx, 327 | ) 328 | mod = BinaryOp( 329 | manager.next_atom(), 330 | "Mod", 331 | operands, 332 | signed, 333 | ins_addr=manager.ins_addr, 334 | vex_block_addr=manager.block_addr, 335 | vex_stmt_idx=manager.vex_stmt_idx, 336 | bits=op1_size, 337 | ) 338 | truncated_mod = Convert( 339 | manager.next_atom(), 340 | op1_size, 341 | chunk_bits, 342 | signed, 343 | mod, 344 | ins_addr=manager.ins_addr, 345 | vex_block_addr=manager.block_addr, 346 | vex_stmt_idx=manager.vex_stmt_idx, 347 | ) 348 | 349 | operands = [truncated_mod, truncated_div] 350 | op_name = "Concat" 351 | signed = False 352 | 353 | return BinaryOp( 354 | manager.next_atom(), 355 | op_name, 356 | operands, 357 | signed, 358 | ins_addr=manager.ins_addr, 359 | vex_block_addr=manager.block_addr, 360 | vex_stmt_idx=manager.vex_stmt_idx, 361 | bits=bits, 362 | vector_count=vector_count, 363 | vector_size=vector_size, 364 | ) 365 | 366 | @staticmethod 367 | def Triop(expr, manager): 368 | op = VEXExprConverter.simop_from_vexop(expr.op) 369 | op_name = op._generic_name 370 | operands = VEXExprConverter.convert_list(expr.args, manager) 371 | 372 | bits = op._output_size_bits 373 | 374 | if op._float: 375 | # this is a floating-point operation where the first argument is the rounding mode. in fact, we have a 376 | # BinaryOp here. 377 | rm = operands[0] 378 | return BinaryOp( 379 | manager.next_atom(), 380 | op_name, 381 | operands[1:], 382 | True, # all floating-point operations are signed 383 | floating_point=True, 384 | rounding_mode=rm, 385 | ins_addr=manager.ins_addr, 386 | vex_block_addr=manager.block_addr, 387 | vex_stmt_idx=manager.vex_stmt_idx, 388 | bits=bits, 389 | ) 390 | 391 | raise TypeError( 392 | "Please figure out what kind of operation this is (smart money says fused multiply) and convert it into " 393 | "multiple binops" 394 | ) 395 | 396 | @staticmethod 397 | def Const(expr, manager): 398 | # pyvex.IRExpr.Const 399 | return Const( 400 | manager.next_atom(), 401 | None, 402 | expr.con.value, 403 | expr.result_size(manager.tyenv), 404 | ins_addr=manager.ins_addr, 405 | vex_block_addr=manager.block_addr, 406 | vex_stmt_idx=manager.vex_stmt_idx, 407 | ) 408 | 409 | @staticmethod 410 | def const_n(expr, manager): 411 | # pyvex.const.xxx 412 | return Const( 413 | manager.next_atom(), 414 | None, 415 | expr.value, 416 | expr.size, 417 | ins_addr=manager.ins_addr, 418 | vex_block_addr=manager.block_addr, 419 | vex_stmt_idx=manager.vex_stmt_idx, 420 | ) 421 | 422 | @staticmethod 423 | def ITE(expr, manager): 424 | cond = VEXExprConverter.convert(expr.cond, manager) 425 | iffalse = VEXExprConverter.convert(expr.iffalse, manager) 426 | iftrue = VEXExprConverter.convert(expr.iftrue, manager) 427 | 428 | return ITE( 429 | manager.next_atom(), 430 | cond, 431 | iffalse, 432 | iftrue, 433 | ins_addr=manager.ins_addr, 434 | vex_block_addr=manager.block_addr, 435 | vex_stmt_idx=manager.vex_stmt_idx, 436 | ) 437 | 438 | @staticmethod 439 | def CCall(expr: pyvex.IRExpr.CCall, manager): 440 | operands = [VEXExprConverter.convert(arg, manager) for arg in expr.args] 441 | return VEXCCallExpression( 442 | manager.next_atom(), 443 | expr.cee.name, 444 | operands, 445 | bits=expr.result_size(manager.tyenv), 446 | ins_addr=manager.ins_addr, 447 | vex_block_addr=manager.block_addr, 448 | vex_stmt_idx=manager.vex_stmt_idx, 449 | ) 450 | 451 | 452 | EXPRESSION_MAPPINGS = { 453 | pyvex.IRExpr.RdTmp: VEXExprConverter.RdTmp, 454 | pyvex.IRExpr.Get: VEXExprConverter.Get, 455 | pyvex.IRExpr.Unop: VEXExprConverter.Unop, 456 | pyvex.IRExpr.Binop: VEXExprConverter.Binop, 457 | pyvex.IRExpr.Triop: VEXExprConverter.Triop, 458 | pyvex.IRExpr.Const: VEXExprConverter.Const, 459 | pyvex.const.U32: VEXExprConverter.const_n, 460 | pyvex.const.U64: VEXExprConverter.const_n, 461 | pyvex.IRExpr.Load: VEXExprConverter.Load, 462 | pyvex.IRExpr.ITE: VEXExprConverter.ITE, 463 | pyvex.IRExpr.CCall: VEXExprConverter.CCall, 464 | } 465 | 466 | 467 | class VEXStmtConverter(Converter): 468 | @staticmethod 469 | def convert(idx, stmt, manager): # pylint:disable=arguments-differ 470 | """ 471 | 472 | :param idx: 473 | :param stmt: 474 | :param manager: 475 | :return: 476 | """ 477 | 478 | try: 479 | func = STATEMENT_MAPPINGS[type(stmt)] 480 | except KeyError: 481 | dirty = DirtyExpression(manager.next_atom(), str(stmt), [], bits=0) 482 | return DirtyStatement(idx, dirty, ins_addr=manager.ins_addr) 483 | 484 | return func(idx, stmt, manager) 485 | 486 | @staticmethod 487 | def WrTmp(idx, stmt, manager): 488 | var = VEXExprConverter.tmp(stmt.tmp, stmt.data.result_size(manager.tyenv), manager) 489 | reg = VEXExprConverter.convert(stmt.data, manager) 490 | 491 | return Assignment( 492 | idx, 493 | var, 494 | reg, 495 | ins_addr=manager.ins_addr, 496 | vex_block_addr=manager.block_addr, 497 | vex_stmt_idx=manager.vex_stmt_idx, 498 | ) 499 | 500 | @staticmethod 501 | def Put(idx, stmt, manager): 502 | data = VEXExprConverter.convert(stmt.data, manager) 503 | reg = VEXExprConverter.register(stmt.offset, data.bits, manager) 504 | return Assignment( 505 | idx, 506 | reg, 507 | data, 508 | ins_addr=manager.ins_addr, 509 | vex_block_addr=manager.block_addr, 510 | vex_stmt_idx=manager.vex_stmt_idx, 511 | ) 512 | 513 | @staticmethod 514 | def Store(idx, stmt, manager): 515 | return Store( 516 | idx, 517 | VEXExprConverter.convert(stmt.addr, manager), 518 | VEXExprConverter.convert(stmt.data, manager), 519 | stmt.data.result_size(manager.tyenv) // 8, 520 | stmt.endness, 521 | ins_addr=manager.ins_addr, 522 | vex_block_addr=manager.block_addr, 523 | vex_stmt_idx=manager.vex_stmt_idx, 524 | ) 525 | 526 | @staticmethod 527 | def Exit(idx, stmt, manager): 528 | if stmt.jumpkind in { 529 | "Ijk_EmWarn", 530 | "Ijk_NoDecode", 531 | "Ijk_MapFail", 532 | "Ijk_NoRedir", 533 | "Ijk_SigTRAP", 534 | "Ijk_SigSEGV", 535 | "Ijk_ClientReq", 536 | "Ijk_SigFPE_IntDiv", 537 | }: 538 | raise SkipConversionNotice() 539 | 540 | return ConditionalJump( 541 | idx, 542 | VEXExprConverter.convert(stmt.guard, manager), 543 | VEXExprConverter.convert(stmt.dst, manager), 544 | None, # it will be filled in right afterwards 545 | ins_addr=manager.ins_addr, 546 | vex_block_addr=manager.block_addr, 547 | vex_stmt_idx=manager.vex_stmt_idx, 548 | ) 549 | 550 | @staticmethod 551 | def LoadG(idx, stmt: pyvex.IRStmt.LoadG, manager): 552 | sizes = { 553 | "ILGop_Ident32": (32, 32, False), 554 | "ILGop_Ident64": (64, 64, False), 555 | "ILGop_IdentV128": (128, 128, False), 556 | "ILGop_8Uto32": (8, 32, False), 557 | "ILGop_8Sto32": (8, 32, True), 558 | "ILGop_16Uto32": (16, 32, False), 559 | "ILGop_16Sto32": (16, 32, True), 560 | } 561 | 562 | dst = VEXExprConverter.tmp(stmt.dst, manager.tyenv.sizeof(stmt.dst), manager) 563 | load_bits, convert_bits, signed = sizes[stmt.cvt] 564 | src = Load( 565 | manager.next_atom(), 566 | VEXExprConverter.convert(stmt.addr, manager), 567 | load_bits // 8, 568 | stmt.end, 569 | guard=VEXExprConverter.convert(stmt.guard, manager), 570 | alt=VEXExprConverter.convert(stmt.alt, manager), 571 | ) 572 | if convert_bits != load_bits: 573 | src = Convert(manager.next_atom(), load_bits, convert_bits, signed, src) 574 | 575 | return Assignment( 576 | idx, 577 | dst, 578 | src, 579 | ins_addr=manager.ins_addr, 580 | vex_block_addr=manager.block_addr, 581 | vex_stmt_idx=manager.vex_stmt_idx, 582 | ) 583 | 584 | @staticmethod 585 | def StoreG(idx, stmt: pyvex.IRStmt.StoreG, manager): 586 | return Store( 587 | idx, 588 | VEXExprConverter.convert(stmt.addr, manager), 589 | VEXExprConverter.convert(stmt.data, manager), 590 | stmt.data.result_size(manager.tyenv) // 8, 591 | stmt.endness, 592 | guard=VEXExprConverter.convert(stmt.guard, manager), 593 | ins_addr=manager.ins_addr, 594 | vex_block_addr=manager.block_addr, 595 | vex_stmt_idx=manager.vex_stmt_idx, 596 | ) 597 | 598 | @staticmethod 599 | def CAS(idx, stmt: pyvex.IRStmt.CAS, manager): 600 | # addr 601 | addr = VEXExprConverter.convert(stmt.addr, manager) 602 | data_lo = VEXExprConverter.convert(stmt.dataLo, manager) 603 | data_hi = VEXExprConverter.convert(stmt.dataHi, manager) if stmt.dataHi is not None else None 604 | expd_lo = VEXExprConverter.convert(stmt.expdLo, manager) 605 | expd_hi = VEXExprConverter.convert(stmt.expdHi, manager) if stmt.expdHi is not None else None 606 | old_lo = VEXExprConverter.tmp(stmt.oldLo, manager.tyenv.sizeof(stmt.oldLo), manager) 607 | old_hi = ( 608 | VEXExprConverter.tmp(stmt.oldHi, stmt.oldHi.result_size(manager.tyenv), manager) 609 | if stmt.oldHi != 0xFFFFFFFF 610 | else None 611 | ) 612 | return CAS( 613 | idx, addr, data_lo, data_hi, expd_lo, expd_hi, old_lo, old_hi, stmt.endness, ins_addr=manager.ins_addr 614 | ) 615 | 616 | @staticmethod 617 | def Dirty(idx, stmt: pyvex.IRStmt.Dirty, manager): 618 | # we translate it into tmp = DirtyExpression() if possible 619 | 620 | operands = [VEXExprConverter.convert(op, manager) for op in stmt.args] 621 | guard = VEXExprConverter.convert(stmt.guard, manager) if stmt.guard is not None else None 622 | bits = manager.tyenv.sizeof(stmt.tmp) if stmt.tmp != 0xFFFFFFFF else 0 623 | maddr = VEXExprConverter.convert(stmt.mAddr, manager) if stmt.mAddr is not None else None 624 | dirty_expr = DirtyExpression( 625 | manager.next_atom(), 626 | stmt.cee.name, 627 | operands, 628 | guard=guard, 629 | mfx=stmt.mFx, 630 | maddr=maddr, 631 | msize=stmt.mSize, 632 | bits=bits, 633 | ) 634 | 635 | if stmt.tmp == 0xFFFFFFFF: 636 | return DirtyStatement( 637 | idx, 638 | dirty_expr, 639 | ins_addr=manager.ins_addr, 640 | vex_block_addr=manager.block_addr, 641 | vex_stmt_idx=manager.vex_stmt_idx, 642 | ) 643 | 644 | tmp = VEXExprConverter.tmp(stmt.tmp, bits, manager) 645 | return Assignment( 646 | idx, 647 | tmp, 648 | dirty_expr, 649 | ins_addr=manager.ins_addr, 650 | vex_block_addr=manager.block_addr, 651 | vex_stmt_idx=manager.vex_stmt_idx, 652 | ) 653 | 654 | 655 | STATEMENT_MAPPINGS = { 656 | pyvex.IRStmt.Put: VEXStmtConverter.Put, 657 | pyvex.IRStmt.WrTmp: VEXStmtConverter.WrTmp, 658 | pyvex.IRStmt.Store: VEXStmtConverter.Store, 659 | pyvex.IRStmt.Exit: VEXStmtConverter.Exit, 660 | pyvex.IRStmt.StoreG: VEXStmtConverter.StoreG, 661 | pyvex.IRStmt.LoadG: VEXStmtConverter.LoadG, 662 | pyvex.IRStmt.CAS: VEXStmtConverter.CAS, 663 | pyvex.IRStmt.Dirty: VEXStmtConverter.Dirty, 664 | } 665 | 666 | 667 | class VEXIRSBConverter(Converter): 668 | @staticmethod 669 | def convert(irsb, manager): # pylint:disable=arguments-differ 670 | """ 671 | 672 | :param irsb: 673 | :param manager: 674 | :return: 675 | """ 676 | 677 | # convert each VEX statement into an AIL statement 678 | statements = [] 679 | idx = 0 680 | 681 | manager.tyenv = irsb.tyenv 682 | manager.block_addr = irsb.addr 683 | 684 | addr = irsb.addr 685 | first_imark = True 686 | 687 | conditional_jumps = [] 688 | 689 | for vex_stmt_idx, stmt in enumerate(irsb.statements): 690 | if type(stmt) is pyvex.IRStmt.IMark: 691 | if first_imark: 692 | # update block address 693 | addr = stmt.addr + stmt.delta 694 | first_imark = False 695 | manager.ins_addr = stmt.addr + stmt.delta 696 | continue 697 | if type(stmt) is pyvex.IRStmt.AbiHint: 698 | # TODO: How can we use AbiHint? 699 | continue 700 | 701 | manager.vex_stmt_idx = vex_stmt_idx 702 | try: 703 | converted = VEXStmtConverter.convert(idx, stmt, manager) 704 | if isinstance(converted, list): 705 | # got multiple statements 706 | statements.extend(converted) 707 | idx += len(converted) 708 | else: 709 | # got one statement 710 | statements.append(converted) 711 | if type(converted) is ConditionalJump: 712 | conditional_jumps.append(converted) 713 | idx += 1 714 | except SkipConversionNotice: 715 | pass 716 | 717 | manager.vex_stmt_idx = DEFAULT_STATEMENT 718 | if irsb.jumpkind == "Ijk_Call" or irsb.jumpkind.startswith("Ijk_Sys"): 719 | # FIXME: Move ret_expr and fp_ret_expr creation into angr because we cannot reliably determine which 720 | # expressions can be returned from the call without performing further analysis 721 | ret_reg_offset = manager.arch.ret_offset 722 | ret_expr = Register( 723 | manager.next_atom(), 724 | None, 725 | ret_reg_offset, 726 | manager.arch.bits, 727 | reg_name=manager.arch.translate_register_name(ret_reg_offset, size=manager.arch.bits), 728 | ins_addr=manager.ins_addr, 729 | vex_block_addr=manager.block_addr, 730 | vex_stmt_idx=DEFAULT_STATEMENT, 731 | ) 732 | fp_ret_reg_offset = manager.arch.fp_ret_offset 733 | if fp_ret_reg_offset is not None and fp_ret_reg_offset != ret_reg_offset: 734 | fp_ret_expr = Register( 735 | manager.next_atom(), 736 | None, 737 | fp_ret_reg_offset, 738 | manager.arch.bits, 739 | reg_name=manager.arch.translate_register_name(fp_ret_reg_offset, size=manager.arch.bits), 740 | ins_addr=manager.ins_addr, 741 | vex_block_addr=manager.block_addr, 742 | vex_stmt_idx=DEFAULT_STATEMENT, 743 | ) 744 | else: 745 | fp_ret_expr = None 746 | 747 | if irsb.jumpkind == "Ijk_Call": 748 | target = VEXExprConverter.convert(irsb.next, manager) 749 | elif irsb.jumpkind.startswith("Ijk_Sys"): 750 | # FIXME: This is a hack to make syscall work. We should have a better way to handle syscalls. 751 | target = DirtyExpression(manager.next_atom(), "syscall", [], bits=manager.arch.bits) 752 | else: 753 | raise NotImplementedError("Unsupported jumpkind") 754 | 755 | statements.append( 756 | Call( 757 | manager.next_atom(), 758 | target, 759 | ret_expr=ret_expr, 760 | fp_ret_expr=fp_ret_expr, 761 | ins_addr=manager.ins_addr, 762 | vex_block_addr=manager.block_addr, 763 | vex_stmt_idx=DEFAULT_STATEMENT, 764 | ) 765 | ) 766 | elif irsb.jumpkind == "Ijk_Boring": 767 | if conditional_jumps: 768 | # fill in the false target 769 | cond_jump = conditional_jumps[-1] 770 | cond_jump.false_target = VEXExprConverter.convert(irsb.next, manager) 771 | 772 | else: 773 | # jump 774 | statements.append( 775 | Jump( 776 | manager.next_atom(), 777 | VEXExprConverter.convert(irsb.next, manager), 778 | ins_addr=manager.ins_addr, 779 | vex_block_addr=manager.block_addr, 780 | vex_stmt_idx=DEFAULT_STATEMENT, 781 | ) 782 | ) 783 | elif irsb.jumpkind == "Ijk_Ret": 784 | # return 785 | statements.append( 786 | Return( 787 | manager.next_atom(), 788 | [], 789 | ins_addr=manager.ins_addr, 790 | vex_block_addr=manager.block_addr, 791 | vex_stmt_idx=DEFAULT_STATEMENT, 792 | ) 793 | ) 794 | else: 795 | raise NotImplementedError("Unsupported jumpkind") 796 | 797 | return Block(addr, irsb.size, statements=statements) 798 | -------------------------------------------------------------------------------- /ailment/expression.py: -------------------------------------------------------------------------------- 1 | # pylint:disable=arguments-renamed,isinstance-second-argument-not-valid-type,missing-class-docstring,too-many-boolean-expressions 2 | from __future__ import annotations 3 | from typing import TYPE_CHECKING, cast 4 | from collections.abc import Sequence 5 | from enum import Enum, IntEnum 6 | from abc import abstractmethod 7 | from typing_extensions import Self 8 | 9 | 10 | try: 11 | import claripy 12 | except ImportError: 13 | claripy = None 14 | 15 | from .tagged_object import TaggedObject 16 | from .utils import get_bits, stable_hash, is_none_or_likeable, is_none_or_matchable 17 | 18 | if TYPE_CHECKING: 19 | from .statement import Statement 20 | 21 | 22 | class Expression(TaggedObject): 23 | """ 24 | The base class of all AIL expressions. 25 | """ 26 | 27 | bits: int 28 | 29 | __slots__ = ( 30 | "bits", 31 | "depth", 32 | ) 33 | 34 | def __init__(self, idx, depth, **kwargs): 35 | super().__init__(idx, **kwargs) 36 | self.depth = depth 37 | 38 | @abstractmethod 39 | def __repr__(self): 40 | raise NotImplementedError() 41 | 42 | def has_atom(self, atom, identity=True): 43 | if identity: 44 | return self is atom 45 | else: 46 | return self.likes(atom) 47 | 48 | def __eq__(self, other): 49 | if self is other: 50 | return True 51 | return type(self) is type(other) and self.likes(other) and self.idx == other.idx 52 | 53 | @abstractmethod 54 | def likes(self, other): # pylint:disable=unused-argument,no-self-use 55 | raise NotImplementedError() 56 | 57 | @abstractmethod 58 | def matches(self, other): # pylint:disable=unused-argument,no-self-use 59 | raise NotImplementedError() 60 | 61 | def replace(self, old_expr: Expression, new_expr: Expression) -> tuple[bool, Self]: 62 | if self is old_expr: 63 | r = True 64 | replaced = cast(Self, new_expr) 65 | elif not isinstance(self, Atom): 66 | r, replaced = self.replace(old_expr, new_expr) 67 | else: 68 | r, replaced = False, self 69 | 70 | return r, replaced 71 | 72 | def __add__(self, other): 73 | return BinaryOp(None, "Add", [self, other], signed=False, **self.tags) 74 | 75 | def __sub__(self, other): 76 | return BinaryOp(None, "Sub", [self, other], signed=False, **self.tags) 77 | 78 | 79 | class Atom(Expression): 80 | __slots__ = ( 81 | "variable", 82 | "variable_offset", 83 | ) 84 | 85 | def __init__(self, idx: int | None, variable=None, variable_offset=0, **kwargs): 86 | super().__init__(idx, 0, **kwargs) 87 | self.variable = variable 88 | self.variable_offset = variable_offset 89 | 90 | def __repr__(self) -> str: 91 | return "Atom (%d)" % self.idx 92 | 93 | def copy(self) -> Self: # pylint:disable=no-self-use 94 | raise NotImplementedError() 95 | 96 | 97 | class Const(Atom): 98 | __slots__ = ("value",) 99 | 100 | def __init__(self, idx: int | None, variable, value: int | float, bits: int, **kwargs): 101 | super().__init__(idx, variable, **kwargs) 102 | 103 | self.value = value 104 | self.bits = bits 105 | 106 | @property 107 | def size(self): 108 | return self.bits // 8 109 | 110 | def __repr__(self): 111 | return str(self) 112 | 113 | def __str__(self): 114 | if isinstance(self.value, int): 115 | return "%#x<%d>" % (self.value, self.bits) 116 | elif isinstance(self.value, float): 117 | return "%f<%d>" % (self.value, self.bits) 118 | else: 119 | return f"{self.value}<{self.bits}>" 120 | 121 | def likes(self, other): 122 | # nan is nan, but nan != nan 123 | return ( 124 | type(self) is type(other) 125 | and (self.value is other.value or self.value == other.value) 126 | and self.bits == other.bits 127 | ) 128 | 129 | matches = likes 130 | __hash__ = TaggedObject.__hash__ # type: ignore 131 | 132 | def _hash_core(self): 133 | return stable_hash((self.value, self.bits)) 134 | 135 | @property 136 | def sign_bit(self): 137 | if not self.is_int: 138 | raise TypeError("Sign bit is only available for int constants.") 139 | assert isinstance(self.value, int) 140 | return self.value >> (self.bits - 1) 141 | 142 | def copy(self) -> Const: 143 | return Const(self.idx, self.variable, self.value, self.bits, **self.tags) 144 | 145 | @property 146 | def is_int(self) -> bool: 147 | return isinstance(self.value, int) 148 | 149 | 150 | class Tmp(Atom): 151 | __slots__ = ("tmp_idx",) 152 | 153 | def __init__(self, idx: int | None, variable, tmp_idx: int, bits, **kwargs): 154 | super().__init__(idx, variable, **kwargs) 155 | 156 | self.tmp_idx = tmp_idx 157 | self.bits = bits 158 | 159 | @property 160 | def size(self): 161 | return self.bits // 8 162 | 163 | def __repr__(self): 164 | return str(self) 165 | 166 | def __str__(self): 167 | return "t%d" % self.tmp_idx 168 | 169 | def likes(self, other): 170 | return type(self) is type(other) and self.tmp_idx == other.tmp_idx and self.bits == other.bits 171 | 172 | matches = likes 173 | __hash__ = TaggedObject.__hash__ # type: ignore 174 | 175 | def _hash_core(self): 176 | return stable_hash(("tmp", self.tmp_idx, self.bits)) 177 | 178 | def copy(self) -> Tmp: 179 | return Tmp(self.idx, self.variable, self.tmp_idx, self.bits, **self.tags) 180 | 181 | 182 | class Register(Atom): 183 | __slots__ = ("reg_offset",) 184 | 185 | def __init__(self, idx: int | None, variable, reg_offset: int, bits: int, **kwargs): 186 | super().__init__(idx, variable, **kwargs) 187 | 188 | self.reg_offset = reg_offset 189 | self.bits = bits 190 | 191 | @property 192 | def size(self): 193 | return self.bits // 8 194 | 195 | def likes(self, other): 196 | return type(self) is type(other) and self.reg_offset == other.reg_offset and self.bits == other.bits 197 | 198 | def __repr__(self): 199 | return str(self) 200 | 201 | def __str__(self): 202 | if hasattr(self, "reg_name"): 203 | return "%s<%d>" % (self.reg_name, self.bits // 8) 204 | if self.variable is None: 205 | return "reg_%d<%d>" % (self.reg_offset, self.bits // 8) 206 | else: 207 | return "%s" % str(self.variable.name) 208 | 209 | matches = likes 210 | __hash__ = TaggedObject.__hash__ # type: ignore 211 | 212 | def _hash_core(self): 213 | return stable_hash(("reg", self.reg_offset, self.bits, self.idx)) 214 | 215 | def copy(self) -> Register: 216 | return Register(self.idx, self.variable, self.reg_offset, self.bits, **self.tags) 217 | 218 | 219 | class VirtualVariableCategory(IntEnum): 220 | REGISTER = 0 221 | STACK = 1 222 | MEMORY = 2 223 | PARAMETER = 3 224 | TMP = 4 225 | UNKNOWN = 5 226 | 227 | 228 | class VirtualVariable(Atom): 229 | 230 | __slots__ = ( 231 | "varid", 232 | "category", 233 | "oident", 234 | ) 235 | 236 | def __init__( 237 | self, 238 | idx, 239 | varid: int, 240 | bits, 241 | category: VirtualVariableCategory, 242 | oident: int | str | tuple | None = None, 243 | **kwargs, 244 | ): 245 | super().__init__(idx, **kwargs) 246 | 247 | self.varid = varid 248 | self.category = category 249 | self.oident = oident 250 | self.bits = bits 251 | 252 | @property 253 | def size(self): 254 | return self.bits // 8 255 | 256 | @property 257 | def was_reg(self) -> bool: 258 | return self.category == VirtualVariableCategory.REGISTER 259 | 260 | @property 261 | def was_stack(self) -> bool: 262 | return self.category == VirtualVariableCategory.STACK 263 | 264 | @property 265 | def was_parameter(self) -> bool: 266 | return self.category == VirtualVariableCategory.PARAMETER 267 | 268 | @property 269 | def was_tmp(self) -> bool: 270 | return self.category == VirtualVariableCategory.TMP 271 | 272 | @property 273 | def reg_offset(self) -> int: 274 | if self.was_reg: 275 | assert isinstance(self.oident, int) 276 | return self.oident 277 | elif self.was_parameter and self.parameter_category == VirtualVariableCategory.REGISTER: 278 | return self.parameter_reg_offset # type: ignore 279 | raise TypeError("Is not a register") 280 | 281 | @property 282 | def stack_offset(self) -> int: 283 | if self.was_stack: 284 | assert isinstance(self.oident, int) 285 | return self.oident 286 | elif self.was_parameter and self.parameter_category == VirtualVariableCategory.STACK: 287 | return self.parameter_stack_offset # type: ignore 288 | raise TypeError("Is not a stack variable") 289 | 290 | @property 291 | def tmp_idx(self) -> int | None: 292 | if self.was_tmp: 293 | assert isinstance(self.oident, int) 294 | return self.oident 295 | return None 296 | 297 | @property 298 | def parameter_category(self) -> VirtualVariableCategory | None: 299 | if self.was_parameter: 300 | assert isinstance(self.oident, tuple) 301 | return self.oident[0] 302 | return None 303 | 304 | @property 305 | def parameter_reg_offset(self) -> int | None: 306 | if self.was_parameter and self.parameter_category == VirtualVariableCategory.REGISTER: 307 | assert isinstance(self.oident, tuple) 308 | return self.oident[1] 309 | return None 310 | 311 | @property 312 | def parameter_stack_offset(self) -> int | None: 313 | if self.was_parameter and self.parameter_category == VirtualVariableCategory.STACK: 314 | assert isinstance(self.oident, tuple) 315 | return self.oident[1] 316 | return None 317 | 318 | def likes(self, other): 319 | return ( 320 | isinstance(other, VirtualVariable) 321 | and self.varid == other.varid 322 | and self.bits == other.bits 323 | and self.category == other.category 324 | and self.oident == other.oident 325 | ) 326 | 327 | def matches(self, other): 328 | return ( 329 | isinstance(other, VirtualVariable) 330 | and self.bits == other.bits 331 | and self.category == other.category 332 | and self.oident == other.oident 333 | ) 334 | 335 | def __repr__(self): 336 | ori_str = "" 337 | match self.category: 338 | case VirtualVariableCategory.REGISTER: 339 | ori_str = f"{{reg {self.reg_offset}}}" 340 | case VirtualVariableCategory.STACK: 341 | ori_str = f"{{stack {self.oident}}}" 342 | return f"vvar_{self.varid}{ori_str}" 343 | 344 | __hash__ = TaggedObject.__hash__ # type: ignore 345 | 346 | def _hash_core(self): 347 | return stable_hash(("var", self.varid, self.bits, self.category, self.oident)) 348 | 349 | def copy(self) -> VirtualVariable: 350 | return VirtualVariable( 351 | self.idx, 352 | self.varid, 353 | self.bits, 354 | self.category, 355 | oident=self.oident, 356 | variable=self.variable, 357 | variable_offset=self.variable_offset, 358 | **self.tags, 359 | ) 360 | 361 | 362 | class Phi(Atom): 363 | 364 | __slots__ = ("src_and_vvars",) 365 | 366 | def __init__( 367 | self, 368 | idx, 369 | bits, 370 | src_and_vvars: list[tuple[tuple[int, int | None], VirtualVariable | None]], 371 | **kwargs, 372 | ): 373 | super().__init__(idx, **kwargs) 374 | self.bits = bits 375 | self.src_and_vvars = src_and_vvars 376 | 377 | @property 378 | def size(self) -> int: 379 | return self.bits // 8 380 | 381 | @property 382 | def op(self) -> str: 383 | return "Phi" 384 | 385 | @property 386 | def verbose_op(self) -> str: 387 | return "Phi" 388 | 389 | def likes(self, other) -> bool: 390 | if isinstance(other, Phi) and self.bits == other.bits: 391 | self_src_and_vvarids = {(src, vvar.varid if vvar is not None else None) for src, vvar in self.src_and_vvars} 392 | other_src_and_vvarids = { 393 | (src, vvar.varid if vvar is not None else None) for src, vvar in other.src_and_vvars 394 | } 395 | return self_src_and_vvarids == other_src_and_vvarids 396 | return False 397 | 398 | def matches(self, other) -> bool: 399 | if isinstance(other, Phi) and self.bits == other.bits: 400 | if len(self.src_and_vvars) != len(other.src_and_vvars): 401 | return False 402 | self_src_and_vvars = dict(self.src_and_vvars) 403 | other_src_and_vvars = dict(other.src_and_vvars) 404 | for src, self_vvar in self_src_and_vvars.items(): 405 | if src not in other_src_and_vvars: 406 | return False 407 | other_vvar = other_src_and_vvars[src] 408 | if self_vvar is None and other_vvar is None: 409 | continue 410 | if ( 411 | self_vvar is None 412 | and other_vvar is not None 413 | or self_vvar is not None 414 | and other_vvar is None 415 | or (self_vvar is not None and other_vvar is not None and not self_vvar.matches(other_vvar)) 416 | ): 417 | return False 418 | return True 419 | return False 420 | 421 | def __repr__(self): 422 | return f"𝜙@{self.bits}b {self.src_and_vvars}" 423 | 424 | __hash__ = TaggedObject.__hash__ # type: ignore 425 | 426 | def _hash_core(self): 427 | return stable_hash(("phi", self.bits, tuple(sorted(self.src_and_vvars, key=self._src_and_vvar_filter)))) 428 | 429 | def copy(self) -> Phi: 430 | return Phi( 431 | self.idx, 432 | self.bits, 433 | self.src_and_vvars[::], 434 | variable=self.variable, 435 | variable_offset=self.variable_offset, 436 | **self.tags, 437 | ) 438 | 439 | def replace(self, old_expr, new_expr): 440 | replaced = False 441 | new_src_and_vvars = [] 442 | for src, vvar in self.src_and_vvars: 443 | if vvar == old_expr and isinstance(new_expr, VirtualVariable): 444 | replaced = True 445 | new_src_and_vvars.append((src, new_expr)) 446 | else: 447 | new_src_and_vvars.append((src, vvar)) 448 | 449 | if replaced: 450 | return True, Phi( 451 | self.idx, 452 | self.bits, 453 | new_src_and_vvars, 454 | variable=self.variable, 455 | variable_offset=self.variable_offset, 456 | **self.tags, 457 | ) 458 | return False, self 459 | 460 | @staticmethod 461 | def _src_and_vvar_filter( 462 | src_and_vvar: tuple[tuple[int, int | None], VirtualVariable | None], 463 | ) -> tuple[tuple[int, int], int]: 464 | src, vvar = src_and_vvar 465 | if src[1] is None: 466 | src = src[0], -1 467 | vvar_id = vvar.varid if vvar is not None else -1 468 | return src, vvar_id # type: ignore 469 | 470 | 471 | class Op(Expression): 472 | __slots__ = ("op",) 473 | 474 | def __init__(self, idx, depth, op, **kwargs): 475 | super().__init__(idx, depth, **kwargs) 476 | self.op = op 477 | 478 | @property 479 | def verbose_op(self): 480 | return self.op 481 | 482 | 483 | class UnaryOp(Op): 484 | __slots__ = ( 485 | "operand", 486 | "variable", 487 | "variable_offset", 488 | ) 489 | 490 | def __init__( 491 | self, 492 | idx: int | None, 493 | op: str, 494 | operand: Expression, 495 | variable=None, 496 | variable_offset: int | None = None, 497 | bits=None, 498 | **kwargs, 499 | ): 500 | super().__init__(idx, (operand.depth if isinstance(operand, Expression) else 0) + 1, op, **kwargs) 501 | 502 | self.operand = operand 503 | self.bits = operand.bits if bits is None else bits 504 | self.variable = variable 505 | self.variable_offset = variable_offset 506 | 507 | def __str__(self): 508 | return f"({self.op} {str(self.operand)})" 509 | 510 | def __repr__(self): 511 | return str(self) 512 | 513 | def likes(self, other): 514 | return ( 515 | type(other) is UnaryOp 516 | and self.op == other.op 517 | and self.bits == other.bits 518 | and self.operand.likes(other.operand) 519 | ) 520 | 521 | def matches(self, other): 522 | return ( 523 | type(other) is UnaryOp 524 | and self.op == other.op 525 | and self.bits == other.bits 526 | and self.operand.matches(other.operand) 527 | ) 528 | 529 | __hash__ = TaggedObject.__hash__ # type: ignore 530 | 531 | def _hash_core(self): 532 | return stable_hash((self.op, self.operand, self.bits)) 533 | 534 | def replace(self, old_expr, new_expr): 535 | if self.operand == old_expr: 536 | r = True 537 | replaced_operand = new_expr 538 | else: 539 | r, replaced_operand = self.operand.replace(old_expr, new_expr) 540 | 541 | if r: 542 | return True, UnaryOp(self.idx, self.op, replaced_operand, bits=self.bits, **self.tags) 543 | else: 544 | return False, self 545 | 546 | @property 547 | def operands(self): 548 | return [self.operand] 549 | 550 | @property 551 | def size(self): 552 | return self.bits // 8 553 | 554 | def copy(self) -> UnaryOp: 555 | return UnaryOp( 556 | self.idx, 557 | self.op, 558 | self.operand, 559 | variable=self.variable, 560 | variable_offset=self.variable_offset, 561 | bits=self.bits, 562 | **self.tags, 563 | ) 564 | 565 | def has_atom(self, atom, identity=True): 566 | if super().has_atom(atom, identity=identity): 567 | return True 568 | return self.operand.has_atom(atom, identity=identity) 569 | 570 | 571 | class ConvertType(Enum): 572 | TYPE_INT = 0 573 | TYPE_FP = 1 574 | 575 | 576 | class Convert(UnaryOp): 577 | TYPE_INT = ConvertType.TYPE_INT 578 | TYPE_FP = ConvertType.TYPE_FP 579 | 580 | __slots__ = ( 581 | "from_bits", 582 | "to_bits", 583 | "is_signed", 584 | "from_type", 585 | "to_type", 586 | "rounding_mode", 587 | ) 588 | 589 | def __init__( 590 | self, 591 | idx: int | None, 592 | from_bits: int, 593 | to_bits: int, 594 | is_signed: bool, 595 | operand: Expression, 596 | from_type: ConvertType = TYPE_INT, 597 | to_type: ConvertType = TYPE_INT, 598 | rounding_mode=None, 599 | **kwargs, 600 | ): 601 | super().__init__(idx, "Convert", operand, **kwargs) 602 | 603 | self.from_bits = from_bits 604 | self.to_bits = to_bits 605 | # override the size 606 | self.bits = to_bits 607 | self.is_signed = is_signed 608 | self.from_type = from_type 609 | self.to_type = to_type 610 | self.rounding_mode = rounding_mode 611 | 612 | def __str__(self): 613 | return "Conv(%d->%s%d, %s)" % (self.from_bits, "s" if self.is_signed else "", self.to_bits, self.operand) 614 | 615 | def __repr__(self): 616 | return str(self) 617 | 618 | def likes(self, other): 619 | return ( 620 | type(other) is Convert 621 | and self.from_bits == other.from_bits 622 | and self.to_bits == other.to_bits 623 | and self.bits == other.bits 624 | and self.is_signed == other.is_signed 625 | and self.operand.likes(other.operand) 626 | and self.from_type == other.from_type 627 | and self.to_type == other.to_type 628 | and self.rounding_mode == other.rounding_mode 629 | ) 630 | 631 | def matches(self, other): 632 | return ( 633 | type(other) is Convert 634 | and self.from_bits == other.from_bits 635 | and self.to_bits == other.to_bits 636 | and self.bits == other.bits 637 | and self.is_signed == other.is_signed 638 | and self.operand.matches(other.operand) 639 | and self.from_type == other.from_type 640 | and self.to_type == other.to_type 641 | and self.rounding_mode == other.rounding_mode 642 | ) 643 | 644 | __hash__ = TaggedObject.__hash__ # type: ignore 645 | 646 | def _hash_core(self): 647 | return stable_hash( 648 | ( 649 | self.operand, 650 | self.from_bits, 651 | self.to_bits, 652 | self.bits, 653 | self.is_signed, 654 | self.from_type, 655 | self.to_type, 656 | self.rounding_mode, 657 | ) 658 | ) 659 | 660 | def replace(self, old_expr, new_expr): 661 | if self.operand == old_expr: 662 | r0 = True 663 | replaced_operand = new_expr 664 | else: 665 | r0, replaced_operand = self.operand.replace(old_expr, new_expr) 666 | 667 | if self.rounding_mode is not None: 668 | if self.rounding_mode.likes(old_expr): 669 | r1 = True 670 | replaced_rm = new_expr 671 | else: 672 | r1, replaced_rm = self.rounding_mode.replace(old_expr, new_expr) 673 | else: 674 | r1 = False 675 | replaced_rm = None 676 | 677 | if r0 or r1: 678 | return True, Convert( 679 | self.idx, 680 | self.from_bits, 681 | self.to_bits, 682 | self.is_signed, 683 | replaced_operand if replaced_operand is not None else self.operand, 684 | from_type=self.from_type, 685 | to_type=self.to_type, 686 | rounding_mode=replaced_rm if replaced_rm is not None else self.rounding_mode, 687 | **self.tags, 688 | ) 689 | else: 690 | return False, self 691 | 692 | def copy(self) -> Convert: 693 | return Convert( 694 | self.idx, 695 | self.from_bits, 696 | self.to_bits, 697 | self.is_signed, 698 | self.operand, 699 | from_type=self.from_type, 700 | to_type=self.to_type, 701 | rounding_mode=self.rounding_mode, 702 | **self.tags, 703 | ) 704 | 705 | 706 | class Reinterpret(UnaryOp): 707 | __slots__ = ( 708 | "from_bits", 709 | "from_type", 710 | "to_bits", 711 | "to_type", 712 | ) 713 | 714 | def __init__(self, idx, from_bits: int, from_type: str, to_bits: int, to_type: str, operand, **kwargs): 715 | super().__init__(idx, "Reinterpret", operand, **kwargs) 716 | 717 | assert (from_type == "I" and to_type == "F") or (from_type == "F" and to_type == "I") 718 | 719 | self.from_bits = from_bits 720 | self.from_type = from_type 721 | self.to_bits = to_bits 722 | self.to_type = to_type 723 | 724 | self.bits = self.to_bits 725 | 726 | def __str__(self): 727 | return f"Reinterpret({self.from_type}{self.from_bits}->{self.to_type}{self.to_bits}, {self.operand})" 728 | 729 | def __repr__(self): 730 | return str(self) 731 | 732 | def likes(self, other): 733 | return ( 734 | type(other) is Reinterpret 735 | and self.from_bits == other.from_bits 736 | and self.from_type == other.from_type 737 | and self.to_bits == other.to_bits 738 | and self.to_type == other.to_type 739 | and self.operand.likes(other.operand) 740 | ) 741 | 742 | def matches(self, other): 743 | return ( 744 | type(other) is Reinterpret 745 | and self.from_bits == other.from_bits 746 | and self.from_type == other.from_type 747 | and self.to_bits == other.to_bits 748 | and self.to_type == other.to_type 749 | and self.operand.matches(other.operand) 750 | ) 751 | 752 | __hash__ = TaggedObject.__hash__ # type: ignore 753 | 754 | def _hash_core(self): 755 | return stable_hash( 756 | ( 757 | self.operand, 758 | self.from_bits, 759 | self.from_type, 760 | self.to_bits, 761 | self.to_type, 762 | ) 763 | ) 764 | 765 | def replace(self, old_expr, new_expr): 766 | if self.operand == old_expr: 767 | r = True 768 | replaced_operand = new_expr 769 | else: 770 | r, replaced_operand = self.operand.replace(old_expr, new_expr) 771 | 772 | if r: 773 | return True, Reinterpret( 774 | self.idx, self.from_bits, self.from_type, self.to_bits, self.to_type, replaced_operand, **self.tags 775 | ) 776 | else: 777 | return False, self 778 | 779 | def copy(self) -> Reinterpret: 780 | return Reinterpret( 781 | self.idx, self.from_bits, self.from_type, self.to_bits, self.to_type, self.operand, **self.tags 782 | ) 783 | 784 | 785 | class BinaryOp(Op): 786 | __slots__ = ( 787 | "operands", 788 | "variable", 789 | "variable_offset", 790 | "floating_point", 791 | "rounding_mode", 792 | "signed", 793 | "vector_count", 794 | "vector_size", 795 | ) 796 | 797 | OPSTR_MAP = { 798 | "Add": "+", 799 | "AddF": "+", 800 | "AddV": "+", 801 | "Sub": "-", 802 | "SubF": "-", 803 | "Mul": "*", 804 | "MulF": "*", 805 | "MulV": "*", 806 | "Div": "/", 807 | "DivF": "/", 808 | "Mod": "%", 809 | "Xor": "^", 810 | "And": "&", 811 | "LogicalAnd": "&&", 812 | "Or": "|", 813 | "LogicalOr": "||", 814 | "Shl": "<<", 815 | "Shr": ">>", 816 | "Sar": ">>a", 817 | "CmpF": "CmpF", 818 | "CmpEQ": "==", 819 | "CmpNE": "!=", 820 | "CmpLT": "<", 821 | "CmpLE": "<=", 822 | "CmpGT": ">", 823 | "CmpGE": ">=", 824 | "CmpLT (signed)": "s", 827 | "CmpGE (signed)": ">=s", 828 | "Concat": "CONCAT", 829 | "Ror": "ROR", 830 | "Rol": "ROL", 831 | "Carry": "CARRY", 832 | "SCarry": "SCARRY", 833 | "SBorrow": "SBORROW", 834 | } 835 | 836 | COMPARISON_NEGATION = { 837 | "CmpEQ": "CmpNE", 838 | "CmpNE": "CmpEQ", 839 | "CmpLT": "CmpGE", 840 | "CmpGE": "CmpLT", 841 | "CmpLE": "CmpGT", 842 | "CmpGT": "CmpLE", 843 | } 844 | 845 | def __init__( 846 | self, 847 | idx: int | None, 848 | op: str, 849 | operands: Sequence[Expression], 850 | signed: bool = False, 851 | *, 852 | variable=None, 853 | variable_offset=None, 854 | bits=None, 855 | floating_point=False, 856 | rounding_mode=None, 857 | vector_count: int | None = None, 858 | vector_size: int | None = None, 859 | **kwargs, 860 | ): 861 | depth = ( 862 | max( 863 | operands[0].depth if isinstance(operands[0], Expression) else 0, 864 | operands[1].depth if isinstance(operands[1], Expression) else 0, 865 | ) 866 | + 1 867 | ) 868 | 869 | super().__init__(idx, depth, op, **kwargs) 870 | 871 | assert len(operands) == 2 872 | self.operands = operands 873 | 874 | if bits is not None: 875 | self.bits = bits 876 | elif self.op == "CmpF": 877 | self.bits = 32 # floating point comparison 878 | elif self.op in { 879 | "CmpEQ", 880 | "CmpNE", 881 | "CmpLT", 882 | "CmpGE", 883 | "CmpLE", 884 | "CmpGT", 885 | "ExpCmpNE", 886 | }: 887 | self.bits = 1 888 | elif self.op in {"Carry", "SCarry", "SBorrow"}: 889 | self.bits = 8 890 | elif self.op == "Concat": 891 | self.bits = get_bits(operands[0]) + get_bits(operands[1]) 892 | elif self.op == "Mull": 893 | self.bits = get_bits(operands[0]) * 2 if not isinstance(operands[0], int) else get_bits(operands[1]) * 2 894 | else: 895 | self.bits = get_bits(operands[0]) if not isinstance(operands[0], int) else get_bits(operands[1]) 896 | self.signed = signed 897 | self.variable = variable 898 | self.variable_offset = variable_offset 899 | self.floating_point = floating_point 900 | self.rounding_mode: str | None = rounding_mode 901 | self.vector_count = vector_count 902 | self.vector_size = vector_size 903 | 904 | # TODO: sanity check of operands' sizes for some ops 905 | # assert self.bits == operands[1].bits 906 | 907 | def __str__(self): 908 | op_str = self.OPSTR_MAP.get(self.verbose_op, self.verbose_op) 909 | return f"({str(self.operands[0])} {op_str} {str(self.operands[1])})" 910 | 911 | def __repr__(self): 912 | return f"{self.verbose_op}({self.operands[0]}, {self.operands[1]})" 913 | 914 | def likes(self, other): 915 | return ( 916 | type(other) is BinaryOp 917 | and self.op == other.op 918 | and self.bits == other.bits 919 | and self.signed == other.signed 920 | and is_none_or_likeable(self.operands, other.operands, is_list=True) 921 | and self.floating_point == other.floating_point 922 | and self.rounding_mode == other.rounding_mode 923 | ) 924 | 925 | def matches(self, other): 926 | return ( 927 | type(other) is BinaryOp 928 | and self.op == other.op 929 | and self.bits == other.bits 930 | and self.signed == other.signed 931 | and is_none_or_matchable(self.operands, other.operands, is_list=True) 932 | and self.floating_point == other.floating_point 933 | and self.rounding_mode == other.rounding_mode 934 | ) 935 | 936 | __hash__ = TaggedObject.__hash__ # type: ignore 937 | 938 | def _hash_core(self): 939 | return stable_hash( 940 | (self.op, tuple(self.operands), self.bits, self.signed, self.floating_point, self.rounding_mode) 941 | ) 942 | 943 | def has_atom(self, atom, identity=True): 944 | if super().has_atom(atom, identity=identity): 945 | return True 946 | 947 | for op in self.operands: 948 | if identity and op == atom: 949 | return True 950 | if not identity and isinstance(op, Expression) and op.likes(atom): 951 | return True 952 | if isinstance(op, Expression) and op.has_atom(atom, identity=identity): 953 | return True 954 | 955 | if self.rounding_mode is not None: 956 | if identity and self.rounding_mode == atom: 957 | return True 958 | if not identity and isinstance(self.rounding_mode, Atom) and self.rounding_mode.likes(atom): 959 | return True 960 | if isinstance(self.rounding_mode, Atom) and self.rounding_mode.has_atom(atom, identity=identity): 961 | return True 962 | 963 | return False 964 | 965 | def replace(self, old_expr: Expression, new_expr: Expression) -> tuple[bool, BinaryOp]: 966 | if self.operands[0] == old_expr: 967 | r0 = True 968 | replaced_operand_0 = new_expr 969 | elif isinstance(self.operands[0], Expression): 970 | r0, replaced_operand_0 = self.operands[0].replace(old_expr, new_expr) 971 | else: 972 | r0, replaced_operand_0 = False, new_expr 973 | 974 | if self.operands[1] == old_expr: 975 | r1 = True 976 | replaced_operand_1 = new_expr 977 | elif isinstance(self.operands[1], Expression): 978 | r1, replaced_operand_1 = self.operands[1].replace(old_expr, new_expr) 979 | else: 980 | r1, replaced_operand_1 = False, new_expr 981 | 982 | r2, replaced_rm = False, None 983 | if self.rounding_mode is not None: 984 | if self.rounding_mode == old_expr: 985 | r2 = True 986 | replaced_rm = new_expr 987 | 988 | if r0 or r1: 989 | return True, BinaryOp( 990 | self.idx, 991 | self.op, 992 | [replaced_operand_0 if r0 else self.operands[0], replaced_operand_1 if r1 else self.operands[1]], 993 | signed=self.signed, 994 | bits=self.bits, 995 | floating_point=self.floating_point, 996 | rounding_mode=replaced_rm if r2 else self.rounding_mode, 997 | **self.tags, 998 | ) 999 | else: 1000 | return False, self 1001 | 1002 | @property 1003 | def verbose_op(self): 1004 | op = self.op 1005 | if self.floating_point: 1006 | op += " (float)" 1007 | else: 1008 | if self.signed: 1009 | op += " (signed)" 1010 | return op 1011 | 1012 | @property 1013 | def size(self): 1014 | return self.bits // 8 1015 | 1016 | def copy(self) -> BinaryOp: 1017 | return BinaryOp( 1018 | self.idx, 1019 | self.op, 1020 | self.operands[::], 1021 | variable=self.variable, 1022 | signed=self.signed, 1023 | variable_offset=self.variable_offset, 1024 | bits=self.bits, 1025 | floating_point=self.floating_point, 1026 | rounding_mode=self.rounding_mode, 1027 | **self.tags, 1028 | ) 1029 | 1030 | 1031 | class Load(Expression): 1032 | __slots__ = ( 1033 | "addr", 1034 | "size", 1035 | "endness", 1036 | "variable", 1037 | "variable_offset", 1038 | "guard", 1039 | "alt", 1040 | ) 1041 | 1042 | def __init__( 1043 | self, 1044 | idx: int | None, 1045 | addr: Expression, 1046 | size: int, 1047 | endness: str, 1048 | variable=None, 1049 | variable_offset=None, 1050 | guard=None, 1051 | alt=None, 1052 | **kwargs, 1053 | ): 1054 | depth = max(addr.depth, size.depth if isinstance(size, Expression) else 0) + 1 1055 | super().__init__(idx, depth, **kwargs) 1056 | 1057 | self.addr = addr 1058 | self.size = size 1059 | self.endness = endness 1060 | self.guard = guard 1061 | self.alt = alt 1062 | self.variable = variable 1063 | self.variable_offset = variable_offset 1064 | self.bits = self.size * 8 1065 | 1066 | def __repr__(self): 1067 | return str(self) 1068 | 1069 | def __str__(self): 1070 | return "Load(addr=%s, size=%d, endness=%s)" % (self.addr, self.size, self.endness) 1071 | 1072 | def has_atom(self, atom, identity=True): 1073 | if super().has_atom(atom, identity=identity): 1074 | return True 1075 | 1076 | if claripy is not None and isinstance(self.addr, (int, claripy.ast.Base)): 1077 | return False 1078 | return self.addr.has_atom(atom, identity=identity) 1079 | 1080 | def replace(self, old_expr, new_expr): 1081 | if self.addr == old_expr: 1082 | r = True 1083 | replaced_addr = new_expr 1084 | else: 1085 | r, replaced_addr = self.addr.replace(old_expr, new_expr) 1086 | 1087 | if r: 1088 | return True, Load(self.idx, replaced_addr, self.size, self.endness, **self.tags) 1089 | else: 1090 | return False, self 1091 | 1092 | def _likes_addr(self, other_addr): 1093 | if hasattr(self.addr, "likes") and hasattr(other_addr, "likes"): 1094 | return self.addr.likes(other_addr) 1095 | 1096 | return self.addr == other_addr 1097 | 1098 | def likes(self, other): 1099 | return ( 1100 | type(other) is Load 1101 | and self._likes_addr(other.addr) 1102 | and self.size == other.size 1103 | and self.endness == other.endness 1104 | and self.guard == other.guard 1105 | and self.alt == other.alt 1106 | ) 1107 | 1108 | def _matches_addr(self, other_addr): 1109 | if hasattr(self.addr, "matches") and hasattr(other_addr, "matches"): 1110 | return self.addr.matches(other_addr) 1111 | return self.addr == other_addr 1112 | 1113 | def matches(self, other): 1114 | return ( 1115 | type(other) is Load 1116 | and self._matches_addr(other.addr) 1117 | and self.size == other.size 1118 | and self.endness == other.endness 1119 | and self.guard == other.guard 1120 | and self.alt == other.alt 1121 | ) 1122 | 1123 | __hash__ = TaggedObject.__hash__ # type: ignore 1124 | 1125 | def _hash_core(self): 1126 | return stable_hash(("Load", self.addr, self.size, self.endness)) 1127 | 1128 | def copy(self) -> Load: 1129 | return Load( 1130 | self.idx, 1131 | self.addr, 1132 | self.size, 1133 | self.endness, 1134 | variable=self.variable, 1135 | variable_offset=self.variable_offset, 1136 | guard=self.guard, 1137 | alt=self.alt, 1138 | **self.tags, 1139 | ) 1140 | 1141 | 1142 | class ITE(Expression): 1143 | __slots__ = ( 1144 | "cond", 1145 | "iffalse", 1146 | "iftrue", 1147 | "variable", 1148 | "variable_offset", 1149 | ) 1150 | 1151 | def __init__( 1152 | self, 1153 | idx: int | None, 1154 | cond: Expression, 1155 | iffalse: Expression, 1156 | iftrue: Expression, 1157 | variable=None, 1158 | variable_offset=None, 1159 | **kwargs, 1160 | ): 1161 | depth = ( 1162 | max( 1163 | cond.depth if isinstance(cond, Expression) else 0, 1164 | iffalse.depth if isinstance(iffalse, Expression) else 0, 1165 | iftrue.depth if isinstance(iftrue, Expression) else 0, 1166 | ) 1167 | + 1 1168 | ) 1169 | super().__init__(idx, depth, **kwargs) 1170 | 1171 | self.cond = cond 1172 | self.iffalse = iffalse 1173 | self.iftrue = iftrue 1174 | self.bits = iftrue.bits 1175 | self.variable = variable 1176 | self.variable_offset = variable_offset 1177 | 1178 | def __repr__(self): 1179 | return str(self) 1180 | 1181 | def __str__(self): 1182 | return f"(({self.cond}) ? ({self.iftrue}) : ({self.iffalse}))" 1183 | 1184 | def likes(self, other): 1185 | return ( 1186 | type(other) is ITE 1187 | and self.cond.likes(other.cond) 1188 | and self.iffalse == other.iffalse 1189 | and self.iftrue == other.iftrue 1190 | and self.bits == other.bits 1191 | ) 1192 | 1193 | def matches(self, other): 1194 | return ( 1195 | type(other) is ITE 1196 | and self.cond.matches(other.cond) 1197 | and self.iffalse == other.iffalse 1198 | and self.iftrue == other.iftrue 1199 | and self.bits == other.bits 1200 | ) 1201 | 1202 | __hash__ = TaggedObject.__hash__ # type: ignore 1203 | 1204 | def _hash_core(self): 1205 | return stable_hash((ITE, self.cond, self.iffalse, self.iftrue, self.bits)) 1206 | 1207 | def has_atom(self, atom, identity=True): 1208 | if super().has_atom(atom, identity=identity): 1209 | return True 1210 | 1211 | return ( 1212 | self.cond.has_atom(atom, identity=identity) 1213 | or self.iftrue.has_atom(atom, identity=identity) 1214 | or self.iffalse.has_atom(atom, identity=identity) 1215 | ) 1216 | 1217 | def replace(self, old_expr, new_expr): 1218 | if self.cond == old_expr: 1219 | cond_replaced = True 1220 | new_cond = new_expr 1221 | else: 1222 | cond_replaced, new_cond = self.cond.replace(old_expr, new_expr) 1223 | 1224 | if self.iffalse == old_expr: 1225 | iffalse_replaced = True 1226 | new_iffalse = new_expr 1227 | else: 1228 | iffalse_replaced, new_iffalse = self.iffalse.replace(old_expr, new_expr) 1229 | 1230 | if self.iftrue == old_expr: 1231 | iftrue_replaced = True 1232 | new_iftrue = new_expr 1233 | else: 1234 | iftrue_replaced, new_iftrue = self.iftrue.replace(old_expr, new_expr) 1235 | 1236 | replaced = cond_replaced or iftrue_replaced or iffalse_replaced 1237 | 1238 | if replaced: 1239 | return True, ITE(self.idx, new_cond, new_iffalse, new_iftrue, **self.tags) 1240 | else: 1241 | return False, self 1242 | 1243 | @property 1244 | def size(self): 1245 | return self.bits // 8 1246 | 1247 | def copy(self) -> ITE: 1248 | return ITE(self.idx, self.cond, self.iffalse, self.iftrue, **self.tags) 1249 | 1250 | 1251 | class DirtyExpression(Expression): 1252 | __slots__ = ( 1253 | "callee", 1254 | "guard", 1255 | "operands", 1256 | "mfx", 1257 | "maddr", 1258 | "msize", 1259 | ) 1260 | 1261 | def __init__( 1262 | self, 1263 | idx, 1264 | callee: str, 1265 | operands: list[Expression], 1266 | *, 1267 | guard: Expression | None = None, 1268 | mfx: str | None = None, 1269 | maddr: Expression | None = None, 1270 | msize: int | None = None, 1271 | # TODO: fxstate (guest state effects) is not modeled yet 1272 | bits: int, 1273 | **kwargs, 1274 | ): 1275 | super().__init__(idx, 1, **kwargs) 1276 | 1277 | self.callee = callee 1278 | self.guard = guard 1279 | self.operands = operands 1280 | self.mfx = mfx 1281 | self.maddr = maddr 1282 | self.msize = msize 1283 | self.bits = bits 1284 | 1285 | @property 1286 | def op(self) -> str: 1287 | return self.callee 1288 | 1289 | @property 1290 | def verbose_op(self) -> str: 1291 | return self.op 1292 | 1293 | def likes(self, other): 1294 | return ( 1295 | type(other) is DirtyExpression 1296 | and other.callee == self.callee 1297 | and is_none_or_likeable(other.guard, self.guard) 1298 | and len(self.operands) == len(other.operands) 1299 | and all(op1.likes(op2) for op1, op2 in zip(self.operands, other.operands)) 1300 | and other.mfx == self.mfx 1301 | and is_none_or_likeable(other.maddr, self.maddr) 1302 | and other.msize == self.msize 1303 | and self.bits == other.bits 1304 | ) 1305 | 1306 | def matches(self, other): 1307 | return ( 1308 | type(other) is DirtyExpression 1309 | and other.callee == self.callee 1310 | and is_none_or_matchable(other.guard, self.guard) 1311 | and len(self.operands) == len(other.operands) 1312 | and all(op1.matches(op2) for op1, op2 in zip(self.operands, other.operands)) 1313 | and other.mfx == self.mfx 1314 | and is_none_or_matchable(other.maddr, self.maddr) 1315 | and other.msize == self.msize 1316 | and self.bits == other.bits 1317 | ) 1318 | 1319 | __hash__ = TaggedObject.__hash__ # type: ignore 1320 | 1321 | def _hash_core(self): 1322 | return stable_hash( 1323 | ( 1324 | DirtyExpression, 1325 | self.callee, 1326 | self.guard, 1327 | tuple(self.operands), 1328 | self.mfx, 1329 | self.maddr, 1330 | self.msize, 1331 | self.bits, 1332 | ) 1333 | ) 1334 | 1335 | def __repr__(self): 1336 | return f"[D] {self.callee}({', '.join(repr(op) for op in self.operands)})" 1337 | 1338 | def __str__(self): 1339 | return f"[D] {self.callee}({', '.join(repr(op) for op in self.operands)})" 1340 | 1341 | def copy(self) -> DirtyExpression: 1342 | return DirtyExpression( 1343 | self.idx, 1344 | self.callee, 1345 | self.operands, 1346 | guard=self.guard, 1347 | mfx=self.mfx, 1348 | maddr=self.maddr, 1349 | msize=self.msize, 1350 | bits=self.bits, 1351 | **self.tags, 1352 | ) 1353 | 1354 | def replace(self, old_expr: Expression, new_expr: Expression): 1355 | new_operands = [] 1356 | replaced = False 1357 | for op in self.operands: 1358 | if old_expr == op: 1359 | replaced = True 1360 | new_operands.append(new_expr) 1361 | else: 1362 | r, new_op = op.replace(old_expr, new_expr) 1363 | if r: 1364 | replaced = True 1365 | new_operands.append(new_op) 1366 | else: 1367 | new_operands.append(op) 1368 | 1369 | if replaced: 1370 | return True, DirtyExpression( 1371 | self.idx, 1372 | self.callee, 1373 | new_operands, 1374 | guard=self.guard, 1375 | mfx=self.mfx, 1376 | maddr=self.maddr, 1377 | msize=self.msize, 1378 | bits=self.bits, 1379 | **self.tags, 1380 | ) 1381 | else: 1382 | return False, self 1383 | 1384 | @property 1385 | def size(self): 1386 | if self.bits is None: 1387 | return None 1388 | return self.bits // 8 1389 | 1390 | 1391 | class VEXCCallExpression(Expression): 1392 | __slots__ = ( 1393 | "callee", 1394 | "operands", 1395 | ) 1396 | 1397 | def __init__(self, idx: int | None, callee: str, operands: tuple[Expression, ...], bits: int, **kwargs): 1398 | super().__init__(idx, max(operand.depth for operand in operands), **kwargs) 1399 | self.callee = callee 1400 | self.operands = operands 1401 | self.bits = bits 1402 | 1403 | @property 1404 | def op(self) -> str: 1405 | return self.callee 1406 | 1407 | @property 1408 | def verbose_op(self) -> str: 1409 | return self.op 1410 | 1411 | def likes(self, other): 1412 | return ( 1413 | type(other) is VEXCCallExpression 1414 | and other.callee == self.callee 1415 | and len(self.operands) == len(other.operands) 1416 | and self.bits == other.bits 1417 | and all(op1.likes(op2) for op1, op2 in zip(other.operands, self.operands)) 1418 | ) 1419 | 1420 | def matches(self, other): 1421 | return ( 1422 | type(other) is VEXCCallExpression 1423 | and other.callee == self.callee 1424 | and len(self.operands) == len(other.operands) 1425 | and self.bits == other.bits 1426 | and all(op1.matches(op2) for op1, op2 in zip(other.operands, self.operands)) 1427 | ) 1428 | 1429 | __hash__ = TaggedObject.__hash__ # type: ignore 1430 | 1431 | def _hash_core(self): 1432 | return stable_hash((VEXCCallExpression, self.callee, self.bits, tuple(self.operands))) 1433 | 1434 | def __repr__(self): 1435 | return f"VEXCCallExpression [{self.callee}({', '.join(repr(op) for op in self.operands)})]" 1436 | 1437 | def __str__(self): 1438 | operands_str = ", ".join(repr(op) for op in self.operands) 1439 | return f"{self.callee}({operands_str})" 1440 | 1441 | def copy(self) -> VEXCCallExpression: 1442 | return VEXCCallExpression(self.idx, self.callee, self.operands, bits=self.bits, **self.tags) 1443 | 1444 | def replace(self, old_expr, new_expr): 1445 | new_operands = [] 1446 | replaced = False 1447 | for operand in self.operands: 1448 | if operand is old_expr: 1449 | new_operands.append(new_expr) 1450 | replaced = True 1451 | else: 1452 | operand_replaced, new_operand = operand.replace(old_expr, new_expr) 1453 | if operand_replaced: 1454 | new_operands.append(new_operand) 1455 | replaced = True 1456 | else: 1457 | new_operands.append(operand) 1458 | 1459 | if replaced: 1460 | return True, VEXCCallExpression(self.idx, self.callee, tuple(new_operands), bits=self.bits, **self.tags) 1461 | else: 1462 | return False, self 1463 | 1464 | @property 1465 | def size(self): 1466 | if self.bits is None: 1467 | return None 1468 | return self.bits // 8 1469 | 1470 | 1471 | class MultiStatementExpression(Expression): 1472 | """ 1473 | For representing comma-separated statements and expression in C. 1474 | """ 1475 | 1476 | __slots__ = ( 1477 | "stmts", 1478 | "expr", 1479 | ) 1480 | 1481 | def __init__(self, idx: int | None, stmts: list[Statement], expr: Expression, **kwargs): 1482 | super().__init__(idx, expr.depth + 1, **kwargs) 1483 | self.stmts = stmts 1484 | self.expr = expr 1485 | self.bits = self.expr.bits 1486 | 1487 | __hash__ = TaggedObject.__hash__ # type: ignore 1488 | 1489 | def _hash_core(self): 1490 | return stable_hash((MultiStatementExpression,) + tuple(self.stmts) + (self.expr,)) 1491 | 1492 | def likes(self, other): 1493 | return ( 1494 | type(self) is type(other) 1495 | and len(self.stmts) == len(other.stmts) 1496 | and all(s_stmt.likes(o_stmt) for s_stmt, o_stmt in zip(self.stmts, other.stmts)) 1497 | and self.expr.likes(other.expr) 1498 | ) 1499 | 1500 | def matches(self, other): 1501 | return ( 1502 | type(self) is type(other) 1503 | and len(self.stmts) == len(other.stmts) 1504 | and all(s_stmt.matches(o_stmt) for s_stmt, o_stmt in zip(self.stmts, other.stmts)) 1505 | and self.expr.matches(other.expr) 1506 | ) 1507 | 1508 | def __repr__(self): 1509 | return f"MultiStatementExpression({self.stmts}, {self.expr})" 1510 | 1511 | def __str__(self): 1512 | stmts_str = [str(stmt) for stmt in self.stmts] 1513 | expr_str = str(self.expr) 1514 | concatenated_str = ", ".join(stmts_str + [expr_str]) 1515 | return f"({concatenated_str})" 1516 | 1517 | @property 1518 | def size(self): 1519 | return self.expr.size 1520 | 1521 | def replace(self, old_expr, new_expr): 1522 | replaced = False 1523 | 1524 | new_stmts = [] 1525 | for stmt in self.stmts: 1526 | r, new_stmt = stmt.replace(old_expr, new_expr) 1527 | new_stmts.append(new_stmt if new_stmt is not None else stmt) 1528 | replaced |= r 1529 | 1530 | if self.expr is old_expr: 1531 | replaced = True 1532 | new_expr_ = new_expr 1533 | else: 1534 | r, new_expr_ = self.expr.replace(old_expr, new_expr) 1535 | replaced |= r 1536 | 1537 | if replaced: 1538 | return True, MultiStatementExpression( 1539 | self.idx, new_stmts, new_expr_ if new_expr_ is not None else self.expr, **self.tags 1540 | ) 1541 | return False, self 1542 | 1543 | def copy(self) -> MultiStatementExpression: 1544 | return MultiStatementExpression(self.idx, self.stmts[::], self.expr, **self.tags) 1545 | 1546 | 1547 | # 1548 | # Special (Dummy) expressions 1549 | # 1550 | 1551 | 1552 | class BasePointerOffset(Expression): 1553 | __slots__ = ( 1554 | "base", 1555 | "offset", 1556 | "variable", 1557 | "variable_offset", 1558 | ) 1559 | 1560 | def __init__( 1561 | self, 1562 | idx: int | None, 1563 | bits: int, 1564 | base: Expression | str, 1565 | offset: int, 1566 | variable=None, 1567 | variable_offset=None, 1568 | **kwargs, 1569 | ): 1570 | super().__init__(idx, (offset.depth if isinstance(offset, Expression) else 0) + 1, **kwargs) 1571 | self.bits = bits 1572 | self.base = base 1573 | self.offset = offset 1574 | self.variable = variable 1575 | self.variable_offset = variable_offset 1576 | 1577 | @property 1578 | def size(self): 1579 | return self.bits // 8 1580 | 1581 | def __repr__(self): 1582 | if self.offset is None: 1583 | return "BaseOffset(%s)" % self.base 1584 | if isinstance(self.offset, int): 1585 | return "BaseOffset(%s, %d)" % (self.base, self.offset) 1586 | return f"BaseOffset({self.base}, {self.offset})" 1587 | 1588 | def __str__(self): 1589 | if self.offset is None: 1590 | return str(self.base) 1591 | if isinstance(self.offset, int): 1592 | return "%s%+d" % (self.base, self.offset) 1593 | return f"{self.base}+{self.offset}" 1594 | 1595 | def likes(self, other): 1596 | return ( 1597 | type(other) is type(self) 1598 | and self.bits == other.bits 1599 | and self.base == other.base 1600 | and self.offset == other.offset 1601 | ) 1602 | 1603 | matches = likes 1604 | __hash__ = TaggedObject.__hash__ # type: ignore 1605 | 1606 | def _hash_core(self): 1607 | return stable_hash((self.bits, self.base, self.offset)) 1608 | 1609 | def replace(self, old_expr, new_expr): 1610 | if isinstance(self.base, Expression): 1611 | base_replaced, new_base = self.base.replace(old_expr, new_expr) 1612 | else: 1613 | base_replaced, new_base = False, self.base 1614 | if isinstance(self.offset, Expression): 1615 | offset_replaced, new_offset = self.offset.replace(old_expr, new_expr) 1616 | else: 1617 | offset_replaced, new_offset = False, self.offset 1618 | 1619 | if base_replaced or offset_replaced: 1620 | return True, BasePointerOffset(self.idx, self.bits, new_base, new_offset, **self.tags) 1621 | return False, self 1622 | 1623 | def copy(self) -> BasePointerOffset: 1624 | return BasePointerOffset(self.idx, self.bits, self.base, self.offset, **self.tags) 1625 | 1626 | 1627 | class StackBaseOffset(BasePointerOffset): 1628 | __slots__ = () 1629 | 1630 | def __init__(self, idx: int | None, bits: int, offset: int, **kwargs): 1631 | # stack base offset is always signed 1632 | if offset >= (1 << (bits - 1)): 1633 | offset -= 1 << bits 1634 | super().__init__(idx, bits, "stack_base", offset, **kwargs) 1635 | 1636 | def copy(self) -> StackBaseOffset: 1637 | return StackBaseOffset(self.idx, self.bits, self.offset, **self.tags) 1638 | 1639 | 1640 | def negate(expr: Expression) -> Expression: 1641 | if isinstance(expr, UnaryOp) and expr.op == "Not": 1642 | # unpack 1643 | return expr.operand 1644 | if isinstance(expr, BinaryOp) and expr.op in BinaryOp.COMPARISON_NEGATION: 1645 | return BinaryOp( 1646 | expr.idx, 1647 | BinaryOp.COMPARISON_NEGATION[expr.op], 1648 | expr.operands, 1649 | signed=expr.signed, 1650 | bits=expr.bits, 1651 | floating_point=expr.floating_point, 1652 | rounding_mode=expr.rounding_mode, 1653 | **expr.tags, 1654 | ) 1655 | return UnaryOp(None, "Not", expr, **expr.tags) 1656 | -------------------------------------------------------------------------------- /ailment/manager.py: -------------------------------------------------------------------------------- 1 | import itertools 2 | 3 | 4 | class Manager: 5 | def __init__(self, name: str | None = None, arch=None): 6 | self.name = name 7 | self.arch = arch 8 | 9 | self.atom_ctr = itertools.count() 10 | 11 | self._ins_addr: int | None = None 12 | 13 | ### 14 | # vex specific 15 | ### 16 | self.vex_stmt_idx: int | None = None 17 | self.tyenv = None 18 | self.block_addr = None 19 | 20 | def next_atom(self): 21 | return next(self.atom_ctr) 22 | 23 | def reset(self): 24 | self.atom_ctr = itertools.count() 25 | 26 | @property 27 | def ins_addr(self) -> int | None: 28 | return self._ins_addr 29 | 30 | @ins_addr.setter 31 | def ins_addr(self, v): 32 | self._ins_addr = v 33 | -------------------------------------------------------------------------------- /ailment/py.typed: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/angr/ailment/761c9d7d2f1b7f9ed6f2652cbeea58742a30ab2e/ailment/py.typed -------------------------------------------------------------------------------- /ailment/statement.py: -------------------------------------------------------------------------------- 1 | # pylint:disable=isinstance-second-argument-not-valid-type,no-self-use,arguments-renamed,too-many-boolean-expressions 2 | from __future__ import annotations 3 | from typing import TYPE_CHECKING 4 | from collections.abc import Sequence 5 | from abc import ABC, abstractmethod 6 | from typing_extensions import Self 7 | 8 | try: 9 | import claripy 10 | except ImportError: 11 | claripy = None 12 | 13 | from .utils import stable_hash, is_none_or_likeable, is_none_or_matchable 14 | from .tagged_object import TaggedObject 15 | from .expression import Atom, Expression, DirtyExpression 16 | 17 | if TYPE_CHECKING: 18 | from angr.calling_conventions import SimCC 19 | 20 | 21 | class Statement(TaggedObject, ABC): 22 | """ 23 | The base class of all AIL statements. 24 | """ 25 | 26 | __slots__ = () 27 | 28 | @abstractmethod 29 | def __repr__(self): 30 | raise NotImplementedError() 31 | 32 | @abstractmethod 33 | def __str__(self): 34 | raise NotImplementedError() 35 | 36 | @abstractmethod 37 | def replace(self, old_expr: Expression, new_expr: Expression) -> tuple[bool, Self]: 38 | raise NotImplementedError() 39 | 40 | def eq(self, expr0, expr1): # pylint:disable=no-self-use 41 | if claripy is not None and (isinstance(expr0, claripy.ast.Base) or isinstance(expr1, claripy.ast.Base)): 42 | return expr0 is expr1 43 | return expr0 == expr1 44 | 45 | @abstractmethod 46 | def likes(self, other) -> bool: # pylint:disable=unused-argument,no-self-use 47 | raise NotImplementedError() 48 | 49 | @abstractmethod 50 | def matches(self, other) -> bool: # pylint:disable=unused-argument,no-self-use 51 | raise NotImplementedError() 52 | 53 | 54 | class Assignment(Statement): 55 | """ 56 | Assignment statement: expr_a = expr_b 57 | """ 58 | 59 | __slots__ = ( 60 | "dst", 61 | "src", 62 | ) 63 | 64 | def __init__(self, idx: int | None, dst: Atom, src: Expression, **kwargs): 65 | super().__init__(idx, **kwargs) 66 | 67 | self.dst = dst 68 | self.src = src 69 | 70 | def __eq__(self, other): 71 | return type(other) is Assignment and self.idx == other.idx and self.dst == other.dst and self.src == other.src 72 | 73 | def likes(self, other): 74 | return type(other) is Assignment and self.dst.likes(other.dst) and self.src.likes(other.src) 75 | 76 | def matches(self, other): 77 | return type(other) is Assignment and self.dst.matches(other.dst) and self.src.matches(other.src) 78 | 79 | __hash__ = TaggedObject.__hash__ 80 | 81 | def _hash_core(self): 82 | return stable_hash((Assignment, self.idx, self.dst, self.src)) 83 | 84 | def __repr__(self): 85 | return f"Assignment ({self.dst}, {self.src})" 86 | 87 | def __str__(self): 88 | return f"{str(self.dst)} = {str(self.src)}" 89 | 90 | def replace(self, old_expr: Expression, new_expr: Expression): 91 | if self.dst == old_expr: 92 | r_dst = True 93 | assert isinstance(new_expr, Atom) 94 | replaced_dst = new_expr 95 | else: 96 | r_dst, replaced_dst = self.dst.replace(old_expr, new_expr) 97 | 98 | if self.src == old_expr: 99 | r_src = True 100 | replaced_src = new_expr 101 | else: 102 | r_src, replaced_src = self.src.replace(old_expr, new_expr) 103 | 104 | if r_dst or r_src: 105 | return True, Assignment(self.idx, replaced_dst, replaced_src, **self.tags) 106 | else: 107 | return False, self 108 | 109 | def copy(self) -> Assignment: 110 | return Assignment(self.idx, self.dst, self.src, **self.tags) 111 | 112 | 113 | class WeakAssignment(Statement): 114 | """ 115 | An assignment statement that does not create a new variable at its destination; It should be seen as 116 | operator=(&dst, &src) in C++-like syntax. 117 | """ 118 | 119 | __slots__ = ( 120 | "dst", 121 | "src", 122 | ) 123 | 124 | def __init__(self, idx: int | None, dst: Atom, src: Expression, **kwargs): 125 | super().__init__(idx, **kwargs) 126 | 127 | self.dst = dst 128 | self.src = src 129 | 130 | def __eq__(self, other): 131 | return ( 132 | type(other) is WeakAssignment and self.idx == other.idx and self.dst == other.dst and self.src == other.src 133 | ) 134 | 135 | def likes(self, other): 136 | return type(other) is WeakAssignment and self.dst.likes(other.dst) and self.src.likes(other.src) 137 | 138 | def matches(self, other): 139 | return type(other) is WeakAssignment and self.dst.matches(other.dst) and self.src.matches(other.src) 140 | 141 | __hash__ = TaggedObject.__hash__ 142 | 143 | def _hash_core(self): 144 | return stable_hash((WeakAssignment, self.idx, self.dst, self.src)) 145 | 146 | def __repr__(self): 147 | return f"WeakAssignment ({self.dst}, {self.src})" 148 | 149 | def __str__(self): 150 | return f"{str(self.dst)} =W {str(self.src)}" 151 | 152 | def replace(self, old_expr: Expression, new_expr: Expression): 153 | if self.dst == old_expr: 154 | r_dst = True 155 | assert isinstance(new_expr, Atom) 156 | replaced_dst = new_expr 157 | else: 158 | r_dst, replaced_dst = self.dst.replace(old_expr, new_expr) 159 | 160 | if self.src == old_expr: 161 | r_src = True 162 | replaced_src = new_expr 163 | else: 164 | r_src, replaced_src = self.src.replace(old_expr, new_expr) 165 | 166 | if r_dst or r_src: 167 | return True, WeakAssignment(self.idx, replaced_dst, replaced_src, **self.tags) 168 | else: 169 | return False, self 170 | 171 | def copy(self) -> WeakAssignment: 172 | return WeakAssignment(self.idx, self.dst, self.src, **self.tags) 173 | 174 | 175 | class Store(Statement): 176 | """ 177 | Store statement: *addr = data 178 | """ 179 | 180 | __slots__ = ( 181 | "addr", 182 | "size", 183 | "data", 184 | "endness", 185 | "variable", 186 | "offset", 187 | "guard", 188 | ) 189 | 190 | def __init__( 191 | self, 192 | idx: int | None, 193 | addr: Expression, 194 | data: Expression, 195 | size: int, 196 | endness: str, 197 | guard: Expression | None = None, 198 | variable=None, 199 | offset=None, 200 | **kwargs, 201 | ): 202 | super().__init__(idx, **kwargs) 203 | 204 | self.addr = addr 205 | self.data = data 206 | self.size = size 207 | self.endness = endness 208 | self.variable = variable 209 | self.guard = guard 210 | self.offset = offset # variable_offset 211 | 212 | def __eq__(self, other): 213 | return ( 214 | type(other) is Store 215 | and self.idx == other.idx 216 | and self.eq(self.addr, other.addr) 217 | and self.eq(self.data, other.data) 218 | and self.size == other.size 219 | and self.guard == other.guard 220 | and self.endness == other.endness 221 | ) 222 | 223 | def likes(self, other): 224 | return ( 225 | type(other) is Store 226 | and self.addr.likes(other.addr) 227 | and self.data.likes(other.data) 228 | and self.size == other.size 229 | and self.guard == other.guard 230 | and self.endness == other.endness 231 | ) 232 | 233 | def matches(self, other): 234 | return ( 235 | type(other) is Store 236 | and self.addr.matches(other.addr) 237 | and self.data.matches(other.data) 238 | and self.size == other.size 239 | and self.guard == other.guard 240 | and self.endness == other.endness 241 | ) 242 | 243 | __hash__ = TaggedObject.__hash__ 244 | 245 | def _hash_core(self): 246 | return stable_hash((Store, self.idx, self.addr, self.data, self.size, self.endness, self.guard)) 247 | 248 | def __repr__(self): 249 | return "Store (%s, %s[%d])%s" % ( 250 | self.addr, 251 | str(self.data), 252 | self.size, 253 | "" if self.guard is None else "[%s]" % self.guard, 254 | ) 255 | 256 | def __str__(self): 257 | if self.variable is None: 258 | return "STORE(addr={}, data={}, size={}, endness={}, guard={})".format( 259 | self.addr, str(self.data), self.size, self.endness, self.guard 260 | ) 261 | else: 262 | return "%s =%s %s<%d>%s" % ( 263 | self.variable.name, 264 | "L" if self.endness == "Iend_LE" else "B", 265 | str(self.data), 266 | self.size, 267 | "" if self.guard is None else "[%s]" % self.guard, 268 | ) 269 | 270 | def replace(self, old_expr, new_expr): 271 | if self.addr.likes(old_expr): 272 | r_addr = True 273 | replaced_addr = new_expr 274 | else: 275 | r_addr, replaced_addr = self.addr.replace(old_expr, new_expr) 276 | 277 | if isinstance(self.data, Expression): 278 | if self.data.likes(old_expr): 279 | r_data = True 280 | replaced_data = new_expr 281 | else: 282 | r_data, replaced_data = self.data.replace(old_expr, new_expr) 283 | else: 284 | r_data, replaced_data = False, self.data 285 | 286 | if self.guard is not None: 287 | r_guard, replaced_guard = self.guard.replace(old_expr, new_expr) 288 | else: 289 | r_guard, replaced_guard = False, None 290 | 291 | if r_addr or r_data or r_guard: 292 | return True, Store( 293 | self.idx, 294 | replaced_addr, 295 | replaced_data, 296 | self.size, 297 | self.endness, 298 | guard=replaced_guard, 299 | variable=self.variable, 300 | **self.tags, 301 | ) 302 | else: 303 | return False, self 304 | 305 | def copy(self) -> Store: 306 | return Store( 307 | self.idx, 308 | self.addr, 309 | self.data, 310 | self.size, 311 | self.endness, 312 | guard=self.guard, 313 | variable=self.variable, 314 | offset=self.offset, 315 | **self.tags, 316 | ) 317 | 318 | 319 | class Jump(Statement): 320 | """ 321 | Jump statement: goto target 322 | """ 323 | 324 | __slots__ = ( 325 | "target", 326 | "target_idx", 327 | ) 328 | 329 | def __init__(self, idx: int | None, target: Expression, target_idx: int | None = None, **kwargs): 330 | super().__init__(idx, **kwargs) 331 | 332 | self.target = target 333 | self.target_idx = target_idx 334 | 335 | def __eq__(self, other): 336 | return type(other) is Jump and self.idx == other.idx and self.target == other.target 337 | 338 | def likes(self, other): 339 | return type(other) is Jump and is_none_or_likeable(self.target, other.target) 340 | 341 | def matches(self, other): 342 | return type(other) is Jump and is_none_or_matchable(self.target, other.target) 343 | 344 | __hash__ = TaggedObject.__hash__ 345 | 346 | def _hash_core(self): 347 | return stable_hash((Jump, self.idx, self.target)) 348 | 349 | def __repr__(self): 350 | if self.target_idx is not None: 351 | return f"Jump ({self.target}.{self.target_idx})" 352 | return "Jump (%s)" % self.target 353 | 354 | def __str__(self): 355 | if self.target_idx is not None: 356 | return f"Goto({self.target}.{self.target_idx})" 357 | return "Goto(%s)" % self.target 358 | 359 | @property 360 | def depth(self): 361 | return self.target.depth 362 | 363 | def replace(self, old_expr, new_expr): 364 | r, replaced_target = self.target.replace(old_expr, new_expr) 365 | 366 | if r: 367 | return True, Jump(self.idx, replaced_target, **self.tags) 368 | else: 369 | return False, self 370 | 371 | def copy(self): 372 | return Jump( 373 | self.idx, 374 | self.target, 375 | **self.tags, 376 | ) 377 | 378 | 379 | class ConditionalJump(Statement): 380 | """ 381 | if (cond) {true_target} else {false_target} 382 | """ 383 | 384 | __slots__ = ( 385 | "condition", 386 | "true_target", 387 | "false_target", 388 | "true_target_idx", 389 | "false_target_idx", 390 | ) 391 | 392 | def __init__( 393 | self, 394 | idx: int | None, 395 | condition: Expression, 396 | true_target: Expression | None, 397 | false_target: Expression | None, 398 | true_target_idx: int | None = None, 399 | false_target_idx: int | None = None, 400 | **kwargs, 401 | ): 402 | super().__init__(idx, **kwargs) 403 | 404 | self.condition = condition 405 | self.true_target = true_target 406 | self.false_target = false_target 407 | self.true_target_idx = true_target_idx 408 | self.false_target_idx = false_target_idx 409 | 410 | def __eq__(self, other): 411 | return ( 412 | type(other) is ConditionalJump 413 | and self.idx == other.idx 414 | and self.condition == other.condition 415 | and self.true_target == other.true_target 416 | and self.false_target == other.false_target 417 | and self.true_target_idx == other.true_target_idx 418 | and self.false_target_idx == other.false_target_idx 419 | ) 420 | 421 | def likes(self, other): 422 | return ( 423 | type(other) is ConditionalJump 424 | and self.condition.likes(other.condition) 425 | and is_none_or_likeable(self.true_target, other.true_target) 426 | and is_none_or_likeable(self.false_target, other.false_target) 427 | ) 428 | 429 | def matches(self, other): 430 | return ( 431 | type(other) is ConditionalJump 432 | and self.condition.matches(other.condition) 433 | and is_none_or_matchable(self.true_target, other.true_target) 434 | and is_none_or_matchable(self.false_target, other.false_target) 435 | ) 436 | 437 | __hash__ = TaggedObject.__hash__ 438 | 439 | def _hash_core(self): 440 | return stable_hash( 441 | ( 442 | ConditionalJump, 443 | self.idx, 444 | self.condition, 445 | self.true_target, 446 | self.false_target, 447 | self.true_target_idx, 448 | self.false_target_idx, 449 | ) 450 | ) 451 | 452 | def __repr__(self): 453 | return "ConditionalJump (condition: {}, true: {}{}, false: {}{})".format( 454 | self.condition, 455 | self.true_target, 456 | "" if self.true_target_idx is None else f".{self.true_target_idx}", 457 | self.false_target, 458 | "" if self.false_target_idx is None else f".{self.false_target_idx}", 459 | ) 460 | 461 | def __str__(self): 462 | return "if ({}) {{ Goto {}{} }} else {{ Goto {}{} }}".format( 463 | self.condition, 464 | self.true_target, 465 | "" if self.true_target_idx is None else f".{self.true_target_idx}", 466 | self.false_target, 467 | "" if self.false_target_idx is None else f".{self.false_target_idx}", 468 | ) 469 | 470 | def replace(self, old_expr, new_expr): 471 | if self.condition == old_expr: 472 | r_cond = True 473 | replaced_cond = new_expr 474 | else: 475 | r_cond, replaced_cond = self.condition.replace(old_expr, new_expr) 476 | 477 | if self.true_target is not None: 478 | if self.true_target == old_expr: 479 | r_true = True 480 | replaced_true = new_expr 481 | else: 482 | r_true, replaced_true = self.true_target.replace(old_expr, new_expr) 483 | else: 484 | r_true, replaced_true = False, self.true_target 485 | 486 | if self.false_target is not None: 487 | if self.false_target == old_expr: 488 | r_false = True 489 | replaced_false = new_expr 490 | else: 491 | r_false, replaced_false = self.false_target.replace(old_expr, new_expr) 492 | else: 493 | r_false, replaced_false = False, self.false_target 494 | 495 | r = r_cond or r_true or r_false 496 | 497 | if r: 498 | return True, ConditionalJump( 499 | self.idx, 500 | replaced_cond, 501 | replaced_true, 502 | replaced_false, 503 | true_target_idx=self.true_target_idx, 504 | false_target_idx=self.false_target_idx, 505 | **self.tags, 506 | ) 507 | else: 508 | return False, self 509 | 510 | def copy(self) -> ConditionalJump: 511 | return ConditionalJump( 512 | self.idx, 513 | self.condition, 514 | self.true_target, 515 | self.false_target, 516 | true_target_idx=self.true_target_idx, 517 | false_target_idx=self.false_target_idx, 518 | **self.tags, 519 | ) 520 | 521 | 522 | class Call(Expression, Statement): 523 | """ 524 | Call is both an expression and a statement. 525 | 526 | When used as a statement, it will set ret_expr, fp_ret_expr, or both if both of them should hold return values. 527 | When used as an expression, both ret_expr and fp_ret_expr should be None (and should be ignored). The size of the 528 | call expression is stored in the bits attribute. 529 | """ 530 | 531 | __slots__ = ( 532 | "target", 533 | "calling_convention", 534 | "prototype", 535 | "args", 536 | "ret_expr", 537 | "fp_ret_expr", 538 | ) 539 | 540 | def __init__( 541 | self, 542 | idx: int | None, 543 | target: Expression | str, 544 | calling_convention: SimCC | None = None, 545 | prototype=None, 546 | args: Sequence[Expression] | None = None, 547 | ret_expr: Expression | None = None, 548 | fp_ret_expr: Expression | None = None, 549 | bits: int | None = None, 550 | **kwargs, 551 | ): 552 | super().__init__(idx, target.depth + 1 if isinstance(target, Expression) else 1, **kwargs) 553 | 554 | self.target = target 555 | self.calling_convention = calling_convention 556 | self.prototype = prototype 557 | self.args = args 558 | self.ret_expr = ret_expr 559 | self.fp_ret_expr = fp_ret_expr 560 | if bits is not None: 561 | self.bits = bits 562 | elif ret_expr is not None: 563 | self.bits = ret_expr.bits 564 | elif fp_ret_expr is not None: 565 | self.bits = fp_ret_expr.bits 566 | else: 567 | self.bits = 0 # uhhhhhhhhhhhhhhhhhhh 568 | 569 | def likes(self, other): 570 | return ( 571 | type(other) is Call 572 | and is_none_or_likeable(self.target, other.target) 573 | and self.calling_convention == other.calling_convention 574 | and self.prototype == other.prototype 575 | and is_none_or_likeable(self.args, other.args, is_list=True) 576 | and is_none_or_likeable(self.ret_expr, other.ret_expr) 577 | and is_none_or_likeable(self.fp_ret_expr, other.fp_ret_expr) 578 | ) 579 | 580 | def matches(self, other): 581 | return ( 582 | type(other) is Call 583 | and is_none_or_matchable(self.target, other.target) 584 | and self.calling_convention == other.calling_convention 585 | and self.prototype == other.prototype 586 | and is_none_or_matchable(self.args, other.args, is_list=True) 587 | and is_none_or_matchable(self.ret_expr, other.ret_expr) 588 | and is_none_or_matchable(self.fp_ret_expr, other.fp_ret_expr) 589 | ) 590 | 591 | __hash__ = TaggedObject.__hash__ # type: ignore 592 | 593 | def _hash_core(self): 594 | return stable_hash((Call, self.idx, self.target)) 595 | 596 | def __repr__(self): 597 | return f"Call (target: {self.target}, prototype: {self.prototype}, args: {self.args})" 598 | 599 | def __str__(self): 600 | cc = "Unknown CC" if self.calling_convention is None else "%s" % self.calling_convention 601 | if self.args is None: 602 | if self.calling_convention is not None: 603 | s = ( 604 | ("%s" % cc) 605 | if self.prototype is None 606 | else f"{self.calling_convention}: {self.calling_convention.arg_locs(self.prototype)}" 607 | ) 608 | else: 609 | s = ("%s" % cc) if self.prototype is None else repr(self.prototype) 610 | else: 611 | s = (f"{cc}: {self.args}") if self.prototype is None else f"{self.calling_convention}: {self.args}" 612 | 613 | if self.ret_expr is None: 614 | ret_s = "no-ret-value" 615 | else: 616 | ret_s = f"{self.ret_expr}" 617 | if self.fp_ret_expr is None: 618 | fp_ret_s = "no-fp-ret-value" 619 | else: 620 | fp_ret_s = f"{self.fp_ret_expr}" 621 | 622 | return f"Call({self.target}, {s}, ret: {ret_s}, fp_ret: {fp_ret_s})" 623 | 624 | @property 625 | def size(self): 626 | return self.bits // 8 627 | 628 | @property 629 | def verbose_op(self): 630 | return "call" 631 | 632 | @property 633 | def op(self): 634 | return "call" 635 | 636 | def replace(self, old_expr: Expression, new_expr: Expression): 637 | if isinstance(self.target, Expression): 638 | r0, replaced_target = self.target.replace(old_expr, new_expr) 639 | else: 640 | r0 = False 641 | replaced_target = self.target 642 | 643 | r = r0 644 | 645 | new_args = None 646 | if self.args: 647 | new_args = [] 648 | for arg in self.args: 649 | if arg == old_expr: 650 | r_arg = True 651 | replaced_arg = new_expr 652 | else: 653 | r_arg, replaced_arg = arg.replace(old_expr, new_expr) 654 | r |= r_arg 655 | new_args.append(replaced_arg) 656 | 657 | new_ret_expr = self.ret_expr 658 | new_bits = self.bits 659 | if self.ret_expr: 660 | if self.ret_expr == old_expr: 661 | r_ret = True 662 | replaced_ret = new_expr 663 | else: 664 | r_ret, replaced_ret = self.ret_expr.replace(old_expr, new_expr) 665 | r |= r_ret 666 | new_ret_expr = replaced_ret 667 | if replaced_ret is not None: 668 | new_bits = replaced_ret.bits 669 | 670 | new_fp_ret_expr = self.fp_ret_expr 671 | if self.fp_ret_expr: 672 | if self.fp_ret_expr == old_expr: 673 | r_ret = True 674 | replaced_fp_ret = new_expr 675 | else: 676 | r_ret, replaced_fp_ret = self.fp_ret_expr.replace(old_expr, new_expr) 677 | r |= r_ret 678 | new_fp_ret_expr = replaced_fp_ret 679 | 680 | if r: 681 | return True, Call( 682 | self.idx, 683 | replaced_target, 684 | calling_convention=self.calling_convention, 685 | prototype=self.prototype, 686 | args=new_args, 687 | ret_expr=new_ret_expr, 688 | fp_ret_expr=new_fp_ret_expr, 689 | bits=new_bits, 690 | **self.tags, 691 | ) 692 | else: 693 | return False, self 694 | 695 | def copy(self): 696 | return Call( 697 | self.idx, 698 | self.target, 699 | calling_convention=self.calling_convention, 700 | prototype=self.prototype, 701 | args=self.args[::] if self.args is not None else None, 702 | ret_expr=self.ret_expr, 703 | fp_ret_expr=self.fp_ret_expr, 704 | bits=self.bits, 705 | **self.tags, 706 | ) 707 | 708 | 709 | class Return(Statement): 710 | """ 711 | Return statement: (return expr_a), (return) 712 | """ 713 | 714 | __slots__ = ("ret_exprs",) 715 | 716 | def __init__(self, idx: int | None, ret_exprs, **kwargs): 717 | super().__init__(idx, **kwargs) 718 | self.ret_exprs = ret_exprs if isinstance(ret_exprs, list) else list(ret_exprs) 719 | 720 | def __eq__(self, other): 721 | return type(other) is Return and self.idx == other.idx and self.ret_exprs == other.ret_exprs 722 | 723 | def likes(self, other): 724 | return type(other) is Return and is_none_or_likeable(self.ret_exprs, other.ret_exprs, is_list=True) 725 | 726 | def matches(self, other): 727 | return type(other) is Return and is_none_or_matchable(self.ret_exprs, other.ret_exprs, is_list=True) 728 | 729 | __hash__ = TaggedObject.__hash__ 730 | 731 | def _hash_core(self): 732 | return stable_hash((Return, self.idx, tuple(self.ret_exprs))) 733 | 734 | def __repr__(self): 735 | return "Return to ({})".format(",".join(repr(x) for x in self.ret_exprs)) 736 | 737 | def __str__(self): 738 | exprs = ",".join(str(ret_expr) for ret_expr in self.ret_exprs) 739 | if not exprs: 740 | return "return;" 741 | else: 742 | return "return %s;" % exprs 743 | 744 | def replace(self, old_expr, new_expr): 745 | new_ret_exprs = [] 746 | replaced = False 747 | 748 | for expr in self.ret_exprs: 749 | if expr == old_expr: 750 | r_expr = True 751 | replaced_expr = new_expr 752 | else: 753 | r_expr, replaced_expr = expr.replace(old_expr, new_expr) 754 | if r_expr: 755 | replaced = True 756 | new_ret_exprs.append(replaced_expr) 757 | else: 758 | new_ret_exprs.append(expr) 759 | 760 | if replaced: 761 | return True, Return( 762 | self.idx, 763 | new_ret_exprs, 764 | **self.tags, 765 | ) 766 | 767 | return False, self 768 | 769 | def copy(self): 770 | return Return( 771 | self.idx, 772 | self.ret_exprs[::], 773 | **self.tags, 774 | ) 775 | 776 | 777 | class CAS(Statement): 778 | """ 779 | Atomic compare-and-swap. 780 | 781 | *_lo and *_hi are used to represent the low and high parts of a 128-bit CAS operation; *_hi is None if the CAS 782 | operation works on values that are less than or equal to 64 bits. 783 | 784 | addr: The address to be compared and swapped. 785 | data: The value to be written if the comparison is successful. 786 | expd: The expected value to be compared against. 787 | old: The value that is currently stored at addr before compare-and-swap; it will be returned after compare-and-swap. 788 | """ 789 | 790 | __slots__ = ("addr", "data_lo", "data_hi", "expd_lo", "expd_hi", "old_lo", "old_hi", "endness") 791 | 792 | def __init__( 793 | self, 794 | idx: int | None, 795 | addr: Expression, 796 | data_lo: Expression, 797 | data_hi: Expression | None, 798 | expd_lo: Expression, 799 | expd_hi: Expression | None, 800 | old_lo: Atom, 801 | old_hi: Atom | None, 802 | endness: str, 803 | **kwargs, 804 | ): 805 | super().__init__(idx, **kwargs) 806 | self.addr = addr 807 | self.data_lo = data_lo 808 | self.data_hi = data_hi 809 | self.expd_lo = expd_lo 810 | self.expd_hi = expd_hi 811 | self.old_lo = old_lo 812 | self.old_hi = old_hi 813 | self.endness = endness 814 | 815 | def _hash_core(self): 816 | return stable_hash( 817 | ( 818 | CAS, 819 | self.idx, 820 | self.addr, 821 | self.data_lo, 822 | self.data_hi, 823 | self.expd_lo, 824 | self.expd_hi, 825 | self.old_lo, 826 | self.old_hi, 827 | self.endness, 828 | ) 829 | ) 830 | 831 | __hash__ = TaggedObject.__hash__ 832 | 833 | def __repr__(self): 834 | if self.old_hi is None: 835 | return f"CAS({self.addr}, {self.data_lo}, {self.expd_lo}, {self.old_lo})" 836 | return ( 837 | f"CAS({self.addr}, {self.data_hi} .. {self.data_lo}, {self.expd_hi} .. {self.expd_lo}, " 838 | f"{self.old_hi} .. {self.old_lo})" 839 | ) 840 | 841 | def __str__(self): 842 | if self.old_hi is None: 843 | return f"{self.old_lo} = CAS({self.addr}, {self.data_lo}, {self.expd_lo})" 844 | return ( 845 | f"{self.old_hi} .. {self.old_lo} = CAS({self.addr}, {self.data_hi} .. {self.data_lo}, " 846 | f"{self.expd_hi} .. {self.expd_lo})" 847 | ) 848 | 849 | def replace(self, old_expr: Expression, new_expr: Expression) -> tuple[bool, CAS]: 850 | r_addr, replaced_addr = self.addr.replace(old_expr, new_expr) 851 | r_data_lo, replaced_data_lo = self.data_lo.replace(old_expr, new_expr) 852 | r_data_hi, replaced_data_hi = self.data_hi.replace(old_expr, new_expr) if self.data_hi else (False, None) 853 | r_expd_lo, replaced_expd_lo = self.expd_lo.replace(old_expr, new_expr) 854 | r_expd_hi, replaced_expd_hi = self.expd_hi.replace(old_expr, new_expr) if self.expd_hi else (False, None) 855 | r_old_lo, replaced_old_lo = self.old_lo.replace(old_expr, new_expr) 856 | r_old_hi, replaced_old_hi = self.old_hi.replace(old_expr, new_expr) if self.old_hi else (False, None) 857 | 858 | if r_addr or r_data_lo or r_data_hi or r_expd_lo or r_expd_hi or r_old_lo or r_old_hi: 859 | return True, CAS( 860 | self.idx, 861 | replaced_addr, 862 | replaced_data_lo, 863 | replaced_data_hi, 864 | replaced_expd_lo, 865 | replaced_expd_hi, 866 | replaced_old_lo, 867 | replaced_old_hi, 868 | endness=self.endness, 869 | **self.tags, 870 | ) 871 | return False, self 872 | 873 | def copy(self) -> CAS: 874 | return CAS( 875 | self.idx, 876 | self.addr, 877 | self.data_lo, 878 | self.data_hi, 879 | self.expd_lo, 880 | self.expd_hi, 881 | self.old_lo, 882 | self.old_hi, 883 | endness=self.endness, 884 | **self.tags, 885 | ) 886 | 887 | def likes(self, other) -> bool: 888 | return ( 889 | type(other) is CAS 890 | and self.addr.likes(other.addr) 891 | and self.data_lo.likes(other.data_lo) 892 | and (self.data_hi is None or self.data_hi.likes(other.data_hi)) 893 | and self.expd_lo.likes(other.expd_lo) 894 | and (self.expd_hi is None or self.expd_hi.likes(other.expd_hi)) 895 | and self.old_lo.likes(other.old_lo) 896 | and (self.old_hi is None or self.old_hi.likes(other.old_hi)) 897 | ) 898 | 899 | def matches(self, other) -> bool: 900 | return ( 901 | type(other) is CAS 902 | and self.addr.matches(other.addr) 903 | and self.data_lo.matches(other.data_lo) 904 | and (self.data_hi is None or self.data_hi.matches(other.data_hi)) 905 | and self.expd_lo.matches(other.expd_lo) 906 | and (self.expd_hi is None or self.expd_hi.matches(other.expd_hi)) 907 | and self.old_lo.matches(other.old_lo) 908 | and (self.old_hi is None or self.old_hi.matches(other.old_hi)) 909 | ) 910 | 911 | @property 912 | def bits(self) -> int: 913 | if self.old_hi is None: 914 | return self.old_lo.bits 915 | return self.old_lo.bits + self.old_hi.bits 916 | 917 | @property 918 | def size(self) -> int: 919 | return self.bits // 8 920 | 921 | 922 | class DirtyStatement(Statement): 923 | """ 924 | Wrapper around the original statement, which is usually not convertible (temporarily). 925 | """ 926 | 927 | __slots__ = ("dirty",) 928 | 929 | def __init__(self, idx: int | None, dirty: DirtyExpression, **kwargs): 930 | super().__init__(idx, **kwargs) 931 | self.dirty = dirty 932 | 933 | def _hash_core(self): 934 | return stable_hash((DirtyStatement, self.dirty)) 935 | 936 | def __repr__(self): 937 | return repr(self.dirty) 938 | 939 | def __str__(self): 940 | return str(self.dirty) 941 | 942 | def replace(self, old_expr, new_expr): 943 | if self.dirty == old_expr: 944 | assert isinstance(new_expr, DirtyExpression) 945 | return True, DirtyStatement(self.idx, new_expr, **self.tags) 946 | r, new_dirty = self.dirty.replace(old_expr, new_expr) 947 | if r: 948 | return True, DirtyStatement(self.idx, new_dirty, **self.tags) 949 | return False, self 950 | 951 | def copy(self) -> DirtyStatement: 952 | return DirtyStatement(self.idx, self.dirty, **self.tags) 953 | 954 | def likes(self, other): 955 | return type(other) is DirtyStatement and self.dirty.likes(other.dirty) 956 | 957 | def matches(self, other): 958 | return type(other) is DirtyStatement and self.dirty.matches(other.dirty) 959 | 960 | 961 | class Label(Statement): 962 | """ 963 | A dummy statement that indicates a label with a name. 964 | """ 965 | 966 | __slots__ = ( 967 | "name", 968 | "ins_addr", 969 | "block_idx", 970 | ) 971 | 972 | def __init__(self, idx: int | None, name: str, ins_addr: int, block_idx: int | None = None, **kwargs): 973 | super().__init__(idx, **kwargs) 974 | self.name = name 975 | self.ins_addr = ins_addr 976 | self.block_idx = block_idx 977 | 978 | def likes(self, other: Label): 979 | return isinstance(other, Label) 980 | 981 | def replace(self, old_expr, new_expr): 982 | return False, self 983 | 984 | matches = likes 985 | 986 | def _hash_core(self): 987 | return stable_hash( 988 | ( 989 | Label, 990 | self.name, 991 | self.ins_addr, 992 | self.block_idx, 993 | ) 994 | ) 995 | 996 | def __repr__(self): 997 | return f"Label {self.name}" 998 | 999 | def __str__(self): 1000 | return f"{self.name}:" 1001 | 1002 | def copy(self) -> Label: 1003 | return Label(self.idx, self.name, self.ins_addr, self.block_idx, **self.tags) 1004 | -------------------------------------------------------------------------------- /ailment/tagged_object.py: -------------------------------------------------------------------------------- 1 | class TaggedObject: 2 | """ 3 | A class that takes arbitrary tags. 4 | """ 5 | 6 | __slots__ = ( 7 | "idx", 8 | "_tags", 9 | "_hash", 10 | ) 11 | 12 | def __init__(self, idx: int | None, **kwargs): 13 | self._tags = None 14 | self.idx = idx 15 | self._hash = None 16 | if kwargs: 17 | self.initialize_tags(kwargs) 18 | 19 | def initialize_tags(self, tags): 20 | self._tags = {} 21 | for k, v in tags.items(): 22 | self._tags[k] = v 23 | 24 | def __getattr__(self, item): 25 | try: 26 | return self.tags[item] 27 | except KeyError: 28 | return super().__getattribute__(item) 29 | 30 | def __new__(cls, *args, **kwargs): # pylint:disable=unused-argument 31 | """Create a new instance and set `_tags` attribute. 32 | 33 | Since TaggedObject override `__getattr__` method and try to access the 34 | `_tags` attribute, infinite recursion could occur if `_tags` not ready 35 | to exists. 36 | 37 | This behavior causes an infinite recursion error when copying 38 | `TaggedObject` with `copy.deepcopy`. 39 | 40 | Hence, we set `_tags` attribute here to prevent this problem. 41 | """ 42 | self = super().__new__(cls) 43 | self._tags = None 44 | return self 45 | 46 | def __hash__(self) -> int: 47 | if self._hash is None: 48 | self._hash = self._hash_core() 49 | return self._hash 50 | 51 | def _hash_core(self): 52 | raise NotImplementedError() 53 | 54 | @property 55 | def tags(self) -> dict: 56 | if not self._tags: 57 | self._tags = {} 58 | return self._tags 59 | -------------------------------------------------------------------------------- /ailment/utils.py: -------------------------------------------------------------------------------- 1 | # pylint:disable=ungrouped-imports,wrong-import-position 2 | from __future__ import annotations 3 | from typing import TypeAlias 4 | import struct 5 | 6 | try: 7 | from claripy.ast import Bits 8 | except ImportError: 9 | from typing_extensions import Never as Bits 10 | 11 | try: 12 | import _md5 as md5lib 13 | except ImportError: 14 | import hashlib as md5lib 15 | 16 | GetBitsTypeParams: TypeAlias = "Bits | Expression" 17 | 18 | 19 | def get_bits(expr: GetBitsTypeParams) -> int: 20 | 21 | if isinstance(expr, Expression): 22 | return expr.bits 23 | elif isinstance(expr, Bits): 24 | return expr.size() 25 | else: 26 | raise TypeError(type(expr)) 27 | 28 | 29 | md5_unpacker = struct.Struct("4I") 30 | 31 | 32 | def stable_hash(t: tuple) -> int: 33 | cnt = _dump_tuple(t) 34 | hd = md5lib.md5(cnt).digest() 35 | return md5_unpacker.unpack(hd)[0] # 32 bits 36 | 37 | 38 | def _dump_tuple(t: tuple) -> bytes: 39 | cnt = b"" 40 | for item in t: 41 | if item is not None: 42 | type_ = type(item) 43 | if type_ in _DUMP_BY_TYPE: 44 | cnt += _DUMP_BY_TYPE[type_](item) 45 | else: 46 | # for TaggedObjects, hash(item) is stable 47 | # other types of items may show up, such as pyvex.expr.CCall and Dirty. they will be removed some day. 48 | cnt += struct.pack(" bytes: 54 | return t.encode("ascii") 55 | 56 | 57 | def _dump_int(t: int) -> bytes: 58 | prefix = b"" if t >= 0 else b"-" 59 | t = abs(t) 60 | if t <= 0xFFFF: 61 | return prefix + struct.pack(" 0: 69 | cnt += _dump_int(t & 0xFFFF_FFFF_FFFF_FFFF) 70 | t >>= 64 71 | return prefix + cnt 72 | 73 | 74 | def _dump_type(t: type) -> bytes: 75 | return t.__name__.encode("ascii") 76 | 77 | 78 | _DUMP_BY_TYPE = { 79 | tuple: _dump_tuple, 80 | str: _dump_str, 81 | int: _dump_int, 82 | type: _dump_type, 83 | } 84 | 85 | 86 | def is_none_or_likeable(arg1, arg2, is_list=False): 87 | """ 88 | Returns whether two things are both None or can like each other 89 | """ 90 | if arg1 is None or arg2 is None: 91 | if arg1 == arg2: 92 | return True 93 | return False 94 | 95 | if is_list: 96 | return len(arg1) == len(arg2) and all(is_none_or_likeable(a1, a2) for a1, a2 in zip(arg1, arg2)) 97 | 98 | if isinstance(arg1, Expression): 99 | return arg1.likes(arg2) 100 | return arg1 == arg2 101 | 102 | 103 | def is_none_or_matchable(arg1, arg2, is_list=False): 104 | """ 105 | Returns whether two things are both None or can match each other 106 | """ 107 | if arg1 is None or arg2 is None: 108 | if arg1 == arg2: 109 | return True 110 | return False 111 | 112 | if is_list: 113 | return len(arg1) == len(arg2) and all(is_none_or_matchable(a1, a2) for a1, a2 in zip(arg1, arg2)) 114 | 115 | if isinstance(arg1, Expression): 116 | return arg1.matches(arg2) 117 | return arg1 == arg2 118 | 119 | 120 | from .expression import Expression # noqa: E402 121 | -------------------------------------------------------------------------------- /docs/Makefile: -------------------------------------------------------------------------------- 1 | # Minimal makefile for Sphinx documentation 2 | # 3 | 4 | # You can set these variables from the command line, and also 5 | # from the environment for the first two. 6 | SPHINXOPTS ?= 7 | SPHINXBUILD ?= sphinx-build 8 | SOURCEDIR = . 9 | BUILDDIR = _build 10 | 11 | # Put it first so that "make" without argument is like "make help". 12 | help: 13 | @$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O) 14 | 15 | .PHONY: help Makefile 16 | 17 | # Catch-all target: route all unknown targets to Sphinx using the new 18 | # "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS). 19 | %: Makefile 20 | @$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O) 21 | -------------------------------------------------------------------------------- /docs/api.rst: -------------------------------------------------------------------------------- 1 | :mod:`ailment` --- angr Intermediate Language 2 | ============================================= 3 | 4 | .. automodule:: ailment 5 | 6 | Converter 7 | --------- 8 | .. automodule:: ailment.converter_common 9 | .. automodule:: ailment.converter_pcode 10 | .. automodule:: ailment.converter_vex 11 | 12 | Expressions 13 | ----------- 14 | .. automodule:: ailment.expression 15 | 16 | Statement 17 | --------- 18 | .. automodule:: ailment.statement 19 | 20 | Misc. Things 21 | ------------ 22 | .. automodule:: ailment.block 23 | .. automodule:: ailment.manager 24 | .. automodule:: ailment.tagged_object 25 | .. automodule:: ailment.utils 26 | -------------------------------------------------------------------------------- /docs/conf.py: -------------------------------------------------------------------------------- 1 | # Configuration file for the Sphinx documentation builder. 2 | # 3 | # For the full list of built-in configuration values, see the documentation: 4 | # https://www.sphinx-doc.org/en/master/usage/configuration.html 5 | 6 | import datetime 7 | 8 | # -- Project information ----------------------------------------------------- 9 | # https://www.sphinx-doc.org/en/master/usage/configuration.html#project-information 10 | 11 | project = "ailment" 12 | project_copyright = f"{datetime.datetime.now().year}, The angr Project contributors" 13 | author = "The angr Project" 14 | 15 | # -- General configuration --------------------------------------------------- 16 | # https://www.sphinx-doc.org/en/master/usage/configuration.html#general-configuration 17 | 18 | extensions = [ 19 | "sphinx.ext.autodoc", 20 | "sphinx.ext.autosummary", 21 | "sphinx.ext.coverage", 22 | "sphinx.ext.napoleon", 23 | "sphinx.ext.todo", 24 | "sphinx.ext.viewcode", 25 | "sphinx_autodoc_typehints", 26 | "myst_parser", 27 | ] 28 | 29 | templates_path = ["_templates"] 30 | exclude_patterns = ["_build", "Thumbs.db", ".DS_Store"] 31 | 32 | # -- Options for autodoc ----------------------------------------------------- 33 | # https://www.sphinx-doc.org/en/master/usage/extensions/autodoc.html#configuration 34 | autoclass_content = "class" 35 | autodoc_default_options = { 36 | "members": True, 37 | "member-order": "bysource", 38 | "show-inheritance": True, 39 | "special-members": "__init__", 40 | "undoc-members": True, 41 | } 42 | autodoc_inherit_docstrings = True 43 | autodoc_typehints = "both" 44 | 45 | # -- Options for coverage ---------------------------------------------------- 46 | # https://www.sphinx-doc.org/en/master/usage/extensions/coverage.html 47 | coverage_write_headline = False 48 | 49 | 50 | # -- Options for HTML output ------------------------------------------------- 51 | # https://www.sphinx-doc.org/en/master/usage/configuration.html#options-for-html-output 52 | 53 | html_theme = "furo" 54 | html_static_path = ["_static"] 55 | -------------------------------------------------------------------------------- /docs/index.rst: -------------------------------------------------------------------------------- 1 | Welcome to ailment's documentation! 2 | =================================== 3 | 4 | 5 | .. toctree:: 6 | :maxdepth: 2 7 | :caption: Contents: 8 | 9 | Readme 10 | API 11 | 12 | 13 | 14 | Indices and tables 15 | ================== 16 | 17 | * :ref:`genindex` 18 | * :ref:`modindex` 19 | * :ref:`search` 20 | -------------------------------------------------------------------------------- /docs/make.bat: -------------------------------------------------------------------------------- 1 | @ECHO OFF 2 | 3 | pushd %~dp0 4 | 5 | REM Command file for Sphinx documentation 6 | 7 | if "%SPHINXBUILD%" == "" ( 8 | set SPHINXBUILD=sphinx-build 9 | ) 10 | set SOURCEDIR=. 11 | set BUILDDIR=_build 12 | 13 | %SPHINXBUILD% >NUL 2>NUL 14 | if errorlevel 9009 ( 15 | echo. 16 | echo.The 'sphinx-build' command was not found. Make sure you have Sphinx 17 | echo.installed, then set the SPHINXBUILD environment variable to point 18 | echo.to the full path of the 'sphinx-build' executable. Alternatively you 19 | echo.may add the Sphinx directory to PATH. 20 | echo. 21 | echo.If you don't have Sphinx installed, grab it from 22 | echo.https://www.sphinx-doc.org/ 23 | exit /b 1 24 | ) 25 | 26 | if "%1" == "" goto help 27 | 28 | %SPHINXBUILD% -M %1 %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O% 29 | goto end 30 | 31 | :help 32 | %SPHINXBUILD% -M help %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O% 33 | 34 | :end 35 | popd 36 | -------------------------------------------------------------------------------- /docs/readme.rst: -------------------------------------------------------------------------------- 1 | .. include:: ../README.md 2 | :parser: myst_parser.sphinx_ 3 | -------------------------------------------------------------------------------- /pyproject.toml: -------------------------------------------------------------------------------- 1 | [build-system] 2 | requires = ["setuptools>=46.4.0", "wheel"] 3 | build-backend = "setuptools.build_meta" 4 | 5 | [project] 6 | name = "ailment" 7 | description = "The angr intermediate language." 8 | license = { text = "BSD-2-Clause" } 9 | classifiers = [ 10 | "Programming Language :: Python :: 3", 11 | "Programming Language :: Python :: 3 :: Only", 12 | "Programming Language :: Python :: 3.10", 13 | "Programming Language :: Python :: 3.11", 14 | "Programming Language :: Python :: 3.12", 15 | "Programming Language :: Python :: 3.13", 16 | ] 17 | requires-python = ">=3.10" 18 | dependencies = [ 19 | "typing-extensions", 20 | ] 21 | dynamic = ["version"] 22 | 23 | [project.readme] 24 | file = "README.md" 25 | content-type = "text/markdown" 26 | 27 | [project.urls] 28 | Homepage = "https://api.angr.io/projects/ailment/en/latest/" 29 | Repository = "https://github.com/angr/ailment" 30 | 31 | [project.optional-dependencies] 32 | docs = [ 33 | "furo", 34 | "myst-parser", 35 | "sphinx", 36 | "sphinx-autodoc-typehints", 37 | ] 38 | testing = [ 39 | "pytest", 40 | "pytest-xdist", 41 | ] 42 | 43 | [tool.setuptools] 44 | include-package-data = true 45 | license-files = ["LICENSE"] 46 | 47 | [tool.setuptools.dynamic] 48 | version = { attr = "ailment.__version__" } 49 | 50 | [tool.setuptools.package-data] 51 | ailment = ["py.typed"] 52 | 53 | [tool.black] 54 | line-length = 120 55 | target-version = ['py310'] 56 | 57 | [tool.ruff] 58 | line-length = 120 59 | 60 | [tool.ruff.lint.per-file-ignores] 61 | "ailment/expression.py" = ["F841"] # This is probably a bug 62 | -------------------------------------------------------------------------------- /tests/test_expression.py: -------------------------------------------------------------------------------- 1 | # pylint: disable=missing-class-docstring,no-self-use 2 | import unittest 3 | 4 | import ailment 5 | 6 | 7 | class TestExpression(unittest.TestCase): 8 | 9 | def test_phi_hashing(self): 10 | vvar_0 = ailment.expression.VirtualVariable(100, 0, 32, ailment.expression.VirtualVariableCategory.REGISTER, 16) 11 | vvar_1 = ailment.expression.VirtualVariable(101, 1, 32, ailment.expression.VirtualVariableCategory.REGISTER, 16) 12 | vvar_2 = ailment.expression.VirtualVariable(102, 2, 32, ailment.expression.VirtualVariableCategory.REGISTER, 16) 13 | phi_expr = ailment.expression.Phi( 14 | 0, 32, [((0, None), vvar_0), ((0, 0), vvar_2), ((1, None), vvar_1), ((4, None), None)] 15 | ) 16 | h = hash(phi_expr) # should not crash 17 | assert h is not None 18 | 19 | 20 | if __name__ == "__main__": 21 | unittest.main() 22 | -------------------------------------------------------------------------------- /tests/test_irsb.py: -------------------------------------------------------------------------------- 1 | import unittest 2 | 3 | import ailment 4 | 5 | try: 6 | import angr 7 | import archinfo 8 | except ImportError: 9 | angr = None 10 | 11 | try: 12 | import pyvex 13 | except ImportError: 14 | pyvex = None 15 | 16 | 17 | class TestIrsb(unittest.TestCase): 18 | # pylint: disable=missing-class-docstring 19 | block_bytes = bytes.fromhex( 20 | "554889E54883EC40897DCC488975C048C745F89508400048C745F0B6064000488B45C04883C008488B00BEA70840004889C7E883FEFFFF" 21 | ) # pylint: disable=line-too-long 22 | block_addr = 0x4006C6 23 | 24 | @unittest.skipUnless(pyvex, "pyvex required") 25 | def test_convert_from_vex_irsb(self): 26 | arch = archinfo.arch_from_id("AMD64") 27 | manager = ailment.Manager(arch=arch) 28 | irsb = pyvex.IRSB(self.block_bytes, self.block_addr, arch, opt_level=0) 29 | ablock = ailment.IRSBConverter.convert(irsb, manager) 30 | assert ablock # TODO: test if this conversion is valid 31 | 32 | @unittest.skipUnless(angr and hasattr(angr.engines, "UberEnginePcode"), "angr and pypcode required") 33 | def test_convert_from_pcode_irsb(self): 34 | arch = archinfo.arch_from_id("AMD64") 35 | manager = ailment.Manager(arch=arch) 36 | p = angr.load_shellcode( 37 | self.block_bytes, arch, self.block_addr, self.block_addr, engine=angr.engines.UberEnginePcode 38 | ) 39 | irsb = p.factory.block(self.block_addr).vex 40 | ablock = ailment.IRSBConverter.convert(irsb, manager) 41 | assert ablock # TODO: test if this conversion is valid 42 | 43 | 44 | if __name__ == "__main__": 45 | unittest.main() 46 | --------------------------------------------------------------------------------