├── .gitignore ├── README.md ├── parser.py └── patchdeps.py /.gitignore: -------------------------------------------------------------------------------- 1 | __pycache__/ 2 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | `patchdeps` 2 | =========== 3 | Tool for analyzing textual dependencies within a patch series. 4 | 5 | Given a pile of patches, `patchdeps` can find out which patch modifies 6 | which files and lines within those files. From there, it can detect that 7 | a specific patch modifies a line introduced by an earlier patch, and 8 | mark these patches as dependent. 9 | 10 | This tool is intended to sort out a pile of patches, so you can 11 | determine which patches should be applied together as a group and which 12 | can be freely reordered without problems. 13 | 14 | Features 15 | -------- 16 | - Automatically detect coarse-grained dependencies between patches 17 | based on the files they modify. 18 | - Automatically detect fine-grained dependencies between patches 19 | based on the lines they modify. 20 | - Show which patch modifies which files. 21 | - Show which patch last modified which line (*i.e.* blame). 22 | - Output depedencies as: 23 | * A list 24 | * A textual matrix 25 | * A dot format graph (optionally running xdot to display it) 26 | - Read patches from: 27 | * A series of git commits (uses the git command) 28 | * A series of patch (diff) files 29 | 30 | Limitations 31 | ----------- 32 | Note that this tool can only detect textual dependencies, when two 33 | patches touch the same line or lines close together. For logical 34 | dependencies, where a patches applies just fine without another patch, 35 | but does need the other to actually work, you'll still have to think 36 | yourself :-) 37 | 38 | Furthermore, this tool needs a proper series of patches that are known 39 | to apply cleanly. Since it works without the original files and even 40 | without doing the magic patch uses (offset and fuzziness) to find out 41 | how to apply a patch, it must assume that the line numbers in the patch 42 | are correct and works solely from that. `patchdeps` does do some 43 | verification of the line contents it does have (mostly from context 44 | lines) and will yell at you if it detects a problem, but it might not 45 | catch these problems always... 46 | 47 | Running patchdeps 48 | ----------------- 49 | Patchdeps supports a number of commandline parameters, which are 50 | explained when running `patchdeps --help`. 51 | 52 | Examples 53 | -------- 54 | This shows running patchdeps on a set of kernel cleanup patches, which 55 | modify a single driver containing a few files. It shows both the list 56 | output and the matrix output. In the matrix output, an `X` means the 57 | later patch changed a line introduced by the earlier patch, while a `*` 58 | means the later patch changes a line within two lines of a line changed 59 | by the earlier patch (number of lines can be configured with the 60 | `--proximity` flag). 61 | 62 | $ patchdeps --git 025a9230c8373..91121c103ae93 --list --matrix 63 | d6ec53e04bf79 (staging: dwc2: simplify register shift expressions) depends on: 64 | f923463335385 (staging: dwc2: unshift non-bool register value constants) (proximity) 65 | 08b9f9db707ba (staging: dwc2: remove redundant register reads) depends on: 66 | c35205aa05124 (staging: dwc2: re-use hptxfsiz variable) (hard) 67 | a1fc524393583 (staging: dwc2: properly mask the GRXFSIZ register) depends on: 68 | 4ab799df6d716 (staging: dwc2: remove specific fifo size constants) (proximity) 69 | c35205aa05124 (staging: dwc2: re-use hptxfsiz variable) (proximity) 70 | 08b9f9db707ba (staging: dwc2: remove redundant register reads) (hard) 71 | 9badec2f9fa92 (staging: dwc2: interpret all hwcfg and related register at init time) depends on: 72 | 3b9edf88472e9 (staging: dwc2: fix off-by-one in check for max_packet_count parameter) (hard) 73 | f923463335385 (staging: dwc2: unshift non-bool register value constants) (hard) 74 | 1c58ce133971e (staging: dwc2: only read the snpsid register once) (hard) 75 | a1fc524393583 (staging: dwc2: properly mask the GRXFSIZ register) (hard) 76 | de4a193193989 (staging: dwc2: validate the value for phy_utmi_width) depends on: 77 | 9badec2f9fa92 (staging: dwc2: interpret all hwcfg and related register at init time) (proximity) 78 | 4ab799df6d716 (staging: dwc2: remove specific fifo size constants) ····················* 79 | 3b9edf88472e9 (staging: dwc2: fix off-by-one in check for max_packet_count param ······│·X 80 | f923463335385 (staging: dwc2: unshift non-bool register value constants) ··········*···│·X 81 | 1c58ce133971e (staging: dwc2: only read the snpsid register once) ·················│···│·X 82 | d6ec53e04bf79 (staging: dwc2: simplify register shift expressions) ────────────────┘ │ │ 83 | acdb9046b61a6 (staging: dwc2: add missing shift) │ │ 84 | 57bb8aeda06af (staging: dwc2: simplify debug output in dwc_hc_init) │ │ 85 | c35205aa05124 (staging: dwc2: re-use hptxfsiz variable) ·····························X·* │ 86 | 08b9f9db707ba (staging: dwc2: remove redundant register reads) ──────────────────────┘·X │ 87 | a1fc524393583 (staging: dwc2: properly mask the GRXFSIZ register) ─────────────────────┘·X 88 | 9badec2f9fa92 (staging: dwc2: interpret all hwcfg and related register at init t ────────┘·* 89 | de4a193193989 (staging: dwc2: validate the value for phy_utmi_width) ──────────────────────┘ 90 | 91121c103ae93 (staging: dwc2: make dwc2_core_params documentation more complete) 91 | 92 | The above uses by-line analysis. A per-file analysis is also available, 93 | but as you can see below, it is not so useful for this example (since 94 | there are only a handful of files in the driver, pretty much every patch 95 | depends on every other patch: 96 | 97 | $ patchdeps --git 025a9230c8373..91121c103ae93 --matrix --by-file 98 | 4ab799df6d716 (staging: dwc2: remove specific fifo size constants) ················X·············X···X 99 | 3b9edf88472e9 (staging: dwc2: fix off-by-one in check for max_packet_count param ··X···X···X·X·X·X·X·X 100 | f923463335385 (staging: dwc2: unshift non-bool register value constants) ──────────┘·X·X·X·X·X·X·X·X·X 101 | 1c58ce133971e (staging: dwc2: only read the snpsid register once) ───────────────────┘·X·X·│·│·│·│·X │ 102 | d6ec53e04bf79 (staging: dwc2: simplify register shift expressions) ────────────────────┘·X·X·X·X·X·X·X 103 | acdb9046b61a6 (staging: dwc2: add missing shift) ────────────────────────────────────────┘·│·│·│·│·X │ 104 | 57bb8aeda06af (staging: dwc2: simplify debug output in dwc_hc_init) ───────────────────────┘·X·X·X·X·X 105 | c35205aa05124 (staging: dwc2: re-use hptxfsiz variable) ─────────────────────────────────────┘·X·X·X·X 106 | 08b9f9db707ba (staging: dwc2: remove redundant register reads) ────────────────────────────────┘·X·X·X 107 | a1fc524393583 (staging: dwc2: properly mask the GRXFSIZ register) ───────────────────────────────┘·X·X 108 | 9badec2f9fa92 (staging: dwc2: interpret all hwcfg and related register at init t ──────────────────┘·X·X 109 | de4a193193989 (staging: dwc2: validate the value for phy_utmi_width) ────────────────────────────────┘·X 110 | 91121c103ae93 (staging: dwc2: make dwc2_core_params documentation more complete) ──────────────────────┘ 111 | 112 | Copyright & License 113 | ------------------- 114 | © 2013 Matthijs Kooijman <> 115 | 116 | Patchdeps is made available under the MIT license 117 | 118 | Permission is hereby granted, free of charge, to any person obtaining 119 | a copy of this software and associated documentation files (the 120 | "Software"), to deal in the Software without restriction, including 121 | without limitation the rights to use, copy, modify, merge, publish, 122 | distribute, sublicense, and/or sell copies of the Software, and to 123 | permit persons to whom the Software is furnished to do so, subject to 124 | the following conditions: 125 | 126 | The above copyright notice and this permission notice shall be 127 | included in all copies or substantial portions of the Software. 128 | 129 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 130 | EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 131 | MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. 132 | IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY 133 | CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, 134 | TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE 135 | SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. 136 | -------------------------------------------------------------------------------- /parser.py: -------------------------------------------------------------------------------- 1 | # The MIT License (MIT) 2 | # Copyright (c) 2012 Matias Bordese 3 | # Copyright (c) 2013 Matthijs Kooijman 4 | # 5 | # Permission is hereby granted, free of charge, to any person obtaining a copy 6 | # of this software and associated documentation files (the "Software"), to deal 7 | # in the Software without restriction, including without limitation the rights 8 | # to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | # copies of the Software, and to permit persons to whom the Software is 10 | # furnished to do so, subject to the following conditions: 11 | # 12 | # The above copyright notice and this permission notice shall be included in all 13 | # copies or substantial portions of the Software. 14 | # 15 | # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 16 | # EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 17 | # MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. 18 | # IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, 19 | # DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR 20 | # OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE 21 | # OR OTHER DEALINGS IN THE SOFTWARE. 22 | 23 | """Unified diff parser module.""" 24 | 25 | # This file is based on the unidiff library by Matías Bordese (at 26 | # https://github.com/matiasb/python-unidiff) 27 | 28 | # A "Changeset" consists of multiple "PatchFiles" / commits (GitRev) 29 | # A single patch(-file)/commit affects a "set of PatchedFiles" 30 | # For each "PatchedFile" the patch consists of multiple "Hunks" 31 | # Each "Hunk" concists of multiple "Lines" being "context or change" (LineType) 32 | 33 | from __future__ import annotations 34 | 35 | import re 36 | from enum import Enum 37 | from typing import Any, Iterable, Iterator 38 | 39 | RE_SOURCE_FILENAME = re.compile(r'^--- (?P[^\t]+)') 40 | RE_TARGET_FILENAME = re.compile(r'^\+\+\+ (?P[^\t]+)') 41 | 42 | # @@ (source offset, length) (target offset, length) @@ 43 | RE_HUNK_HEADER = re.compile(r"^@@ -(\d+)(?:,(\d+))? \+(\d+)(?:,(\d+))?\ @@") 44 | RE_HUNK_BODY_LINE = re.compile(r'^([- \+\\])') 45 | 46 | 47 | class LineType(Enum): 48 | ADD = '+' # added line 49 | DELETE= '-' # deleted line 50 | CONTEXT = ' ' # kept line (context) 51 | IGNORE = '\\' # No newline case (ignore) 52 | 53 | 54 | class UnidiffParseError(Exception): 55 | pass 56 | 57 | 58 | class Line: 59 | """ 60 | A single line from a patch hunk. 61 | """ 62 | def __init__( 63 | self, 64 | hunk: Hunk, 65 | action: LineType, 66 | source_lineno_rel: int, 67 | source_line: str | None, 68 | target_lineno_rel: int, 69 | target_line: str | None, 70 | ) -> None: 71 | """ 72 | The line numbers must always be present, either source_line or 73 | target_line can be None depending on the action. 74 | """ 75 | self.hunk = hunk 76 | self.action = action 77 | self.source_lineno_rel = source_lineno_rel 78 | self.source_line = source_line 79 | self.target_lineno_rel = target_lineno_rel 80 | self.target_line = target_line 81 | 82 | self.source_lineno_abs = self.hunk.source_start + self.source_lineno_rel 83 | self.target_lineno_abs = self.hunk.target_start + self.target_lineno_rel 84 | 85 | def __str__(self) -> str: 86 | return f"(-{self.source_lineno_abs}, +{self.target_lineno_abs}) {self.action}{self.source_line or self.target_line}" 87 | 88 | 89 | class PatchedFile(list["Hunk"]): 90 | """Data from a patched file.""" 91 | 92 | def __init__(self, source: str = '', target: str = '') -> None: 93 | self.source_file = source 94 | self.target_file = target 95 | 96 | if self.source_file.startswith('a/') and self.target_file.startswith('b/'): 97 | self.path = self.source_file[2:] 98 | elif self.source_file.startswith('a/') and self.target_file == '/dev/null': 99 | self.path = self.source_file[2:] 100 | elif self.target_file.startswith('b/') and self.source_file == '/dev/null': 101 | self.path = self.target_file[2:] 102 | else: 103 | self.path = self.source_file 104 | 105 | 106 | class Hunk: 107 | """Each of the modified blocks of a file.""" 108 | 109 | def __init__(self, src_start: int = 0, src_len: int = 0, tgt_start: int = 0, tgt_len: int = 0) -> None: 110 | self.source_start = src_start 111 | self.source_length = self.source_todo = src_len 112 | self.target_start = tgt_start 113 | self.target_length = self.target_todo = tgt_len 114 | self.changes: list[Line] = [] 115 | 116 | def is_valid(self) -> bool: 117 | """Check hunk header data matches entered lines info.""" 118 | return self.source_todo == self.target_todo == 0 119 | 120 | def append_line(self, line: Line) -> None: 121 | """Append a line.""" 122 | self.changes.append(line) 123 | 124 | if line.action in {LineType.CONTEXT, LineType.DELETE}: 125 | self.source_todo -= 1 126 | if self.source_todo < 0: 127 | raise UnidiffParseError( 128 | f'Too many source lines in hunk: {self}') 129 | 130 | if line.action in {LineType.CONTEXT, LineType.ADD}: 131 | self.target_todo -= 1 132 | if self.target_todo < 0: 133 | raise UnidiffParseError( 134 | f'Too many target lines in hunk: {self}') 135 | 136 | def __str__(self) -> str: 137 | return f"<@@ {self.source_start},{self.source_length} {self.target_start},{self.target_length} @@>" 138 | 139 | 140 | def _parse_hunk( 141 | diff: Iterator[str], 142 | source_start: int, 143 | source_len: int, 144 | target_start: int, 145 | target_len: int, 146 | ) -> Hunk: 147 | hunk = Hunk(source_start, source_len, target_start, target_len) 148 | source_lineno = 0 149 | target_lineno = 0 150 | 151 | for line in diff: 152 | valid_line = RE_HUNK_BODY_LINE.match(line) 153 | if valid_line: 154 | action = LineType(valid_line.group(0)) 155 | original_line = line[1:] 156 | 157 | kwargs: dict[str, Any] = { 158 | "action": action, 159 | "hunk": hunk, 160 | "source_lineno_rel": source_lineno, 161 | "target_lineno_rel": target_lineno, 162 | "source_line": None, 163 | "target_line": None, 164 | } 165 | 166 | if action == LineType.ADD: 167 | kwargs['target_line'] = original_line 168 | target_lineno += 1 169 | elif action == LineType.DELETE: 170 | kwargs['source_line'] = original_line 171 | source_lineno += 1 172 | elif action == LineType.CONTEXT: 173 | kwargs['source_line'] = original_line 174 | kwargs['target_line'] = original_line 175 | source_lineno += 1 176 | target_lineno += 1 177 | hunk.append_line(Line(**kwargs)) 178 | else: 179 | raise UnidiffParseError(f'Hunk diff data expected: {line}') 180 | 181 | # check hunk len(old_lines) and len(new_lines) are ok 182 | if hunk.is_valid(): 183 | break 184 | 185 | return hunk 186 | 187 | 188 | def parse_diff(diff: Iterable[str]) -> list[PatchedFile]: 189 | ret: list[PatchedFile] = [] 190 | 191 | # Make sure we only iterate the diff once, instead of restarting 192 | # from the top inside _parse_hunk 193 | lines = iter(diff) 194 | for line in lines: 195 | if m := RE_SOURCE_FILENAME.match(line): 196 | source_file = m['filename'] 197 | elif m := RE_TARGET_FILENAME.match(line): 198 | target_file = m['filename'] 199 | current_file = PatchedFile(source_file, target_file) 200 | ret.append(current_file) 201 | elif m := RE_HUNK_HEADER.match(line): 202 | hunk = _parse_hunk( 203 | lines, 204 | int(m[1]), 205 | _int1(m[2]), 206 | int(m[3]), 207 | _int1(m[4]), 208 | ) 209 | current_file.append(hunk) 210 | 211 | return ret 212 | 213 | 214 | def _int1(s: str) -> int: 215 | return 1 if s is None else int(s) 216 | -------------------------------------------------------------------------------- /patchdeps.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | 3 | # Copyright (c) 2013 Matthijs Kooijman 4 | # 5 | # Permission is hereby granted, free of charge, to any person obtaining 6 | # a copy of this software and associated documentation files (the 7 | # "Software"), to deal in the Software without restriction, including 8 | # without limitation the rights to use, copy, modify, merge, publish, 9 | # distribute, sublicense, and/or sell copies of the Software, and to 10 | # permit persons to whom the Software is furnished to do so, subject to 11 | # the following conditions: 12 | # 13 | # The above copyright notice and this permission notice shall be 14 | # included in all copies or substantial portions of the Software. 15 | # 16 | # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 17 | # EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 18 | # MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. 19 | # IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY 20 | # CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, 21 | # TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE 22 | # SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. 23 | # 24 | # Simple script to process a list of patch files and identify obvious 25 | # dependencies between them. Inspired by the similar (but more limited) 26 | # perl script published at 27 | # http://blog.mozilla.org/sfink/2012/01/05/patch-queue-dependencies/ 28 | # 29 | 30 | from __future__ import annotations 31 | 32 | import argparse 33 | import collections 34 | import itertools 35 | import os 36 | import subprocess 37 | import sys 38 | import textwrap 39 | from enum import Enum 40 | from parser import LineType, parse_diff 41 | from typing import TYPE_CHECKING, Iterable, Iterator 42 | 43 | if TYPE_CHECKING: 44 | from parser import Hunk, PatchedFile 45 | 46 | 47 | class Depend(Enum): 48 | # Used if a patch changes a line changed by another patch 49 | HARD = ("hard", "X", "solid") 50 | # Used if a patch changes a line changed near a line changed by another patch 51 | PROXIMITY = ("proximity", "*", "dashed") 52 | # By filename 53 | FILENAME = ("", "X", "solid") 54 | 55 | def __init__(self, desc: str, matrixmark: str, dotstyle: str) -> None: 56 | self.desc = desc 57 | self.matrixmark = matrixmark 58 | self.dotstyle = dotstyle 59 | 60 | 61 | class Changeset: 62 | def get_patch_set(self) -> list[PatchedFile]: 63 | """ 64 | Returns this changeset as a list of PatchedFiles. 65 | """ 66 | parsed = parse_diff(self.get_diff()) 67 | if not parsed: 68 | sys.stderr.write(f"WARNING: Parsing diff {self} produced no patch hunks, maybe format is invalid?\n") 69 | return parsed 70 | 71 | def get_diff(self) -> Iterable[str]: 72 | """ 73 | Returns the textual unified diff for this changeset as an 74 | iterable of lines 75 | """ 76 | raise NotImplementedError 77 | 78 | def __repr__(self) -> str: 79 | return f"{self.__class__.__name__}({self!s})" 80 | 81 | 82 | class PatchFile(Changeset): 83 | def __init__(self, filename: str) -> None: 84 | self.filename = filename 85 | 86 | def get_diff(self) -> Iterable[str]: 87 | f = open(self.filename, encoding='utf-8') 88 | # Iterating over a file gives separate lines, with newlines 89 | # included. We want those stripped off 90 | return (line.rstrip('\n') for line in f) 91 | 92 | @staticmethod 93 | def get_changesets(args: Iterable[str]) -> Iterator[PatchFile]: 94 | """ 95 | Generate Changeset objects, given patch filenamesk 96 | """ 97 | for filename in args: 98 | yield PatchFile(filename) 99 | 100 | def __str__(self) -> str: 101 | return os.path.basename(self.filename) 102 | 103 | 104 | class GitRev(Changeset): 105 | def __init__(self, rev: str, msg: str) -> None: 106 | self.rev = rev 107 | self.msg = msg 108 | 109 | def get_diff(self) -> Iterable[str]: 110 | diff = subprocess.check_output(['git', 'diff', '--no-color', f"{self.rev}^", self.rev]) 111 | # Convert to utf8 and just drop any invalid characters (we're 112 | # not interested in the actual file contents and all diff 113 | # special characters are valid ascii). 114 | return diff.decode(errors='ignore').split('\n') 115 | 116 | def __str__(self) -> str: 117 | return f"{self.rev} ({self.msg})" 118 | 119 | @staticmethod 120 | def get_changesets(args: list[str]) -> Iterator[GitRev]: 121 | """ 122 | Generate Changeset objects, given arguments for git rev-list. 123 | """ 124 | output = subprocess.check_output(['git', 'rev-list', '--oneline', '--reverse', *args]) 125 | 126 | if not output: 127 | sys.stderr.write("No revisions specified?\n") 128 | else: 129 | lines = output.decode().strip().split('\n') 130 | for line in lines: 131 | yield GitRev(*line.split(' ', 1)) 132 | 133 | 134 | def print_depends(patches: list[Changeset], depends: dict[Changeset, dict[Changeset, Depend]]) -> None: 135 | for p in patches: 136 | if dependencies := depends[p]: 137 | print(f"{p} depends on:") 138 | for dep in patches: 139 | if dependency := dependencies.get(dep): 140 | desc = dependency.desc 141 | if desc: 142 | print(f" {dep} ({desc})") 143 | else: 144 | print(f" {dep}") 145 | 146 | 147 | def print_depends_tsort(patches: list[Changeset], depends: dict[Changeset, dict[Changeset, Depend]]) -> None: 148 | for p in patches: 149 | if dependencies := depends[p]: 150 | for dep in patches: 151 | if dep in dependencies: 152 | def no_delim(x): 153 | # Tsort source has: #define DELIM " \t\n" 154 | return str(x).replace(' ', '_').replace('\t', '_').replace('\n', '_') 155 | print(f"{no_delim(dep)}\t{no_delim(p)}") 156 | 157 | 158 | def print_depends_matrix(patches: list[Changeset], depends: dict[Changeset, dict[Changeset, Depend]]) -> None: 159 | # Which patches have at least one dependency drawn (and thus 160 | # need lines from then on)? 161 | has_deps: set[Changeset] = set() 162 | prereq: set[Changeset] = {dep for p in patches for dep in depends[p]} 163 | # Every patch depending on other patches needs a column 164 | depending: list[Changeset] = [p for p in patches if depends[p]] 165 | column = 82 166 | for p in patches: 167 | if depending and depending[0] == p: 168 | del depending[0] 169 | column += 2 170 | fill, corner = "─", "┘" 171 | else: 172 | fill = corner = "·" if p in prereq else " " 173 | line = f"{f'{p!s:.80} ':{fill}<{column}}{corner}" 174 | 175 | for i, dep in enumerate(depending): 176 | # Show ruler if a later patch depends on this one 177 | ruler = "·" if any(depends[d].get(p) for d in depending[i:]) else " " 178 | # For every later patch, print an "X" if it depends on this one 179 | if dependency := depends[dep].get(p): 180 | line += f"{ruler}{dependency.matrixmark}" 181 | has_deps.add(dep) 182 | elif dep in has_deps: 183 | line += f"{ruler}│" 184 | else: 185 | line += ruler * 2 186 | 187 | print(line) 188 | 189 | 190 | def dot_escape_string(s: str) -> str: 191 | return s.replace("\\", "\\\\").replace('"', '\\"') 192 | 193 | 194 | def depends_dot(args: argparse.Namespace, patches: list[Changeset], depends: dict[Changeset, dict[Changeset, Depend]]) -> str: 195 | """ 196 | Returns dot code for the dependency graph. 197 | """ 198 | # Seems that 'fdp' gives the best clustering if patches are often independent 199 | res = """ 200 | digraph ConflictMap { 201 | node [shape=box] 202 | layout=neato 203 | overlap=scale 204 | """ 205 | 206 | if args.randomize: 207 | res += "start=random\n" 208 | 209 | for i, p in enumerate(patches): 210 | label = dot_escape_string(str(p)) 211 | label = "\\n".join(textwrap.wrap(label, 25)) 212 | res += f'{i} [label="{label}"]\n' 213 | for dep, v in depends[p].items(): 214 | res += f"{patches.index(dep)} -> {i} [style={v.dotstyle}]\n" 215 | res += "}\n" 216 | 217 | return res 218 | 219 | 220 | def show_xdot(dot: str) -> None: 221 | """ 222 | Shows a given dot graph in xdot 223 | """ 224 | subprocess.run(['xdot', '/dev/stdin'], input=dot.encode(), check=True) 225 | 226 | 227 | class ByFileAnalyzer: 228 | def analyze(self, args: argparse.Namespace, patches: list[Changeset]) -> dict[Changeset, dict[Changeset, Depend]]: 229 | """ 230 | Find dependencies in a list of patches by looking at the files they 231 | change. 232 | 233 | The algorithm is simple: Just keep a list of files changed, and mark 234 | two patches as conflicting when they change the same file. 235 | """ 236 | # Which patches touch a particular file. A dict of filename => list 237 | # of patches 238 | touches_file: dict[str, list[Changeset]] = collections.defaultdict(list) 239 | 240 | # Which patch depends on which other patches? 241 | # A dict of patch => (dict of dependent patches => Depend.FILENAME) 242 | depends: dict[Changeset, dict[Changeset, Depend]] = collections.defaultdict(dict) 243 | 244 | for patch in patches: 245 | for f in patch.get_patch_set(): 246 | for other in touches_file[f.path]: 247 | depends[patch][other] = Depend.FILENAME 248 | 249 | touches_file[f.path].append(patch) 250 | 251 | if 'blame' in args.actions: 252 | for path, ps in touches_file.items(): 253 | patch = ps[-1] 254 | print(f"{patch!s:80.80} {path}") 255 | 256 | return depends 257 | 258 | 259 | class ByLineAnalyzer: 260 | def analyze(self, args: argparse.Namespace, patches: list[Changeset]) -> dict[Changeset, dict[Changeset, Depend]]: 261 | """ 262 | Find dependencies in a list of patches by looking at the lines they 263 | change. 264 | """ 265 | # Per-file info on which patch last touched a particular line. 266 | # A dict of file => list of LineState objects 267 | state: dict[str, ByLineFileAnalyzer] = {} 268 | 269 | # Which patch depends on which other patches? 270 | # A dict of patch => (dict of dependent patches => Depend) 271 | depends: dict[Changeset, dict[Changeset, Depend]] = collections.defaultdict(dict) 272 | 273 | for patch in patches: 274 | for f in patch.get_patch_set(): 275 | if f.path not in state: 276 | state[f.path] = ByLineFileAnalyzer(f.path, args.proximity) 277 | 278 | state[f.path].analyze(depends, patch, f) 279 | 280 | if 'blame' in args.actions: 281 | for a in state.values(): 282 | a.print_blame() 283 | 284 | return depends 285 | 286 | 287 | class ByLineFileAnalyzer: 288 | """ 289 | Helper class for the ByLineAnalyzer, that performs the analysis for 290 | a specific file. Created once per file and called for multiple patches. 291 | """ 292 | 293 | def __init__(self, fname: str, proximity: int) -> None: 294 | self.fname = fname 295 | self.proximity = proximity 296 | self.line_list: list[ByLineFileAnalyzer.LineState] = [] 297 | 298 | def analyze(self, depends: dict[Changeset, dict[Changeset, Depend]], patch: Changeset, hunks: PatchedFile) -> None: 299 | # This is the index in line_list of the first line state that 300 | # still uses source line numbers 301 | self.to_update_idx = 0 302 | 303 | # The index in line_list of the last line processed (i.e, 304 | # matched against a diff line) 305 | self.processed_idx = -1 306 | 307 | # Offset between source and target files at state_pos 308 | self.offset = 0 309 | 310 | for hunk in hunks: 311 | self.analyze_hunk(depends, patch, hunk) 312 | 313 | # Pretend we processed the entire list, so update_offset can 314 | # update the line numbers of any remaining (unchanged) lines 315 | # after the last hunk in this patch 316 | self.processed_idx = len(self.line_list) 317 | self.update_offset(0) 318 | 319 | def line_state(self, lineno: int, create: bool) -> LineState | None: 320 | """ 321 | Returns the state of the given (source) line number, creating a 322 | new empty state if it is not yet present and create is True. 323 | """ 324 | 325 | self.processed_idx += 1 326 | for state in self.line_list[self.processed_idx:]: 327 | # Found it, return 328 | if state.lineno == lineno: 329 | return state 330 | # It's not in there, stop looking 331 | if state.lineno > lineno: 332 | break 333 | # We're already passed this one, continue looking 334 | self.processed_idx += 1 335 | 336 | if not create: 337 | return None 338 | 339 | # We don't have state for this particular line, insert a 340 | # new empty state 341 | state = self.LineState(lineno=lineno) 342 | self.line_list.insert(self.processed_idx, state) 343 | return state 344 | 345 | def update_offset(self, amount: int) -> None: 346 | """ 347 | Update the offset between target and source lines by the 348 | specified amount. 349 | 350 | Takes care of updating the line states of all processed lines 351 | (up to but excluding self.processed_idx) with the old offset 352 | before changing it. 353 | """ 354 | 355 | for state in self.line_list[self.to_update_idx:self.processed_idx]: 356 | state.lineno += self.offset 357 | self.to_update_idx += 1 358 | 359 | self.offset += amount 360 | 361 | def analyze_hunk(self, depends: dict[Changeset, dict[Changeset, Depend]], patch: Changeset, hunk: Hunk) -> None: 362 | #print('\n'.join(map(str, self.line_list))) 363 | #print('--') 364 | for change in hunk.changes: 365 | # When adding a line, don't bother creating a new line 366 | # state, since we'll be adding one anyway (this prevents 367 | # extra unused linestates) 368 | create = change.action != LineType.ADD 369 | line_state = self.line_state(change.source_lineno_abs, create) 370 | 371 | # When changing a line, claim proximity lines before it as 372 | # well. 373 | if change.action != LineType.CONTEXT and self.proximity != 0: 374 | # i points to the only linestate that could contain the 375 | # state for lineno 376 | i = self.processed_idx - 1 377 | lineno = change.source_lineno_abs - 1 378 | while (change.source_lineno_abs - lineno <= self.proximity and 379 | lineno > 0): 380 | if (i < 0 or 381 | i >= self.to_update_idx and 382 | self.line_list[i].lineno < lineno or 383 | i < self.to_update_idx and 384 | self.line_list[i].lineno - self.offset < lineno): 385 | # This line does not exist yet, i points to an 386 | # earlier line. Insert it 387 | # _after_ i. 388 | self.line_list.insert(i + 1, self.LineState(lineno)) 389 | # Point i at the inserted line 390 | i += 1 391 | self.processed_idx += 1 392 | assert i >= self.to_update_idx, "Inserting before already updated line" 393 | 394 | # Claim this line 395 | s = self.line_list[i] 396 | 397 | # Already claimed, stop looking. This should also 398 | # prevent us from i becoming < to_update_idx - 1, 399 | # since the state at to_update_idx - 1 should always 400 | # be claimed 401 | if patch in s.proximity or s.changed_by == patch: 402 | break 403 | 404 | s.proximity.add(patch) 405 | i -= 1 406 | lineno -= 1 407 | 408 | # For changes that know about the contents of the old line, 409 | # check if it matches our observations 410 | if change.action != LineType.ADD: 411 | assert line_state is not None 412 | if line_state.line is not None and change.source_line != line_state.line: 413 | sys.exit( 414 | f"While processing {patch}\n" 415 | "Warning: patch does not apply cleanly! Results are probably wrong!\n" 416 | f"According to previous patches, line {change.source_lineno_abs} is:\n" 417 | f"{line_state.line}\n" 418 | f"But according to {patch}, it should be:\n" 419 | f"{change.source_line}\n\n", 420 | ) 421 | 422 | if change.action == LineType.CONTEXT: 423 | assert line_state is not None 424 | if line_state.line is None: 425 | line_state.line = change.target_line 426 | 427 | # For context lines, only remember the line contents 428 | #claim_after(in_change, change. 429 | #in_change = False 430 | 431 | elif change.action == LineType.ADD: 432 | self.update_offset(1) 433 | 434 | # Mark this line as changed by this patch 435 | s = self.LineState(lineno=change.target_lineno_abs, 436 | line=change.target_line, 437 | changed_by=patch) 438 | self.line_list.insert(self.processed_idx, s) 439 | assert self.processed_idx == self.to_update_idx, "Not everything updated?" 440 | 441 | # Since we insert this using the target line number, it 442 | # doesn't need to be updated again 443 | self.to_update_idx += 1 444 | 445 | # Add proximity deps for patches that touched code 446 | # around this line. We can't get a hard dependency for 447 | # an 'add' change, since we don't actually touch any 448 | # existing code 449 | if line_state: 450 | deps = itertools.chain(line_state.proximity, 451 | [line_state.changed_by]) 452 | for p in deps: 453 | if p and p not in depends[patch] and p != patch: 454 | depends[patch][p] = Depend.PROXIMITY 455 | 456 | elif change.action == LineType.DELETE: 457 | assert line_state is not None 458 | self.update_offset(-1) 459 | 460 | # This file was touched by another patch, add dependency 461 | if line_state.changed_by: 462 | depends[patch][line_state.changed_by] = Depend.HARD 463 | # TODO(PHH): Assigning to singleton Depend.*.dottooltip; unused by `depends_dot` 464 | # https://graphviz.org/docs/attrs/tooltip/ 465 | # depends[patch][line_state.changed_by].dottooltip = f"-{change.source_line}" 466 | 467 | # Also add proximity deps for patches that touched code 468 | # around this line 469 | for p in line_state.proximity: 470 | if (p not in depends[patch]) and p != patch: 471 | depends[patch][p] = Depend.PROXIMITY 472 | 473 | # Forget about the state for this source line 474 | del self.line_list[self.processed_idx] 475 | self.processed_idx -= 1 476 | 477 | # After changing a line, claim proximity lines after it as well. 478 | if change.action != LineType.CONTEXT and self.proximity != 0: 479 | # i points to the only linestate that could contain the 480 | # state for lineno 481 | i = self.to_update_idx 482 | # When a file is created, the source line for the adds is 0... 483 | lineno = change.source_lineno_abs or 1 484 | while (lineno - change.source_lineno_abs < self.proximity): 485 | if i >= len(self.line_list) or self.line_list[i].lineno > lineno: 486 | # This line does not exist yet, i points to an 487 | # later line. Insert it _before_ i. 488 | self.line_list.insert(i, self.LineState(lineno)) 489 | assert i > self.processed_idx, "Inserting before already processed line" 490 | 491 | # Claim this line 492 | self.line_list[i].proximity.add(patch) 493 | 494 | i += 1 495 | lineno += 1 496 | 497 | def print_blame(self) -> None: 498 | print(f"{self.fname}:") 499 | next_line: int | None = None 500 | for line_state in self.line_list: 501 | if line_state.line is None: 502 | continue 503 | 504 | if next_line and line_state.lineno != next_line: 505 | print(f"{'':50} …") 506 | 507 | print(f"{line_state.changed_by or ''!s:50.50} {line_state.lineno:4} {line_state.line}") 508 | next_line = line_state.lineno + 1 509 | 510 | print() 511 | 512 | 513 | class LineState: 514 | """ State of a particular line in a file """ 515 | def __init__(self, lineno: int, line: str | None = None, changed_by: Changeset | None = None) -> None: 516 | self.lineno = lineno 517 | self.line = line 518 | self.changed_by = changed_by 519 | # Set of patches that changed lines near this one 520 | self.proximity: set[Changeset] = set() 521 | 522 | def __str__(self) -> str: 523 | return f"{self.lineno}: changed by {self.changed_by}: {self.line}" 524 | 525 | 526 | def parse_args() -> argparse.Namespace: 527 | parser = argparse.ArgumentParser(description='Analyze patches for dependencies.') 528 | 529 | types = parser.add_argument_group('type').add_mutually_exclusive_group(required=True) 530 | types.add_argument('--git', dest='changeset_type', action='store_const', 531 | const=GitRev, 532 | help='Analyze a list of git revisions (non-option arguments are passed git git rev-list as-is') 533 | types.add_argument('--patches', dest='changeset_type', action='store_const', 534 | const=PatchFile, 535 | help='Analyze a list of patch files (non-option arguments are patch filenames') 536 | 537 | parser.add_argument('arguments', metavar="ARG", nargs='*', help=""" 538 | Specification of patches to analyze, depending 539 | on the type given. When --git is given, this is 540 | passed to git rev-list as-is (so use a valid 541 | revision range, like HEAD^^..HEAD). When 542 | --patches is given, these are filenames of patch 543 | files.""") 544 | parser.add_argument('--by-file', dest='analyzer', action='store_const', 545 | const=ByFileAnalyzer, default=ByLineAnalyzer, help=""" 546 | Mark patches as conflicting when they change the 547 | same file (by default, they are conflicting when 548 | they change the same lines).""") 549 | parser.add_argument('--proximity', default='2', metavar='LINES', 550 | type=int, help=""" 551 | The number of lines changes should be apart to 552 | prevent being marked as a dependency. Pass 0 to 553 | only consider exactly the same line. This option 554 | is no used when --by-file is passed. The default 555 | value is %(default)s.""") 556 | parser.add_argument('--randomize', action='store_true', help=""" 557 | Randomize the graph layout produced by 558 | --depends-dot and --depends-xdot.""") 559 | 560 | actions = parser.add_argument_group('actions') 561 | actions.add_argument('--blame', dest='actions', action='append_const', 562 | const='blame', help=""" 563 | Instead of outputting patch dependencies, 564 | output for each line or file which patch changed 565 | it last.""") 566 | actions.add_argument('--list', dest='actions', action='append_const', 567 | const='list', help=""" 568 | Output a list of each patch and the patches it 569 | depends on.""") 570 | actions.add_argument('--matrix', dest='actions', action='append_const', 571 | const='matrix', help=""" 572 | Output a matrix with patches on both axis and 573 | markings for dependencies. This is used if not 574 | action is given.""") 575 | actions.add_argument('--tsort', dest='actions', action='append_const', 576 | const='tsort', help=""" 577 | Show dependency graph as tsort input.""") 578 | actions.add_argument('--dot', dest='actions', action='append_const', 579 | const='dot', help=""" 580 | Output dot format for a dependency graph.""") 581 | actions.add_argument('--xdot', dest='actions', action='append_const', 582 | const='xdot', help=""" 583 | Show a dependency graph using xdot (if available).""") 584 | 585 | args = parser.parse_args() 586 | 587 | if not args.actions: 588 | args.actions = ['matrix'] 589 | 590 | return args 591 | 592 | 593 | def main() -> None: 594 | args = parse_args() 595 | 596 | patches: list[Changeset] = list(args.changeset_type.get_changesets(args.arguments)) 597 | 598 | depends = args.analyzer().analyze(args, patches) 599 | 600 | if 'list' in args.actions: 601 | print_depends(patches, depends) 602 | 603 | if 'matrix' in args.actions: 604 | print_depends_matrix(patches, depends) 605 | 606 | if 'tsort' in args.actions: 607 | print_depends_tsort(patches, depends) 608 | 609 | if 'dot' in args.actions: 610 | print(depends_dot(args, patches, depends)) 611 | 612 | if 'xdot' in args.actions: 613 | show_xdot(depends_dot(args, patches, depends)) 614 | 615 | 616 | if __name__ == "__main__": 617 | main() 618 | 619 | # vim: set sw=4 sts=4 et: 620 | --------------------------------------------------------------------------------