├── .gitignore
├── README.md
├── parser.py
└── patchdeps.py


/.gitignore:
--------------------------------------------------------------------------------
1 | __pycache__/
2 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | `patchdeps`
  2 | ===========
  3 | Tool for analyzing textual dependencies within a patch series.
  4 | 
  5 | Given a pile of patches, `patchdeps` can find out which patch modifies
  6 | which files and lines within those files. From there, it can detect that
  7 | a specific patch modifies a line introduced by an earlier patch, and
  8 | mark these patches as dependent.
  9 | 
 10 | This tool is intended to sort out a pile of patches, so you can
 11 | determine which patches should be applied together as a group and which
 12 | can be freely reordered without problems.
 13 | 
 14 | Features
 15 | --------
 16 |  - Automatically detect coarse-grained dependencies between patches
 17 |    based on the files they modify.
 18 |  - Automatically detect fine-grained dependencies between patches
 19 |    based on the lines they modify.
 20 |  - Show which patch modifies which files.
 21 |  - Show which patch last modified which line (*i.e.* blame).
 22 |  - Output depedencies as:
 23 |     * A list
 24 |     * A textual matrix
 25 |     * A dot format graph (optionally running xdot to display it)
 26 |  - Read patches from:
 27 |     * A series of git commits (uses the git command)
 28 |     * A series of patch (diff) files
 29 | 
 30 | Limitations
 31 | -----------
 32 | Note that this tool can only detect textual dependencies, when two
 33 | patches touch the same line or lines close together. For logical
 34 | dependencies, where a patches applies just fine without another patch,
 35 | but does need the other to actually work, you'll still have to think
 36 | yourself :-)
 37 | 
 38 | Furthermore, this tool needs a proper series of patches that are known
 39 | to apply cleanly. Since it works without the original files and even
 40 | without doing the magic patch uses (offset and fuzziness) to find out
 41 | how to apply a patch, it must assume that the line numbers in the patch
 42 | are correct and works solely from that. `patchdeps` does do some
 43 | verification of the line contents it does have (mostly from context
 44 | lines) and will yell at you if it detects a problem, but it might not
 45 | catch these problems always...
 46 | 
 47 | Running patchdeps
 48 | -----------------
 49 | Patchdeps supports a number of commandline parameters, which are
 50 | explained when running `patchdeps --help`.
 51 | 
 52 | Examples
 53 | --------
 54 | This shows running patchdeps on a set of kernel cleanup patches, which
 55 | modify a single driver containing a few files. It shows both the list
 56 | output and the matrix output. In the matrix output, an `X` means the
 57 | later patch changed a line introduced by the earlier patch, while a `*`
 58 | means the later patch changes a line within two lines of a line changed
 59 | by the earlier patch (number of lines can be configured with the
 60 | `--proximity` flag).
 61 | 
 62 |     $ patchdeps --git 025a9230c8373..91121c103ae93 --list --matrix
 63 |     d6ec53e04bf79 (staging: dwc2: simplify register shift expressions) depends on:
 64 |       f923463335385 (staging: dwc2: unshift non-bool register value constants) (proximity)
 65 |     08b9f9db707ba (staging: dwc2: remove redundant register reads) depends on:
 66 |       c35205aa05124 (staging: dwc2: re-use hptxfsiz variable) (hard)
 67 |     a1fc524393583 (staging: dwc2: properly mask the GRXFSIZ register) depends on:
 68 |       4ab799df6d716 (staging: dwc2: remove specific fifo size constants) (proximity)
 69 |       c35205aa05124 (staging: dwc2: re-use hptxfsiz variable) (proximity)
 70 |       08b9f9db707ba (staging: dwc2: remove redundant register reads) (hard)
 71 |     9badec2f9fa92 (staging: dwc2: interpret all hwcfg and related register at init time) depends on:
 72 |       3b9edf88472e9 (staging: dwc2: fix off-by-one in check for max_packet_count parameter) (hard)
 73 |       f923463335385 (staging: dwc2: unshift non-bool register value constants) (hard)
 74 |       1c58ce133971e (staging: dwc2: only read the snpsid register once) (hard)
 75 |       a1fc524393583 (staging: dwc2: properly mask the GRXFSIZ register) (hard)
 76 |     de4a193193989 (staging: dwc2: validate the value for phy_utmi_width) depends on:
 77 |       9badec2f9fa92 (staging: dwc2: interpret all hwcfg and related register at init time) (proximity)
 78 |     4ab799df6d716 (staging: dwc2: remove specific fifo size constants)  ····················*    
 79 |     3b9edf88472e9 (staging: dwc2: fix off-by-one in check for max_packet_count param  ······│·X  
 80 |     f923463335385 (staging: dwc2: unshift non-bool register value constants)  ··········*···│·X  
 81 |     1c58ce133971e (staging: dwc2: only read the snpsid register once)  ·················│···│·X  
 82 |     d6ec53e04bf79 (staging: dwc2: simplify register shift expressions)  ────────────────┘   │ │  
 83 |     acdb9046b61a6 (staging: dwc2: add missing shift)                                        │ │  
 84 |     57bb8aeda06af (staging: dwc2: simplify debug output in dwc_hc_init)                     │ │  
 85 |     c35205aa05124 (staging: dwc2: re-use hptxfsiz variable)  ·····························X·* │  
 86 |     08b9f9db707ba (staging: dwc2: remove redundant register reads)  ──────────────────────┘·X │  
 87 |     a1fc524393583 (staging: dwc2: properly mask the GRXFSIZ register)  ─────────────────────┘·X  
 88 |     9badec2f9fa92 (staging: dwc2: interpret all hwcfg and related register at init t  ────────┘·*
 89 |     de4a193193989 (staging: dwc2: validate the value for phy_utmi_width)  ──────────────────────┘
 90 |     91121c103ae93 (staging: dwc2: make dwc2_core_params documentation more complete)             
 91 | 
 92 | The above uses by-line analysis. A per-file analysis is also available,
 93 | but as you can see below, it is not so useful for this example (since
 94 | there are only a handful of files in the driver, pretty much every patch
 95 | depends on every other patch:
 96 | 
 97 |     $ patchdeps --git 025a9230c8373..91121c103ae93 --matrix --by-file
 98 |     4ab799df6d716 (staging: dwc2: remove specific fifo size constants)  ················X·············X···X  
 99 |     3b9edf88472e9 (staging: dwc2: fix off-by-one in check for max_packet_count param  ··X···X···X·X·X·X·X·X  
100 |     f923463335385 (staging: dwc2: unshift non-bool register value constants)  ──────────┘·X·X·X·X·X·X·X·X·X  
101 |     1c58ce133971e (staging: dwc2: only read the snpsid register once)  ───────────────────┘·X·X·│·│·│·│·X │  
102 |     d6ec53e04bf79 (staging: dwc2: simplify register shift expressions)  ────────────────────┘·X·X·X·X·X·X·X  
103 |     acdb9046b61a6 (staging: dwc2: add missing shift)  ────────────────────────────────────────┘·│·│·│·│·X │  
104 |     57bb8aeda06af (staging: dwc2: simplify debug output in dwc_hc_init)  ───────────────────────┘·X·X·X·X·X  
105 |     c35205aa05124 (staging: dwc2: re-use hptxfsiz variable)  ─────────────────────────────────────┘·X·X·X·X  
106 |     08b9f9db707ba (staging: dwc2: remove redundant register reads)  ────────────────────────────────┘·X·X·X  
107 |     a1fc524393583 (staging: dwc2: properly mask the GRXFSIZ register)  ───────────────────────────────┘·X·X  
108 |     9badec2f9fa92 (staging: dwc2: interpret all hwcfg and related register at init t  ──────────────────┘·X·X
109 |     de4a193193989 (staging: dwc2: validate the value for phy_utmi_width)  ────────────────────────────────┘·X
110 |     91121c103ae93 (staging: dwc2: make dwc2_core_params documentation more complete)  ──────────────────────┘
111 | 
112 | Copyright & License
113 | -------------------
114 | © 2013 Matthijs Kooijman <<matthijs@stdin.nl>>
115 | 
116 | Patchdeps is made available under the MIT license
117 | 
118 | Permission is hereby granted, free of charge, to any person obtaining
119 | a copy of this software and associated documentation files (the
120 | "Software"), to deal in the Software without restriction, including
121 | without limitation the rights to use, copy, modify, merge, publish,
122 | distribute, sublicense, and/or sell copies of the Software, and to
123 | permit persons to whom the Software is furnished to do so, subject to
124 | the following conditions:
125 | 
126 | The above copyright notice and this permission notice shall be
127 | included in all copies or substantial portions of the Software.
128 | 
129 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
130 | EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
131 | MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
132 | IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
133 | CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
134 | TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
135 | SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
136 | 


--------------------------------------------------------------------------------
/parser.py:
--------------------------------------------------------------------------------
  1 | # The MIT License (MIT)
  2 | # Copyright (c) 2012 Matias Bordese
  3 | # Copyright (c) 2013 Matthijs Kooijman <matthijs@stdin.nl>
  4 | #
  5 | # Permission is hereby granted, free of charge, to any person obtaining a copy
  6 | # of this software and associated documentation files (the "Software"), to deal
  7 | # in the Software without restriction, including without limitation the rights
  8 | # to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
  9 | # copies of the Software, and to permit persons to whom the Software is
 10 | # furnished to do so, subject to the following conditions:
 11 | #
 12 | # The above copyright notice and this permission notice shall be included in all
 13 | # copies or substantial portions of the Software.
 14 | #
 15 | # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
 16 | # EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
 17 | # MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
 18 | # IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
 19 | # DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
 20 | # OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE
 21 | # OR OTHER DEALINGS IN THE SOFTWARE.
 22 | 
 23 | """Unified diff parser module."""
 24 | 
 25 | # This file is based on the unidiff library by Matías Bordese (at
 26 | # https://github.com/matiasb/python-unidiff)
 27 | 
 28 | # A "Changeset" consists of multiple "PatchFiles" / commits (GitRev)
 29 | # A single patch(-file)/commit affects a "set of PatchedFiles"
 30 | # For each "PatchedFile" the patch consists of multiple "Hunks"
 31 | # Each "Hunk" concists of multiple "Lines" being "context or change" (LineType)
 32 | 
 33 | from __future__ import annotations
 34 | 
 35 | import re
 36 | from enum import Enum
 37 | from typing import Any, Iterable, Iterator
 38 | 
 39 | RE_SOURCE_FILENAME = re.compile(r'^--- (?P<filename>[^\t]+)')
 40 | RE_TARGET_FILENAME = re.compile(r'^\+\+\+ (?P<filename>[^\t]+)')
 41 | 
 42 | # @@ (source offset, length) (target offset, length) @@
 43 | RE_HUNK_HEADER = re.compile(r"^@@ -(\d+)(?:,(\d+))? \+(\d+)(?:,(\d+))?\ @@")
 44 | RE_HUNK_BODY_LINE = re.compile(r'^([- \+\\])')
 45 | 
 46 | 
 47 | class LineType(Enum):
 48 |     ADD = '+'  # added line
 49 |     DELETE= '-'  # deleted line
 50 |     CONTEXT = ' '  # kept line (context)
 51 |     IGNORE = '\\'  # No newline case (ignore)
 52 | 
 53 | 
 54 | class UnidiffParseError(Exception):
 55 |     pass
 56 | 
 57 | 
 58 | class Line:
 59 |     """
 60 |     A single line from a patch hunk.
 61 |     """
 62 |     def __init__(
 63 |         self,
 64 |         hunk: Hunk,
 65 |         action: LineType,
 66 |         source_lineno_rel: int,
 67 |         source_line: str | None,
 68 |         target_lineno_rel: int,
 69 |         target_line: str | None,
 70 |     ) -> None:
 71 |         """
 72 |         The line numbers must always be present, either source_line or
 73 |         target_line can be None depending on the action.
 74 |         """
 75 |         self.hunk = hunk
 76 |         self.action = action
 77 |         self.source_lineno_rel = source_lineno_rel
 78 |         self.source_line = source_line
 79 |         self.target_lineno_rel = target_lineno_rel
 80 |         self.target_line = target_line
 81 | 
 82 |         self.source_lineno_abs = self.hunk.source_start + self.source_lineno_rel
 83 |         self.target_lineno_abs = self.hunk.target_start + self.target_lineno_rel
 84 | 
 85 |     def __str__(self) -> str:
 86 |         return f"(-{self.source_lineno_abs}, +{self.target_lineno_abs}) {self.action}{self.source_line or self.target_line}"
 87 | 
 88 | 
 89 | class PatchedFile(list["Hunk"]):
 90 |     """Data from a patched file."""
 91 | 
 92 |     def __init__(self, source: str = '', target: str = '') -> None:
 93 |         self.source_file = source
 94 |         self.target_file = target
 95 | 
 96 |         if self.source_file.startswith('a/') and self.target_file.startswith('b/'):
 97 |             self.path = self.source_file[2:]
 98 |         elif self.source_file.startswith('a/') and self.target_file == '/dev/null':
 99 |             self.path = self.source_file[2:]
100 |         elif self.target_file.startswith('b/') and self.source_file == '/dev/null':
101 |             self.path = self.target_file[2:]
102 |         else:
103 |             self.path = self.source_file
104 | 
105 | 
106 | class Hunk:
107 |     """Each of the modified blocks of a file."""
108 | 
109 |     def __init__(self, src_start: int = 0, src_len: int = 0, tgt_start: int = 0, tgt_len: int = 0) -> None:
110 |         self.source_start = src_start
111 |         self.source_length = self.source_todo = src_len
112 |         self.target_start = tgt_start
113 |         self.target_length = self.target_todo = tgt_len
114 |         self.changes: list[Line] = []
115 | 
116 |     def is_valid(self) -> bool:
117 |         """Check hunk header data matches entered lines info."""
118 |         return self.source_todo == self.target_todo == 0
119 | 
120 |     def append_line(self, line: Line) -> None:
121 |         """Append a line."""
122 |         self.changes.append(line)
123 | 
124 |         if line.action in {LineType.CONTEXT, LineType.DELETE}:
125 |             self.source_todo -= 1
126 |             if self.source_todo < 0:
127 |                 raise UnidiffParseError(
128 |                     f'Too many source lines in hunk: {self}')
129 | 
130 |         if line.action in {LineType.CONTEXT, LineType.ADD}:
131 |             self.target_todo -= 1
132 |             if self.target_todo < 0:
133 |                 raise UnidiffParseError(
134 |                     f'Too many target lines in hunk: {self}')
135 | 
136 |     def __str__(self) -> str:
137 |         return f"<@@ {self.source_start},{self.source_length} {self.target_start},{self.target_length} @@>"
138 | 
139 | 
140 | def _parse_hunk(
141 |     diff: Iterator[str],
142 |     source_start: int,
143 |     source_len: int,
144 |     target_start: int,
145 |     target_len: int,
146 | ) -> Hunk:
147 |     hunk = Hunk(source_start, source_len, target_start, target_len)
148 |     source_lineno = 0
149 |     target_lineno = 0
150 | 
151 |     for line in diff:
152 |         valid_line = RE_HUNK_BODY_LINE.match(line)
153 |         if valid_line:
154 |             action = LineType(valid_line.group(0))
155 |             original_line = line[1:]
156 | 
157 |             kwargs: dict[str, Any] = {
158 |                 "action": action,
159 |                 "hunk": hunk,
160 |                 "source_lineno_rel": source_lineno,
161 |                 "target_lineno_rel": target_lineno,
162 |                 "source_line": None,
163 |                 "target_line": None,
164 |             }
165 | 
166 |             if action == LineType.ADD:
167 |                 kwargs['target_line'] = original_line
168 |                 target_lineno += 1
169 |             elif action == LineType.DELETE:
170 |                 kwargs['source_line'] = original_line
171 |                 source_lineno += 1
172 |             elif action == LineType.CONTEXT:
173 |                 kwargs['source_line'] = original_line
174 |                 kwargs['target_line'] = original_line
175 |                 source_lineno += 1
176 |                 target_lineno += 1
177 |             hunk.append_line(Line(**kwargs))
178 |         else:
179 |             raise UnidiffParseError(f'Hunk diff data expected: {line}')
180 | 
181 |         # check hunk len(old_lines) and len(new_lines) are ok
182 |         if hunk.is_valid():
183 |             break
184 | 
185 |     return hunk
186 | 
187 | 
188 | def parse_diff(diff: Iterable[str]) -> list[PatchedFile]:
189 |     ret: list[PatchedFile] = []
190 | 
191 |     # Make sure we only iterate the diff once, instead of restarting
192 |     # from the top inside _parse_hunk
193 |     lines = iter(diff)
194 |     for line in lines:
195 |         if m := RE_SOURCE_FILENAME.match(line):
196 |             source_file = m['filename']
197 |         elif m := RE_TARGET_FILENAME.match(line):
198 |             target_file = m['filename']
199 |             current_file = PatchedFile(source_file, target_file)
200 |             ret.append(current_file)
201 |         elif m := RE_HUNK_HEADER.match(line):
202 |             hunk = _parse_hunk(
203 |                 lines,
204 |                 int(m[1]),
205 |                 _int1(m[2]),
206 |                 int(m[3]),
207 |                 _int1(m[4]),
208 |             )
209 |             current_file.append(hunk)
210 | 
211 |     return ret
212 | 
213 | 
214 | def _int1(s: str) -> int:
215 |     return 1 if s is None else int(s)
216 | 


--------------------------------------------------------------------------------
/patchdeps.py:
--------------------------------------------------------------------------------
  1 | #!/usr/bin/env python3
  2 | 
  3 | # Copyright (c) 2013 Matthijs Kooijman <matthijs@stdin.nl>
  4 | #
  5 | # Permission is hereby granted, free of charge, to any person obtaining
  6 | # a copy of this software and associated documentation files (the
  7 | # "Software"), to deal in the Software without restriction, including
  8 | # without limitation the rights to use, copy, modify, merge, publish,
  9 | # distribute, sublicense, and/or sell copies of the Software, and to
 10 | # permit persons to whom the Software is furnished to do so, subject to
 11 | # the following conditions:
 12 | #
 13 | # The above copyright notice and this permission notice shall be
 14 | # included in all copies or substantial portions of the Software.
 15 | #
 16 | # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
 17 | # EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
 18 | # MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
 19 | # IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
 20 | # CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
 21 | # TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
 22 | # SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
 23 | #
 24 | # Simple script to process a list of patch files and identify obvious
 25 | # dependencies between them. Inspired by the similar (but more limited)
 26 | # perl script published at
 27 | # http://blog.mozilla.org/sfink/2012/01/05/patch-queue-dependencies/
 28 | #
 29 | 
 30 | from __future__ import annotations
 31 | 
 32 | import argparse
 33 | import collections
 34 | import itertools
 35 | import os
 36 | import subprocess
 37 | import sys
 38 | import textwrap
 39 | from enum import Enum
 40 | from parser import LineType, parse_diff
 41 | from typing import TYPE_CHECKING, Iterable, Iterator
 42 | 
 43 | if TYPE_CHECKING:
 44 |     from parser import Hunk, PatchedFile
 45 | 
 46 | 
 47 | class Depend(Enum):
 48 |     # Used if a patch changes a line changed by another patch
 49 |     HARD = ("hard", "X", "solid")
 50 |     # Used if a patch changes a line changed near a line changed by another patch
 51 |     PROXIMITY = ("proximity", "*", "dashed")
 52 |     # By filename
 53 |     FILENAME = ("", "X", "solid")
 54 | 
 55 |     def __init__(self, desc: str, matrixmark: str, dotstyle: str) -> None:
 56 |         self.desc = desc
 57 |         self.matrixmark = matrixmark
 58 |         self.dotstyle = dotstyle
 59 | 
 60 | 
 61 | class Changeset:
 62 |     def get_patch_set(self) -> list[PatchedFile]:
 63 |         """
 64 |         Returns this changeset as a list of PatchedFiles.
 65 |         """
 66 |         parsed = parse_diff(self.get_diff())
 67 |         if not parsed:
 68 |             sys.stderr.write(f"WARNING: Parsing diff {self} produced no patch hunks, maybe format is invalid?\n")
 69 |         return parsed
 70 | 
 71 |     def get_diff(self) -> Iterable[str]:
 72 |         """
 73 |         Returns the textual unified diff for this changeset as an
 74 |         iterable of lines
 75 |         """
 76 |         raise NotImplementedError
 77 | 
 78 |     def __repr__(self) -> str:
 79 |         return f"{self.__class__.__name__}({self!s})"
 80 | 
 81 | 
 82 | class PatchFile(Changeset):
 83 |     def __init__(self, filename: str) -> None:
 84 |         self.filename = filename
 85 | 
 86 |     def get_diff(self) -> Iterable[str]:
 87 |         f = open(self.filename, encoding='utf-8')
 88 |         # Iterating over a file gives separate lines, with newlines
 89 |         # included. We want those stripped off
 90 |         return (line.rstrip('\n') for line in f)
 91 | 
 92 |     @staticmethod
 93 |     def get_changesets(args: Iterable[str]) -> Iterator[PatchFile]:
 94 |         """
 95 |         Generate Changeset objects, given patch filenamesk
 96 |         """
 97 |         for filename in args:
 98 |             yield PatchFile(filename)
 99 | 
100 |     def __str__(self) -> str:
101 |         return os.path.basename(self.filename)
102 | 
103 | 
104 | class GitRev(Changeset):
105 |     def __init__(self, rev: str, msg: str) -> None:
106 |         self.rev = rev
107 |         self.msg = msg
108 | 
109 |     def get_diff(self) -> Iterable[str]:
110 |         diff = subprocess.check_output(['git', 'diff', '--no-color', f"{self.rev}^", self.rev])
111 |         # Convert to utf8 and just drop any invalid characters (we're
112 |         # not interested in the actual file contents and all diff
113 |         # special characters are valid ascii).
114 |         return diff.decode(errors='ignore').split('\n')
115 | 
116 |     def __str__(self) -> str:
117 |         return f"{self.rev} ({self.msg})"
118 | 
119 |     @staticmethod
120 |     def get_changesets(args: list[str]) -> Iterator[GitRev]:
121 |         """
122 |         Generate Changeset objects, given arguments for git rev-list.
123 |         """
124 |         output = subprocess.check_output(['git', 'rev-list', '--oneline', '--reverse', *args])
125 | 
126 |         if not output:
127 |             sys.stderr.write("No revisions specified?\n")
128 |         else:
129 |             lines = output.decode().strip().split('\n')
130 |             for line in lines:
131 |                 yield GitRev(*line.split(' ', 1))
132 | 
133 | 
134 | def print_depends(patches: list[Changeset], depends: dict[Changeset, dict[Changeset, Depend]]) -> None:
135 |     for p in patches:
136 |         if dependencies := depends[p]:
137 |             print(f"{p} depends on:")
138 |             for dep in patches:
139 |                 if dependency := dependencies.get(dep):
140 |                     desc = dependency.desc
141 |                     if desc:
142 |                         print(f"  {dep} ({desc})")
143 |                     else:
144 |                         print(f"  {dep}")
145 | 
146 | 
147 | def print_depends_tsort(patches: list[Changeset], depends: dict[Changeset, dict[Changeset, Depend]]) -> None:
148 |     for p in patches:
149 |         if dependencies := depends[p]:
150 |             for dep in patches:
151 |                 if dep in dependencies:
152 |                     def no_delim(x):
153 |                         # Tsort source has: #define DELIM " \t\n"
154 |                         return str(x).replace(' ', '_').replace('\t', '_').replace('\n', '_')
155 |                     print(f"{no_delim(dep)}\t{no_delim(p)}")
156 | 
157 | 
158 | def print_depends_matrix(patches: list[Changeset], depends: dict[Changeset, dict[Changeset, Depend]]) -> None:
159 |     # Which patches have at least one dependency drawn (and thus
160 |     # need lines from then on)?
161 |     has_deps: set[Changeset] = set()
162 |     prereq: set[Changeset] = {dep for p in patches for dep in depends[p]}
163 |     # Every patch depending on other patches needs a column
164 |     depending: list[Changeset] = [p for p in patches if depends[p]]
165 |     column = 82
166 |     for p in patches:
167 |         if depending and depending[0] == p:
168 |             del depending[0]
169 |             column += 2
170 |             fill, corner = "─", "┘"
171 |         else:
172 |             fill = corner = "·" if p in prereq else " "
173 |         line = f"{f'{p!s:.80}  ':{fill}<{column}}{corner}"
174 | 
175 |         for i, dep in enumerate(depending):
176 |             # Show ruler if a later patch depends on this one
177 |             ruler = "·" if any(depends[d].get(p) for d in depending[i:]) else " "
178 |             # For every later patch, print an "X" if it depends on this one
179 |             if dependency := depends[dep].get(p):
180 |                 line += f"{ruler}{dependency.matrixmark}"
181 |                 has_deps.add(dep)
182 |             elif dep in has_deps:
183 |                 line += f"{ruler}│"
184 |             else:
185 |                 line += ruler * 2
186 | 
187 |         print(line)
188 | 
189 | 
190 | def dot_escape_string(s: str) -> str:
191 |     return s.replace("\\", "\\\\").replace('"', '\\"')
192 | 
193 | 
194 | def depends_dot(args: argparse.Namespace, patches: list[Changeset], depends: dict[Changeset, dict[Changeset, Depend]]) -> str:
195 |     """
196 |     Returns dot code for the dependency graph.
197 |     """
198 |     # Seems that 'fdp' gives the best clustering if patches are often independent
199 |     res = """
200 | digraph ConflictMap {
201 | node [shape=box]
202 | layout=neato
203 | overlap=scale
204 | """
205 | 
206 |     if args.randomize:
207 |         res += "start=random\n"
208 | 
209 |     for i, p in enumerate(patches):
210 |         label = dot_escape_string(str(p))
211 |         label = "\\n".join(textwrap.wrap(label, 25))
212 |         res += f'{i} [label="{label}"]\n'
213 |         for dep, v in depends[p].items():
214 |             res += f"{patches.index(dep)} -> {i} [style={v.dotstyle}]\n"
215 |     res += "}\n"
216 | 
217 |     return res
218 | 
219 | 
220 | def show_xdot(dot: str) -> None:
221 |     """
222 |     Shows a given dot graph in xdot
223 |     """
224 |     subprocess.run(['xdot', '/dev/stdin'], input=dot.encode(), check=True)
225 | 
226 | 
227 | class ByFileAnalyzer:
228 |     def analyze(self, args: argparse.Namespace, patches: list[Changeset]) -> dict[Changeset, dict[Changeset, Depend]]:
229 |         """
230 |         Find dependencies in a list of patches by looking at the files they
231 |         change.
232 | 
233 |         The algorithm is simple: Just keep a list of files changed, and mark
234 |         two patches as conflicting when they change the same file.
235 |         """
236 |         # Which patches touch a particular file. A dict of filename => list
237 |         # of patches
238 |         touches_file: dict[str, list[Changeset]] = collections.defaultdict(list)
239 | 
240 |         # Which patch depends on which other patches?
241 |         # A dict of patch => (dict of dependent patches => Depend.FILENAME)
242 |         depends: dict[Changeset, dict[Changeset, Depend]] = collections.defaultdict(dict)
243 | 
244 |         for patch in patches:
245 |             for f in patch.get_patch_set():
246 |                 for other in touches_file[f.path]:
247 |                     depends[patch][other] = Depend.FILENAME
248 | 
249 |                 touches_file[f.path].append(patch)
250 | 
251 |         if 'blame' in args.actions:
252 |             for path, ps in touches_file.items():
253 |                 patch = ps[-1]
254 |                 print(f"{patch!s:80.80} {path}")
255 | 
256 |         return depends
257 | 
258 | 
259 | class ByLineAnalyzer:
260 |     def analyze(self, args: argparse.Namespace, patches: list[Changeset]) -> dict[Changeset, dict[Changeset, Depend]]:
261 |         """
262 |         Find dependencies in a list of patches by looking at the lines they
263 |         change.
264 |         """
265 |         # Per-file info on which patch last touched a particular line.
266 |         # A dict of file => list of LineState objects
267 |         state: dict[str, ByLineFileAnalyzer] = {}
268 | 
269 |         # Which patch depends on which other patches?
270 |         # A dict of patch => (dict of dependent patches => Depend)
271 |         depends: dict[Changeset, dict[Changeset, Depend]] = collections.defaultdict(dict)
272 | 
273 |         for patch in patches:
274 |             for f in patch.get_patch_set():
275 |                 if f.path not in state:
276 |                     state[f.path] = ByLineFileAnalyzer(f.path, args.proximity)
277 | 
278 |                 state[f.path].analyze(depends, patch, f)
279 | 
280 |         if 'blame' in args.actions:
281 |             for a in state.values():
282 |                 a.print_blame()
283 | 
284 |         return depends
285 | 
286 | 
287 | class ByLineFileAnalyzer:
288 |     """
289 |     Helper class for the ByLineAnalyzer, that performs the analysis for
290 |     a specific file. Created once per file and called for multiple patches.
291 |     """
292 | 
293 |     def __init__(self, fname: str, proximity: int) -> None:
294 |         self.fname = fname
295 |         self.proximity = proximity
296 |         self.line_list: list[ByLineFileAnalyzer.LineState] = []
297 | 
298 |     def analyze(self, depends: dict[Changeset, dict[Changeset, Depend]], patch: Changeset, hunks: PatchedFile) -> None:
299 |         # This is the index in line_list of the first line state that
300 |         # still uses source line numbers
301 |         self.to_update_idx = 0
302 | 
303 |         # The index in line_list of the last line processed (i.e,
304 |         # matched against a diff line)
305 |         self.processed_idx = -1
306 | 
307 |         # Offset between source and target files at state_pos
308 |         self.offset = 0
309 | 
310 |         for hunk in hunks:
311 |             self.analyze_hunk(depends, patch, hunk)
312 | 
313 |         # Pretend we processed the entire list, so update_offset can
314 |         # update the line numbers of any remaining (unchanged) lines
315 |         # after the last hunk in this patch
316 |         self.processed_idx = len(self.line_list)
317 |         self.update_offset(0)
318 | 
319 |     def line_state(self, lineno: int, create: bool) -> LineState | None:
320 |         """
321 |         Returns the state of the given (source) line number, creating a
322 |         new empty state if it is not yet present and create is True.
323 |         """
324 | 
325 |         self.processed_idx += 1
326 |         for state in self.line_list[self.processed_idx:]:
327 |             # Found it, return
328 |             if state.lineno == lineno:
329 |                 return state
330 |             # It's not in there, stop looking
331 |             if state.lineno > lineno:
332 |                 break
333 |             # We're already passed this one, continue looking
334 |             self.processed_idx += 1
335 | 
336 |         if not create:
337 |             return None
338 | 
339 |         # We don't have state for this particular line, insert a
340 |         # new empty state
341 |         state = self.LineState(lineno=lineno)
342 |         self.line_list.insert(self.processed_idx, state)
343 |         return state
344 | 
345 |     def update_offset(self, amount: int) -> None:
346 |         """
347 |         Update the offset between target and source lines by the
348 |         specified amount.
349 | 
350 |         Takes care of updating the line states of all processed lines
351 |         (up to but excluding self.processed_idx) with the old offset
352 |         before changing it.
353 |         """
354 | 
355 |         for state in self.line_list[self.to_update_idx:self.processed_idx]:
356 |             state.lineno += self.offset
357 |             self.to_update_idx += 1
358 | 
359 |         self.offset += amount
360 | 
361 |     def analyze_hunk(self, depends: dict[Changeset, dict[Changeset, Depend]], patch: Changeset, hunk: Hunk) -> None:
362 |         #print('\n'.join(map(str, self.line_list)))
363 |         #print('--')
364 |         for change in hunk.changes:
365 |             # When adding a line, don't bother creating a new line
366 |             # state, since we'll be adding one anyway (this prevents
367 |             # extra unused linestates)
368 |             create = change.action != LineType.ADD
369 |             line_state = self.line_state(change.source_lineno_abs, create)
370 | 
371 |             # When changing a line, claim proximity lines before it as
372 |             # well.
373 |             if change.action != LineType.CONTEXT and self.proximity != 0:
374 |                 # i points to the only linestate that could contain the
375 |                 # state for lineno
376 |                 i = self.processed_idx - 1
377 |                 lineno = change.source_lineno_abs - 1
378 |                 while (change.source_lineno_abs - lineno <= self.proximity and
379 |                        lineno > 0):
380 |                     if (i < 0 or
381 |                         i >= self.to_update_idx and
382 |                         self.line_list[i].lineno < lineno or
383 |                         i < self.to_update_idx and
384 |                         self.line_list[i].lineno - self.offset < lineno):
385 |                             # This line does not exist yet, i points to an
386 |                             # earlier line. Insert it
387 |                             # _after_ i.
388 |                             self.line_list.insert(i + 1, self.LineState(lineno))
389 |                             # Point i at the inserted line
390 |                             i += 1
391 |                             self.processed_idx += 1
392 |                             assert i >= self.to_update_idx, "Inserting before already updated line"
393 | 
394 |                     # Claim this line
395 |                     s = self.line_list[i]
396 | 
397 |                     # Already claimed, stop looking. This should also
398 |                     # prevent us from i becoming < to_update_idx - 1,
399 |                     # since the state at to_update_idx - 1 should always
400 |                     # be claimed
401 |                     if patch in s.proximity or s.changed_by == patch:
402 |                         break
403 | 
404 |                     s.proximity.add(patch)
405 |                     i -= 1
406 |                     lineno -= 1
407 | 
408 |             # For changes that know about the contents of the old line,
409 |             # check if it matches our observations
410 |             if change.action != LineType.ADD:
411 |                 assert line_state is not None
412 |                 if line_state.line is not None and change.source_line != line_state.line:
413 |                     sys.exit(
414 |                         f"While processing {patch}\n"
415 |                         "Warning: patch does not apply cleanly! Results are probably wrong!\n"
416 |                         f"According to previous patches, line {change.source_lineno_abs} is:\n"
417 |                         f"{line_state.line}\n"
418 |                         f"But according to {patch}, it should be:\n"
419 |                         f"{change.source_line}\n\n",
420 |                     )
421 | 
422 |             if change.action == LineType.CONTEXT:
423 |                 assert line_state is not None
424 |                 if line_state.line is None:
425 |                     line_state.line = change.target_line
426 | 
427 |                 # For context lines, only remember the line contents
428 |                 #claim_after(in_change, change.
429 |                 #in_change = False
430 | 
431 |             elif change.action == LineType.ADD:
432 |                 self.update_offset(1)
433 | 
434 |                 # Mark this line as changed by this patch
435 |                 s = self.LineState(lineno=change.target_lineno_abs,
436 |                                    line=change.target_line,
437 |                                    changed_by=patch)
438 |                 self.line_list.insert(self.processed_idx, s)
439 |                 assert self.processed_idx == self.to_update_idx, "Not everything updated?"
440 | 
441 |                 # Since we insert this using the target line number, it
442 |                 # doesn't need to be updated again
443 |                 self.to_update_idx += 1
444 | 
445 |                 # Add proximity deps for patches that touched code
446 |                 # around this line. We can't get a hard dependency for
447 |                 # an 'add' change, since we don't actually touch any
448 |                 # existing code
449 |                 if line_state:
450 |                     deps = itertools.chain(line_state.proximity,
451 |                                            [line_state.changed_by])
452 |                     for p in deps:
453 |                         if p and p not in depends[patch] and p != patch:
454 |                             depends[patch][p] = Depend.PROXIMITY
455 | 
456 |             elif change.action == LineType.DELETE:
457 |                 assert line_state is not None
458 |                 self.update_offset(-1)
459 | 
460 |                 # This file was touched by another patch, add dependency
461 |                 if line_state.changed_by:
462 |                     depends[patch][line_state.changed_by] = Depend.HARD
463 |                     # TODO(PHH): Assigning to singleton Depend.*.dottooltip; unused by `depends_dot`
464 |                     # https://graphviz.org/docs/attrs/tooltip/
465 |                     # depends[patch][line_state.changed_by].dottooltip = f"-{change.source_line}"
466 | 
467 |                 # Also add proximity deps for patches that touched code
468 |                 # around this line
469 |                 for p in line_state.proximity:
470 |                     if (p not in depends[patch]) and p != patch:
471 |                         depends[patch][p] = Depend.PROXIMITY
472 | 
473 |                 # Forget about the state for this source line
474 |                 del self.line_list[self.processed_idx]
475 |                 self.processed_idx -= 1
476 | 
477 |             # After changing a line, claim proximity lines after it as well.
478 |             if change.action != LineType.CONTEXT and self.proximity != 0:
479 |                 # i points to the only linestate that could contain the
480 |                 # state for lineno
481 |                 i = self.to_update_idx
482 |                 # When a file is created, the source line for the adds is 0...
483 |                 lineno = change.source_lineno_abs or 1
484 |                 while (lineno - change.source_lineno_abs < self.proximity):
485 |                     if i >= len(self.line_list) or self.line_list[i].lineno > lineno:
486 |                         # This line does not exist yet, i points to an
487 |                         # later line. Insert it _before_ i.
488 |                         self.line_list.insert(i, self.LineState(lineno))
489 |                         assert i > self.processed_idx, "Inserting before already processed line"
490 | 
491 |                     # Claim this line
492 |                     self.line_list[i].proximity.add(patch)
493 | 
494 |                     i += 1
495 |                     lineno += 1
496 | 
497 |     def print_blame(self) -> None:
498 |         print(f"{self.fname}:")
499 |         next_line: int | None = None
500 |         for line_state in self.line_list:
501 |             if line_state.line is None:
502 |                 continue
503 | 
504 |             if next_line and line_state.lineno != next_line:
505 |                 print(f"{'':50}    …")
506 | 
507 |             print(f"{line_state.changed_by or ''!s:50.50} {line_state.lineno:4} {line_state.line}")
508 |             next_line = line_state.lineno + 1
509 | 
510 |         print()
511 | 
512 | 
513 |     class LineState:
514 |         """ State of a particular line in a file """
515 |         def __init__(self, lineno: int, line: str | None = None, changed_by: Changeset | None = None) -> None:
516 |             self.lineno = lineno
517 |             self.line = line
518 |             self.changed_by = changed_by
519 |             # Set of patches that changed lines near this one
520 |             self.proximity: set[Changeset] = set()
521 | 
522 |         def __str__(self) -> str:
523 |             return f"{self.lineno}: changed by {self.changed_by}: {self.line}"
524 | 
525 | 
526 | def parse_args() -> argparse.Namespace:
527 |     parser = argparse.ArgumentParser(description='Analyze patches for dependencies.')
528 | 
529 |     types = parser.add_argument_group('type').add_mutually_exclusive_group(required=True)
530 |     types.add_argument('--git', dest='changeset_type', action='store_const',
531 |                    const=GitRev,
532 |                    help='Analyze a list of git revisions (non-option arguments are passed git git rev-list as-is')
533 |     types.add_argument('--patches', dest='changeset_type', action='store_const',
534 |                    const=PatchFile,
535 |                    help='Analyze a list of patch files (non-option arguments are patch filenames')
536 | 
537 |     parser.add_argument('arguments', metavar="ARG", nargs='*', help="""
538 |                         Specification of patches to analyze, depending
539 |                         on the type given. When --git is given, this is
540 |                         passed to git rev-list as-is (so use a valid
541 |                         revision range, like HEAD^^..HEAD). When
542 |                         --patches is given, these are filenames of patch
543 |                         files.""")
544 |     parser.add_argument('--by-file', dest='analyzer', action='store_const',
545 |                         const=ByFileAnalyzer, default=ByLineAnalyzer, help="""
546 |                         Mark patches as conflicting when they change the
547 |                         same file (by default, they are conflicting when
548 |                         they change the same lines).""")
549 |     parser.add_argument('--proximity', default='2', metavar='LINES',
550 |                         type=int, help="""
551 |                         The number of lines changes should be apart to
552 |                         prevent being marked as a dependency. Pass 0 to
553 |                         only consider exactly the same line. This option
554 |                         is no used when --by-file is passed. The default
555 |                         value is %(default)s.""")
556 |     parser.add_argument('--randomize', action='store_true', help="""
557 |                         Randomize the graph layout produced by
558 |                         --depends-dot and --depends-xdot.""")
559 | 
560 |     actions = parser.add_argument_group('actions')
561 |     actions.add_argument('--blame', dest='actions', action='append_const',
562 |                         const='blame', help="""
563 |                         Instead of outputting patch dependencies,
564 |                         output for each line or file which patch changed
565 |                         it last.""")
566 |     actions.add_argument('--list', dest='actions', action='append_const',
567 |                         const='list', help="""
568 |                         Output a list of each patch and the patches it
569 |                         depends on.""")
570 |     actions.add_argument('--matrix', dest='actions', action='append_const',
571 |                         const='matrix', help="""
572 |                         Output a matrix with patches on both axis and
573 |                         markings for dependencies. This is used if not
574 |                         action is given.""")
575 |     actions.add_argument('--tsort', dest='actions', action='append_const',
576 |                         const='tsort', help="""
577 |                         Show dependency graph as tsort input.""")
578 |     actions.add_argument('--dot', dest='actions', action='append_const',
579 |                         const='dot', help="""
580 |                         Output dot format for a dependency graph.""")
581 |     actions.add_argument('--xdot', dest='actions', action='append_const',
582 |                         const='xdot', help="""
583 |                         Show a dependency graph using xdot (if available).""")
584 | 
585 |     args = parser.parse_args()
586 | 
587 |     if not args.actions:
588 |         args.actions = ['matrix']
589 | 
590 |     return args
591 | 
592 | 
593 | def main() -> None:
594 |     args = parse_args()
595 | 
596 |     patches: list[Changeset] = list(args.changeset_type.get_changesets(args.arguments))
597 | 
598 |     depends = args.analyzer().analyze(args, patches)
599 | 
600 |     if 'list' in args.actions:
601 |         print_depends(patches, depends)
602 | 
603 |     if 'matrix' in args.actions:
604 |         print_depends_matrix(patches, depends)
605 | 
606 |     if 'tsort' in args.actions:
607 |         print_depends_tsort(patches, depends)
608 | 
609 |     if 'dot' in args.actions:
610 |         print(depends_dot(args, patches, depends))
611 | 
612 |     if 'xdot' in args.actions:
613 |         show_xdot(depends_dot(args, patches, depends))
614 | 
615 | 
616 | if __name__ == "__main__":
617 |     main()
618 | 
619 | # vim: set sw=4 sts=4 et:
620 | 


--------------------------------------------------------------------------------