├── .python-version
├── messages.json
├── .gitignore
├── Fmt.sublime-commands
├── Main.sublime-menu
├── messages
├── 0.1.8.md
├── 0.1.11.md
└── install.md
├── unlicense
├── Fmt.sublime-settings
├── readme.md
├── Fmt.py
└── difflib.py
/.python-version:
--------------------------------------------------------------------------------
1 | 3.8
--------------------------------------------------------------------------------
/messages.json:
--------------------------------------------------------------------------------
1 | {
2 | "install": "messages/install.md",
3 | "0.1.8": "messages/0.1.8.md"
4 | }
5 |
--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------
1 | /*
2 | !/*ignore
3 | !/.python-version
4 | !/unlicense
5 | !/*.py
6 | !/*.md
7 | !/*.json
8 | !/*.sublime-*
9 |
10 | !/messages
11 | /messages/*
12 | !/messages/*.md
13 |
--------------------------------------------------------------------------------
/Fmt.sublime-commands:
--------------------------------------------------------------------------------
1 | [
2 | {"caption": "Fmt: Format Buffer", "command": "fmt_format_buffer"},
3 | {"caption": "Fmt: Format Selection", "command": "fmt_format_selection"},
4 | {
5 | "caption": "Preferences: Fmt Settings",
6 | "command": "edit_settings",
7 | "args": {
8 | "base_file": "${packages}/Fmt/Fmt.sublime-settings",
9 | "default": "{\n\t$0\n}",
10 | },
11 | },
12 | ]
13 |
--------------------------------------------------------------------------------
/Main.sublime-menu:
--------------------------------------------------------------------------------
1 | [
2 | {
3 | "id": "preferences",
4 | "children": [
5 | {
6 | "id": "package-settings",
7 | "children": [
8 | {
9 | "caption": "Fmt",
10 | "children": [
11 | {
12 | "caption": "Settings",
13 | "command": "edit_settings",
14 | "args": {
15 | "base_file": "${packages}/Fmt/Fmt.sublime-settings",
16 | "default": "{\n\t$0\n}"
17 | }
18 | }
19 | ]
20 | }
21 | ]
22 | }
23 | ]
24 | }
25 | ]
26 |
--------------------------------------------------------------------------------
/messages/0.1.8.md:
--------------------------------------------------------------------------------
1 | `Fmt` has a new command for selection formatting!
2 |
3 | * Palette title: `Fmt: Format Selection`
4 | * Command name for hotkeys: `fmt_format_selection`
5 |
6 | Example hotkey:
7 |
8 | ```json
9 | {"keys": ["primary+k", "primary+k"], "command": "fmt_format_selection"},
10 | ```
11 |
12 | Selection formatting even works for embedded syntaxes, choosing settings/rules by the scope at the start of each region. For example, you can format a block of JSON or Go embedded in a Markdown file.
13 |
14 | When you have different formatters configured for both inner and outer syntaxes, it may currently choose the outer one. This could be rectified on demand.
15 |
--------------------------------------------------------------------------------
/messages/0.1.11.md:
--------------------------------------------------------------------------------
1 | `Fmt` now uses `"merge_type": "replace"` by default. This avoids worst-case freezes caused by poor combinatorial complexity of the diff algorithm. When using a precise formatter that generates few diffs, such as `gofmt` or `rustfmt`, it's safe to opt into diffing, which is better at preserving scroll and cursor position.
2 |
3 | Example config:
4 |
5 | ```json
6 | {
7 | "rules": [
8 | // Explicit diff merge.
9 | {
10 | "selector": "source.go",
11 | "cmd": ["goimports"],
12 | "format_on_save": true,
13 | "merge_type": "diff",
14 | },
15 | // Uses default replace merge.
16 | {
17 | "selector": "source.json",
18 | "cmd": ["jsonfmt"],
19 | },
20 | ],
21 | }
22 | ```
23 |
--------------------------------------------------------------------------------
/messages/install.md:
--------------------------------------------------------------------------------
1 | ## Fmt Setup
2 |
3 | (More in the readme: https://github.com/mitranim/sublime-fmt)
4 |
5 | Fmt has NO DEFAULT FORMATTERS. It invokes CLI programs installed globally on your system. You must specify them in the plugin settings:
6 |
7 | menu → Preferences → Package Settings → Fmt → Settings
8 |
9 | Example for Go:
10 |
11 | {
12 | "rules": [
13 | {
14 | "selector": "source.go",
15 | "cmd": ["goimports"],
16 | "format_on_save": true,
17 | "merge_type": "diff",
18 | },
19 | ],
20 | }
21 |
22 | To understand Sublime scopes and selector matching, read this short official doc: https://www.sublimetext.com/docs/selectors.html.
23 |
24 | HOW TO GET SCOPE NAME:
25 |
26 | Option 1: menu → Tools → Developer → Show Scope Name.
27 |
28 | Option 2: run the command `Fmt: Format Buffer`, and if not configured for the current scope, it will tell you!
29 |
--------------------------------------------------------------------------------
/unlicense:
--------------------------------------------------------------------------------
1 | This is free and unencumbered software released into the public domain.
2 |
3 | Anyone is free to copy, modify, publish, use, compile, sell, or
4 | distribute this software, either in source code form or as a compiled
5 | binary, for any purpose, commercial or non-commercial, and by any
6 | means.
7 |
8 | In jurisdictions that recognize copyright laws, the author or authors
9 | of this software dedicate any and all copyright interest in the
10 | software to the public domain. We make this dedication for the benefit
11 | of the public at large and to the detriment of our heirs and
12 | successors. We intend this dedication to be an overt act of
13 | relinquishment in perpetuity of all present and future rights to this
14 | software under copyright law.
15 |
16 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
17 | EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
18 | MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
19 | IN NO EVENT SHALL THE AUTHORS BE LIABLE FOR ANY CLAIM, DAMAGES OR
20 | OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
21 | ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
22 | OTHER DEALINGS IN THE SOFTWARE.
23 |
24 | For more information, please refer to
25 |
--------------------------------------------------------------------------------
/Fmt.sublime-settings:
--------------------------------------------------------------------------------
1 | {
2 | /*
3 | Formatting rules should be added here. (In your own settings file.)
4 |
5 | Every rule is a dictionary, where the only required field is "selector":
6 | a syntax type such as "source.python". See the docs on selectors:
7 | https://www.sublimetext.com/docs/selectors.html. All other fields are
8 | overrides for global settings such as "cmd".
9 |
10 | Example for Go:
11 |
12 | "rules": [
13 | {
14 | "selector": "source.go",
15 | "cmd": ["goimports"],
16 | "format_on_save": true,
17 | "merge_type": "diff",
18 | },
19 | ],
20 | */
21 | "rules": [],
22 |
23 | /*
24 | Command to invoke, with command line arguments. Must be a list of strings.
25 | The command must communicate over standard input/output.
26 |
27 | While technically this can be set at the top level, in practice you should
28 | set this PER SELECTOR in the "rules" setting, using different fmters for
29 | different scopes.
30 |
31 | Supports variable substitution, using the shell variable interpolation
32 | syntax:
33 |
34 | "cmd": ["some_command", "$tab_size"]
35 |
36 | Supported variables:
37 |
38 | - Environment variables via `os.environ`.
39 |
40 | - Special variables available in build systems:
41 | https://www.sublimetext.com/docs/build_systems.html.
42 |
43 | - $tab_size -- Indent width, usually 2 or 4; takes the "tab_size" setting
44 | from the current view.
45 |
46 | - $indent -- Literal indent: either N spaces or a single tab.
47 | */
48 | "cmd": null,
49 |
50 | /*
51 | Additional environment variables for the subprocess. Environment is always
52 | inherited from Sublime Text, which generally tries to mimic your shell env.
53 | This is needed only for additional variables and overrides.
54 |
55 | Can be configured per rule / per selector.
56 | */
57 | "env": null,
58 |
59 | /*
60 | Format current buffer on save. Disabled by default. Can be overridden for
61 | individual scope selectors.
62 |
63 | Note that you can format the buffer manually via the "Fmt: Format Buffer"
64 | command. You can also format selection via the `Fmt: Format Selection`
65 | command.
66 | */
67 | "format_on_save": false,
68 |
69 | /*
70 | Determines the CWD of the subprocess. Possible values:
71 |
72 | - "auto" -- Try to use the current file's directory; fall back on
73 | the project root, which is assumed to be the first
74 | directory in the current window.
75 |
76 | - "project_root" -- Use the project root, which is assumed to be the first
77 | directory in the current window.
78 |
79 | - "none" -- Don't set the CWD.
80 |
81 | - ":" -- Use hardcoded path; may be useful for project-specific
82 | settings.
83 | */
84 | "cwd_mode": "auto",
85 |
86 | /*
87 | How to show errors. Possible values:
88 |
89 | - "" -- Hide errors completely.
90 |
91 | - "console" -- Print errors to the Sublime console.
92 |
93 | - "panel" -- Show an output panel at the bottom.
94 |
95 | - "popup" -- Show obnoxious popup windows.
96 | */
97 | "error_style": "panel",
98 |
99 | /*
100 | Determines how to replace buffer contents. Can be overridden for individual
101 | scope selectors. "diff" is more precise and preserves scroll and cursor
102 | position, but can be EXTREMELY slow when the number of changes exceeds a few
103 | dozen.
104 |
105 | Possible values:
106 |
107 | - "replace" -- Simpler but doesn't preserve cursor position.
108 |
109 | - "diff" -- More complicated but better at preserving cursor position.
110 | */
111 | "merge_type": "replace",
112 |
113 | /*
114 | Subprocess timeout in seconds. If execution takes longer, Fmt kills the
115 | subprocess and aborts with an error.
116 | */
117 | "timeout": 60,
118 | }
119 |
--------------------------------------------------------------------------------
/readme.md:
--------------------------------------------------------------------------------
1 | ## Overview
2 |
3 | Sublime Text plugin for auto-formatting arbitrary code by calling arbitrary executables. Works for `gofmt`, `rustfmt`, any similar tool that's an executable and uses standard input/output.
4 |
5 | Features:
6 |
7 | * Format on demand. Optionally auto-format on save.
8 | * Configure executables and other settings per _scope_ (syntax type: `source.go`, `source.rust` and so on).
9 | * Optionally preserve cursor and scroll position when formatting, via `"merge_type": "diff"`.
10 | * Show errors in an output panel (configurable).
11 | * Format either an entire file, or only selection.
12 | * Selection formatting works for embedded syntaxes, such as JS inside HTML.
13 |
14 | Limitations:
15 |
16 | * Invokes a subprocess every time. Good enough for formatters written in compiled languages, such as `gofmt` and `rustfmt`. If a given formatter is written in JS and takes a second to start up, this tool might not be suitable.
17 |
18 | Based on https://github.com/mitranim/sublime-gofmt and fully replaces it. Also replaces [RustFmt](https://github.com/mitranim/sublime-rust-fmt) and countless others.
19 |
20 | ## Why
21 |
22 | Why this exists?
23 |
24 | Package Control has special-case formatter plugins for different languages, and the monstrous Formatter with too many batteries included. This makes it hard to add formatters: someone has to make and publish a new plugin every time, or fork a repo and make a PR, etc.
25 |
26 | Many formatters just call a subprocess and use stdio. One plugin can handle them all, while letting the _user_ specify any new formatter for any new syntax! This works for `gofmt`, `rustfmt`, `clang-format`, and endless others.
27 |
28 | ## Installation
29 |
30 | ### Package Control
31 |
32 | 1. Get [Package Control](https://packagecontrol.io).
33 | 2. Open the command palette: ⇪⌘P or ⇪^P.
34 | 3. `Package Control: Install Package`.
35 | 4. `Fmt`.
36 |
37 | ### Manual
38 |
39 | Clone the repo and symlink it to your Sublime packages directory. Example for MacOS:
40 |
41 | ```sh
42 | git clone https://github.com/mitranim/sublime-fmt.git
43 | cd sublime-fmt
44 | ln -sf "$(pwd)" "$HOME/Library/Application Support/Sublime Text 3/Packages/Fmt"
45 | ```
46 |
47 | To find the packages directory on your system, use Sublime Text menu → Preferences → Browse Packages.
48 |
49 | ## Usage
50 |
51 | The plugin has _no default formatters_. It invokes CLI programs installed globally on your system. You must specify them in the plugin settings. Example for Go:
52 |
53 | ```json
54 | {
55 | "rules": [
56 | {
57 | "selector": "source.go",
58 | "cmd": ["goimports"],
59 | "format_on_save": true,
60 | "merge_type": "diff",
61 | },
62 | ],
63 | }
64 | ```
65 |
66 | To understand Sublime scopes and selector matching, read this short official doc: https://www.sublimetext.com/docs/selectors.html.
67 |
68 | **How to get scope name**. Option 1: menu → Tools → Developer → Show Scope Name. Option 2: run the command `Fmt: Format Buffer`, and if not configured for the current scope, it will tell you!
69 |
70 | To format on demand, run the `Fmt: Format Buffer` command from the command palette. See below how to configure hotkeys.
71 |
72 | To auto-format on save, set `"format_on_save": true` in the settings. Can be global or per rule.
73 |
74 | ## Settings
75 |
76 | See [`Fmt.sublime-settings`](Fmt.sublime-settings) for all available settings. To override them, open:
77 |
78 | ```
79 | menu → Preferences → Package Settings → Fmt → Settings
80 | ```
81 |
82 | The plugin looks for settings in the following places, with the following priority:
83 |
84 | * `"Fmt"` dict in general Sublime settings, project-specific or global.
85 | * `Fmt.sublime-settings`, user-created or default.
86 |
87 | For overrides, open project or global settings and make a `"Fmt"` entry:
88 |
89 | ```json
90 | {
91 | "Fmt": {
92 | "rules": [
93 | {
94 | "selector": "source.some_lang",
95 | "cmd": ["some_lang_fmt", "--some_arg"],
96 | },
97 | ],
98 | },
99 | }
100 | ```
101 |
102 | A rule may contain _any_ of the root-level settings, such as `format_on_save`. This allows fine-tuning.
103 |
104 | ## Commands
105 |
106 | In Sublime's command palette:
107 |
108 | * `Fmt: Format Buffer`
109 | * `Fmt: Format Selection`
110 |
111 | ## Hotkeys
112 |
113 | Hotkeys? More like _notkeys_!
114 |
115 | To avoid potential conflicts, this plugin does not come with hotkeys. To hotkey
116 | the format commands, add something like this to your `.sublime-keymap`:
117 |
118 | ```sublime-keymap
119 | {"keys": ["primary+k", "primary+j"], "command": "fmt_format_buffer"},
120 | {"keys": ["primary+k", "primary+k"], "command": "fmt_format_selection"},
121 | ```
122 |
123 | Sublime automatically resolves "primary" to "super" on MacOS and to "ctrl" on other systems.
124 |
125 | ## Changelog
126 |
127 | **2022-07-18**. Ignore informational output over stderr when the subprocess exits with 0 and stdout is non-empty.
128 |
129 | **2022-07-11**. Use `"merge_type": "replace"` by default. Diff is now opt-in due to extreme performance degradation for large amounts of diffs.
130 |
131 | **2020-12-28**. Support env variable substitution. Format-on-save is no longer enabled by default.
132 |
133 | **2020-11-26**. Support variable substitution in `cmd`.
134 |
135 | **2020-11-25**. Use scope selectors instead of exactly matching the scope name.
136 |
137 | **2020-10-25**. Support subprocess timeout, always kill the subprocess.
138 |
139 | **2020-10-23**. Support several ways of printing errors. By default, errors are shown in a transient output panel at the bottom.
140 |
141 | ## License
142 |
143 | https://unlicense.org
144 |
145 | `difflib.py` is based on code which is published under Apache license, see the hyperlink in the file. It underwent significant edits, and its licensing status is unknown to me. Everything else is original and under Unlicense.
146 |
--------------------------------------------------------------------------------
/Fmt.py:
--------------------------------------------------------------------------------
1 | import sublime
2 | import sublime_plugin
3 | import subprocess as sub
4 | import os
5 | import sys
6 | from . import difflib
7 |
8 | PLUGIN_NAME = 'Fmt'
9 | SETTINGS_KEY = PLUGIN_NAME + '.sublime-settings'
10 | IS_WINDOWS = os.name == 'nt'
11 | PANEL_OUTPUT_NAME = 'output.' + PLUGIN_NAME
12 |
13 | class fmt_listener(sublime_plugin.EventListener):
14 | def on_pre_save(self, view):
15 | if is_enabled(view) and get_setting(view, 'format_on_save'):
16 | view.run_command('fmt_format_buffer')
17 |
18 | class fmt_format_buffer(sublime_plugin.TextCommand):
19 | def run(self, edit):
20 | view = self.view
21 | try:
22 | fmt_region(view, edit, view_region(view))
23 | except Exception as err:
24 | report(view, err)
25 |
26 | class fmt_format_selection(sublime_plugin.TextCommand):
27 | def run(self, edit):
28 | view = self.view
29 |
30 | for region in view.sel():
31 | try:
32 | fmt_region(view, edit, region)
33 | except Exception as err:
34 | report(view, err)
35 | break
36 |
37 | class fmt_panel_replace_content(sublime_plugin.TextCommand):
38 | def run(self, edit, text):
39 | view = self.view
40 | view.replace(edit, view_region(view), text)
41 | view.sel().clear()
42 |
43 | # TODO: any other exception type should be printed with the stack. Only error
44 | # messages generated by Fmt, as `ErrMsg`, should have the stack suppressed
45 | # (which is the default behavior of `str.format`).
46 | class ErrMsg(Exception):
47 | pass
48 |
49 | def fmt_region(view, edit, region):
50 | if region.empty():
51 | return
52 |
53 | hide_panel(view.window())
54 |
55 | source = view.substr(region)
56 | scope = view.scope_name(region.begin())
57 | fmted = fmt(view, source, view_encoding(view), scope)
58 | if fmted == source:
59 | return
60 |
61 | merge_type = get_setting(view, 'merge_type', scope)
62 |
63 | if merge_type == 'diff':
64 | try:
65 | merge_into_view(view, edit, fmted, region)
66 | except difflib.TooManyDiffsException:
67 | replace_view(view, edit, fmted, region)
68 | return
69 |
70 | if merge_type == 'replace':
71 | replace_view(view, edit, fmted, region)
72 | return
73 |
74 | report(view, 'unknown value of setting "merge_type": {}'.format(merge_type))
75 |
76 | def fmt(view, input, encoding, scope):
77 | cmd = get_setting(view, 'cmd', scope)
78 |
79 | if not cmd:
80 | raise ErrMsg('unable to find setting "cmd" for scope "{}"'.format(scope))
81 |
82 | if not isinstance(cmd, list) or not every(cmd, is_string):
83 | raise ErrMsg('expected setting "cmd" to be a list of strings, found {}'.format(cmd))
84 |
85 | # Support "$variable" substitutions.
86 | variables = extract_variables(view)
87 | cmd = [sublime.expand_variables(arg, variables) for arg in cmd]
88 |
89 | proc = sub.Popen(
90 | args=cmd,
91 | stdin=sub.PIPE,
92 | stdout=sub.PIPE,
93 | stderr=sub.PIPE,
94 | startupinfo=process_startup_info(),
95 | universal_newlines=False,
96 | cwd=guess_cwd(view),
97 | env=get_env(view, scope),
98 | )
99 |
100 | timeout = get_setting(view, 'timeout', scope)
101 |
102 | try:
103 | (stdout, stderr) = proc.communicate(input=bytes(input, encoding=encoding), timeout=timeout)
104 | finally:
105 | try:
106 | proc.kill()
107 | except:
108 | pass
109 |
110 | stdout = stdout.decode(encoding)
111 | stderr = stderr.decode(encoding)
112 |
113 | if proc.returncode != 0:
114 | msg = str(sub.CalledProcessError(proc.returncode, cmd))
115 | if len(stderr) > 0:
116 | msg += ':\n' + stderr
117 | elif len(stdout) > 0:
118 | msg += ':\n' + stdout
119 | raise ErrMsg(msg)
120 |
121 | if len(stdout) == 0 and len(stderr) > 0:
122 | raise ErrMsg(stderr)
123 |
124 | return stdout
125 |
126 | def merge_into_view(view, edit, content, region):
127 | def subview(start, end):
128 | return view.substr(sublime.Region(start, end))
129 |
130 | diffs = difflib.myers_diffs(subview(0, view.size()), content)
131 | difflib.cleanup_efficiency(diffs)
132 | offset = region.begin()
133 |
134 | for (op_type, patch) in diffs:
135 | patch_len = len(patch)
136 | if op_type == difflib.Ops.EQUAL:
137 | if subview(offset, offset+patch_len) != patch:
138 | report(view, "mismatch between diff's source and current content")
139 | return
140 | offset += patch_len
141 | elif op_type == difflib.Ops.INSERT:
142 | view.insert(edit, offset, patch)
143 | offset += patch_len
144 | elif op_type == difflib.Ops.DELETE:
145 | if subview(offset, offset+patch_len) != patch:
146 | report(view, "mismatch between diff's source and current content")
147 | return
148 | view.erase(edit, sublime.Region(offset, offset+patch_len))
149 |
150 | def replace_view(view, edit, content, region):
151 | position = view.viewport_position()
152 | view.replace(edit, region, content)
153 | # Works only on the main thread, hence lambda and timer.
154 | restore = lambda: view.set_viewport_position(position, animate=False)
155 | sublime.set_timeout(restore, 0)
156 |
157 | def report(view, msg):
158 | window = view.window()
159 | style = get_setting(view, 'error_style')
160 |
161 | if style == '':
162 | return
163 |
164 | if style is None:
165 | style = 'panel'
166 |
167 | if style == 'console':
168 | if isinstance(msg, Exception):
169 | raise msg
170 | msg = '[{}] {}'.format(PLUGIN_NAME, msg)
171 | print(msg)
172 | return
173 |
174 | if style == 'panel':
175 | msg = '[{}] {}'.format(PLUGIN_NAME, msg)
176 | ensure_panel(window).run_command('fmt_panel_replace_content', {'text': norm_newlines(msg)})
177 | show_panel(window)
178 | return
179 |
180 | if style == 'popup':
181 | msg = '[{}] {}'.format(PLUGIN_NAME, msg)
182 | sublime.error_message(msg)
183 | return
184 |
185 | sublime.error_message('[{}] unknown value of setting "error_style": {}'.format(PLUGIN_NAME, style))
186 |
187 | # Copied from other plugins, haven't personally tested on Windows.
188 | def process_startup_info():
189 | if not IS_WINDOWS:
190 | return None
191 | startupinfo = sub.STARTUPINFO()
192 | startupinfo.dwFlags |= sub.STARTF_USESHOWWINDOW
193 | startupinfo.wShowWindow = sub.SW_HIDE
194 | return startupinfo
195 |
196 | def guess_cwd(view):
197 | window = view.window()
198 | mode = get_setting(view, 'cwd_mode') or ''
199 |
200 | if mode.startswith(':'):
201 | return mode[1:]
202 |
203 | if mode == 'none':
204 | return None
205 |
206 | if mode == 'project_root':
207 | if len(window.folders()):
208 | return window.folders()[0]
209 | return None
210 |
211 | if mode == 'auto':
212 | if view.file_name():
213 | return os.path.dirname(view.file_name())
214 | if len(window.folders()):
215 | return window.folders()[0]
216 |
217 | def get_in(val, *path):
218 | for key in path:
219 | val, ok = get(val, key)
220 | if not ok:
221 | return (None, False)
222 | return (val, True)
223 |
224 | def get(val, key):
225 | if (
226 | isinstance(val, dict) and key in val
227 | ) or (
228 | (isinstance(val, list) or isinstance(val, tuple)) and
229 | (isinstance(key, int) and len(val) > key)
230 | ):
231 | return (val[key], True)
232 | return (None, False)
233 |
234 | def view_scope(view):
235 | scopes = view.scope_name(0)
236 | return scopes[0:scopes.find(' ')]
237 |
238 | def get_setting(view, key, scope = None):
239 | if scope is None:
240 | scope = view_scope(view)
241 |
242 | overrides = view.settings().get(PLUGIN_NAME)
243 |
244 | rule = rule_for_scope(get(overrides, 'rules')[0], scope)
245 | (val, found) = get(rule, key)
246 | if found:
247 | return val
248 |
249 | (val, found) = get_in(overrides, key)
250 | if found:
251 | return val
252 |
253 | settings = sublime.load_settings(SETTINGS_KEY)
254 |
255 | rule = rule_for_scope(settings.get('rules'), scope)
256 | (val, found) = get(rule, key)
257 | if found:
258 | return val
259 |
260 | return settings.get(key)
261 |
262 | def rule_for_scope(rules, scope):
263 | if not rules:
264 | return None
265 |
266 | rule = max(rules, key = lambda rule: rule_score(rule, scope))
267 |
268 | # Note: `max` doesn't ensure this condition.
269 | if rule_score(rule, scope) > 0:
270 | return rule
271 |
272 | return None
273 |
274 | def rule_score(rule, scope):
275 | if 'selector' not in rule:
276 | raise ErrMsg('missing "selector" in rule {}'.format(rule))
277 | return sublime.score_selector(scope, rule['selector'])
278 |
279 | def is_enabled(view):
280 | return bool(get_setting(view, 'cmd'))
281 |
282 | def view_encoding(view):
283 | encoding = view.encoding()
284 | return 'UTF-8' if encoding == 'Undefined' else encoding
285 |
286 | def create_panel(window):
287 | return window.create_output_panel(PLUGIN_NAME)
288 |
289 | def find_panel(window):
290 | return window.find_output_panel(PANEL_OUTPUT_NAME)
291 |
292 | def ensure_panel(window):
293 | return find_panel(window) or create_panel(window)
294 |
295 | def hide_panel(window):
296 | if window.active_panel() == PANEL_OUTPUT_NAME:
297 | window.run_command('hide_panel', {'panel': PANEL_OUTPUT_NAME})
298 |
299 | def show_panel(window):
300 | window.run_command('show_panel', {'panel': PANEL_OUTPUT_NAME})
301 |
302 | def every(iter, fun):
303 | if iter:
304 | for val in iter:
305 | if not fun(val):
306 | return False
307 | return True
308 |
309 | def is_string(val):
310 | return isinstance(val, str)
311 |
312 | def extract_variables(view):
313 | settings = view.settings()
314 | tab_size = settings.get('tab_size') or 0
315 | indent = ' ' * tab_size if settings.get('translate_tabs_to_spaces') else '\t'
316 |
317 | vars = view.window().extract_variables()
318 | vars['tab_size'] = str(tab_size)
319 | vars['indent'] = indent
320 | vars.update(os.environ)
321 |
322 | return vars
323 |
324 | def view_region(view):
325 | return sublime.Region(0, view.size())
326 |
327 | def get_env(view, scope):
328 | val = get_setting(view, 'env', scope)
329 | if val is None:
330 | return None
331 | env = os.environ.copy()
332 | env.update(val)
333 | return env
334 |
335 | def norm_newlines(src):
336 | return src.replace('\r\n', '\n')
337 |
--------------------------------------------------------------------------------
/difflib.py:
--------------------------------------------------------------------------------
1 | """
2 | Functions for diff, match and patch.
3 |
4 | Computes the difference between two texts to create a patch.
5 | Applies the patch onto another text, allowing for errors.
6 |
7 | Originally found at http://code.google.com/p/google-diff-match-patch/.
8 | Edited by Nelo Mitranim (2017, 2020).
9 | """
10 |
11 | import re
12 | from collections import namedtuple
13 |
14 | class Ops(object):
15 | EQUAL = 'EQUAL'
16 | INSERT = 'INSERT'
17 | DELETE = 'DELETE'
18 |
19 | Diff = namedtuple('Diff', ['op', 'text'])
20 |
21 | # Cost of an empty edit operation in terms of edit characters.
22 | DIFF_EDIT_COST = 4
23 |
24 | BLANK_LINE_END = re.compile(r"\n\r?\n$")
25 |
26 | BLANK_LINE_START = re.compile(r"^\r?\n\r?\n")
27 |
28 | MAX_DIFFS_THRESHOLD = 32
29 |
30 | class TooManyDiffsException(Exception):
31 | pass
32 |
33 | def myers_diffs(text1, text2, checklines=True):
34 | """Find the differences between two texts. Simplifies the problem by
35 | stripping any common prefix or suffix off the texts before diffing.
36 |
37 | Args:
38 | text1: Old string to be diffed.
39 | text2: New string to be diffed.
40 | checklines: Optional speedup flag. If present and false, then don't run
41 | a line-level diff first to identify the changed areas.
42 | Defaults to true, which does a faster, slightly less optimal diff.
43 |
44 | Returns:
45 | List of changes.
46 | """
47 | if text1 == None or text2 == None:
48 | raise ValueError('Null inputs (myers_diffs)')
49 |
50 | # Check for equality (speedup).
51 | if text1 == text2:
52 | if text1:
53 | return [Diff(Ops.EQUAL, text1)]
54 | return []
55 |
56 | # Trim off common prefix (speedup).
57 | common_length = common_prefix_length(text1, text2)
58 | common_prefix = text1[:common_length]
59 | text1 = text1[common_length:]
60 | text2 = text2[common_length:]
61 |
62 | # Trim off common suffix (speedup).
63 | common_length = common_suffix_length(text1, text2)
64 | if common_length == 0:
65 | commonsuffix = ''
66 | else:
67 | commonsuffix = text1[-common_length:]
68 | text1 = text1[:-common_length]
69 | text2 = text2[:-common_length]
70 |
71 | # Compute the diff on the middle block.
72 | diffs = compute_diffs(text1, text2, checklines)
73 |
74 | # Restore the prefix and suffix.
75 | if common_prefix:
76 | diffs[:0] = [Diff(Ops.EQUAL, common_prefix)]
77 | if commonsuffix:
78 | diffs.append(Diff(Ops.EQUAL, commonsuffix))
79 | cleanup_merge(diffs)
80 | return diffs
81 |
82 | def compute_diffs(text1, text2, checklines):
83 | """Find the differences between two texts. Assumes that the texts do not
84 | have any common prefix or suffix.
85 |
86 | Args:
87 | text1: Old string to be diffed.
88 | text2: New string to be diffed.
89 | checklines: Speedup flag. If false, then don't run a line-level diff
90 | first to identify the changed areas.
91 | If true, then run a faster, slightly less optimal diff.
92 |
93 | Returns:
94 | List of changes.
95 | """
96 | if not text1:
97 | # Just add some text (speedup).
98 | return [Diff(Ops.INSERT, text2)]
99 |
100 | if not text2:
101 | # Just delete some text (speedup).
102 | return [Diff(Ops.DELETE, text1)]
103 |
104 | if len(text1) > len(text2):
105 | (longtext, shorttext) = (text1, text2)
106 | else:
107 | (shorttext, longtext) = (text1, text2)
108 | i = longtext.find(shorttext)
109 | if i != -1:
110 | # Shorter text is inside the longer text (speedup).
111 | diffs = [Diff(Ops.INSERT, longtext[:i]), Diff(Ops.EQUAL, shorttext),
112 | Diff(Ops.INSERT, longtext[i + len(shorttext):])]
113 | # Swap insertions for deletions if diff is reversed.
114 | if len(text1) > len(text2):
115 | diffs[0] = diffs[0]._replace(op=Ops.DELETE)
116 | diffs[2] = diffs[2]._replace(op=Ops.DELETE)
117 | return diffs
118 |
119 | if len(shorttext) == 1:
120 | # Single character string.
121 | # After the previous speedup, the character can't be an equality.
122 | return [Diff(Ops.DELETE, text1), Diff(Ops.INSERT, text2)]
123 |
124 | if checklines and len(text1) > 100 and len(text2) > 100:
125 | return line_mode_diffs(text1, text2)
126 |
127 | return diff_bisect(text1, text2)
128 |
129 | def line_mode_diffs(text1, text2):
130 | """Do a quick line-level diff on both strings, then rediff the parts for
131 | greater accuracy.
132 | This speedup can produce non-minimal diffs.
133 |
134 | Args:
135 | text1: Old string to be diffed.
136 | text2: New string to be diffed.
137 |
138 | Returns:
139 | List of changes.
140 | """
141 |
142 | # Scan the text on a line-by-line basis first.
143 | (text1, text2, line_list) = lines_to_chars(text1, text2)
144 |
145 | diffs = myers_diffs(text1, text2, False)
146 |
147 | # Convert the diff back to original text.
148 | diffs = [diff._replace(text=''.join(line_list[ord(char)] for char in diff.text)) for diff in diffs]
149 |
150 | # Eliminate freak matches (e.g. blank lines)
151 | cleanup_semantic(diffs)
152 |
153 | # Rediff any replacement blocks, this time character-by-character.
154 | # Add a dummy entry at the end.
155 | diffs.append(Diff(Ops.EQUAL, ''))
156 | pointer = 0
157 | count_delete = 0
158 | count_insert = 0
159 | text_delete = ''
160 | text_insert = ''
161 | while pointer < len(diffs):
162 | if diffs[pointer].op == Ops.INSERT:
163 | count_insert += 1
164 | text_insert += diffs[pointer].text
165 | elif diffs[pointer].op == Ops.DELETE:
166 | count_delete += 1
167 | text_delete += diffs[pointer].text
168 | elif diffs[pointer].op == Ops.EQUAL:
169 | # Upon reaching an equality, check for prior redundancies.
170 | if count_delete >= 1 and count_insert >= 1:
171 | # Delete the offending records and add the merged ones.
172 | a = myers_diffs(text_delete, text_insert, False)
173 | diffs[pointer - count_delete - count_insert : pointer] = a
174 | pointer = pointer - count_delete - count_insert + len(a)
175 | count_insert = 0
176 | count_delete = 0
177 | text_delete = ''
178 | text_insert = ''
179 |
180 | pointer += 1
181 |
182 | diffs.pop() # Remove the dummy entry at the end.
183 |
184 | return diffs
185 |
186 | def diff_bisect(text1, text2):
187 | """Find the 'middle snake' of a diff, split the problem in two
188 | and return the recursively constructed diff.
189 | See Myers 1986 paper: An O(ND) Difference Algorithm and Its Variations.
190 |
191 | Args:
192 | text1: Old string to be diffed.
193 | text2: New string to be diffed.
194 |
195 | Returns:
196 | List of diff tuples.
197 | """
198 |
199 | # Cache the text lengths to prevent multiple calls.
200 | text1_length = len(text1)
201 | text2_length = len(text2)
202 | max_d = (text1_length + text2_length + 1) // 2
203 | v_offset = max_d
204 | v_length = 2 * max_d
205 | v1 = [-1] * v_length
206 | v1[v_offset + 1] = 0
207 | v2 = v1[:]
208 | delta = text1_length - text2_length
209 | # If the total number of characters is odd, then the front path will
210 | # collide with the reverse path.
211 | front = (delta % 2 != 0)
212 | # Offsets for start and end of k loop.
213 | # Prevents mapping of space beyond the grid.
214 | k1start = 0
215 | k1end = 0
216 | k2start = 0
217 | k2end = 0
218 | for d in range(max_d):
219 | # Walk the front path one step.
220 | for k1 in range(-d + k1start, d + 1 - k1end, 2):
221 | k1_offset = v_offset + k1
222 | if k1 == -d or (k1 != d and
223 | v1[k1_offset - 1] < v1[k1_offset + 1]):
224 | x1 = v1[k1_offset + 1]
225 | else:
226 | x1 = v1[k1_offset - 1] + 1
227 | y1 = x1 - k1
228 | while (x1 < text1_length and y1 < text2_length and
229 | text1[x1] == text2[y1]):
230 | x1 += 1
231 | y1 += 1
232 | v1[k1_offset] = x1
233 | if x1 > text1_length:
234 | # Ran off the right of the graph.
235 | k1end += 2
236 | elif y1 > text2_length:
237 | # Ran off the bottom of the graph.
238 | k1start += 2
239 | elif front:
240 | k2_offset = v_offset + delta - k1
241 | if k2_offset >= 0 and k2_offset < v_length and v2[k2_offset] != -1:
242 | # Mirror x2 onto top-left coordinate system.
243 | x2 = text1_length - v2[k2_offset]
244 | if x1 >= x2:
245 | # Overlap detected.
246 | return bisect_split_diffs(text1, text2, x1, y1)
247 |
248 | # Walk the reverse path one step.
249 | for k2 in range(-d + k2start, d + 1 - k2end, 2):
250 | k2_offset = v_offset + k2
251 | if k2 == -d or (k2 != d and
252 | v2[k2_offset - 1] < v2[k2_offset + 1]):
253 | x2 = v2[k2_offset + 1]
254 | else:
255 | x2 = v2[k2_offset - 1] + 1
256 | y2 = x2 - k2
257 | while (x2 < text1_length and y2 < text2_length and
258 | text1[-x2 - 1] == text2[-y2 - 1]):
259 | x2 += 1
260 | y2 += 1
261 | v2[k2_offset] = x2
262 | if x2 > text1_length:
263 | # Ran off the left of the graph.
264 | k2end += 2
265 | elif y2 > text2_length:
266 | # Ran off the top of the graph.
267 | k2start += 2
268 | elif not front:
269 | k1_offset = v_offset + delta - k2
270 | if k1_offset >= 0 and k1_offset < v_length and v1[k1_offset] != -1:
271 | x1 = v1[k1_offset]
272 | y1 = v_offset + x1 - k1_offset
273 | # Mirror x2 onto top-left coordinate system.
274 | x2 = text1_length - x2
275 | if x1 >= x2:
276 | # Overlap detected.
277 | return bisect_split_diffs(text1, text2, x1, y1)
278 |
279 | # Number of diffs equals number of characters, no commonality at all.
280 | return [Diff(Ops.DELETE, text1), Diff(Ops.INSERT, text2)]
281 |
282 | def bisect_split_diffs(text1, text2, x, y):
283 | """Given the location of the 'middle snake', split the diff in two parts
284 | and recurse.
285 |
286 | Args:
287 | text1: Old string to be diffed.
288 | text2: New string to be diffed.
289 | x: Index of split point in text1.
290 | y: Index of split point in text2.
291 |
292 | Returns:
293 | List of diff tuples.
294 | """
295 | text1a = text1[:x]
296 | text2a = text2[:y]
297 | text1b = text1[x:]
298 | text2b = text2[y:]
299 |
300 | # Compute both diffs serially.
301 | diffs = myers_diffs(text1a, text2a, False)
302 | diffsb = myers_diffs(text1b, text2b, False)
303 |
304 | if len(diffs) + len(diffsb) > MAX_DIFFS_THRESHOLD:
305 | raise TooManyDiffsException()
306 |
307 | return diffs + diffsb
308 |
309 | def lines_to_chars(text1, text2):
310 | """Split two texts into a list of strings. Reduce the texts to a string
311 | of dicts where each Unicode character represents one line.
312 |
313 | Args:
314 | text1: First string.
315 | text2: Second string.
316 |
317 | Returns:
318 | Three element tuple, containing the encoded text1, the encoded text2 and
319 | the list of unique strings. The zeroth element of the list of unique
320 | strings is intentionally blank.
321 | """
322 | line_list = [] # e.g. line_list[4] == "Hello\n"
323 | line_dict = {} # e.g. line_dict["Hello\n"] == 4
324 |
325 | # "\x00" is a valid character, but various debuggers don't like it.
326 | # So we'll insert a junk entry to avoid generating a null character.
327 | line_list.append('')
328 |
329 | def lines_to_chars_munge(text):
330 | """Split a text into a list of strings. Reduce the texts to a string
331 | of dicts where each Unicode character represents one line.
332 | Modifies line_list and lineHash through being a closure.
333 |
334 | Args:
335 | text: String to encode.
336 |
337 | Returns:
338 | Encoded string.
339 | """
340 | chars = []
341 | # Walk the text, pulling out a substring for each line.
342 | # text.split('\n') would would temporarily double our memory footprint.
343 | # Modifying text would create many large strings to garbage collect.
344 | line_start = 0
345 | line_end = -1
346 | while line_end < len(text) - 1:
347 | line_end = text.find('\n', line_start)
348 | if line_end == -1:
349 | line_end = len(text) - 1
350 | line = text[line_start:line_end + 1]
351 | line_start = line_end + 1
352 |
353 | if line in line_dict:
354 | chars.append(chr(line_dict[line]))
355 | else:
356 | line_list.append(line)
357 | line_dict[line] = len(line_list) - 1
358 | chars.append(chr(len(line_list) - 1))
359 | return ''.join(chars)
360 |
361 | chars1 = lines_to_chars_munge(text1)
362 | chars2 = lines_to_chars_munge(text2)
363 | return (chars1, chars2, line_list)
364 |
365 | def common_prefix_length(text1, text2):
366 | """Determine the common prefix of two strings.
367 |
368 | Args:
369 | text1: First string.
370 | text2: Second string.
371 |
372 | Returns:
373 | The number of characters common to the start of each string.
374 | """
375 | # Quick check for common null cases.
376 | if not text1 or not text2 or text1[0] != text2[0]:
377 | return 0
378 | # Binary search.
379 | # Performance analysis: http://neil.fraser.name/news/2007/10/09/
380 | pointermin = 0
381 | pointermax = min(len(text1), len(text2))
382 | pointermid = pointermax
383 | pointerstart = 0
384 | while pointermin < pointermid:
385 | if text1[pointerstart:pointermid] == text2[pointerstart:pointermid]:
386 | pointermin = pointermid
387 | pointerstart = pointermin
388 | else:
389 | pointermax = pointermid
390 | pointermid = (pointermax - pointermin) // 2 + pointermin
391 | return pointermid
392 |
393 | def common_suffix_length(text1, text2):
394 | """Determine the common suffix of two strings.
395 |
396 | Args:
397 | text1: First string.
398 | text2: Second string.
399 |
400 | Returns:
401 | The number of characters common to the end of each string.
402 | """
403 | # Quick check for common null cases.
404 | if not text1 or not text2 or text1[-1] != text2[-1]:
405 | return 0
406 | # Binary search.
407 | # Performance analysis: http://neil.fraser.name/news/2007/10/09/
408 | pointermin = 0
409 | pointermax = min(len(text1), len(text2))
410 | pointermid = pointermax
411 | pointerend = 0
412 | while pointermin < pointermid:
413 | if (text1[-pointermid:len(text1) - pointerend] ==
414 | text2[-pointermid:len(text2) - pointerend]):
415 | pointermin = pointermid
416 | pointerend = pointermin
417 | else:
418 | pointermax = pointermid
419 | pointermid = (pointermax - pointermin) // 2 + pointermin
420 | return pointermid
421 |
422 | def common_overlap(text1, text2):
423 | """Determine if the suffix of one string is the prefix of another.
424 |
425 | Args:
426 | text1 First string.
427 | text2 Second string.
428 |
429 | Returns:
430 | The number of characters common to the end of the first
431 | string and the start of the second string.
432 | """
433 | # Cache the text lengths to prevent multiple calls.
434 | text1_length = len(text1)
435 | text2_length = len(text2)
436 | # Eliminate the null case.
437 | if text1_length == 0 or text2_length == 0:
438 | return 0
439 | # Truncate the longer string.
440 | if text1_length > text2_length:
441 | text1 = text1[-text2_length:]
442 | elif text1_length < text2_length:
443 | text2 = text2[:text1_length]
444 | text_length = min(text1_length, text2_length)
445 | # Quick check for the worst case.
446 | if text1 == text2:
447 | return text_length
448 |
449 | # Start by looking for a single character match
450 | # and increase length until no match is found.
451 | # Performance analysis: http://neil.fraser.name/news/2010/11/04/
452 | best = 0
453 | length = 1
454 | while True:
455 | pattern = text1[-length:]
456 | found = text2.find(pattern)
457 | if found == -1:
458 | return best
459 | length += found
460 | if found == 0 or text1[-length:] == text2[:length]:
461 | best = length
462 | length += 1
463 |
464 | def cleanup_semantic(diffs):
465 | """Reduce the number of edits by eliminating semantically trivial
466 | equalities.
467 |
468 | Args:
469 | diffs: List of diff tuples.
470 | """
471 | changes = False
472 | equalities = [] # Stack of indices where equalities are found.
473 | lastequality = None # Always equal to diffs[equalities[-1]].text
474 | pointer = 0 # Index of current position.
475 | # Number of chars that changed prior to the equality.
476 | (length_insertions1, length_deletions1) = (0, 0)
477 | # Number of chars that changed after the equality.
478 | (length_insertions2, length_deletions2) = (0, 0)
479 | while pointer < len(diffs):
480 | if diffs[pointer].op == Ops.EQUAL: # Equality found.
481 | equalities.append(pointer)
482 | (length_insertions1, length_insertions2) = (length_insertions2, 0)
483 | (length_deletions1, length_deletions2) = (length_deletions2, 0)
484 | lastequality = diffs[pointer].text
485 | else: # An insertion or deletion.
486 | if diffs[pointer].op == Ops.INSERT:
487 | length_insertions2 += len(diffs[pointer].text)
488 | else:
489 | length_deletions2 += len(diffs[pointer].text)
490 | # Eliminate an equality that is smaller or equal to the edits on both
491 | # sides of it.
492 | if (lastequality and (len(lastequality) <=
493 | max(length_insertions1, length_deletions1)) and
494 | (len(lastequality) <= max(length_insertions2, length_deletions2))):
495 | # Duplicate record.
496 | diffs.insert(equalities[-1], Diff(Ops.DELETE, lastequality))
497 | # Change second copy to insert.
498 | diffs[equalities[-1] + 1] = diffs[equalities[-1] + 1]._replace(op=Ops.INSERT)
499 | # Throw away the equality we just deleted.
500 | equalities.pop()
501 | # Throw away the previous equality (it needs to be reevaluated).
502 | if len(equalities):
503 | equalities.pop()
504 | if len(equalities):
505 | pointer = equalities[-1]
506 | else:
507 | pointer = -1
508 | # Reset the counters.
509 | length_insertions1, length_deletions1 = 0, 0
510 | length_insertions2, length_deletions2 = 0, 0
511 | lastequality = None
512 | changes = True
513 | pointer += 1
514 |
515 | # Normalize the diff.
516 | if changes:
517 | cleanup_merge(diffs)
518 | cleanup_semantic_lossless(diffs)
519 |
520 | # Find any overlaps between deletions and insertions.
521 | # e.g: abcxxxxxxdef
522 | # -> abcxxxdef
523 | # e.g: xxxabcdefxxx
524 | # -> defxxxabc
525 | # Only extract an overlap if it is as big as the edit ahead or behind it.
526 | pointer = 1
527 | while pointer < len(diffs):
528 | if (diffs[pointer - 1].op == Ops.DELETE and
529 | diffs[pointer].op == Ops.INSERT):
530 | deletion = diffs[pointer - 1].text
531 | insertion = diffs[pointer].text
532 | overlap_length1 = common_overlap(deletion, insertion)
533 | overlap_length2 = common_overlap(insertion, deletion)
534 | if overlap_length1 >= overlap_length2:
535 | if (overlap_length1 >= len(deletion) / 2.0 or
536 | overlap_length1 >= len(insertion) / 2.0):
537 | # Overlap found. Insert an equality and trim the surrounding edits.
538 | diffs.insert(pointer, Diff(Ops.EQUAL, insertion[:overlap_length1]))
539 | diffs[pointer - 1] = Diff(Ops.DELETE, deletion[:len(deletion) - overlap_length1])
540 | diffs[pointer + 1] = Diff(Ops.INSERT, insertion[overlap_length1:])
541 | pointer += 1
542 | else:
543 | if (overlap_length2 >= len(deletion) / 2.0 or
544 | overlap_length2 >= len(insertion) / 2.0):
545 | # Reverse overlap found.
546 | # Insert an equality and swap and trim the surrounding edits.
547 | diffs.insert(pointer, Diff(Ops.EQUAL, deletion[:overlap_length2]))
548 | diffs[pointer - 1] = Diff(Ops.INSERT, insertion[:len(insertion) - overlap_length2])
549 | diffs[pointer + 1] = Diff(Ops.DELETE, deletion[overlap_length2:])
550 | pointer += 1
551 | pointer += 1
552 | pointer += 1
553 |
554 | def cleanup_semantic_lossless(diffs):
555 | """Look for single edits surrounded on both sides by equalities
556 | which can be shifted sideways to align the edit to a word boundary.
557 | e.g: The cat came. -> The cat came.
558 |
559 | Args:
560 | diffs: List of diff tuples.
561 | """
562 |
563 | def cleanup_semantic_score(one, two):
564 | """Given two strings, compute a score representing whether the
565 | internal boundary falls on logical boundaries.
566 | Scores range from 6 (best) to 0 (worst).
567 | Closure, but does not reference any external variables.
568 |
569 | Args:
570 | one: First string.
571 | two: Second string.
572 |
573 | Returns:
574 | The score.
575 | """
576 | if not one or not two:
577 | # Edges are the best.
578 | return 6
579 |
580 | # Each port of this function behaves slightly differently due to
581 | # subtle differences in each language's definition of things like
582 | # 'whitespace'. Since this function's purpose is largely cosmetic,
583 | # the choice has been made to use each language's native features
584 | # rather than force total conformity.
585 | char1 = one[-1]
586 | char2 = two[0]
587 | non_alpha_numeric_1 = not char1.isalnum()
588 | non_alpha_numeric_2 = not char2.isalnum()
589 | whitespace1 = non_alpha_numeric_1 and char1.isspace()
590 | whitespace2 = non_alpha_numeric_2 and char2.isspace()
591 | line_break_1 = whitespace1 and (char1 == "\r" or char1 == "\n")
592 | line_break_2 = whitespace2 and (char2 == "\r" or char2 == "\n")
593 | blank_line_1 = line_break_1 and BLANK_LINE_END.search(one)
594 | blank_line_2 = line_break_2 and BLANK_LINE_START.match(two)
595 |
596 | if blank_line_1 or blank_line_2:
597 | # Five points for blank lines.
598 | return 5
599 | elif line_break_1 or line_break_2:
600 | # Four points for line breaks.
601 | return 4
602 | elif non_alpha_numeric_1 and not whitespace1 and whitespace2:
603 | # Three points for end of sentences.
604 | return 3
605 | elif whitespace1 or whitespace2:
606 | # Two points for whitespace.
607 | return 2
608 | elif non_alpha_numeric_1 or non_alpha_numeric_2:
609 | # One point for non-alphanumeric.
610 | return 1
611 | return 0
612 |
613 | pointer = 1
614 | # Intentionally ignore the first and last element (don't need checking).
615 | while pointer < len(diffs) - 1:
616 | if (diffs[pointer - 1].op == Ops.EQUAL and
617 | diffs[pointer + 1].op == Ops.EQUAL):
618 | # This is a single edit surrounded by equalities.
619 | equality1 = diffs[pointer - 1].text
620 | edit = diffs[pointer].text
621 | equality2 = diffs[pointer + 1].text
622 |
623 | # First, shift the edit as far left as possible.
624 | common_offset = common_suffix_length(equality1, edit)
625 | if common_offset:
626 | common_string = edit[-common_offset:]
627 | equality1 = equality1[:-common_offset]
628 | edit = common_string + edit[:-common_offset]
629 | equality2 = common_string + equality2
630 |
631 | # Second, step character by character right, looking for the best fit.
632 | best_equality_1 = equality1
633 | best_edit = edit
634 | best_equality_2 = equality2
635 | best_score = (cleanup_semantic_score(equality1, edit) + cleanup_semantic_score(edit, equality2))
636 | while edit and equality2 and edit[0] == equality2[0]:
637 | equality1 += edit[0]
638 | edit = edit[1:] + equality2[0]
639 | equality2 = equality2[1:]
640 | score = (cleanup_semantic_score(equality1, edit) + cleanup_semantic_score(edit, equality2))
641 | # The >= encourages trailing rather than leading whitespace on edits.
642 | if score >= best_score:
643 | best_score = score
644 | best_equality_1 = equality1
645 | best_edit = edit
646 | best_equality_2 = equality2
647 |
648 | if diffs[pointer - 1].text != best_equality_1:
649 | # We have an improvement, save it back to the diff.
650 | if best_equality_1:
651 | diffs[pointer - 1] = diffs[pointer - 1]._replace(text=best_equality_1)
652 | else:
653 | del diffs[pointer - 1]
654 | pointer -= 1
655 | diffs[pointer] = diffs[pointer]._replace(text=best_edit)
656 | if best_equality_2:
657 | diffs[pointer + 1] = diffs[pointer + 1]._replace(text=best_equality_2)
658 | else:
659 | del diffs[pointer + 1]
660 | pointer -= 1
661 | pointer += 1
662 |
663 | def cleanup_efficiency(diffs):
664 | """Reduce the number of edits by eliminating operationally trivial
665 | equalities.
666 |
667 | Args:
668 | diffs: List of diff tuples.
669 | """
670 | changes = False
671 | equalities = [] # Stack of indices where equalities are found.
672 | lastequality = None # Always equal to diffs[equalities[-1]].text
673 | pointer = 0 # Index of current position.
674 | pre_ins = False # Is there an insertion operation before the last equality.
675 | pre_del = False # Is there a deletion operation before the last equality.
676 | post_ins = False # Is there an insertion operation after the last equality.
677 | post_del = False # Is there a deletion operation after the last equality.
678 | while pointer < len(diffs):
679 | if diffs[pointer].op == Ops.EQUAL: # Equality found.
680 | if (len(diffs[pointer].text) < DIFF_EDIT_COST and
681 | (post_ins or post_del)):
682 | # Candidate found.
683 | equalities.append(pointer)
684 | pre_ins = post_ins
685 | pre_del = post_del
686 | lastequality = diffs[pointer].text
687 | else:
688 | # Not a candidate, and can never become one.
689 | equalities = []
690 | lastequality = None
691 |
692 | post_ins = post_del = False
693 | else: # An insertion or deletion.
694 | if diffs[pointer].op == Ops.DELETE:
695 | post_del = True
696 | else:
697 | post_ins = True
698 |
699 | # Five types to be split:
700 | # ABXYCD
701 | # AXCD
702 | # ABXC
703 | # AXCD
704 | # ABXC
705 |
706 | if lastequality and ((pre_ins and pre_del and post_ins and post_del) or
707 | ((len(lastequality) < DIFF_EDIT_COST / 2) and
708 | (pre_ins + pre_del + post_ins + post_del) == 3)):
709 | # Duplicate record.
710 | diffs.insert(equalities[-1], Diff(Ops.DELETE, lastequality))
711 | # Change second copy to insert.
712 | diffs[equalities[-1] + 1] = Diff(Ops.INSERT, diffs[equalities[-1] + 1].text)
713 | equalities.pop() # Throw away the equality we just deleted.
714 | lastequality = None
715 | if pre_ins and pre_del:
716 | # No changes made which could affect previous entry, keep going.
717 | post_ins = post_del = True
718 | equalities = []
719 | else:
720 | if len(equalities):
721 | equalities.pop() # Throw away the previous equality.
722 | if len(equalities):
723 | pointer = equalities[-1]
724 | else:
725 | pointer = -1
726 | post_ins = post_del = False
727 | changes = True
728 | pointer += 1
729 |
730 | if changes:
731 | cleanup_merge(diffs)
732 |
733 | def cleanup_merge(diffs):
734 | """Reorder and merge like edit sections. Merge equalities.
735 | Any edit section can move as long as it doesn't cross an equality.
736 |
737 | Args:
738 | diffs: List of diff tuples.
739 | """
740 | diffs.append(Diff(Ops.EQUAL, '')) # Add a dummy entry at the end.
741 | pointer = 0
742 | count_delete = 0
743 | count_insert = 0
744 | text_delete = ''
745 | text_insert = ''
746 | while pointer < len(diffs):
747 | if diffs[pointer].op == Ops.INSERT:
748 | count_insert += 1
749 | text_insert += diffs[pointer].text
750 | pointer += 1
751 | elif diffs[pointer].op == Ops.DELETE:
752 | count_delete += 1
753 | text_delete += diffs[pointer].text
754 | pointer += 1
755 | elif diffs[pointer].op == Ops.EQUAL:
756 | # Upon reaching an equality, check for prior redundancies.
757 | if count_delete + count_insert > 1:
758 | if count_delete != 0 and count_insert != 0:
759 | # Factor out any common prefixies.
760 | common_length = common_prefix_length(text_insert, text_delete)
761 | if common_length != 0:
762 | x = pointer - count_delete - count_insert - 1
763 | if x >= 0 and diffs[x].op == Ops.EQUAL:
764 | diffs[x] = diffs[x]._replace(text=(diffs[x].text + text_insert[:common_length]))
765 | else:
766 | diffs.insert(0, Diff(Ops.EQUAL, text_insert[:common_length]))
767 | pointer += 1
768 | text_insert = text_insert[common_length:]
769 | text_delete = text_delete[common_length:]
770 | # Factor out any common suffixies.
771 | common_length = common_suffix_length(text_insert, text_delete)
772 | if common_length != 0:
773 | diffs[pointer] = diffs[pointer]._replace(text=(
774 | text_insert[-common_length:] + diffs[pointer].text
775 | ))
776 | text_insert = text_insert[:-common_length]
777 | text_delete = text_delete[:-common_length]
778 | # Delete the offending records and add the merged ones.
779 | if count_delete == 0:
780 | diffs[pointer - count_insert : pointer] = [Diff(Ops.INSERT, text_insert)]
781 | elif count_insert == 0:
782 | diffs[pointer - count_delete : pointer] = [Diff(Ops.DELETE, text_delete)]
783 | else:
784 | diffs[pointer - count_delete - count_insert : pointer] = [
785 | Diff(Ops.DELETE, text_delete),
786 | Diff(Ops.INSERT, text_insert)]
787 | pointer = pointer - count_delete - count_insert + 1
788 | if count_delete != 0:
789 | pointer += 1
790 | if count_insert != 0:
791 | pointer += 1
792 | elif pointer != 0 and diffs[pointer - 1].op == Ops.EQUAL:
793 | # Merge this equality with the previous one.
794 | diffs[pointer - 1] = diffs[pointer - 1]._replace(text=(
795 | diffs[pointer - 1].text + diffs[pointer].text
796 | ))
797 | del diffs[pointer]
798 | else:
799 | pointer += 1
800 |
801 | count_insert = 0
802 | count_delete = 0
803 | text_delete = ''
804 | text_insert = ''
805 |
806 | if diffs[-1].text == '':
807 | diffs.pop() # Remove the dummy entry at the end.
808 |
809 | # Second pass: look for single edits surrounded on both sides by equalities
810 | # which can be shifted sideways to eliminate an equality.
811 | # e.g: ABAC -> ABAC
812 | changes = False
813 | pointer = 1
814 | # Intentionally ignore the first and last element (don't need checking).
815 | while pointer < len(diffs) - 1:
816 | if (diffs[pointer - 1].op == Ops.EQUAL and
817 | diffs[pointer + 1].op == Ops.EQUAL):
818 | # This is a single edit surrounded by equalities.
819 | if diffs[pointer].text.endswith(diffs[pointer - 1].text):
820 | # Shift the edit over the previous equality.
821 | diffs[pointer] = diffs[pointer]._replace(text=(
822 | diffs[pointer - 1].text + diffs[pointer].text[:-len(diffs[pointer - 1].text)]
823 | ))
824 | diffs[pointer + 1] = diffs[pointer + 1]._replace(text=(
825 | diffs[pointer - 1].text + diffs[pointer + 1].text
826 | ))
827 | del diffs[pointer - 1]
828 | changes = True
829 | elif diffs[pointer].text.startswith(diffs[pointer + 1].text):
830 | # Shift the edit over the next equality.
831 | diffs[pointer - 1] = diffs[pointer - 1]._replace(text=(
832 | diffs[pointer - 1].text + diffs[pointer + 1].text
833 | ))
834 | diffs[pointer] = diffs[pointer]._replace(text=(
835 | diffs[pointer].text[len(diffs[pointer + 1].text):] + diffs[pointer + 1].text
836 | ))
837 | del diffs[pointer + 1]
838 | changes = True
839 | pointer += 1
840 |
841 | # If shifts were made, the diff needs reordering and another shift sweep.
842 | if changes:
843 | cleanup_merge(diffs)
844 |
--------------------------------------------------------------------------------