├── .python-version ├── messages.json ├── .gitignore ├── Fmt.sublime-commands ├── Main.sublime-menu ├── messages ├── 0.1.8.md ├── 0.1.11.md └── install.md ├── unlicense ├── Fmt.sublime-settings ├── readme.md ├── Fmt.py └── difflib.py /.python-version: -------------------------------------------------------------------------------- 1 | 3.8 -------------------------------------------------------------------------------- /messages.json: -------------------------------------------------------------------------------- 1 | { 2 | "install": "messages/install.md", 3 | "0.1.8": "messages/0.1.8.md" 4 | } 5 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | /* 2 | !/*ignore 3 | !/.python-version 4 | !/unlicense 5 | !/*.py 6 | !/*.md 7 | !/*.json 8 | !/*.sublime-* 9 | 10 | !/messages 11 | /messages/* 12 | !/messages/*.md 13 | -------------------------------------------------------------------------------- /Fmt.sublime-commands: -------------------------------------------------------------------------------- 1 | [ 2 | {"caption": "Fmt: Format Buffer", "command": "fmt_format_buffer"}, 3 | {"caption": "Fmt: Format Selection", "command": "fmt_format_selection"}, 4 | { 5 | "caption": "Preferences: Fmt Settings", 6 | "command": "edit_settings", 7 | "args": { 8 | "base_file": "${packages}/Fmt/Fmt.sublime-settings", 9 | "default": "{\n\t$0\n}", 10 | }, 11 | }, 12 | ] 13 | -------------------------------------------------------------------------------- /Main.sublime-menu: -------------------------------------------------------------------------------- 1 | [ 2 | { 3 | "id": "preferences", 4 | "children": [ 5 | { 6 | "id": "package-settings", 7 | "children": [ 8 | { 9 | "caption": "Fmt", 10 | "children": [ 11 | { 12 | "caption": "Settings", 13 | "command": "edit_settings", 14 | "args": { 15 | "base_file": "${packages}/Fmt/Fmt.sublime-settings", 16 | "default": "{\n\t$0\n}" 17 | } 18 | } 19 | ] 20 | } 21 | ] 22 | } 23 | ] 24 | } 25 | ] 26 | -------------------------------------------------------------------------------- /messages/0.1.8.md: -------------------------------------------------------------------------------- 1 | `Fmt` has a new command for selection formatting! 2 | 3 | * Palette title: `Fmt: Format Selection` 4 | * Command name for hotkeys: `fmt_format_selection` 5 | 6 | Example hotkey: 7 | 8 | ```json 9 | {"keys": ["primary+k", "primary+k"], "command": "fmt_format_selection"}, 10 | ``` 11 | 12 | Selection formatting even works for embedded syntaxes, choosing settings/rules by the scope at the start of each region. For example, you can format a block of JSON or Go embedded in a Markdown file. 13 | 14 | When you have different formatters configured for both inner and outer syntaxes, it may currently choose the outer one. This could be rectified on demand. 15 | -------------------------------------------------------------------------------- /messages/0.1.11.md: -------------------------------------------------------------------------------- 1 | `Fmt` now uses `"merge_type": "replace"` by default. This avoids worst-case freezes caused by poor combinatorial complexity of the diff algorithm. When using a precise formatter that generates few diffs, such as `gofmt` or `rustfmt`, it's safe to opt into diffing, which is better at preserving scroll and cursor position. 2 | 3 | Example config: 4 | 5 | ```json 6 | { 7 | "rules": [ 8 | // Explicit diff merge. 9 | { 10 | "selector": "source.go", 11 | "cmd": ["goimports"], 12 | "format_on_save": true, 13 | "merge_type": "diff", 14 | }, 15 | // Uses default replace merge. 16 | { 17 | "selector": "source.json", 18 | "cmd": ["jsonfmt"], 19 | }, 20 | ], 21 | } 22 | ``` 23 | -------------------------------------------------------------------------------- /messages/install.md: -------------------------------------------------------------------------------- 1 | ## Fmt Setup 2 | 3 | (More in the readme: https://github.com/mitranim/sublime-fmt) 4 | 5 | Fmt has NO DEFAULT FORMATTERS. It invokes CLI programs installed globally on your system. You must specify them in the plugin settings: 6 | 7 | menu → Preferences → Package Settings → Fmt → Settings 8 | 9 | Example for Go: 10 | 11 | { 12 | "rules": [ 13 | { 14 | "selector": "source.go", 15 | "cmd": ["goimports"], 16 | "format_on_save": true, 17 | "merge_type": "diff", 18 | }, 19 | ], 20 | } 21 | 22 | To understand Sublime scopes and selector matching, read this short official doc: https://www.sublimetext.com/docs/selectors.html. 23 | 24 | HOW TO GET SCOPE NAME: 25 | 26 | Option 1: menu → Tools → Developer → Show Scope Name. 27 | 28 | Option 2: run the command `Fmt: Format Buffer`, and if not configured for the current scope, it will tell you! 29 | -------------------------------------------------------------------------------- /unlicense: -------------------------------------------------------------------------------- 1 | This is free and unencumbered software released into the public domain. 2 | 3 | Anyone is free to copy, modify, publish, use, compile, sell, or 4 | distribute this software, either in source code form or as a compiled 5 | binary, for any purpose, commercial or non-commercial, and by any 6 | means. 7 | 8 | In jurisdictions that recognize copyright laws, the author or authors 9 | of this software dedicate any and all copyright interest in the 10 | software to the public domain. We make this dedication for the benefit 11 | of the public at large and to the detriment of our heirs and 12 | successors. We intend this dedication to be an overt act of 13 | relinquishment in perpetuity of all present and future rights to this 14 | software under copyright law. 15 | 16 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 17 | EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 18 | MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. 19 | IN NO EVENT SHALL THE AUTHORS BE LIABLE FOR ANY CLAIM, DAMAGES OR 20 | OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, 21 | ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR 22 | OTHER DEALINGS IN THE SOFTWARE. 23 | 24 | For more information, please refer to 25 | -------------------------------------------------------------------------------- /Fmt.sublime-settings: -------------------------------------------------------------------------------- 1 | { 2 | /* 3 | Formatting rules should be added here. (In your own settings file.) 4 | 5 | Every rule is a dictionary, where the only required field is "selector": 6 | a syntax type such as "source.python". See the docs on selectors: 7 | https://www.sublimetext.com/docs/selectors.html. All other fields are 8 | overrides for global settings such as "cmd". 9 | 10 | Example for Go: 11 | 12 | "rules": [ 13 | { 14 | "selector": "source.go", 15 | "cmd": ["goimports"], 16 | "format_on_save": true, 17 | "merge_type": "diff", 18 | }, 19 | ], 20 | */ 21 | "rules": [], 22 | 23 | /* 24 | Command to invoke, with command line arguments. Must be a list of strings. 25 | The command must communicate over standard input/output. 26 | 27 | While technically this can be set at the top level, in practice you should 28 | set this PER SELECTOR in the "rules" setting, using different fmters for 29 | different scopes. 30 | 31 | Supports variable substitution, using the shell variable interpolation 32 | syntax: 33 | 34 | "cmd": ["some_command", "$tab_size"] 35 | 36 | Supported variables: 37 | 38 | - Environment variables via `os.environ`. 39 | 40 | - Special variables available in build systems: 41 | https://www.sublimetext.com/docs/build_systems.html. 42 | 43 | - $tab_size -- Indent width, usually 2 or 4; takes the "tab_size" setting 44 | from the current view. 45 | 46 | - $indent -- Literal indent: either N spaces or a single tab. 47 | */ 48 | "cmd": null, 49 | 50 | /* 51 | Additional environment variables for the subprocess. Environment is always 52 | inherited from Sublime Text, which generally tries to mimic your shell env. 53 | This is needed only for additional variables and overrides. 54 | 55 | Can be configured per rule / per selector. 56 | */ 57 | "env": null, 58 | 59 | /* 60 | Format current buffer on save. Disabled by default. Can be overridden for 61 | individual scope selectors. 62 | 63 | Note that you can format the buffer manually via the "Fmt: Format Buffer" 64 | command. You can also format selection via the `Fmt: Format Selection` 65 | command. 66 | */ 67 | "format_on_save": false, 68 | 69 | /* 70 | Determines the CWD of the subprocess. Possible values: 71 | 72 | - "auto" -- Try to use the current file's directory; fall back on 73 | the project root, which is assumed to be the first 74 | directory in the current window. 75 | 76 | - "project_root" -- Use the project root, which is assumed to be the first 77 | directory in the current window. 78 | 79 | - "none" -- Don't set the CWD. 80 | 81 | - ":" -- Use hardcoded path; may be useful for project-specific 82 | settings. 83 | */ 84 | "cwd_mode": "auto", 85 | 86 | /* 87 | How to show errors. Possible values: 88 | 89 | - "" -- Hide errors completely. 90 | 91 | - "console" -- Print errors to the Sublime console. 92 | 93 | - "panel" -- Show an output panel at the bottom. 94 | 95 | - "popup" -- Show obnoxious popup windows. 96 | */ 97 | "error_style": "panel", 98 | 99 | /* 100 | Determines how to replace buffer contents. Can be overridden for individual 101 | scope selectors. "diff" is more precise and preserves scroll and cursor 102 | position, but can be EXTREMELY slow when the number of changes exceeds a few 103 | dozen. 104 | 105 | Possible values: 106 | 107 | - "replace" -- Simpler but doesn't preserve cursor position. 108 | 109 | - "diff" -- More complicated but better at preserving cursor position. 110 | */ 111 | "merge_type": "replace", 112 | 113 | /* 114 | Subprocess timeout in seconds. If execution takes longer, Fmt kills the 115 | subprocess and aborts with an error. 116 | */ 117 | "timeout": 60, 118 | } 119 | -------------------------------------------------------------------------------- /readme.md: -------------------------------------------------------------------------------- 1 | ## Overview 2 | 3 | Sublime Text plugin for auto-formatting arbitrary code by calling arbitrary executables. Works for `gofmt`, `rustfmt`, any similar tool that's an executable and uses standard input/output. 4 | 5 | Features: 6 | 7 | * Format on demand. Optionally auto-format on save. 8 | * Configure executables and other settings per _scope_ (syntax type: `source.go`, `source.rust` and so on). 9 | * Optionally preserve cursor and scroll position when formatting, via `"merge_type": "diff"`. 10 | * Show errors in an output panel (configurable). 11 | * Format either an entire file, or only selection. 12 | * Selection formatting works for embedded syntaxes, such as JS inside HTML. 13 | 14 | Limitations: 15 | 16 | * Invokes a subprocess every time. Good enough for formatters written in compiled languages, such as `gofmt` and `rustfmt`. If a given formatter is written in JS and takes a second to start up, this tool might not be suitable. 17 | 18 | Based on https://github.com/mitranim/sublime-gofmt and fully replaces it. Also replaces [RustFmt](https://github.com/mitranim/sublime-rust-fmt) and countless others. 19 | 20 | ## Why 21 | 22 | Why this exists? 23 | 24 | Package Control has special-case formatter plugins for different languages, and the monstrous Formatter with too many batteries included. This makes it hard to add formatters: someone has to make and publish a new plugin every time, or fork a repo and make a PR, etc. 25 | 26 | Many formatters just call a subprocess and use stdio. One plugin can handle them all, while letting the _user_ specify any new formatter for any new syntax! This works for `gofmt`, `rustfmt`, `clang-format`, and endless others. 27 | 28 | ## Installation 29 | 30 | ### Package Control 31 | 32 | 1. Get [Package Control](https://packagecontrol.io). 33 | 2. Open the command palette: ⇪⌘P or ⇪^P. 34 | 3. `Package Control: Install Package`. 35 | 4. `Fmt`. 36 | 37 | ### Manual 38 | 39 | Clone the repo and symlink it to your Sublime packages directory. Example for MacOS: 40 | 41 | ```sh 42 | git clone https://github.com/mitranim/sublime-fmt.git 43 | cd sublime-fmt 44 | ln -sf "$(pwd)" "$HOME/Library/Application Support/Sublime Text 3/Packages/Fmt" 45 | ``` 46 | 47 | To find the packages directory on your system, use Sublime Text menu → Preferences → Browse Packages. 48 | 49 | ## Usage 50 | 51 | The plugin has _no default formatters_. It invokes CLI programs installed globally on your system. You must specify them in the plugin settings. Example for Go: 52 | 53 | ```json 54 | { 55 | "rules": [ 56 | { 57 | "selector": "source.go", 58 | "cmd": ["goimports"], 59 | "format_on_save": true, 60 | "merge_type": "diff", 61 | }, 62 | ], 63 | } 64 | ``` 65 | 66 | To understand Sublime scopes and selector matching, read this short official doc: https://www.sublimetext.com/docs/selectors.html. 67 | 68 | **How to get scope name**. Option 1: menu → Tools → Developer → Show Scope Name. Option 2: run the command `Fmt: Format Buffer`, and if not configured for the current scope, it will tell you! 69 | 70 | To format on demand, run the `Fmt: Format Buffer` command from the command palette. See below how to configure hotkeys. 71 | 72 | To auto-format on save, set `"format_on_save": true` in the settings. Can be global or per rule. 73 | 74 | ## Settings 75 | 76 | See [`Fmt.sublime-settings`](Fmt.sublime-settings) for all available settings. To override them, open: 77 | 78 | ``` 79 | menu → Preferences → Package Settings → Fmt → Settings 80 | ``` 81 | 82 | The plugin looks for settings in the following places, with the following priority: 83 | 84 | * `"Fmt"` dict in general Sublime settings, project-specific or global. 85 | * `Fmt.sublime-settings`, user-created or default. 86 | 87 | For overrides, open project or global settings and make a `"Fmt"` entry: 88 | 89 | ```json 90 | { 91 | "Fmt": { 92 | "rules": [ 93 | { 94 | "selector": "source.some_lang", 95 | "cmd": ["some_lang_fmt", "--some_arg"], 96 | }, 97 | ], 98 | }, 99 | } 100 | ``` 101 | 102 | A rule may contain _any_ of the root-level settings, such as `format_on_save`. This allows fine-tuning. 103 | 104 | ## Commands 105 | 106 | In Sublime's command palette: 107 | 108 | * `Fmt: Format Buffer` 109 | * `Fmt: Format Selection` 110 | 111 | ## Hotkeys 112 | 113 | Hotkeys? More like _notkeys_! 114 | 115 | To avoid potential conflicts, this plugin does not come with hotkeys. To hotkey 116 | the format commands, add something like this to your `.sublime-keymap`: 117 | 118 | ```sublime-keymap 119 | {"keys": ["primary+k", "primary+j"], "command": "fmt_format_buffer"}, 120 | {"keys": ["primary+k", "primary+k"], "command": "fmt_format_selection"}, 121 | ``` 122 | 123 | Sublime automatically resolves "primary" to "super" on MacOS and to "ctrl" on other systems. 124 | 125 | ## Changelog 126 | 127 | **2022-07-18**. Ignore informational output over stderr when the subprocess exits with 0 and stdout is non-empty. 128 | 129 | **2022-07-11**. Use `"merge_type": "replace"` by default. Diff is now opt-in due to extreme performance degradation for large amounts of diffs. 130 | 131 | **2020-12-28**. Support env variable substitution. Format-on-save is no longer enabled by default. 132 | 133 | **2020-11-26**. Support variable substitution in `cmd`. 134 | 135 | **2020-11-25**. Use scope selectors instead of exactly matching the scope name. 136 | 137 | **2020-10-25**. Support subprocess timeout, always kill the subprocess. 138 | 139 | **2020-10-23**. Support several ways of printing errors. By default, errors are shown in a transient output panel at the bottom. 140 | 141 | ## License 142 | 143 | https://unlicense.org 144 | 145 | `difflib.py` is based on code which is published under Apache license, see the hyperlink in the file. It underwent significant edits, and its licensing status is unknown to me. Everything else is original and under Unlicense. 146 | -------------------------------------------------------------------------------- /Fmt.py: -------------------------------------------------------------------------------- 1 | import sublime 2 | import sublime_plugin 3 | import subprocess as sub 4 | import os 5 | import sys 6 | from . import difflib 7 | 8 | PLUGIN_NAME = 'Fmt' 9 | SETTINGS_KEY = PLUGIN_NAME + '.sublime-settings' 10 | IS_WINDOWS = os.name == 'nt' 11 | PANEL_OUTPUT_NAME = 'output.' + PLUGIN_NAME 12 | 13 | class fmt_listener(sublime_plugin.EventListener): 14 | def on_pre_save(self, view): 15 | if is_enabled(view) and get_setting(view, 'format_on_save'): 16 | view.run_command('fmt_format_buffer') 17 | 18 | class fmt_format_buffer(sublime_plugin.TextCommand): 19 | def run(self, edit): 20 | view = self.view 21 | try: 22 | fmt_region(view, edit, view_region(view)) 23 | except Exception as err: 24 | report(view, err) 25 | 26 | class fmt_format_selection(sublime_plugin.TextCommand): 27 | def run(self, edit): 28 | view = self.view 29 | 30 | for region in view.sel(): 31 | try: 32 | fmt_region(view, edit, region) 33 | except Exception as err: 34 | report(view, err) 35 | break 36 | 37 | class fmt_panel_replace_content(sublime_plugin.TextCommand): 38 | def run(self, edit, text): 39 | view = self.view 40 | view.replace(edit, view_region(view), text) 41 | view.sel().clear() 42 | 43 | # TODO: any other exception type should be printed with the stack. Only error 44 | # messages generated by Fmt, as `ErrMsg`, should have the stack suppressed 45 | # (which is the default behavior of `str.format`). 46 | class ErrMsg(Exception): 47 | pass 48 | 49 | def fmt_region(view, edit, region): 50 | if region.empty(): 51 | return 52 | 53 | hide_panel(view.window()) 54 | 55 | source = view.substr(region) 56 | scope = view.scope_name(region.begin()) 57 | fmted = fmt(view, source, view_encoding(view), scope) 58 | if fmted == source: 59 | return 60 | 61 | merge_type = get_setting(view, 'merge_type', scope) 62 | 63 | if merge_type == 'diff': 64 | try: 65 | merge_into_view(view, edit, fmted, region) 66 | except difflib.TooManyDiffsException: 67 | replace_view(view, edit, fmted, region) 68 | return 69 | 70 | if merge_type == 'replace': 71 | replace_view(view, edit, fmted, region) 72 | return 73 | 74 | report(view, 'unknown value of setting "merge_type": {}'.format(merge_type)) 75 | 76 | def fmt(view, input, encoding, scope): 77 | cmd = get_setting(view, 'cmd', scope) 78 | 79 | if not cmd: 80 | raise ErrMsg('unable to find setting "cmd" for scope "{}"'.format(scope)) 81 | 82 | if not isinstance(cmd, list) or not every(cmd, is_string): 83 | raise ErrMsg('expected setting "cmd" to be a list of strings, found {}'.format(cmd)) 84 | 85 | # Support "$variable" substitutions. 86 | variables = extract_variables(view) 87 | cmd = [sublime.expand_variables(arg, variables) for arg in cmd] 88 | 89 | proc = sub.Popen( 90 | args=cmd, 91 | stdin=sub.PIPE, 92 | stdout=sub.PIPE, 93 | stderr=sub.PIPE, 94 | startupinfo=process_startup_info(), 95 | universal_newlines=False, 96 | cwd=guess_cwd(view), 97 | env=get_env(view, scope), 98 | ) 99 | 100 | timeout = get_setting(view, 'timeout', scope) 101 | 102 | try: 103 | (stdout, stderr) = proc.communicate(input=bytes(input, encoding=encoding), timeout=timeout) 104 | finally: 105 | try: 106 | proc.kill() 107 | except: 108 | pass 109 | 110 | stdout = stdout.decode(encoding) 111 | stderr = stderr.decode(encoding) 112 | 113 | if proc.returncode != 0: 114 | msg = str(sub.CalledProcessError(proc.returncode, cmd)) 115 | if len(stderr) > 0: 116 | msg += ':\n' + stderr 117 | elif len(stdout) > 0: 118 | msg += ':\n' + stdout 119 | raise ErrMsg(msg) 120 | 121 | if len(stdout) == 0 and len(stderr) > 0: 122 | raise ErrMsg(stderr) 123 | 124 | return stdout 125 | 126 | def merge_into_view(view, edit, content, region): 127 | def subview(start, end): 128 | return view.substr(sublime.Region(start, end)) 129 | 130 | diffs = difflib.myers_diffs(subview(0, view.size()), content) 131 | difflib.cleanup_efficiency(diffs) 132 | offset = region.begin() 133 | 134 | for (op_type, patch) in diffs: 135 | patch_len = len(patch) 136 | if op_type == difflib.Ops.EQUAL: 137 | if subview(offset, offset+patch_len) != patch: 138 | report(view, "mismatch between diff's source and current content") 139 | return 140 | offset += patch_len 141 | elif op_type == difflib.Ops.INSERT: 142 | view.insert(edit, offset, patch) 143 | offset += patch_len 144 | elif op_type == difflib.Ops.DELETE: 145 | if subview(offset, offset+patch_len) != patch: 146 | report(view, "mismatch between diff's source and current content") 147 | return 148 | view.erase(edit, sublime.Region(offset, offset+patch_len)) 149 | 150 | def replace_view(view, edit, content, region): 151 | position = view.viewport_position() 152 | view.replace(edit, region, content) 153 | # Works only on the main thread, hence lambda and timer. 154 | restore = lambda: view.set_viewport_position(position, animate=False) 155 | sublime.set_timeout(restore, 0) 156 | 157 | def report(view, msg): 158 | window = view.window() 159 | style = get_setting(view, 'error_style') 160 | 161 | if style == '': 162 | return 163 | 164 | if style is None: 165 | style = 'panel' 166 | 167 | if style == 'console': 168 | if isinstance(msg, Exception): 169 | raise msg 170 | msg = '[{}] {}'.format(PLUGIN_NAME, msg) 171 | print(msg) 172 | return 173 | 174 | if style == 'panel': 175 | msg = '[{}] {}'.format(PLUGIN_NAME, msg) 176 | ensure_panel(window).run_command('fmt_panel_replace_content', {'text': norm_newlines(msg)}) 177 | show_panel(window) 178 | return 179 | 180 | if style == 'popup': 181 | msg = '[{}] {}'.format(PLUGIN_NAME, msg) 182 | sublime.error_message(msg) 183 | return 184 | 185 | sublime.error_message('[{}] unknown value of setting "error_style": {}'.format(PLUGIN_NAME, style)) 186 | 187 | # Copied from other plugins, haven't personally tested on Windows. 188 | def process_startup_info(): 189 | if not IS_WINDOWS: 190 | return None 191 | startupinfo = sub.STARTUPINFO() 192 | startupinfo.dwFlags |= sub.STARTF_USESHOWWINDOW 193 | startupinfo.wShowWindow = sub.SW_HIDE 194 | return startupinfo 195 | 196 | def guess_cwd(view): 197 | window = view.window() 198 | mode = get_setting(view, 'cwd_mode') or '' 199 | 200 | if mode.startswith(':'): 201 | return mode[1:] 202 | 203 | if mode == 'none': 204 | return None 205 | 206 | if mode == 'project_root': 207 | if len(window.folders()): 208 | return window.folders()[0] 209 | return None 210 | 211 | if mode == 'auto': 212 | if view.file_name(): 213 | return os.path.dirname(view.file_name()) 214 | if len(window.folders()): 215 | return window.folders()[0] 216 | 217 | def get_in(val, *path): 218 | for key in path: 219 | val, ok = get(val, key) 220 | if not ok: 221 | return (None, False) 222 | return (val, True) 223 | 224 | def get(val, key): 225 | if ( 226 | isinstance(val, dict) and key in val 227 | ) or ( 228 | (isinstance(val, list) or isinstance(val, tuple)) and 229 | (isinstance(key, int) and len(val) > key) 230 | ): 231 | return (val[key], True) 232 | return (None, False) 233 | 234 | def view_scope(view): 235 | scopes = view.scope_name(0) 236 | return scopes[0:scopes.find(' ')] 237 | 238 | def get_setting(view, key, scope = None): 239 | if scope is None: 240 | scope = view_scope(view) 241 | 242 | overrides = view.settings().get(PLUGIN_NAME) 243 | 244 | rule = rule_for_scope(get(overrides, 'rules')[0], scope) 245 | (val, found) = get(rule, key) 246 | if found: 247 | return val 248 | 249 | (val, found) = get_in(overrides, key) 250 | if found: 251 | return val 252 | 253 | settings = sublime.load_settings(SETTINGS_KEY) 254 | 255 | rule = rule_for_scope(settings.get('rules'), scope) 256 | (val, found) = get(rule, key) 257 | if found: 258 | return val 259 | 260 | return settings.get(key) 261 | 262 | def rule_for_scope(rules, scope): 263 | if not rules: 264 | return None 265 | 266 | rule = max(rules, key = lambda rule: rule_score(rule, scope)) 267 | 268 | # Note: `max` doesn't ensure this condition. 269 | if rule_score(rule, scope) > 0: 270 | return rule 271 | 272 | return None 273 | 274 | def rule_score(rule, scope): 275 | if 'selector' not in rule: 276 | raise ErrMsg('missing "selector" in rule {}'.format(rule)) 277 | return sublime.score_selector(scope, rule['selector']) 278 | 279 | def is_enabled(view): 280 | return bool(get_setting(view, 'cmd')) 281 | 282 | def view_encoding(view): 283 | encoding = view.encoding() 284 | return 'UTF-8' if encoding == 'Undefined' else encoding 285 | 286 | def create_panel(window): 287 | return window.create_output_panel(PLUGIN_NAME) 288 | 289 | def find_panel(window): 290 | return window.find_output_panel(PANEL_OUTPUT_NAME) 291 | 292 | def ensure_panel(window): 293 | return find_panel(window) or create_panel(window) 294 | 295 | def hide_panel(window): 296 | if window.active_panel() == PANEL_OUTPUT_NAME: 297 | window.run_command('hide_panel', {'panel': PANEL_OUTPUT_NAME}) 298 | 299 | def show_panel(window): 300 | window.run_command('show_panel', {'panel': PANEL_OUTPUT_NAME}) 301 | 302 | def every(iter, fun): 303 | if iter: 304 | for val in iter: 305 | if not fun(val): 306 | return False 307 | return True 308 | 309 | def is_string(val): 310 | return isinstance(val, str) 311 | 312 | def extract_variables(view): 313 | settings = view.settings() 314 | tab_size = settings.get('tab_size') or 0 315 | indent = ' ' * tab_size if settings.get('translate_tabs_to_spaces') else '\t' 316 | 317 | vars = view.window().extract_variables() 318 | vars['tab_size'] = str(tab_size) 319 | vars['indent'] = indent 320 | vars.update(os.environ) 321 | 322 | return vars 323 | 324 | def view_region(view): 325 | return sublime.Region(0, view.size()) 326 | 327 | def get_env(view, scope): 328 | val = get_setting(view, 'env', scope) 329 | if val is None: 330 | return None 331 | env = os.environ.copy() 332 | env.update(val) 333 | return env 334 | 335 | def norm_newlines(src): 336 | return src.replace('\r\n', '\n') 337 | -------------------------------------------------------------------------------- /difflib.py: -------------------------------------------------------------------------------- 1 | """ 2 | Functions for diff, match and patch. 3 | 4 | Computes the difference between two texts to create a patch. 5 | Applies the patch onto another text, allowing for errors. 6 | 7 | Originally found at http://code.google.com/p/google-diff-match-patch/. 8 | Edited by Nelo Mitranim (2017, 2020). 9 | """ 10 | 11 | import re 12 | from collections import namedtuple 13 | 14 | class Ops(object): 15 | EQUAL = 'EQUAL' 16 | INSERT = 'INSERT' 17 | DELETE = 'DELETE' 18 | 19 | Diff = namedtuple('Diff', ['op', 'text']) 20 | 21 | # Cost of an empty edit operation in terms of edit characters. 22 | DIFF_EDIT_COST = 4 23 | 24 | BLANK_LINE_END = re.compile(r"\n\r?\n$") 25 | 26 | BLANK_LINE_START = re.compile(r"^\r?\n\r?\n") 27 | 28 | MAX_DIFFS_THRESHOLD = 32 29 | 30 | class TooManyDiffsException(Exception): 31 | pass 32 | 33 | def myers_diffs(text1, text2, checklines=True): 34 | """Find the differences between two texts. Simplifies the problem by 35 | stripping any common prefix or suffix off the texts before diffing. 36 | 37 | Args: 38 | text1: Old string to be diffed. 39 | text2: New string to be diffed. 40 | checklines: Optional speedup flag. If present and false, then don't run 41 | a line-level diff first to identify the changed areas. 42 | Defaults to true, which does a faster, slightly less optimal diff. 43 | 44 | Returns: 45 | List of changes. 46 | """ 47 | if text1 == None or text2 == None: 48 | raise ValueError('Null inputs (myers_diffs)') 49 | 50 | # Check for equality (speedup). 51 | if text1 == text2: 52 | if text1: 53 | return [Diff(Ops.EQUAL, text1)] 54 | return [] 55 | 56 | # Trim off common prefix (speedup). 57 | common_length = common_prefix_length(text1, text2) 58 | common_prefix = text1[:common_length] 59 | text1 = text1[common_length:] 60 | text2 = text2[common_length:] 61 | 62 | # Trim off common suffix (speedup). 63 | common_length = common_suffix_length(text1, text2) 64 | if common_length == 0: 65 | commonsuffix = '' 66 | else: 67 | commonsuffix = text1[-common_length:] 68 | text1 = text1[:-common_length] 69 | text2 = text2[:-common_length] 70 | 71 | # Compute the diff on the middle block. 72 | diffs = compute_diffs(text1, text2, checklines) 73 | 74 | # Restore the prefix and suffix. 75 | if common_prefix: 76 | diffs[:0] = [Diff(Ops.EQUAL, common_prefix)] 77 | if commonsuffix: 78 | diffs.append(Diff(Ops.EQUAL, commonsuffix)) 79 | cleanup_merge(diffs) 80 | return diffs 81 | 82 | def compute_diffs(text1, text2, checklines): 83 | """Find the differences between two texts. Assumes that the texts do not 84 | have any common prefix or suffix. 85 | 86 | Args: 87 | text1: Old string to be diffed. 88 | text2: New string to be diffed. 89 | checklines: Speedup flag. If false, then don't run a line-level diff 90 | first to identify the changed areas. 91 | If true, then run a faster, slightly less optimal diff. 92 | 93 | Returns: 94 | List of changes. 95 | """ 96 | if not text1: 97 | # Just add some text (speedup). 98 | return [Diff(Ops.INSERT, text2)] 99 | 100 | if not text2: 101 | # Just delete some text (speedup). 102 | return [Diff(Ops.DELETE, text1)] 103 | 104 | if len(text1) > len(text2): 105 | (longtext, shorttext) = (text1, text2) 106 | else: 107 | (shorttext, longtext) = (text1, text2) 108 | i = longtext.find(shorttext) 109 | if i != -1: 110 | # Shorter text is inside the longer text (speedup). 111 | diffs = [Diff(Ops.INSERT, longtext[:i]), Diff(Ops.EQUAL, shorttext), 112 | Diff(Ops.INSERT, longtext[i + len(shorttext):])] 113 | # Swap insertions for deletions if diff is reversed. 114 | if len(text1) > len(text2): 115 | diffs[0] = diffs[0]._replace(op=Ops.DELETE) 116 | diffs[2] = diffs[2]._replace(op=Ops.DELETE) 117 | return diffs 118 | 119 | if len(shorttext) == 1: 120 | # Single character string. 121 | # After the previous speedup, the character can't be an equality. 122 | return [Diff(Ops.DELETE, text1), Diff(Ops.INSERT, text2)] 123 | 124 | if checklines and len(text1) > 100 and len(text2) > 100: 125 | return line_mode_diffs(text1, text2) 126 | 127 | return diff_bisect(text1, text2) 128 | 129 | def line_mode_diffs(text1, text2): 130 | """Do a quick line-level diff on both strings, then rediff the parts for 131 | greater accuracy. 132 | This speedup can produce non-minimal diffs. 133 | 134 | Args: 135 | text1: Old string to be diffed. 136 | text2: New string to be diffed. 137 | 138 | Returns: 139 | List of changes. 140 | """ 141 | 142 | # Scan the text on a line-by-line basis first. 143 | (text1, text2, line_list) = lines_to_chars(text1, text2) 144 | 145 | diffs = myers_diffs(text1, text2, False) 146 | 147 | # Convert the diff back to original text. 148 | diffs = [diff._replace(text=''.join(line_list[ord(char)] for char in diff.text)) for diff in diffs] 149 | 150 | # Eliminate freak matches (e.g. blank lines) 151 | cleanup_semantic(diffs) 152 | 153 | # Rediff any replacement blocks, this time character-by-character. 154 | # Add a dummy entry at the end. 155 | diffs.append(Diff(Ops.EQUAL, '')) 156 | pointer = 0 157 | count_delete = 0 158 | count_insert = 0 159 | text_delete = '' 160 | text_insert = '' 161 | while pointer < len(diffs): 162 | if diffs[pointer].op == Ops.INSERT: 163 | count_insert += 1 164 | text_insert += diffs[pointer].text 165 | elif diffs[pointer].op == Ops.DELETE: 166 | count_delete += 1 167 | text_delete += diffs[pointer].text 168 | elif diffs[pointer].op == Ops.EQUAL: 169 | # Upon reaching an equality, check for prior redundancies. 170 | if count_delete >= 1 and count_insert >= 1: 171 | # Delete the offending records and add the merged ones. 172 | a = myers_diffs(text_delete, text_insert, False) 173 | diffs[pointer - count_delete - count_insert : pointer] = a 174 | pointer = pointer - count_delete - count_insert + len(a) 175 | count_insert = 0 176 | count_delete = 0 177 | text_delete = '' 178 | text_insert = '' 179 | 180 | pointer += 1 181 | 182 | diffs.pop() # Remove the dummy entry at the end. 183 | 184 | return diffs 185 | 186 | def diff_bisect(text1, text2): 187 | """Find the 'middle snake' of a diff, split the problem in two 188 | and return the recursively constructed diff. 189 | See Myers 1986 paper: An O(ND) Difference Algorithm and Its Variations. 190 | 191 | Args: 192 | text1: Old string to be diffed. 193 | text2: New string to be diffed. 194 | 195 | Returns: 196 | List of diff tuples. 197 | """ 198 | 199 | # Cache the text lengths to prevent multiple calls. 200 | text1_length = len(text1) 201 | text2_length = len(text2) 202 | max_d = (text1_length + text2_length + 1) // 2 203 | v_offset = max_d 204 | v_length = 2 * max_d 205 | v1 = [-1] * v_length 206 | v1[v_offset + 1] = 0 207 | v2 = v1[:] 208 | delta = text1_length - text2_length 209 | # If the total number of characters is odd, then the front path will 210 | # collide with the reverse path. 211 | front = (delta % 2 != 0) 212 | # Offsets for start and end of k loop. 213 | # Prevents mapping of space beyond the grid. 214 | k1start = 0 215 | k1end = 0 216 | k2start = 0 217 | k2end = 0 218 | for d in range(max_d): 219 | # Walk the front path one step. 220 | for k1 in range(-d + k1start, d + 1 - k1end, 2): 221 | k1_offset = v_offset + k1 222 | if k1 == -d or (k1 != d and 223 | v1[k1_offset - 1] < v1[k1_offset + 1]): 224 | x1 = v1[k1_offset + 1] 225 | else: 226 | x1 = v1[k1_offset - 1] + 1 227 | y1 = x1 - k1 228 | while (x1 < text1_length and y1 < text2_length and 229 | text1[x1] == text2[y1]): 230 | x1 += 1 231 | y1 += 1 232 | v1[k1_offset] = x1 233 | if x1 > text1_length: 234 | # Ran off the right of the graph. 235 | k1end += 2 236 | elif y1 > text2_length: 237 | # Ran off the bottom of the graph. 238 | k1start += 2 239 | elif front: 240 | k2_offset = v_offset + delta - k1 241 | if k2_offset >= 0 and k2_offset < v_length and v2[k2_offset] != -1: 242 | # Mirror x2 onto top-left coordinate system. 243 | x2 = text1_length - v2[k2_offset] 244 | if x1 >= x2: 245 | # Overlap detected. 246 | return bisect_split_diffs(text1, text2, x1, y1) 247 | 248 | # Walk the reverse path one step. 249 | for k2 in range(-d + k2start, d + 1 - k2end, 2): 250 | k2_offset = v_offset + k2 251 | if k2 == -d or (k2 != d and 252 | v2[k2_offset - 1] < v2[k2_offset + 1]): 253 | x2 = v2[k2_offset + 1] 254 | else: 255 | x2 = v2[k2_offset - 1] + 1 256 | y2 = x2 - k2 257 | while (x2 < text1_length and y2 < text2_length and 258 | text1[-x2 - 1] == text2[-y2 - 1]): 259 | x2 += 1 260 | y2 += 1 261 | v2[k2_offset] = x2 262 | if x2 > text1_length: 263 | # Ran off the left of the graph. 264 | k2end += 2 265 | elif y2 > text2_length: 266 | # Ran off the top of the graph. 267 | k2start += 2 268 | elif not front: 269 | k1_offset = v_offset + delta - k2 270 | if k1_offset >= 0 and k1_offset < v_length and v1[k1_offset] != -1: 271 | x1 = v1[k1_offset] 272 | y1 = v_offset + x1 - k1_offset 273 | # Mirror x2 onto top-left coordinate system. 274 | x2 = text1_length - x2 275 | if x1 >= x2: 276 | # Overlap detected. 277 | return bisect_split_diffs(text1, text2, x1, y1) 278 | 279 | # Number of diffs equals number of characters, no commonality at all. 280 | return [Diff(Ops.DELETE, text1), Diff(Ops.INSERT, text2)] 281 | 282 | def bisect_split_diffs(text1, text2, x, y): 283 | """Given the location of the 'middle snake', split the diff in two parts 284 | and recurse. 285 | 286 | Args: 287 | text1: Old string to be diffed. 288 | text2: New string to be diffed. 289 | x: Index of split point in text1. 290 | y: Index of split point in text2. 291 | 292 | Returns: 293 | List of diff tuples. 294 | """ 295 | text1a = text1[:x] 296 | text2a = text2[:y] 297 | text1b = text1[x:] 298 | text2b = text2[y:] 299 | 300 | # Compute both diffs serially. 301 | diffs = myers_diffs(text1a, text2a, False) 302 | diffsb = myers_diffs(text1b, text2b, False) 303 | 304 | if len(diffs) + len(diffsb) > MAX_DIFFS_THRESHOLD: 305 | raise TooManyDiffsException() 306 | 307 | return diffs + diffsb 308 | 309 | def lines_to_chars(text1, text2): 310 | """Split two texts into a list of strings. Reduce the texts to a string 311 | of dicts where each Unicode character represents one line. 312 | 313 | Args: 314 | text1: First string. 315 | text2: Second string. 316 | 317 | Returns: 318 | Three element tuple, containing the encoded text1, the encoded text2 and 319 | the list of unique strings. The zeroth element of the list of unique 320 | strings is intentionally blank. 321 | """ 322 | line_list = [] # e.g. line_list[4] == "Hello\n" 323 | line_dict = {} # e.g. line_dict["Hello\n"] == 4 324 | 325 | # "\x00" is a valid character, but various debuggers don't like it. 326 | # So we'll insert a junk entry to avoid generating a null character. 327 | line_list.append('') 328 | 329 | def lines_to_chars_munge(text): 330 | """Split a text into a list of strings. Reduce the texts to a string 331 | of dicts where each Unicode character represents one line. 332 | Modifies line_list and lineHash through being a closure. 333 | 334 | Args: 335 | text: String to encode. 336 | 337 | Returns: 338 | Encoded string. 339 | """ 340 | chars = [] 341 | # Walk the text, pulling out a substring for each line. 342 | # text.split('\n') would would temporarily double our memory footprint. 343 | # Modifying text would create many large strings to garbage collect. 344 | line_start = 0 345 | line_end = -1 346 | while line_end < len(text) - 1: 347 | line_end = text.find('\n', line_start) 348 | if line_end == -1: 349 | line_end = len(text) - 1 350 | line = text[line_start:line_end + 1] 351 | line_start = line_end + 1 352 | 353 | if line in line_dict: 354 | chars.append(chr(line_dict[line])) 355 | else: 356 | line_list.append(line) 357 | line_dict[line] = len(line_list) - 1 358 | chars.append(chr(len(line_list) - 1)) 359 | return ''.join(chars) 360 | 361 | chars1 = lines_to_chars_munge(text1) 362 | chars2 = lines_to_chars_munge(text2) 363 | return (chars1, chars2, line_list) 364 | 365 | def common_prefix_length(text1, text2): 366 | """Determine the common prefix of two strings. 367 | 368 | Args: 369 | text1: First string. 370 | text2: Second string. 371 | 372 | Returns: 373 | The number of characters common to the start of each string. 374 | """ 375 | # Quick check for common null cases. 376 | if not text1 or not text2 or text1[0] != text2[0]: 377 | return 0 378 | # Binary search. 379 | # Performance analysis: http://neil.fraser.name/news/2007/10/09/ 380 | pointermin = 0 381 | pointermax = min(len(text1), len(text2)) 382 | pointermid = pointermax 383 | pointerstart = 0 384 | while pointermin < pointermid: 385 | if text1[pointerstart:pointermid] == text2[pointerstart:pointermid]: 386 | pointermin = pointermid 387 | pointerstart = pointermin 388 | else: 389 | pointermax = pointermid 390 | pointermid = (pointermax - pointermin) // 2 + pointermin 391 | return pointermid 392 | 393 | def common_suffix_length(text1, text2): 394 | """Determine the common suffix of two strings. 395 | 396 | Args: 397 | text1: First string. 398 | text2: Second string. 399 | 400 | Returns: 401 | The number of characters common to the end of each string. 402 | """ 403 | # Quick check for common null cases. 404 | if not text1 or not text2 or text1[-1] != text2[-1]: 405 | return 0 406 | # Binary search. 407 | # Performance analysis: http://neil.fraser.name/news/2007/10/09/ 408 | pointermin = 0 409 | pointermax = min(len(text1), len(text2)) 410 | pointermid = pointermax 411 | pointerend = 0 412 | while pointermin < pointermid: 413 | if (text1[-pointermid:len(text1) - pointerend] == 414 | text2[-pointermid:len(text2) - pointerend]): 415 | pointermin = pointermid 416 | pointerend = pointermin 417 | else: 418 | pointermax = pointermid 419 | pointermid = (pointermax - pointermin) // 2 + pointermin 420 | return pointermid 421 | 422 | def common_overlap(text1, text2): 423 | """Determine if the suffix of one string is the prefix of another. 424 | 425 | Args: 426 | text1 First string. 427 | text2 Second string. 428 | 429 | Returns: 430 | The number of characters common to the end of the first 431 | string and the start of the second string. 432 | """ 433 | # Cache the text lengths to prevent multiple calls. 434 | text1_length = len(text1) 435 | text2_length = len(text2) 436 | # Eliminate the null case. 437 | if text1_length == 0 or text2_length == 0: 438 | return 0 439 | # Truncate the longer string. 440 | if text1_length > text2_length: 441 | text1 = text1[-text2_length:] 442 | elif text1_length < text2_length: 443 | text2 = text2[:text1_length] 444 | text_length = min(text1_length, text2_length) 445 | # Quick check for the worst case. 446 | if text1 == text2: 447 | return text_length 448 | 449 | # Start by looking for a single character match 450 | # and increase length until no match is found. 451 | # Performance analysis: http://neil.fraser.name/news/2010/11/04/ 452 | best = 0 453 | length = 1 454 | while True: 455 | pattern = text1[-length:] 456 | found = text2.find(pattern) 457 | if found == -1: 458 | return best 459 | length += found 460 | if found == 0 or text1[-length:] == text2[:length]: 461 | best = length 462 | length += 1 463 | 464 | def cleanup_semantic(diffs): 465 | """Reduce the number of edits by eliminating semantically trivial 466 | equalities. 467 | 468 | Args: 469 | diffs: List of diff tuples. 470 | """ 471 | changes = False 472 | equalities = [] # Stack of indices where equalities are found. 473 | lastequality = None # Always equal to diffs[equalities[-1]].text 474 | pointer = 0 # Index of current position. 475 | # Number of chars that changed prior to the equality. 476 | (length_insertions1, length_deletions1) = (0, 0) 477 | # Number of chars that changed after the equality. 478 | (length_insertions2, length_deletions2) = (0, 0) 479 | while pointer < len(diffs): 480 | if diffs[pointer].op == Ops.EQUAL: # Equality found. 481 | equalities.append(pointer) 482 | (length_insertions1, length_insertions2) = (length_insertions2, 0) 483 | (length_deletions1, length_deletions2) = (length_deletions2, 0) 484 | lastequality = diffs[pointer].text 485 | else: # An insertion or deletion. 486 | if diffs[pointer].op == Ops.INSERT: 487 | length_insertions2 += len(diffs[pointer].text) 488 | else: 489 | length_deletions2 += len(diffs[pointer].text) 490 | # Eliminate an equality that is smaller or equal to the edits on both 491 | # sides of it. 492 | if (lastequality and (len(lastequality) <= 493 | max(length_insertions1, length_deletions1)) and 494 | (len(lastequality) <= max(length_insertions2, length_deletions2))): 495 | # Duplicate record. 496 | diffs.insert(equalities[-1], Diff(Ops.DELETE, lastequality)) 497 | # Change second copy to insert. 498 | diffs[equalities[-1] + 1] = diffs[equalities[-1] + 1]._replace(op=Ops.INSERT) 499 | # Throw away the equality we just deleted. 500 | equalities.pop() 501 | # Throw away the previous equality (it needs to be reevaluated). 502 | if len(equalities): 503 | equalities.pop() 504 | if len(equalities): 505 | pointer = equalities[-1] 506 | else: 507 | pointer = -1 508 | # Reset the counters. 509 | length_insertions1, length_deletions1 = 0, 0 510 | length_insertions2, length_deletions2 = 0, 0 511 | lastequality = None 512 | changes = True 513 | pointer += 1 514 | 515 | # Normalize the diff. 516 | if changes: 517 | cleanup_merge(diffs) 518 | cleanup_semantic_lossless(diffs) 519 | 520 | # Find any overlaps between deletions and insertions. 521 | # e.g: abcxxxxxxdef 522 | # -> abcxxxdef 523 | # e.g: xxxabcdefxxx 524 | # -> defxxxabc 525 | # Only extract an overlap if it is as big as the edit ahead or behind it. 526 | pointer = 1 527 | while pointer < len(diffs): 528 | if (diffs[pointer - 1].op == Ops.DELETE and 529 | diffs[pointer].op == Ops.INSERT): 530 | deletion = diffs[pointer - 1].text 531 | insertion = diffs[pointer].text 532 | overlap_length1 = common_overlap(deletion, insertion) 533 | overlap_length2 = common_overlap(insertion, deletion) 534 | if overlap_length1 >= overlap_length2: 535 | if (overlap_length1 >= len(deletion) / 2.0 or 536 | overlap_length1 >= len(insertion) / 2.0): 537 | # Overlap found. Insert an equality and trim the surrounding edits. 538 | diffs.insert(pointer, Diff(Ops.EQUAL, insertion[:overlap_length1])) 539 | diffs[pointer - 1] = Diff(Ops.DELETE, deletion[:len(deletion) - overlap_length1]) 540 | diffs[pointer + 1] = Diff(Ops.INSERT, insertion[overlap_length1:]) 541 | pointer += 1 542 | else: 543 | if (overlap_length2 >= len(deletion) / 2.0 or 544 | overlap_length2 >= len(insertion) / 2.0): 545 | # Reverse overlap found. 546 | # Insert an equality and swap and trim the surrounding edits. 547 | diffs.insert(pointer, Diff(Ops.EQUAL, deletion[:overlap_length2])) 548 | diffs[pointer - 1] = Diff(Ops.INSERT, insertion[:len(insertion) - overlap_length2]) 549 | diffs[pointer + 1] = Diff(Ops.DELETE, deletion[overlap_length2:]) 550 | pointer += 1 551 | pointer += 1 552 | pointer += 1 553 | 554 | def cleanup_semantic_lossless(diffs): 555 | """Look for single edits surrounded on both sides by equalities 556 | which can be shifted sideways to align the edit to a word boundary. 557 | e.g: The cat came. -> The cat came. 558 | 559 | Args: 560 | diffs: List of diff tuples. 561 | """ 562 | 563 | def cleanup_semantic_score(one, two): 564 | """Given two strings, compute a score representing whether the 565 | internal boundary falls on logical boundaries. 566 | Scores range from 6 (best) to 0 (worst). 567 | Closure, but does not reference any external variables. 568 | 569 | Args: 570 | one: First string. 571 | two: Second string. 572 | 573 | Returns: 574 | The score. 575 | """ 576 | if not one or not two: 577 | # Edges are the best. 578 | return 6 579 | 580 | # Each port of this function behaves slightly differently due to 581 | # subtle differences in each language's definition of things like 582 | # 'whitespace'. Since this function's purpose is largely cosmetic, 583 | # the choice has been made to use each language's native features 584 | # rather than force total conformity. 585 | char1 = one[-1] 586 | char2 = two[0] 587 | non_alpha_numeric_1 = not char1.isalnum() 588 | non_alpha_numeric_2 = not char2.isalnum() 589 | whitespace1 = non_alpha_numeric_1 and char1.isspace() 590 | whitespace2 = non_alpha_numeric_2 and char2.isspace() 591 | line_break_1 = whitespace1 and (char1 == "\r" or char1 == "\n") 592 | line_break_2 = whitespace2 and (char2 == "\r" or char2 == "\n") 593 | blank_line_1 = line_break_1 and BLANK_LINE_END.search(one) 594 | blank_line_2 = line_break_2 and BLANK_LINE_START.match(two) 595 | 596 | if blank_line_1 or blank_line_2: 597 | # Five points for blank lines. 598 | return 5 599 | elif line_break_1 or line_break_2: 600 | # Four points for line breaks. 601 | return 4 602 | elif non_alpha_numeric_1 and not whitespace1 and whitespace2: 603 | # Three points for end of sentences. 604 | return 3 605 | elif whitespace1 or whitespace2: 606 | # Two points for whitespace. 607 | return 2 608 | elif non_alpha_numeric_1 or non_alpha_numeric_2: 609 | # One point for non-alphanumeric. 610 | return 1 611 | return 0 612 | 613 | pointer = 1 614 | # Intentionally ignore the first and last element (don't need checking). 615 | while pointer < len(diffs) - 1: 616 | if (diffs[pointer - 1].op == Ops.EQUAL and 617 | diffs[pointer + 1].op == Ops.EQUAL): 618 | # This is a single edit surrounded by equalities. 619 | equality1 = diffs[pointer - 1].text 620 | edit = diffs[pointer].text 621 | equality2 = diffs[pointer + 1].text 622 | 623 | # First, shift the edit as far left as possible. 624 | common_offset = common_suffix_length(equality1, edit) 625 | if common_offset: 626 | common_string = edit[-common_offset:] 627 | equality1 = equality1[:-common_offset] 628 | edit = common_string + edit[:-common_offset] 629 | equality2 = common_string + equality2 630 | 631 | # Second, step character by character right, looking for the best fit. 632 | best_equality_1 = equality1 633 | best_edit = edit 634 | best_equality_2 = equality2 635 | best_score = (cleanup_semantic_score(equality1, edit) + cleanup_semantic_score(edit, equality2)) 636 | while edit and equality2 and edit[0] == equality2[0]: 637 | equality1 += edit[0] 638 | edit = edit[1:] + equality2[0] 639 | equality2 = equality2[1:] 640 | score = (cleanup_semantic_score(equality1, edit) + cleanup_semantic_score(edit, equality2)) 641 | # The >= encourages trailing rather than leading whitespace on edits. 642 | if score >= best_score: 643 | best_score = score 644 | best_equality_1 = equality1 645 | best_edit = edit 646 | best_equality_2 = equality2 647 | 648 | if diffs[pointer - 1].text != best_equality_1: 649 | # We have an improvement, save it back to the diff. 650 | if best_equality_1: 651 | diffs[pointer - 1] = diffs[pointer - 1]._replace(text=best_equality_1) 652 | else: 653 | del diffs[pointer - 1] 654 | pointer -= 1 655 | diffs[pointer] = diffs[pointer]._replace(text=best_edit) 656 | if best_equality_2: 657 | diffs[pointer + 1] = diffs[pointer + 1]._replace(text=best_equality_2) 658 | else: 659 | del diffs[pointer + 1] 660 | pointer -= 1 661 | pointer += 1 662 | 663 | def cleanup_efficiency(diffs): 664 | """Reduce the number of edits by eliminating operationally trivial 665 | equalities. 666 | 667 | Args: 668 | diffs: List of diff tuples. 669 | """ 670 | changes = False 671 | equalities = [] # Stack of indices where equalities are found. 672 | lastequality = None # Always equal to diffs[equalities[-1]].text 673 | pointer = 0 # Index of current position. 674 | pre_ins = False # Is there an insertion operation before the last equality. 675 | pre_del = False # Is there a deletion operation before the last equality. 676 | post_ins = False # Is there an insertion operation after the last equality. 677 | post_del = False # Is there a deletion operation after the last equality. 678 | while pointer < len(diffs): 679 | if diffs[pointer].op == Ops.EQUAL: # Equality found. 680 | if (len(diffs[pointer].text) < DIFF_EDIT_COST and 681 | (post_ins or post_del)): 682 | # Candidate found. 683 | equalities.append(pointer) 684 | pre_ins = post_ins 685 | pre_del = post_del 686 | lastequality = diffs[pointer].text 687 | else: 688 | # Not a candidate, and can never become one. 689 | equalities = [] 690 | lastequality = None 691 | 692 | post_ins = post_del = False 693 | else: # An insertion or deletion. 694 | if diffs[pointer].op == Ops.DELETE: 695 | post_del = True 696 | else: 697 | post_ins = True 698 | 699 | # Five types to be split: 700 | # ABXYCD 701 | # AXCD 702 | # ABXC 703 | # AXCD 704 | # ABXC 705 | 706 | if lastequality and ((pre_ins and pre_del and post_ins and post_del) or 707 | ((len(lastequality) < DIFF_EDIT_COST / 2) and 708 | (pre_ins + pre_del + post_ins + post_del) == 3)): 709 | # Duplicate record. 710 | diffs.insert(equalities[-1], Diff(Ops.DELETE, lastequality)) 711 | # Change second copy to insert. 712 | diffs[equalities[-1] + 1] = Diff(Ops.INSERT, diffs[equalities[-1] + 1].text) 713 | equalities.pop() # Throw away the equality we just deleted. 714 | lastequality = None 715 | if pre_ins and pre_del: 716 | # No changes made which could affect previous entry, keep going. 717 | post_ins = post_del = True 718 | equalities = [] 719 | else: 720 | if len(equalities): 721 | equalities.pop() # Throw away the previous equality. 722 | if len(equalities): 723 | pointer = equalities[-1] 724 | else: 725 | pointer = -1 726 | post_ins = post_del = False 727 | changes = True 728 | pointer += 1 729 | 730 | if changes: 731 | cleanup_merge(diffs) 732 | 733 | def cleanup_merge(diffs): 734 | """Reorder and merge like edit sections. Merge equalities. 735 | Any edit section can move as long as it doesn't cross an equality. 736 | 737 | Args: 738 | diffs: List of diff tuples. 739 | """ 740 | diffs.append(Diff(Ops.EQUAL, '')) # Add a dummy entry at the end. 741 | pointer = 0 742 | count_delete = 0 743 | count_insert = 0 744 | text_delete = '' 745 | text_insert = '' 746 | while pointer < len(diffs): 747 | if diffs[pointer].op == Ops.INSERT: 748 | count_insert += 1 749 | text_insert += diffs[pointer].text 750 | pointer += 1 751 | elif diffs[pointer].op == Ops.DELETE: 752 | count_delete += 1 753 | text_delete += diffs[pointer].text 754 | pointer += 1 755 | elif diffs[pointer].op == Ops.EQUAL: 756 | # Upon reaching an equality, check for prior redundancies. 757 | if count_delete + count_insert > 1: 758 | if count_delete != 0 and count_insert != 0: 759 | # Factor out any common prefixies. 760 | common_length = common_prefix_length(text_insert, text_delete) 761 | if common_length != 0: 762 | x = pointer - count_delete - count_insert - 1 763 | if x >= 0 and diffs[x].op == Ops.EQUAL: 764 | diffs[x] = diffs[x]._replace(text=(diffs[x].text + text_insert[:common_length])) 765 | else: 766 | diffs.insert(0, Diff(Ops.EQUAL, text_insert[:common_length])) 767 | pointer += 1 768 | text_insert = text_insert[common_length:] 769 | text_delete = text_delete[common_length:] 770 | # Factor out any common suffixies. 771 | common_length = common_suffix_length(text_insert, text_delete) 772 | if common_length != 0: 773 | diffs[pointer] = diffs[pointer]._replace(text=( 774 | text_insert[-common_length:] + diffs[pointer].text 775 | )) 776 | text_insert = text_insert[:-common_length] 777 | text_delete = text_delete[:-common_length] 778 | # Delete the offending records and add the merged ones. 779 | if count_delete == 0: 780 | diffs[pointer - count_insert : pointer] = [Diff(Ops.INSERT, text_insert)] 781 | elif count_insert == 0: 782 | diffs[pointer - count_delete : pointer] = [Diff(Ops.DELETE, text_delete)] 783 | else: 784 | diffs[pointer - count_delete - count_insert : pointer] = [ 785 | Diff(Ops.DELETE, text_delete), 786 | Diff(Ops.INSERT, text_insert)] 787 | pointer = pointer - count_delete - count_insert + 1 788 | if count_delete != 0: 789 | pointer += 1 790 | if count_insert != 0: 791 | pointer += 1 792 | elif pointer != 0 and diffs[pointer - 1].op == Ops.EQUAL: 793 | # Merge this equality with the previous one. 794 | diffs[pointer - 1] = diffs[pointer - 1]._replace(text=( 795 | diffs[pointer - 1].text + diffs[pointer].text 796 | )) 797 | del diffs[pointer] 798 | else: 799 | pointer += 1 800 | 801 | count_insert = 0 802 | count_delete = 0 803 | text_delete = '' 804 | text_insert = '' 805 | 806 | if diffs[-1].text == '': 807 | diffs.pop() # Remove the dummy entry at the end. 808 | 809 | # Second pass: look for single edits surrounded on both sides by equalities 810 | # which can be shifted sideways to eliminate an equality. 811 | # e.g: ABAC -> ABAC 812 | changes = False 813 | pointer = 1 814 | # Intentionally ignore the first and last element (don't need checking). 815 | while pointer < len(diffs) - 1: 816 | if (diffs[pointer - 1].op == Ops.EQUAL and 817 | diffs[pointer + 1].op == Ops.EQUAL): 818 | # This is a single edit surrounded by equalities. 819 | if diffs[pointer].text.endswith(diffs[pointer - 1].text): 820 | # Shift the edit over the previous equality. 821 | diffs[pointer] = diffs[pointer]._replace(text=( 822 | diffs[pointer - 1].text + diffs[pointer].text[:-len(diffs[pointer - 1].text)] 823 | )) 824 | diffs[pointer + 1] = diffs[pointer + 1]._replace(text=( 825 | diffs[pointer - 1].text + diffs[pointer + 1].text 826 | )) 827 | del diffs[pointer - 1] 828 | changes = True 829 | elif diffs[pointer].text.startswith(diffs[pointer + 1].text): 830 | # Shift the edit over the next equality. 831 | diffs[pointer - 1] = diffs[pointer - 1]._replace(text=( 832 | diffs[pointer - 1].text + diffs[pointer + 1].text 833 | )) 834 | diffs[pointer] = diffs[pointer]._replace(text=( 835 | diffs[pointer].text[len(diffs[pointer + 1].text):] + diffs[pointer + 1].text 836 | )) 837 | del diffs[pointer + 1] 838 | changes = True 839 | pointer += 1 840 | 841 | # If shifts were made, the diff needs reordering and another shift sweep. 842 | if changes: 843 | cleanup_merge(diffs) 844 | --------------------------------------------------------------------------------