├── .gitattributes ├── .github └── workflows │ └── build.yml ├── .gitignore ├── .npmrc ├── LICENSE ├── README.md ├── cover.png ├── index.html ├── package.json └── spec.emu /.gitattributes: -------------------------------------------------------------------------------- 1 | index.html -diff merge=ours 2 | spec.js -diff merge=ours 3 | spec.css -diff merge=ours 4 | -------------------------------------------------------------------------------- /.github/workflows/build.yml: -------------------------------------------------------------------------------- 1 | name: Deploy spec 2 | 3 | on: [push] 4 | 5 | jobs: 6 | build: 7 | runs-on: ubuntu-latest 8 | 9 | steps: 10 | - uses: actions/checkout@v2 11 | - uses: actions/setup-node@v1 12 | with: 13 | node-version: '12.x' 14 | - run: npm install 15 | - run: npm run build 16 | - name: commit changes 17 | uses: elstudio/actions-js-build/commit@v3 18 | with: 19 | commitMessage: "fixup: [spec] `npm run build`" 20 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | # Logs 2 | logs 3 | *.log 4 | npm-debug.log* 5 | 6 | # Runtime data 7 | pids 8 | *.pid 9 | *.seed 10 | 11 | # Directory for instrumented libs generated by jscoverage/JSCover 12 | lib-cov 13 | 14 | # Coverage directory used by tools like istanbul 15 | coverage 16 | 17 | # nyc test coverage 18 | .nyc_output 19 | 20 | # Grunt intermediate storage (http://gruntjs.com/creating-plugins#storing-task-files) 21 | .grunt 22 | 23 | # node-waf configuration 24 | .lock-wscript 25 | 26 | # Compiled binary addons (http://nodejs.org/api/addons.html) 27 | build/Release 28 | 29 | # Dependency directories 30 | node_modules 31 | jspm_packages 32 | 33 | # Optional npm cache directory 34 | .npm 35 | 36 | # Optional REPL history 37 | .node_repl_history 38 | 39 | # Only apps should have lockfiles 40 | yarn.lock 41 | package-lock.json 42 | npm-shrinkwrap.json 43 | pnpm-lock.yaml 44 | -------------------------------------------------------------------------------- /.npmrc: -------------------------------------------------------------------------------- 1 | package-lock=false 2 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2017 ECMA TC39 and contributors 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Reversible string split 2 | 3 | ![](./cover.png) 4 | 5 | Image from 6 | @DasSurma 7 | 8 | ## Status 9 | 10 | Champion(s): Luca Casonato 11 | 12 | Author(s): Luca Casonato 13 | 14 | Stage: 1 15 | 16 | ## Motivation 17 | 18 | The string split method in JavaScript behaves unlike the string split methods in 19 | nearly all other languages. In JavaScript, a `split(sep, N)` is essentially a 20 | regular split, but with the output array truncated to the first N values. 21 | JavaScript considers the N to mean the number of splits _and_ the number of 22 | return items. 23 | 24 | In most other languages a `splitN` instead splits the original string into N 25 | items, including a remainder. They thus split N-1 times. The last item in the 26 | returned array contains the "remainder" of the string. 27 | 28 | ```perl 29 | # Perl 30 | 31 | print join('\n', split(/\|/, 'a|b|c|d|e|f', 2)) 32 | 33 | # a 34 | # b|c|d|e|f 35 | ``` 36 | 37 | ```php 38 | 39 | >(); 77 | println!("{:?}", v); 78 | } 79 | 80 | // ["a", "b|c|d|e|f"] 81 | ``` 82 | 83 | ```java 84 | // Java 85 | 86 | class Playground { 87 | public static void main(String[] args) { 88 | String s = new String("a|b|c|d|e|f"); 89 | for(String val : s.split("\\|", 2)) { 90 | System.out.println(val); 91 | } 92 | } 93 | } 94 | 95 | // a 96 | // b|c|d|e|f 97 | ``` 98 | 99 | ```python 100 | # Python 101 | 102 | print('a|b|c|d|e|f'.split('|', 2)) 103 | 104 | # ['a', 'b', 'c|d|e|f'] 105 | ``` 106 | 107 | ```js 108 | // JavaScript 109 | 110 | console.log("a|b|c|d|e|f".split("|", 2)); 111 | 112 | // ["a", "b"] 113 | ``` 114 | 115 | The first 6/8 languages agree here. They consider the N to mean "the number of 116 | items returned" and the remainder to be the last item in the returned array. 117 | This means they actually split N-1 times. 118 | 119 | Python also agree that the remainder should be returned as the last item in the 120 | array. It disagrees with the rest about what N means though. Python splits N 121 | times, and returns N+1 items. 122 | 123 | JavaScript diverges from the pack completely though: it splits N times, and 124 | returns N items, but does not return a remainder at all. It is the only language 125 | to do so. 126 | 127 | Even though Python and the other langauges are slightly different from each 128 | other, all their algorithms have a common feature that JavaScript is missing: 129 | their splits are reversible. This means that if you split a string into N items, 130 | you can join them back together without losing any information. 131 | 132 | Reversible splits have the property that for any string V and any seperator S 133 | and any unsigned non 0 integer N, the following is valid: 134 | 135 | ```js 136 | join(S, V.split(S, N)) == V; 137 | ``` 138 | 139 | This reversability allows using string splits for some very useful tasks, where 140 | the current split method does not work. The most common use case for this are 141 | prefix splits: 142 | 143 | ### Prefix splits 144 | 145 | Many formats out there are character delimited. It is useful to be able to 146 | easially split a string at those predefined "split points" into two parts. For 147 | example the INI file format uses the `=` character to separate key-value pairs, 148 | and the `\n` character to separate key-value pairs from each other. 149 | 150 | ```ini 151 | key = value 152 | other_key = 'value contains an = sign' 153 | ``` 154 | 155 | With the current "split" in JavaScript, parsing this is not as obvious as with 156 | the "more popular" splitting algorithm: 157 | 158 | ```js 159 | // Current JavaScript 160 | const ini = Deno.readTextFileSync("./test.ini"); 161 | const entries = ini.split("\n").map((line) => { 162 | const [key, ...rest] = line.split("="); 163 | return [key, rest.join("=")]; 164 | }); 165 | 166 | // Other languages 167 | const ini = Deno.readTextFileSync("./test.ini"); 168 | const entries = ini.split("\n").map((line) => line.splitN("=", 2)); 169 | ``` 170 | 171 | > **Note:** I am aware this could be made more efficient with a different 172 | > "parser". That is not the point. The point is to make the obvious thing easy. 173 | 174 | This behaviour is not just relevant for the INI file format, but also for things 175 | like HTTP headers in HTTP/1.1, key value pairs in `Cookie` headers, and many 176 | more. 177 | 178 | ## Proposal 179 | 180 | The proposal is to add reversible string split support to JavaScript. This 181 | propsal proposes the addition of a `String.prototype.splitN` method that splits 182 | the input string at most N-1 times, returning N substrings. The last item 183 | contains the remainder of the string. 184 | 185 | ```js 186 | console.log("a|b|c|d|e|f".splitN("|", 2)); 187 | // ["a", "b|c|d|e|f"] 188 | ``` 189 | 190 | The naming is taken from Rust and Go. 191 | 192 | ## Q&A 193 | 194 | ### Could this be an extra option for the `split` method? 195 | 196 | Yes. This could also be an option in a new options bag for split. Example: 197 | 198 | ```js 199 | console.log("a|b|c|d|e|f".split("|", { n: 2 })); 200 | // or 201 | console.log("a|b|c|d|e|f".split("|", 2, true)); 202 | // or 203 | console.log("a|b|c|d|e|f".split("|", { n: 2, remainder: true })); 204 | // or 205 | console.log("a|b|c|d|e|f".split("|", 2, { remainder: true })); 206 | ``` 207 | 208 | The first may be confusing to users though, as it is not obvious that the return 209 | value between `split("|", 2)` and `split("|", { n: 2 })` is different. These 210 | kinds of overloads exist on the web platform (e.g. `addEventListener`), but the 211 | form you use does not impact behaviour. 212 | 213 | The second is more clear, but at the same time also less clear, because it is 214 | not obvious what the `true` value in the third argument is. 215 | 216 | The third option is pretty clear, but is also the most verbose. The verbosity 217 | may make it cumbersome to use. 218 | 219 | The 4th option is probably the "cleanest". Because the extra option is ignored 220 | in current engines, it might make it look like the extra option is supported, 221 | whereas in fact it is not - it is just being ignored. 222 | 223 | Which of the 4 proposed options should ultimately be used should be up to the 224 | committee as a whole. I don't really care (although I prefer the `splitN` 225 | option). 226 | 227 | ### I like the current behaviour of split! 228 | 229 | No worries! It isn't going away. The new `splitN` function is meant to simplify 230 | the usecases described above. You can continue to use `split` as it exists now. 231 | 232 | ### Why is JavaScript's split so different from other languages? 233 | 234 | To be completly honest: we don't really know. Nobody we have asked has been able 235 | to give us a good answer yet. What we do know: 236 | 237 | - Netscape Navigator 4, released in June 1997 was the first browser to support 238 | the second argument to the `split` method with it's current behaviour. NN3 239 | released 10 months prior did not have it. 240 | - IE 4, released 4 months later, does not have this feature/behaviour. 241 | - The first ECMA262 version this behaviour is specified is ES3. 242 | - The behaviour did not come from Java. It only added a `String.split` method in 243 | version 1.4 (2002). 244 | 245 | Thanks to @hax and @aimingoo for the research on this. 246 | -------------------------------------------------------------------------------- /cover.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tc39/proposal-reversible-string-split/a8936e091a302a1aeeabdd0b931a9af70d8343af/cover.png -------------------------------------------------------------------------------- /index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | Proposal Title Goes Here

Stage -1 Draft / January 26, 2022

Proposal Title Goes Here

2294 | 2295 | 2296 |

1 This is an emu-clause

2297 |

This is an algorithm:

2298 |
  1. Let proposal be undefined.
  2. If IsAccepted(proposal),
    1. Let stage be 0.
  3. Else,
    1. Let stage be -1.
  4. Return ? ToString(proposal).
2299 |
2300 |

A Copyright & Software License

2301 | 2302 |

Copyright Notice

2303 |

© 2022 Your Name(s) Here

2304 | 2305 |

Software License

2306 |

All Software contained in this document ("Software") is protected by copyright and is being made available under the "BSD License", included below. This Software may be subject to third party rights (rights from parties other than Ecma International), including patent rights, and no licenses under such third party rights are granted under this license even if the third party concerned is a member of Ecma International. SEE THE ECMA CODE OF CONDUCT IN PATENT MATTERS AVAILABLE AT https://ecma-international.org/memento/codeofconduct.htm FOR INFORMATION REGARDING THE LICENSING OF PATENT CLAIMS THAT ARE REQUIRED TO IMPLEMENT ECMA INTERNATIONAL STANDARDS.

2307 | 2308 |

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

2309 | 2310 |
    2311 |
  1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
  2. 2312 |
  3. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
  4. 2313 |
  5. Neither the name of the authors nor Ecma International may be used to endorse or promote products derived from this software without specific prior written permission.
  6. 2314 |
2315 | 2316 |

THIS SOFTWARE IS PROVIDED BY THE ECMA INTERNATIONAL "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL ECMA INTERNATIONAL BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

2317 | 2318 |
2319 |
-------------------------------------------------------------------------------- /package.json: -------------------------------------------------------------------------------- 1 | { 2 | "private": true, 3 | "name": "template-for-proposals", 4 | "description": "A repository template for ECMAScript proposals.", 5 | "scripts": { 6 | "start": "npm run build-loose -- --watch", 7 | "build": "npm run build-loose -- --strict", 8 | "build-loose": "ecmarkup --verbose spec.emu index.html" 9 | }, 10 | "homepage": "https://github.com/tc39/template-for-proposals#readme", 11 | "repository": { 12 | "type": "git", 13 | "url": "git+https://github.com/tc39/template-for-proposals.git" 14 | }, 15 | "license": "MIT", 16 | "devDependencies": { 17 | "ecmarkup": "^8.1.0" 18 | } 19 | } 20 | -------------------------------------------------------------------------------- /spec.emu: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 |
 7 | title: Proposal Title Goes Here
 8 | stage: -1
 9 | contributors: Your Name(s) Here
10 | 
11 | 12 | 13 |

This is an emu-clause

14 |

This is an algorithm:

15 | 16 | 1. Let _proposal_ be *undefined*. 17 | 1. If IsAccepted(_proposal_), 18 | 1. Let _stage_ be *0*. 19 | 1. Else, 20 | 1. Let _stage_ be *-1*. 21 | 1. Return ? ToString(_proposal_). 22 | 23 |
24 | --------------------------------------------------------------------------------